Eating your own dog food
With more and more of our sponsors starting their open data journey, they've been asking us for advice on how to publish open data. This can get quite technical (data formats, standards, platforms) but frequently, for those starting to publish open data, the biggest hurdles are around just getting something published. There are lots of cases where people have spent a lot of effort exporting or extracting a file which they then put somewhere public (such as Data Mill North) but then forget about it. This can lead to problems. I've seen published datasets that are empty, saved incorrectly, saved with the wrong file type, or full of junk. Sometimes, when datasets are updated, a previously good quality file can end up broken and the publisher often doesn't know.
If you are publisher, one thing to consider is "eating your own dog food". This analogy stems from a 1970s TV advert in which the manufacturer of a dog food stated that they fed it to their own dogs. The idea being that if your own dogs (or you!) have to eat the dog food you make, you'll make better dog food. In terms of data, this means that you could make your own tools or reports use datasets that you publish openly. This encourages you to:
- publish enough open data to be able to do something useful
- keep your open data up-to-date
- keep your open data formatted correctly (otherwise you break your own tools)
- have one canoncial version of a dataset that is (by default) open.
A good example of a big international company "eating their own dog food" is Amazon who have used Amazon Web Services for their own retail sites since 2010. If AWS breaks, so does amazon.com (and amazon.co.uk etc) and that would affect sales. So Amazon have a huge incentive to keep it up-and-running.
At ODI Leeds we also "eat our own dog food" as we make use of many of the datasets we publish. We publish several CSV files on Github where we record our events, website visits, social media stats, sponsors, projects, revenue and more. These files drive our dashboard, our events page, our sponsors page, our list of projects, and our data-driven reports. In the recent Future Energy Scenarios project (a collaboration with Northern Powergrid), we also made sure that the visualisation was driven by the very same files that are listed on Data Mill North. That means that the visualisation always has the latest data and if any of the files break we quickly get to know about it.
If your organisation is publishing open data, try to re-ingest those files back into your organisation rather than using local copies (where possible). Become a user of your own data and build on top of it.