Opinion: Improving data portal feedback
Some of our sponsors own Data Portals where they publish (and sometimes encourage others to publish) data. Examples include Data Mill North, Calderdale Dataworks, Northern Powergrid's Open Data Portal. Yiu-Shing Pang recently shared some thoughts on their new Open Data Portal, which you can read the blog here.
A common problem publishers of open data face is justifying the efforts they put into publishing datasets. The main methods they usually have to answer that question are download/visitor figures, surveys, and comment sections. As an organisation, we both publish datasets and use datasets published by others and we'd like a more holistic approach to feedback.
First let's look at the limitations of existing "feedback" methods:
- Download/visitor numbers can tell you that one dataset has been downloaded or visited 3 times and another dataset 3000 times. But that doesn't actually mean that one is 1000 times more useful or used than the other. The dataset with 3 downloads may have been used to power some tool that is used by huge numbers of people. The dataset with 3000 downloads may just be because a single developer has inefficient code.
- Surveys and online 'Contact' forms can give useful feedback about a Data Portal, in general, but often aren't great for specific feedback about a specific dataset. They also require people to know they exist and to be happy to volunteer their time for no clear benefit to themselves.
- Comment sections are useful for individual datasets (as a user of the data you could report an issue with it) but can suffer from a lack of responses from the dataset publisher or can get filled up with irrelevant spam.
Ideally there would be a better feedback loop that has benefits to all parties: publisher, developers/users, and random members of the public.
Adding a showcase per dataset
Our very modest proposal is for every dataset page to have a "showcase" (or "existing uses" or "products") section that would be curated by the publisher of the dataset. This section would consist of a series of "blocks" that could have a title, a description, a thumbnail image, an author and a URL and would be for that dataset. It would resemble existing Showcase/Products pages.
An immediate use of a showcase would be for publishers to separate their own visualisations from the data files. But, because it could promote the work of external developers/journalists/citizens (who have used the data) it gives those people an incentive to submit their uses back to the publisher, giving the publisher a better understanding of how people are using their data beyond simple download and visitor numbers.
It also benefits random members of the public who stumble across the dataset page from a search engine. These people may not be clued up on the industry, haven't interacted with that specific dataset before but may be interested in using the data in original ways not anticipated by the publisher. We often see Data Portals that require the user to already know that the dataset they're interested in exists and how to use it. Giving an incentive for people to share their "use cases" by showcasing them on the data portal itself, i.e. publicity and/or recognition for innovative use of the data, benefits both parties as well as inspiring ideas for future, less experienced data users - everyone gains.
A step further than this would even be to encourage data users to share their code and methodology. Publishers then have easy justification for their efforts, as they can see directly how their data is being used by the industry and general public and where improvements could be made to the dataset.