A main part of the Common Ground tool was the ability to show multiple open datasets at the same time. Separately, we've been having discussions with the open data teams from Bradford, Calderdale, and Leeds councils. They've been getting better at publishing open data but need better ways for the average citizen to be able to explore and make sense of what's been published. Sound familiar? It turns out this has a lot in common (excuse the pun) with Common Ground.
To address the need for visualisation of datasets published on Data Mill North and Calderdale Dataworks, I've re-used the code behind Common Ground in a separate (but related) tool - the ODI Leeds Data Mapper. Data Mapper will display open geographic datasets from across the region, not just Leeds.
One input format to rule them all
We've created a lot of maps over recent years powered by CSV files. In nearly every case we've had to wrangle each and every CSV file first to get it cleaned up and ready to be displayed on a map. Data wrangling can be a huge fraction of the total time spent creating a map or tool. Aside from all the horrors of formatting issues and data gremlins, there are no commonly agreed column headings for geographic data in CSV (although that may be changing).
Uber's Kepler can read CSV files but it needs human input to say which columns contain the geography. It isn't always easy for a human to tell which those columns are, never mind reliably knowing in code, and this isn't something I want to push as a task to the average citizen. It needs to be as frictionless as possible.
The original publisher knows their own data better than anyone. Geographic-sanitisation should be done upstream by either them or by a tool that knows how to deal with their specific CSV file. Having a consistency upstream should improve the reliability of a dataset displaying without the end-user needing to do anything. By encouraging more GeoJSON versions of datasets to exist, we should also make life easier for other web developers too.
The ODI Leeds Data Mapper gets data from multiple sources:
- Data Mill North and Calderdale Data Works. In the background I've written code that queries the APIs of our local open data repositories once a day looking for geographic datasets published as GeoJSON. When it finds them it also grabs the metadata and creates a list of datasets that can be shown in the tool.
- Open Street Map. Previously I wrote code that grabs a West Yorkshire extract of Open Street Map and creates themed GeoJSON files for Calderdale, Leeds, and Bradford every day. Those are hosted on Github and are very easy to re-use.
- Ordnance Survey. Amy took the OS Open Greenspace dataset, clipped it to West Yorkshire in QGIS, simplified the shapes to optimise the file size, and saved the output as GeoJSON.
- Other sources. We're including data from the crowd-sourcing platforms Open Plaques and Open Benches. It would be great to be able to include the Great British Public Toilet Map too.
- We've also manually converted some interesting datasets ourselves e.g. the Environment Agency's high and medium flood risk maps.
Data Mapper doesn't really mind where the GeoJSON files are hosted as long as the browser can load them. That means we can load the GeoJSON file straight from Data Mill North, Calderdale Data Works, or Github ensuring the user gets the latest version of the data.
As a result of creating the Data Mapper, Bradford council have already published their Priority Gritting Routes and allotments in GeoJSON. Calderdale have added listed buildings in West Yorkshire as GeoJSON. If an organisation publishes a GeoJSON file to Data Mill North it will show up on the map within a day.
Sharing a view
If you've loaded multiple data layers you may discover something interesting in the data. You might want to be able to share that particular view with someone else. So, by default, the URL stores the coordinates of where the map is and the datasets that you've loaded. That means you can share a link to a specific view. That also gives us the option of being able to link to a view of a dataset from the dataset page on Data Mill North or Calderdale Data Works.
A virtuous circle
One of the advantages of open data for organisations is that there are "more eyes on the data" meaning that errors are more easily found. If there is a way for a member of the public to let you know about an error, you can improve your dataset. Completing the circle in this way is good for everyone.
In practice there are different ways an organisation might get feedback. Some datasets will have a contact email. Others might have a feedback form we can link to. Better still are data sources where users can provide feedback on individual data points. In those cases we can add specific update/edit links to each pop-up bubble encouraging people to participate and improve the data. In practice we have to deal with all these options but I hope we can encourage publishers to see the advantages of better user feedback.
We already have 84 datasets included in the Data Mapper. Some cover cities or parts of West Yorkshire. Some cover parts of Greater Manchester. Some are national datasets. As our local councils add more datasets, this will increase.
Don't be all things to all people
This tool isn't supposed to replace all the great GIS tools that already exist. It fulfills a limited and specific use case that quickly shows multiple local government or third-sector datasets together. Citizens can start to explore datasets without needing lots of GIS knowledge and just by following a simple web link. Data Mapper doesn't have every bell and whistle of GIS software and neither should it. Other tools do that job better. We've added download links so that people can easily get hold of the data and load it into something more powerful when they need to.