Our Data Mapper continues to gain geographic datasets. At the end of last week we added several layers based on data from data.police.uk. Each layer shows crimes grouped by constituency. The first shows the total number of crimes from January to July 2018.
We also have breakdowns for each constituency by type of crime:
- All crime
- Antisocial behaviour
- Bicycle theft - Oxbridge, Manchester, and London are the hotspots
- Public order
- Theft from the person
- Possession of weapons - Saffron Walden is the hot spot as there are lots of these crimes taking place at Stansted Airport.
How I did it
I downloaded data from data.police.uk. This covers crimes reported in England, Wales, and Northern Ireland. That means it also includes some crimes that took place in Scotland but which were reported elsewhere (for whatever reason). The download is pretty big and comes with directories for each month. Within each directory there are a bunch of CSV files; one for each police force.
When you have a national dataset like this it contains a lot of data. Working out which constituency every single crime is in takes time and some of that is unnecessary as several crimes can take place at the same location. So, I wrote a little bit of code to combine all the CSV files within a month and group the crimes that took place at specific coordinates. That quickly helped reduce the number of unique locations by about half. That was useful for speeding up the next step.
The task now was to work out which constituency each location corresponded to. I downloaded a copy of Westminster Parliamentary Constituencies from the ONS's Open Geography Portal and loaded this into the great open source QGIS software. I used QGIS's
Layer->Add Layer->Add delimited text layer option to load in the CSV file for a particular month. Then I used the
Vector->Data Management Tools->Join attributes by location . This found which constituency each point in the CSV file layer was within and then added the
pcon17nm identifiers to each row. The resulting table was then saved and the process repeated for each month. A final bit of code then went through these monthly files adding up crimes by constituency (using the
pcon17cd codes) before writing the totals as properties of each constituency in a final GeoJSON file. It was then pretty easy to plug this into Data Mapper for display.