Skip to main content

Project Cygnus - Old house, cold house?

In our blog post last week, we described how we wanted to explore and test the current state of energy-related data infrastructure by building something that fulfilled a need within a Net Zero context. So we chose retrofitting houses to be more energy efficient (benefits to homeowners, reducing the reliance on energy to heat a home, etc). We would need to start with Energy Performance Certificate data.

Of all of the fields in EPC data, the most relevant to our task are EPC grades (current and potential) and Energy Efficiency (current and potential). But alongside these (and many, many, MANY more fields) was 'Construction Age Band', which can give us an idea of how old a domestic building might be. Why would we be interested in building age? Well, it's an *interesting* extra data story to explore. Does the housing stock in the UK and Ireland fit that stereotype of older houses being the worst for energy efficiency? Where are the outliers, and why are they different? Has there been a much bigger local drive for energy efficiency in, for example, Barnsley versus Devon?

Exploring data like this can unearth interesting stories, and part of what we do is making data more accessible for folk to explore in their own way. Which translates to 'let's build something!' A map is usually a good place to start as people can engage with a map. In our previous blog post, we had initially thought about a hexmap but that does bring some unique challenges for this project. So we're staying with a geographic map for now, especially as we have new data and new geographies to work with in the form of Building Energy Rating data (equivalent to EPC data) from the Republic of Ireland.

What we did

The domestic EPC data for England and Wales is available to download from the MHCLG's 'Open Data' website - although it does require registration in order to access, and contains 30GB worth of CSV files. So, we began by downloading an extract of the data for properties in Leeds only, with the view that in the future we could easily expand this to England, Wales, and ROI (plus Scotland and Northern Ireland, if they published their EPC data openly!). We then started exploring the data in a Jupyter Notebook, which is very much a work in progress but is available openly on GitHub.

With a dataset with so many fields (90 columns!), it's hard to know what to look at first - which fields are actually relevant? So, we decided to concentrate on a few specific measures for now - current & potential EPC ratings, current & potential Energy Efficiency Ratings, and property age bands. We then calculated average values for each LSOA in Leeds, and also compared the number of properties & property age from the EPC data, with the VOA Housing Stock 2020 data. This gave us some interesting insights - for example, in Leeds as a whole, only about 60% of VOA-registered properties have a registered EPC, and that figure is as low as 20% in some LSOAs.

We've published our work so far as a static website where you can see our working and code, a full explanation of what we have done, and some basic insights. We're currently working on adding some interactive maps to allow users to explore this data too, although this turned out to be a much harder job than we anticipated!

The idea behind our work so far is to demonstrate what could be done with the data we have available - we've just taken a small subset of the data and tried to make some interesting comparisons, but we could easily expand our approach to look at different parameters - for example, number of solar panels, type of heating, window glazing type - the list is endless.

What next

At this stage of the project, we didn't want to focus too much on the 'extensions' that we could add later. It's important to make something and iterate as we go along. However, it was worth briefly thinking about making connections between energy efficiency retrofitting and 'green funding'. Initial searches for data surrounding things like the Green Deal scheme from the UK government (which is no longer running) have not turned up much. Even using tools like the UK spending tool or the EU spending tool can't shed much light on funding for sustainable projects.

So is there anything else we could look at?

There are a couple of options available. If we stay with a local perspective, Leeds now has a district heating network supplying homes and businesses, so there is bound to be some data about how many people will benefit, what it means for energy costs, etc. Looking more widely, we can look to Ofgem for open data, with Energy Company Obligation being a potential thing to explore.

We know we are going in the right direction as we have already had someone reach out to us about this exact challenge with EPC data. Local government have had to struggle with challenges of interrogating the data at a suitable level, which is where our exploration and development could help.