Northernlands 2 - Do cities have access to the private sector data they need?

Description

Jack Hardinges, who leads on the 'data institutions' progamme at the ODI, explores the current state of data access in cities

Transcript

This transcript comes from the captions associated with the video above. It is "as spoken".

Hi everyone, my name is Jack Hardinges and I'm program lead at

the Open Data Institute. I'm really pleased to be pre-recording this

for Northernlands 2020 and looking forward to joining lots

of the other sessions around this one next week.

In this session I'm going to be setting up two arguments in

response to the question of do cities have access to the

private sector data they need?

In terms of an agenda I'll first give a bit of context to the question so

where it's come from and some of the ODI's work on this so far and

I'm going to present the case for yes. Cities do have access

to the data they need, which will feature Richard

Emmott at Yorkshire Water, and then I follow up with some

evidence in favor of no, they don't. So we will be hearing

from Marius Jennings at Bristol City Council as well. And then I

finish with some ideas about what might come next, and open

up for some live Q&A.

I'll start by saying that cities have long built and maintained

infrastructure. We depend on their roads and bridges such as

these nice ones belonging to Hong Kong, as well as things

like their communications networks and utilities.

And given that I'm speaking at a part ODI Leeds event I perhaps

don't need to make the case that

cities are an important lens to use to talk about infrastructure

and responsibility is about building and maintaining it.

At the ODI we talk about data infrastructure, so to

us data infrastructure, includes the data itself.

So statistics, maps, sensor readings as well as the

people, processes and technologies that enable it

to be used. We think that data infrastructure will only

become more important as our economies and our societies

become ever more reliant on getting value from data and

from a cities context as more and more people end up

living in urban environments.

As is similar with physical infrastructure, almost

everywhere a cities data infrastructure consists of data

that's managed by public institutions, so housing

authorities and transport authorities as well as data that's

collected and held by the private sector organizations

that operate within them. So when we think about private

sector city data, we talk about both data generated by

businesses that work with cities to deliver public services as

well as those that provide services directly to their

citizens, visitors and others.

I guess one of the underlying arguments for lots of our work

on this and related topics is that we strategically plan fund

and govern our physical infrastructure to varying

degrees of success. And but we're not necessarily treating

or thinking about data in the same way right now.

So last year we set up small research projects as part of our

R&D program to explore the current state of city access to

private sector data, and we set out to find out three things

really. One of which was do cities have access to data held

by private sector organizations, and how do they access that? So

what types of processes in what kinds of ways? And then also

what are the challenges to cities doing this? So what are the

barriers? If we are right, but we're not necessarily thinking

about city data infrastructure in the same way as physical

infrastructure. I should say that when I'm talking bout

cities in this context, I'm using it as shorthand for city

government and public sector bodies as well as local

organizations, communities and people. So it could be

researchers startups. So when I say cities I'm meaning something

quite broad, deliberately here.

As you might guess we would, we tried to do some of this

research in the open, so this is a open document that we set up

for people to contribute to and use as we went. Includes some

really interesting examples of cities accessing data held by

the private sector and what we've done is

set up this matrix, which kind of clusters them around the

purpose of the use. So is it to support innovation in a

particular place, or is it more about transparency and

accountability? We've also set out the role of cities or the

approaches they used to access that data. So are they collaborating

with the data holders? The firms that have collected it?

Or are they having to in some cases, legislate for access to it

So it's a real mix of different approaches in different examples.

Been really pleased by the contributions from the

community on this - it's been quite popular open document, so if

you've added to it and are listening, thank you very much.

Say onto the question and the main body of the session today.

So do cities have access to the private sector data they need?

I'm going to be using examples from the open document I just

shared to present the case for both yes and no. In response to

this question, and I say it now rather than leave it to the end.

But obviously the answer is it depends, and in particular it

depends on what data you're talking about and what city and

bear with me while I present these binary polar options, it's

deliberate and I hope it's a

useful way to surface some interesting and concrete

examples of what's happening or not across different cities.

Let's start with yes.

So firstly we came across some really interesting examples of

companies producing or commissioning research about their

impact in particular places. So this is an extract from one of

Airbnb's Insight reports related to the West Midlands, so I think

this is evidence of recognition from some organizations the

data they hold and collect about a particular place has

value to it should be accessible to people and organizations

other than just themselves. This is my evidence point for that.

Alternatively, cities can turn to some of the more

specialist data providers for access to data about things like

the movement of people. So this is a foot-fall graph from a

company called Safe Graph who generate these through access to

cell phone activity data. This might be particularly useful to

a city looking at the impact of COVID-19 on movement and the

local economy. Separately, there's some cool examples of

cities working with the private sector to collaboratively

maintain and produce data, and based on our research in

particular, lots of maps. So this is the Greater Manchester

Open data infrastructure map, which includes data from the

private sector. So it has things like cycle network data and the

layer that I've got here is Zoopla property data.

Another example is this one from Amsterdam, so this is an attempt

by the City Council to build a sensor registry that

encourages both government departments and companies to

upload the location of cameras and sensors in that

particular city mainly for transparency purposes so people

know when data about them is being collected and where.

And lastly, on the case for Yes - we came across some quite

sophisticated attempts by companies, particularly in the

transport sector, to develop services and tools that provide

quite enriched access to data that their services have

collected. So this is a screenshot of Stava's Metro

service that enables city planners to access data to

better understand trends in movement and to inform their

planning and decisions about investment in transport

infrastructure. I just chose this example from Louisiana as I

like the name of the place.

And just to say, this isn't always about static data and

maps for policy policy making and planning. This is a

picture of Transport for London's operations room and

the example that we came across was it now ingesting

real time data from the mapping service Ways which it uses for

operational traffic management.

And I'll cut to Richard for his view on this.

My name is Richard Emmott I'm director of corporate affairs

for Yorkshire Water, and I've been leading our open data

policy since 2018. I think there is a lot of potential for more

data sharing from the private sector within cities, although I

think it's come a long way, particularly in the fields of

transport. From a Yorkshire Water perspective key areas of data

that we've been looking to share with other public service

providers have been things like information on flooding, how I

St works are operating, and other issues around how we

employ people such as gender diversity and how the workforce

of our organization is

constituted. The way we see that as a possibility is that it

means that the data sharing process can enable us to

develop really strong joint partnerships with our public

service provider partners to improve service within

the cities and to provide a much more integrated data landscape

for the city.

So the first stage is what we've done to share data at Yorkshire

Water is to participate in Data Mill North, and we published a

whole range of data sets on their in fairly raw format.

But obviously that's quite static data albeit it that it

has made our information accessible alongside some of

the key public service partners. The next stage of that which

we're about to launch shortly is to run a series of consumer

customizable dashboards, so that's what you can do is kind

of take out our raw data feeds from the website.

And drill down into really quite local granular level to

understand and analyse our performance on some of the key

issues within really quite quite small individual

communities. We think that's a really exciting possibility and

it starts to kind of create a different relationship with the

public in terms of accountability and transparency.

So we're really keen to get that launched and see what sort of

public response we get to it.

There are three real benefits for Yorkshire Water ourselves,

and that's transparency, accountability, and potentially

innovation. In terms of transparency, we are, we provide

an essential public service and the public have a genuine

interest in finding out how we're performing. So it's

important that we completely open about both the good and the

bad of our performance. That's why, for example, we published

five years data worth on air pollution performance, some of

which doesn't make for a great reading. It's important that

people understand that.

And that leads obviously to accountability. I think where we

have to do more work is around how we can collaborate with

other partners, potential service providers, on how they

can use our data to innovate and give us some different

approaches to things like leakage or flooding.

In terms of partnership with local authorities I've always been

very conscious that taking on its own our data is useful, but

it's much more useful when you put it alongside of data, say,

from the Environment Agency.

So if you got some of our data on drainage planning and

flooding, put that alongside some of the Environment Agency's

data, say on river levels and potentially get a much more

dynamic and rich data set that there's a hell of a lot more use

to potential data users. I think most organisations see the

barriers to data sharing as being in two sectors.

One, there's a big concern about cyber security and data

integrity and the whole GDPR

regime has been a challenge and certainly that was that came in

at the very early days when we were starting down our open data

journey. Likewise, cyber security. Clearly there's also a

commercial sensitivity around access to some data, and there's

also a general nervousness I think about opening the inner

workings of an organization out up to the outside world. I think

what we found when we've started doing things is that

to some extent, we've had to be fairly cavalier about some of

the rules except for GDPR which we've respected very

carefully. But I remember a conversation about potential

cyber security data and some of our water treatment works, and I

discovered that the information that were not supposed to

release was actually already publicly available on Google, so

it was particularly pointless to sort of maintain that

restriction. So I think I'd recommend the approach that we

took, which was to publish some data, see what happens, if the

sky doesn't fall in and you find something interesting out as a

result of it publish some more. Build your sort of risk

tolerance as that happens in your skill and expertise in

doing it. And also kind of build the richness of your

engagement with potential data users along the way. And

gradually they'll start to educate the organization that

this is actually a very powerful and interesting thing

to do, rather than something that is challenging or difficult.

And now onto the fun part the argument for No.

And to start with and I don't have a screenshot for this, but

these are two quotes from our research that we found

interesting. That kind of show the extent of the hacks and work

arounds that cities are having to put in place just to try to

access data and then to use it to inform their decision making.

So the first involves pinging Uber and Lyft's APIs, which is

interesting as a form of data collection, to say the least.

I'm not sure how robust that is and the 2nd is an attempt to

essentially replicate the datasets that these kinds of

rideshare companies already have and to me creating that kind of

shadow data set feels like fairly fairly unnecessary

effort that shouldn't necessarily need to be expanded.

Again, no screenshot, but a quote. This is another that we came

across that is stuck with us. This is a description of a

particular company's view of

Copenhagen's data platform and describing it as parasitic and of

no real benefit to the company's the city would like to

contribute data to it has stuck with us and some of our further

research has really validated this. In particular a piece of work

that we did in 2018 which found that as a city or borough in the

UK it was nigh on impossible to find out how many properties

were available on Airbnb or short term letting sites in a

particular area. Essentially, the data infrastructure didn't

exist. It wasn't the case that it was broken, it just wasn't

there. And this screenshot looks great, but it's not, and I'll

explain why. So, because the short term letting platforms

don't share data with anyone robustly, or if they do, it's

very limited research purposes. This is a service that

sprung up called AirDNA, that systematically scrapes these

platforms and packages up the data like this so you can derive

different kinds of stats and analysis from it.

But not only is this on fairly dodgy ground from a

licensing point of view in terms of whether or not the

data can be scraped from the platforms in this way. If you

come to any conclusions based on this data then the

platforms tend to dispute it, because they argue that

the data is inherently inaccurate and doesn't match up

with their own internal records. So there's

real limitations to what you can do with this.

Lastly, on the argument for, no, cities don't have access to the

private space they need. I point to the various live battles in

court that we came across in our research, especially in the US

between organizations like Uber and Lyft and city transport

authorities over access to data as evidence of it not being there,

that infrastructure not being developed or governed in the way

that we need it to be.

So since 2017 New York has made access to data, part of its

licensing conditions. So in order to operate as that

particular kind of operator, then you need to provide monthly

access to data that you've collected. And others seems to be

following suit, and there's some controversy over whether or not

some of these attempts by cities constitute over-reach and the

privacy implications of those.

And as well as cities, drivers on these platforms are taking legal

action to access data that these types of platforms hold. So this

is a screenshot of James Farrar's data, who's an ex Uber

driver and is taking them to court in part over fair access to

data that the platform holds about him in order for him to

build a case or at least understand his employment and

the way that he was directed by the platform over a

particular period of time.

Now I'll cut to Marius.

My name is Marius Jennings and I work for Bristol City Council as

our urban data lead. So my work is really seeing how we can work

with internal partners and

external partners to make open data are available to support

our understanding, our city challenges and to promote open

data in general.

On reflection

I don't feel that cities currently have enough access to

private data to tackle the range of challenges they face, and I

think that's particularly evident at the moment. We are

facing some of the most momentous change ever. And

some of the most ambiguous situations that cities of had to

cope with. And this is all happened really, really quickly.

And this means that we're going to have to start doing things

in different ways. We're going to have to start looking at our

macro environment and

seeing how do we experiment and what do we need?

And that's a real challenge, because a) it's identifying

what is the information that we need internally and

then b) how do we actually go about getting that data

that may not be available?

So there are a couple of initiatives that Bristol City

Council are exploring.

One of them came out of the ODI. open cities workshop that we

held in December 2019 and what that really gave was an

opportunity for us to bring together a range of

stakeholders to see what data is available and really map our

ecosystem. And understand what are the ethical considerations

when you're getting data from different parties and how do you

share that information

Understanding what context is it suitable to share that data?

So some of that data on the data spectrum may be freely available

that we would want to share with anybody. Some of that might,

however, be quite confidential and not appropriate. And how do

you best go about it?

In a legally compliant way where you are seen as a

trusted provider. One of the things that we're looking at in

Bristol City Council is working with our procurement team so

that as new contracts are put into place, we look at social

value, and ask private sector companies to provide data that

is not necessarily

commercially sensitive with the council to kind of help shape

our understanding of our city

and provide that data meaningful ways with our citizens.

And another initiative that we are involved in is called the

Bristol one city plan, and it's the city's aspirations for where

we want to be by 2050, and what that really does is it moves

from the council taking ownership for those goals to

rather a collective co-design piece where we work with city

stakeholders to make the city a

fairer place while

making it a cleaner, more economically viable city, and

what that means is that we're having various KPIs that we

need to look at on a yearly

basis. And working with our city partners so that we can make

that information available to monitor how we're progressing.

So unfortunately there are a number of limitations around

sharing data. In terms of.

The private sector sharing and local council. So that it can

be used in an effective manner and drive the best value.

So some of those things are a) Is there a business case to do

this? At the end of the day often councils don't necessarily

have the data science or the GIS or business intelligence teams

who can devote the time to

take data that's given and really extrapolate the value.

It often means that there isn't the understanding from private

sector organisations on the sort of data that you're asking for,

and that may mean that there's quite a low risk appetite where

they're not necessarily engaged because they're not seeing the

value of that.

They may be user agreements that are in place that hamper

the availability of data.

They may be

potential commercial sensitivities around the data.

So there are a number of things that have to be taken

into account and addressed before you can do this in an

effective manner.

So I do have a number of recommendations for improving

the situation, and particularly about how we get to work more

effectively with private sector

organisations. I think one of the main things is education.

It's been going out there and really talking about the benefit

of providing that data.

Easing the private sector organisations' concerns

and showing that often

customers are really interested in knowing

how they're doing and

if they are having that interest and that honesty and

transparency, that's often quite

well received. And though they may not be performing as well on

some of the issues, it doesn't mean that they're going to be

criticised for that. If they do it in quite a transparent way.

I think there also needs to be a better understanding with

organisations and also councils around GDPR. Often that gets

utilised as an excuse.

And it doesn't necessarily need to be, particularly if you're

looking at more open data, or the data is being shared in a

compliant way that is taking

into account concerns.

I think there are also circumstances where councils

will automatically have powers to access certain data,

like if there was a major incident. That often means that

different organizations come

together. To enable the city's civil contingency plans. So some

of this work does happen in the background, but people may not

be aware of it.

I probably would also say tackle one thing at a time.

It's complex. And if you have already got existing

relationships with private sector organizations, spend the

time with them, understand what

their doing. For example, we work quite closely with Bristol

Waste. To be fair, which is a Bristol City Council owned

company but really what they do is share how they're performing

and how they are able to

recycle more each year and that enables us to understand

how Bristol responds to the climate emergency, where we're

trying to be carbon neutral by 2030. I think if they can be

seen mutual goals being aligned by this data sharing people

might be more engaged to do it.

Also, I think you need to have quite an honest discussion

internally and review what capacity you've got to utilize

this data because there's no point in going out there and

getting it and then finding that you're not actually able to

interpret effectively or share effectively. There needs to be

support the local council's giving to the

external organization so that that data is used in

the most effective way.

And so to wrap up what next? And from my point of view, the real

conclusion from our work so far is that while there are some

really promising examples of open collaborative access to

data between cities in the private sector, and this isn't

spread evenly and there remains a lot more to be done.

This is actually the first time we presented some of our

findings from this research, and so ideally we'd like to go

further. And if it's useful to produce some guidance for cities

or for companies thinking about this issue based on the things

that we've come across and what might constitute best practice.

Even better would be a pilot or two or some practical work in

this topic. So if you are one of those two groups, then do

let me know if you're interested in talking to us and potentially

working with us. For cities we also run an online workshop

that came out of the project

that I described. Where we use tools such as our data

ecosystem mapping tool and the data ethics canvas to help

cities understand current gaps in their data infrastructure or

to identify the ethical implications of increased data

access or a new data project.

And for private sector data holders, we actually have a

series of webinars at the moment that are working through similar

topics to the one that I've covered today. So do you have a

look on our website for those if

they're of interest? And I'll stop there to take any questions

you have or ideas you wanted to share. Thank you for listening.

  • Jack Hardinges

    Programme Lead - Data Institutions
    The Open Data Institute

    Jack Hardinges
    © The ODI 2020

    Jack is Programme Lead for Data Institutions at the ODI. He is responsible for the strategic direction of the ODI’s work to explore different approaches to data stewardship, leading the delivery of related projects and building partnerships with other organisations.

    Recently, his work has focused on involving the public, patients and other stakeholders in the stewardship of health data, and the flows of data between cities and private sector organisations.

Sponsors

Nothernlands 2 is a collaboration between ODI Leeds and The Kingdom of the Netherlands, the start of activity to create, support, and amplify the cultural links between The Netherlands and the North of England. It is with their generous and vigourous support, and the support of other energetic organisations, that Northernlands can be delivered.

  • Kingdom of the Netherlands