Northernlands 2 - Why is collaboration around data important in 2020 and beyond?
Leigh Dodds shares his knowledge and experience about data collaboration
This transcript comes from the captions associated with the video above. It is "as spoken".
Good afternoon everyone and welcome to this Northernlands
session on collaborating around data. My name is Leigh Dodds.
I'm the director of delivery at the Open Data Institute.
I'm going to introduce the session, put a bit of context to the
discussion and then we'll hear from 2 speakers: Sam Nelson from Open
Data Manchester and Lindsey Marchessault from the Open
Contracting Partnership. So the question that we are considering
in this session is
"why is collaboration around data important in 2020 and beyond?"
Well, we're living in
interesting times. We're dealing with a global pandemic.
Whilst being faced often daily with the issues and
problems of unethical collection and use of data.
All set against the backdrop of a rapidly changing climate.
So we have some big challenges to face in our
societies, in our economies, and in the environment.
Data can help us to address those challenges, but open only
if we are open and transparent in how that data is being
collected and used.
And only if that data is fairly representing the communities
that will be benefiting from it or be potentially impacted
through its use.
And we need to ensure that data is being used in responsible and
ethical ways. And to do that we need to collaborate. We need to
work together to try and maximize the value that we can
create by using data whilst minimizing its potential harmful
impacts. And it's only through that that we will create a world
where data works for everyone.
So in this session, we're going to look at collaboration around
data in two ways, firstly around collaborating on its collection
and use. So collection of maintenance. And secondly around
its publication and use.
So let's jump into the first of those - collaborative maintenance.
Now when were typically introducing open data, we focus
on the definition that open data is data that anyone can access
to use and share.
And while that is true and useful, it doesn't get into the
practice of open data which, even in 2020, is still largely
around public and private sector organisations publishing data
that they have already collected, so increasing access
to it to allow others to use the data in new ways. But it doesn't
have to be that way.
We can work together to collect and maintain data
and there are many projects that exists that have explored that
approach, so I think it's important to recognize that
collaborative maintenance is an option that communities can work
together to collect and maintain data, which is then are made
available under an open licence.
When we look at the kinds of work that is involved in
maintenance, it turns out that we can share quite a lot of it.
We can decide on what data should be collected in the first
place and how it should be
stored and managed. We can share the work of collecting
and maintaining the data so it can maintain its use over
in value over time. We can work together on managing access to
data and improving its quality.
And we can collaborate on building the tools and the
communities that can help us to unlock value from that dataset
once it's been collected.
So there's a range of different activities we can be involved with
So let's look at a few
brief examples. So the first and perhaps most widely known is
Open Street Map, which is very successful collaborative
maintenance project. Within Open Street Map, one particular
initiative is the Humanitarian Open Street Map Project, which
brings together local mappers working on the ground with
remote mappers, working with satellite imagery
to help to improve responses to disasters around the world.
By working together, those mappers can rapidly update and
improve the map in areas where humanitarian aid is needed to
ensure that aid is directed in the most useful and most impactful ways
Another example would be the Mozilla Common Voice Project
which is a project to try and develop a more diverse
and representative data set that can support the development
of voice recognition applications
So the project encourages people to contribute their
voice to the dataset and get involved in transcribing
recordings that have been contributed to it.
The third example we can look at is the work of Open Data Manchester
in Stockport. So let's hear from Sam Milsom about that project.
I'm Sam, I work for Open Data Manchester. We're a
not-for-profit formed from a diverse group of open data
advocates back in 2010, and we support organizations to release
data and we help people to use it. We build support good data
practice through expert advice, advocacy, events, research.
That sort of thing.
So back in 2019 part of the ODI's open GIS call, we ran
a project which we called Mapping Mobility Stockport which was a
program about trying to map the experience of people with
mobility and accessibility impairments. Looking at the roots
and the strategies that they used to travel in and around
Stockport Town Center.
Quite often these sort of communities can be overlooked in
town planning, and you know whether that's because the urban
environment deteriorates overtime or just doesn't account
for their experience. So it could be things like not having
dropped curbs at the pavement at busy crossings, for example.
These things, historically, can
be overlooked. And again, moving forward in terms of planning an
redeveloping the Town Center.
Quite often again, these things are quite hard to sort of map
and to take into account often because the GIS systems
themselves don't. They have a very sort of basic way of
describing these things like drop curbs or tactile paving,
but they don't take into account the vast range of experiences
that people have, so they might
not take into account quite acute neurological conditions,
for example. So how do you sort
of plan for that? So.
With Mapping Mobility Stockport we had some funding
from the ODI in which we sort
of developed a methodology for capturing and mapping this data.
So we worked with Stockport Town Center and some community groups
in Stockport, so Disability Stockport and also Age UK,
Stockport and we did some ransom workshops with them where we got
Maps of the Town Center from the council and we literally just
started scribbling on the maps. The routes that people took to
get where they needed to get to go around the Town Center and
started to identify
problem areas, but we also started to identify the routes
that people took regularly. Because quite often people do
have their own routes and strategies to move around the
Town Center, but these kind of remained tacitly within those
communities, so it was a really good exercise for us and for
Stockport Council to be able to actually map these routes and to
think about why do people take these routes? Why not this
particular route when it's more direct? Maybe because there's
you know it's too steep for a wheelchair to pass down safely.
That kind of thing. So it was a really, really good exercise for
Stockport Town Council because they were well,
redeveloping their, doing a lot of redevelopment of the Town
Center. The bus stop they are currently redeveloping. So a lot
of the experiences of the bus station area we... some of
the information that we captured has fed directly into the
redevelopment of the bus station
So making it more accessible.
With the workshops themselves, yep we started off doing
workshops where we just scribbled down problem areas.
But then we also went out onto the streets and conducted
walkabouts; 101 walkabouts with people so wheelchair users,
people with visual impairments, elderly and let people just walk
around and with a dictaphone and taking photos and we just took the
routes that they took and got them to describe their experience.
And in the end we were able to build a sort of
actual digital map
of these roots and of the good areas and of the problem areas.
And as I say, the project was really, really successful.
Stockport Town Center you know have used the data that we captured
and mapped in the redevelopment of their bus station, but
actually moving forward. At Open Data Manchester we've been
thinking about how can we capture these experiences?
Because there are so many more experiences and I think that a
lot of GIS systems don't have
the... let's say the language or the structures through which we
can describe these experiences. Quite often you're looking at
urban areas from the point of view of the buildings or from
the streets rather than the experience of the human being;
the embodied experience of the human in that space. So from the
Mapping Mobility Project what we're hoping to do is we
actually going to take the data that we collected with these
groups and start to build a framework. A sort of really
basic. I suppose trying to look at is there a... can we create a
sort of basic standard for describing experience, so
describing and mapping areas from the human being, the human's
perspective, and from that it can also incorporate a multitude
of different experiences. So we're actually hoping this year
to start developing that project further. So taking the data and
the stuff that we collected and developing a kind of prototype.
Whether that's going to be a
questionnaire or some kind of standard. We're going to sort of
hopefully work on that and develop it further, and to see
whether we can, you know
yeah, create a kind of a standard or a framework for capturing
Capturing more people's experience of moving around
urban areas which we think could make town centers, city centers
much more inclusive and just a lot safer and easier for all
people to move and travel around and live in.
So thanks, Sam. What's really interesting to me about that
project is not only were they able to create some local
impact by working with the local community on collecting
missing data, it's also led to some reflections about better
ways to collect data in future so that more... it's more
representative of those communities.
That's something that we're really interested in at the ODI
We recently published a guidebook to help people design and
build systems and tools that
will support collaborative maintenance.
We'd really love to get some feedback on this, so
please take a look and let us know what you think.
So let's move now onto the second way in which we can think about
collaboration. So collaborating on the publication and use of
data. And again, I think it's illustrative to look at what is
happening in the open source movement and think about how we
might apply that to open data initiatives.
So some of the more successful open source projects
have recognized that in order to be impactful, they need to get
people to contribute more than just code and bug fixes.
They need a wider range of skill sets. They need people to help
produce documentation. They need user researchers to make sure
that the software is well designed. They need designers to
help improve the user interfaces and the user experience of using
those tools. So there's been a move to try and get more
people involved in a whole variety of different ways.
I recently came across this framework I called BASEDEF that
had been developed by a number of people in the open source
community. It tries to spell out a variety of ways in which we
might contribute to
an open source project and in helping it achieve some impact
So we might blog about it to help other people
understand how it could be used. We might try it in a project so
that we can provide useful feedback about how it might be
used. We can file bugs. We can contribute bug fixes, but we
could also just work to try and improve the documentation to
help other people, perhaps with less technical skills to get
some value from the software.
Now, what if we applied the same framework to open data?
We could write about new datasets as they're published to
help others to understand both the value of the data, perhaps
also its limitations.
We could write tutorials and produce code that can illustrate
how a dataset can be used to help apps to show how it could
be used in specific tools, or to answer specific types of
questions. We can give feedback to the publisher, maybe to
suggest how they could improve, how the data has been published,
Perhaps to align it better with other datasets, or to adopt a
specific set of standards.
And through the freedoms that are given to us by open
licences, we can enrich and improve those datasets. Perhaps
to add in additional data or to link it with other sources.
And increasingly, we need to make sure that data is well
documented so we can all contribute to that. We can help
to fill in the gaps and support publishers in making sure that
their data is well described and
well published. Increasingly, there's a... there's a range of
open data initiatives that are realising that this kind of
broader support is necessary in order to achieve impact.
One example of those is the Open contracting Partnership.
So let's hear from Lindsey about the work
that they've been doing.
Hi, my name is Lindsey Marchessault and I'm the Director
of Data and Engagement at the Open Contracting Partnership
where we support governments and other stakeholders around the
world to improve their public services by improving their
public contracting. Public spending on contracts adds up to
more than 10 trillion dollars of public investment per year. So
it's very important to get it right. And by that I mean value
for money for government. Fair access to opportunity for
business and high quality goods works and services for everyone,
which includes everything from buying textbooks to maintaining
power plants. Unfortunately, in many cases, we don't have the
data that we need to ensure an effective procurement process
or to diagnose the performance of the system as a whole.
These data challenges might be because the data is not
collected, not digitized, or because of poor data quality.
Unusually, there's an element of collaboration in this.
I'll give an example. Currently in the COVID-19 emergency response,
many governments are currently purchasing or have purchased
ventilators. And the process to effectively buy ventilators,
involves a lot of needed data as input. This includes
understanding what is our current stock and supply of
ventilators. Where are the ventilators? Are the ventilators
functioning? Are they the appropriate ventilators that we
need? What do we expect in terms of the amount of ventilators
that we need and when we're going to need them and where
we're going to need them? What are best predictions and our
best advice? Finally, what do we know about the market for
ventilators and where are they
available? What are market reference prices and who might be able to
supply these things. Without the process to understand and
collect this data, digitize it, and use it effectively
the quality of the procurement outcomes will be affected. And
that's just one example of how data is important in all
different types of public purchases at all different
stages. Whether that's the planning of the procurement
the procurement process itself, or how we want to effectively
manage that that contract delivers value for what was
needed, and intended.
So this situation is why we support partners to collect,
publish, improve and use open data about the planning,
procurement and implementation of public contracts.
And to do this, we maintain a variety of tools and guidance,
most notably the Open Contracting data standard.
And we provide both policy and technical advice, including on
change management strategies. And we also provide a free
global help desk for the Open Contracting data standard in
both Spanish and English.
Over the years, we've supported hundreds of partners from dozens
of countries, and we are working to build an international
community of Open Contracting
practitioners. And one thing that we've learned is how
important multi-stakeholder collaboration is to
achieving measurable results. Collaboration has
been even more important in 2020, as the entire open
contacting community has focused on ensuring the
effectiveness and accountability of the
COVID-19 response procurement. So this is to
ensure we have the necessary medical equipment,
facilities and services to address the crisis.
And I'll share one example.
The government of Paraguay, through their procurement agency
wanted to ensure transparency of the COVID-19
emergency purchases. Because of the emergency government
agencies were allowed to buy quickly without a competition
without an advertised process. But just because something has
to be done quickly doesn't mean that it can't happen without
accountability, and so our helpdesk advised them on how they
could use the Open Contracting data standard to identify the
COVID-19 related procurement and ensure that it was referenced
to the COVID-19 emergency budget lines as well. Our
engagement leads for Latin America also worked closely with
the government and users of the data to ensure that the data was
being presented to people in a way that they could use it, and
one way that this happened was by a tool developed by the Inter
American Development Bank that we advised and supported on
which reuse the data and combined it with other
information including budget
graphical data and something called an "investment map" and the
data included information on the buyers, the prices, the suppliers
and information about the items and services being
purchased and their delivery.
And this data is being used by a wide range of stakeholders
within the country, including the government who improve their
own processes. By business to understand the market and the
opportunities. By civil society, academia and media. In fact,
journalists have reported about several irregular procurements
in the media, including a case in which large volumes of
tonic water were purchased at roughly five times the market
price as a COVID-19 emergency procurement, and as a result of
this case coming to light
the government issued a new requirement that buyers must
report and published their market reference prices to
justify the prices they're paying, which will hopefully
avoid other inflated emergency
procurement. So at the Open Contracting Partnership, we're
also supporting similar efforts to publish and use procurement
data all around the world. Some specific to the COVID-19
response, but also starting to prepare for the coming spending
associated with the economic recovery. And, for example, we
recently launched an action research program where we're
supporting researchers from 12 countries to use data and
investigate the affectiveness/ integrity of the COVID-19
emergency response procurement. And we hope that their findings
will lead to similarly
actionable policy recommendations which can be
implemented through collaboration with government
I'll stop there, but I'm happy to answer any questions
about our work, our tools, or our support.
So thanks, Lindsey. Again, I think there's something really
powerful there about demonstrating how, by bridging
between the publishers and the users of the data, the
partnership has been able to increase the impact, increase
the value that has been gained by sharing that data in the
first place. So to return to our original question,
It's because we need to work together to
get more people involved in understanding how
data is being collected. Get more people involved in
maintaining that data over the long term so that we have a
stronger data infrastructure. And we need to work together
as communities to make sure that we can maximize the value
of that data while minimizing its potential harmful impacts.
And through that work we can create an open, trustworthy,
So thank you for listening. If you'd like to learn
more about the work of our speakers or the work that we
have done at the ODI, then please follow the links in the slides
Thank you very much.
Director of Advisory
The Open Data Institute
Leigh has 17 years of experience working in a variety of technology focused roles including software engineer, product manager, technical consultant and CTO. Leigh spent 10 years working in the publishing industry dealing with data integration and management issues. At Publishing Technology whilst CTO of their scholarly division, Leigh was responsible for designing an innovative publishing framework based on semantic web technology. On moving to Talis Group in 2011, Leigh became responsible for programme manager for their “Data as a Service” products, overseeing product development and launching their Linked Data consulting business.
Recently Leigh has been working as an independent technical consultant helping businesses explore technology and best practices that support the integration and publication of Linked Data and Open Data. Leigh has worked with a variety of organisations ranging from small startups through to large multi-national businesses. Leigh also enjoys writing and has worked as a freelance author publishing articles and training materials for O’Reilly and IBM.
Nothernlands 2 is a collaboration between ODI Leeds and The Kingdom of the Netherlands, the start of activity to create, support, and amplify the cultural links between The Netherlands and the North of England. It is with their generous and vigourous support, and the support of other energetic organisations, that Northernlands can be delivered.