Skip to main content

A good talk about good data - An interview about data ethics with Sam Gilbert

As part of our data ethics project, we're reaching out to wrack the brains of those doing interesting and exciting work on the topic, to see if their perspectives can give us a better picture of what's going on and what to focus on. We do this through a series of casual interviews with people involved in the area. Louis Connell and Jennifer Cheung conducted the first of these on Tuesday, with Sam Gilbert, of the Bennett Institute for Public Policy and author of Good Data.

Who is Sam?

Sam's background is in data driven marketing, having worked as Head of Strategy and Development at credit expert Experian and as Head of Consumer Finance at Santander. He was Chief Marketing Officer at Bought by Many, a FinTech development which used web browser search data to tailor their product offers to unmet needs. He recently published his book Good Data, which emphasises optimism against a background of concern about privacy and surveillance capitalism. In his own words, he says he wrote it partially out of nostalgia for a period of the digital revolution when people had a much higher awareness of the enormous benefits technology digitisation could and would bring to our lives. He believes in optimism and hopefulness in this regard, and we do too!

What did we talk about?

Our conversation started out with a general talk about what data ethics means to different people and what some frequently asked important questions in this area might be. Data ethics is a contested topic: Some consider AI and the ethics surrounding its use and development to be the pivot on which the conversation rotates. Others believe what the concept really connects to is the rules and understanding we need around data sharing, collecting, transfer and ultimately data's use as a tool.

There's quite a well-known academic definition of what data ethics is, advanced by Luciano Floridi and Mariarosaria Taddeo:

A new branch of ethics that studies and evaluates the moral problems related to data (including generation, recording, curation, processing, dissemination, sharing and use), algorithms (including artificial intelligence, artificial agents, machine learning and robots) and corresponding practices (including responsible innovation, programming, hacking and professional codes), in order to formulate and support morally good solutions.

As useful as that definition might be in theory, data ethics and what it means is a constantly evolving topic, with a nebula of different connections and branches. So instead, Sam says, we should focus on finding out what the most important questions that arise are. We find these through analysing as much academic work and commentary as possible and extracting the general themes.

So, what are the big questions?

Of course, one of the biggest concerns in this area relates to privacy and surveillance. This is perhaps what most people might think of when asked about data ethics - the big brother kind of idea of Google tracking you through your maps. Shoshanna Zuboff's bestselling book The Age of Surveillance Capitalism sums up these concerns. She argues that there is a significant detrimental impact of the mass digital tracking conducted by platforms and companies in order to sell advertising. Essentially, she believes that there is something inherently wrong about collecting, storing and using personal data. While it is true that the control that big data platforms have over datasets is a concern, it might be that this sort of thing gets too much airtime: Sam tells us that he thinks the fixation on surveillance capitalism might take the focus away from more tangible, material threats to people's wellbeing, such as bias and discrimination in machine learning and datasets.

We firmly believe that data should be allowed to circulate and be used to its positive public good. We want, as Sam puts it so succinctly, "correlative public duties between people and organisations". For this, we need to find and develop positive and practical action solutions, rather than just focusing on risk mitigation. But, as Sam points out, the focus on privacy and surveillance means that the prevailing trend remains in the direction of the latter.

What are some examples of practical action?

Sam's favourite example is very simple yet effective, and could be employed effectively by anyone. It is the analysis of search data made public by Google through Google Trends, which shows general trends rather than targeted personal data. Businesses can identify unmet needs through search data to their advantage, such as when Sam's FinTech Bought by Many identified a need for pug pet insurance. This was also used by ScrewFix, as Sam identified, to find out why their product named "indoor lighting" wasn't selling very well. People were in fact searching more frequently for kitchen lighting - A simple change helped their business and people to find lights to host dazzling kitchen parties. Further, Bill Lampos developed a predictive model for covid cases based on searches for symptoms. This is now used by PHE, and predicts case levels and surges 17 days in advance. Google and Bing have done a lot to make their search data available, and this is a big positive.

There are also some more ambitious ideas: Data Unions and Data Trusts try to reclaim choice and ownership over personal data with individuals getting to decide how and when their data is used. These initiatives are very new, and are just a number of options in a sea of potentials. This means they are not perfect. For example, Sam makes some very good observations about the nature of data unions. Data is, and should be, optimised for society. Monetising data by and for individuals, might not produce optimal outcomes for society since it uses data as private property. This idea of private ownership of data may also pose legal and philosophical issues.

Data trusts might help us out in pursuing the collaborative data sharing idea. As Sam says, if we want to control how and when and why our data is used, that's going to require constant decision making that we're just realistically not going to be able to do as individuals every day. Data trusts, using stewards of our data, can help make these decisions and take some control over how our data is used. This is obviously a very promising idea. Sam talks about 'data as a commons' in relation to this model. However, there remains the question of just how collaborative the companies who have access to the big swathes of data will have to be to make this happen. Perhaps we need to find a middle way, in between what Diane Coyle calls the "last hurrah for neo-liberalism" of hyper individualised data monetisation and relying on companies, to give us access to data.

Nevertheless, there is a lot to be optimistic about, and a lot more to be done in this area. Sam is doing some great work in this area and his positive outlook really is encouraging. You can check out his paper on data ethics here and his book Good Data was recently published. If you know of anything that you think could help, or would like to get involved please let us know and reach out to us via susanne.schmidt@odileeds.org