Data Ethics: A starting point

Throughout the last few decades, the debate on data ethics has been gaining significant attention amongst industry leaders, public agencies, and academics. Indeed, whilst these are important conversations to have, wading through the enormous volume of literature can be quite confusing. At ODI Leeds, our latest project aims to analyse and evaluate relevant data ethics publications, narrowing them into tools that can provide useful conceptual and instrumental value to organizations.

That being said, the purpose of this blog post is to serve as a starting point: what is data ethics? What questions do we have? Do we have answers to them or at least, are we getting any closer?

What is data ethics?

Broadly speaking, data ethics studies the moral problems related to data (including generation, collection, processing, storage, sharing, use) and algorithms (including AI, ML, robotics).

Whilst there are no existing global standards, there are a number of recurrent themes. The biggest ones are: transparency, justice and fairness, non-maleficence, responsibility, and privacy. Whilst definitions may vary from publication to publication, the general idea remains fairly similar:

  • transparency aims to increase explainability and interpretability
  • justice and fairness aim to tackle unwanted bias and discrimination
  • responsibility can range from addressing liability to diversity and sustainability
  • privacy is often related to data protection, including differential privacy, privacy by design, access, and regulation

These cornerstones of data ethics principles are quite technical - they are therefore usually addressed using open-source toolkits, audits, and other validate systems. However, there are also a number of conceptual and abstract challenges that remain unanswered: for example, how do we tackle the phenomenon of dataveillance (surveillance using digital technologies)? How do we balance the need to be seen and represented with individuals' privacy and autonomy? How does this tie in with the growing global technological divide?

The growth of data ethics

Whilst these questions are by no means easy to answer, the proliferation of data ethics publications in the last few years is opening new avenues for conversation and development.

In the private sector, we are seeing a growing volume of publications and collaborations: Microsoft, Google, OpenAI, and IBM have published principle lists and charters; Amazon, Apple, Baidu, Facebook, Google, IBM and Intel have joined the Partnership on AI, a non-profit coalition that seeks to develop data ethics practices.

In the area of standards and frameworks, the IEEE has embarked on one of the most comprehensive data ethics projects that exist today: they have published 14 AI standards, an ethics glossary, and a conceptual framework for Ethically Aligned Design. The AI Now Institute has also published the Algorithmic Impact Assessment, which outlines a practical framework designed to aid public agencies to assess automated decision systems.

Academic literature on AI ethics is also growing substantially, and along with it, we are seeing greater discussion regarding the social impacts of Big Data and AI. One particular example of this is the idea of 'data justice'. The works of 'data justice' proponents, such as those of Linnet Taylor and Richard Heeks, offer new ways of thinking about the datafication of development programs- ensuring that social protection programs do not marginalize the very communities they aim to serve.

What does this mean?

The question now then, is: how useful are these guidelines and publications? How effective is soft law, as opposed to hard regulation? Which of these organizations are truly committed to furthering data ethics, and how do they translate into real-world scenarios? It is these kind of questions we will be attempting to answer in our 6- week exploration in the field of data ethics.

If you think you can help us, please get in touch, we would be delighted to hear from you. For now, watch this space.