Big data: A tool for development or a threat to privacy?

News & Analysis

Post date

21st January 2014

Big data: A tool for development or a threat to privacy?

Big data consists mainly of data that is openly available, created and stored. It includes public sector data such as national health statistics, procurement and budgetary information, and transport and infrastructure data. While big data may carry benefits for development initiatives, it also carries serious risks, which are often ignored. In pursuit of the promised social benefits that big data may bring, it is critical that fundamental human rights and ethical values are not cast aside.

Expanding beyond publicly accessible data

Along with other humanitarian organisations and UN agencies, one key advocate and user of big data is the UN Global Pulse, launched in 2009 in recognition of the need for more timely information to track and monitor the impacts of global and local socio-economic crises. This innovative initiative explores how digital data sources and real-time analytics technologies can help policymakers understand human well-being and emerging vulnerabilities in real-time, in order to better protect populations from shocks.

UN Global Pulse clearly identified the privacy concerns linked to their use of big data and the impact of privacy in "Big Data for Development: Challenges & Opportunities” and have adopted Privacy and Data Protection Principles. While these are positive steps in the right direction, more needs to be done, given the increasingly complex web of actors concerned, the expanding scope of their work, the growing amount of data that can be collected on individuals, and the poor legal protections in place.

Increasingly, big data includes not only openly available information but extends to information collected by the private sector. This includes Twitter feeds, Google searches, and call detail records held by network providers. The efforts of groups such as UN Global Pulse are focussed on opening access to private sector data; UN Global Pulse noted this “challenge” and have been encouraging enterprises to participate in “data philanthropy” by providing access to their data for public benefit.

Dangers of big data

While access to such data is posited as opening opportunities for development, it also has the potential to seriously threaten the right of individuals to keep their personal information private.

If private sector data falls into the wrong hands, it could enable monitoring of individuals, identification and surveillance. Despite guarantees of anonymisation, the correlation of separate pieces of data can (re)identify an individual and provide information about them that is even more private than the data they consented to share, such as their religion, ethnicity or sexual orientation. If this were to happen in certain contexts the consequences could have tragic impacts, especially if the data concerned relates to vulnerable groups such as minorities or refugees, as well as societal groups including journalists, social dissidents and human rights advocates.

Big data development initiatives such as that conducted by French telecommunication company, Orange, in Côte d’Ivoire have shown that even a basic mobile phone traffic data set can enable conclusions about social divisions and segregation on the basis of ethnicity, language, religion or political persuasion.

What about consent?

Because big data is derived from aggregated data from various sources (which are not always identifiable), there is no process to request the consent of a person for the resulting data that emerges. In many cases, that data is more personal than the set of data the person consented to give.

In October 2012, MIT and the Université Catholique de Louvain, in Belgium, published research proving the uniqueness of human mobility traces and the implications this has on protecting privacy. The researchers analysed the anonymised data of 1.5 million mobile phone users in a small European country collected between April 2006 and June 2007, and found that just four points of reference, with fairly low spatial and temporal resolution, were sufficient to uniquely identify 95 per cent of them. This showed that even if anonymised datasets do not contain name, home address, phone number or other obvious identifier, the uniqueness of individuals’ patterns (i.e. top location of users) information could be linked back to them.

Advocates for big data for development argue that there is no need to request consent because they concern themselves with unidentifiable anonymised data. Yet, even if one actor in one context uses data anonymously, this does not mean that the same data set will not be de-anonymised by another actor. The UN Global Pulse can promise that they will not do anything that could potentially violate the right to privacy and permit re-identification, but can they guarantee others along the process ensure the same ethical safeguards apply?

Whose data and for what policies and programmes?

New technologies are enabling the creation of new forms and high quantities of data that can inform policy-making processes, improving the effectiveness and efficiency of public policy and administration. However, inaccuracies can exist in the data used – either because data is not regularly updated, relates only to a sample of the population, or lacks contextual analysis.

A recurring criticism of big data and its use to analyse socio-economic trends for the purpose of developing policies and programmes is the fact that the big data collected does not necessarily represent those towards whom these policies are targeted. The collection of data may itself be exclusionary when it only relates to users of a certain service (health care, social benefits), platforms (i.e. Facebook users, Twitter account holders, etc.) or other grouping (i.e. online shoppers, loyalty card members of airlines, supermarkets, etc.)

In the developing world, only 31 per cent of the population is online, 63 in 100 inhabitants have a mobile phone and 11 per cent have access to mobile-broadband. Ninety per cent of the 1.1 billion households that are not connected to the Internet are located in the developing world. Some countries in Africa have less than 10 per cent of their population active on the internet. This means whole populations can be excluded in data-based decision-making processes.

So what must be done?

As noted by Linnet Taylor, researcher at the Oxford Internet Institute working on a project about big data and its meaning for the social sciences, a quick analysis of the big data discourse reveals a clear double standard:

There is a certain irony here: 90% of the discussion at the forum referred to big data as a tool for surveillance, whereas the thread of debate that focused on developing countries alone, treated it as a way to ‘observe’ the poor in order to remedy poverty”.

Data is data. Yet the short- and long-term consequences of collecting data in environments where appropriate legal and institutional safeguards are lacking have not been properly explored. Amassing and analysing data always has the potential to enable surveillance, regardless of the well-intentioned objectives that may underpin its collection. Development is not merely about economic prosperity, and social services. It is about providing individuals with a safe environment in which they can live in dignity.

Towards accountability

In their recently published paper, Big data and Due Process: Towards A Framework to Redress Predictive Privacy Harms, Crawford and Schultz propose a new framework for a “right to procedural data due process,” arguing that “individuals who are privately and often secretly 'judged' by big data should have similar rights to those judged by the courts with respect to how their personal data has been used in such adjudications”.

Unlike the common model of personally identifiable information, big data does not easily fall within legally protected categories of data. This means there are no legal provisions protecting the data collected, processed and disclosed, and the rights of individuals whose data is being analysed.

Therefore, Crawford and Schultz have innovatively re-visited some relevant founding principles of the legal concept of due process. Due process (as understood in the American context) prohibits the government from depriving an individual’s rights to life, liberty, or property without affording them access to certain basic procedural components of the adjudication process. The concept equally exists under European human rights law, though is more commonly called procedural fairness.

By doing so, Crawford and Schultz are challenging the fairness of the process of collection rather than the attempting to regulate it, which would be more complex and contested. They have thus applied these principles to address existing privacy concerns linked to the development and use of big data, namely:

requiring those who use big data to “adjudicate” others, to post some form of notice disclosing not only the type of predictions they are attempting, but also the general sources of data that they are drawing upon as inputs, including a means whereby those whose personal data in included can learn of that fact;
providing an opportunity for a hearing to challenge fairness of the predictive process; and
the establishment of an impartial adjudicator and judicial review to ensure accountability of those who adjudicate others, i.e. those who deprive individuals of a liberty interest do so without unwarranted bias or a direct financial interest in the outcome.

The use of big data is intrinsically linked to ethical values, which means that the starting point must be the development international guidelines governing access to and analysis of individuals’ data. Thus as Crawford and Schultz conclude:

Before there can be greater social acceptance of big data’s role in decision-making, especially within government, it must also appear fair, and have an acceptable degree of predictability, transparency, and rationality. Without these characteristics, we cannot trust big data to be part of governance”.

Learn more

Location

PI seeks to inform report on AI and racial discrimination of the UN Special Rapporteur on racism

Privacy International submitted its input to the UN Special Rapporteur on racism for their upcoming report which will examine and analyse the relationship between artificial intelligence (AI) and non-discrimination and racial equality, as well as other international human rights standards.

Advocacy

An Electric Control Cabinet Production Factory

PI response to ICO consultation on web scraping by generative AI

PI responded to the ICO consultation on the legality of web scraping by AI developers when producing generative AI models such as LLMs. Developers are known to scrape enormous amounts of data from the web in order to train their models on different types of human-generated content. But data collection by AI web-scrapers can be indiscriminate and the outputs of generative AI models can be unpredictable and potentially harmful.

Long Read

Image of human body with dots of data being extracted

Q&A: €40 million fine for AdTech giant Criteo - What does it mean?

Our 2018 complaint against French AdTech company Criteo led to a €40 million fine for failing to ensure that data subjects had provided their consent to processing, to sufficiently inform them and to enable them to exercise their rights.

Following on from our initial reaction, we answer some questions about the decision below.

Advocacy

A drawing showing a tall amazon building surrounded by businesses closing and out of business. The caption read "Big tech's data-fuelled power is a huge problem"

Submissions to the UK and EU competition authorities on the Amazon/iRobot merger

On 2 May and 5 June 2023, PI made a submission to the UK Competition and Markets Authority (CMA) and the European Commission, respectively, in relation to the proposed merger between Amazon and iRobot, outlining the concerns the transaction raises for consumers and markets.

Big data: A tool for development or a threat to privacy?

Related Content

PI seeks to inform report on AI and racial discrimination of the UN Special Rapporteur on racism

PI response to ICO consultation on web scraping by generative AI

Q&A: €40 million fine for AdTech giant Criteo - What does it mean?

Submissions to the UK and EU competition authorities on the Amazon/iRobot merger