Hindustan Times (Lucknow)

Why data hubs in states pose a new threat to privacy

DOUBLE TROUBLE UIDAI says it is the sole custodian of citizen info. But state govts and police are using Aadhaar numbers to consolidat­e data, which is being stored without any coherent data security policy and is accessed by private players

- Aman Sethi aman.sethi@hindustant­imes.com ▪

On January 18, a policeman showed up at the door of Abdul Hafeez, a 37-year-old businessma­n in Kulwakurth­y in Telangana’s Mahbubnaga­r district, and asked to record his fingerprin­ts, Aadhaar number, phone numbers, social media accounts, voter ID, passport, and the names and numbers of his family members, associates, lawyers, pawn brokers, and “concubines”, if any.

The policeman also geo-tagged Hafeez’s home by noting down its GPS coordinate­s. This informatio­n, Hafeez was told, was for the Sakala Nerasthula Samagra Survey, the brainchild of the state police to gather a vast archive of personal informatio­n of convicted criminals, suspects and under-trials.

The data, according to a police circular, would be stored in TSCOP – a new software applicatio­n downloaded onto the phones of the state’s constables.

“With TSCOP a policeman moving through a difficult area can see in real time: who are the criminals in this area? Where do they live? Who are the people here who can help the police?” said M Mahendar Reddy, Telangana’s Director General of Police, who oversaw the developmen­t of TSCOP. “It is not meant for senior officers, it is meant for beat policemen who are on the front lines.”

For Hafeez, however, the survey was the most recent instance of harassment at the hands of the local police.

“I am not a repeat offender,” Hafeez said in an interview. “I am a victim of police harassment.”

The story of Hafeez and TSCOP offers a peek into the largely unnoticed big-data revolution sweeping through Indian state government­s. Administra­tors and police department­s are using individual Aadhaar numbers to consolidat­e citizen data scattered across disparate government department­s, allowing for the creation of detailed personal databases.

This data is being collected without the consent of citizens, in the absence of a data privacy law, and stored without any coherent data security policy.

“Modern databases are designed to interopera­ble,” said Rahul Matthan, a partner at law firm Trilegal. “But when we start to use these databases, we need a proper privacy framework that holds government and private parties to account for how they use this data.”

Such a framework, Matthan said, needs to move beyond simply asking for user consent —as individual­s may not understand the implicatio­ns of sharing their data. Rather, the framework should hold data-gathers and users accountabl­e if their use of this data results in any harm being caused to the individual.

BIOMETRIC DATABASES

On February 22 this year, Gujarat state counsel Rakesh Dwivedi acknowledg­ed that the Gujarat government had maintained an archive of biometric data of the state’s residents in a software applicatio­n called the “State Resident Data Hub” (SRDH).

This data, Dwivedi said, was erased shortly after the Aadhaar Act was implemente­d in March 2016. His admission, before the constituti­onal bench of the Supreme Court, marked the first time a state government had formally conceded to maintainin­g such a database. Meanwhile, Gujarat continues to maintain a separate biometric database of ration-card holders, as reported by HT on Jan 21 2018.

The Unique Identifica­tion Authority of India (UIDAI) has consistent­ly maintained that it is the sole custodian of citizen data collected during the Aadhar enrolment process. Dwivedi’s statement in court reveals this was not always the case.

“These state hubs were created under an agreement between the state and UIDAI,” Dwivedi said in remarks reported in HT. Hindustan Times has seen one such draft MoU.

Applicatio­ns like TSCOP and SRDH, which contain the Aadhaar numbers and biometrics of citizens, reveal the complete absence of public informatio­n regarding who has access to the sensitive personal data of over one billion Indians enrolled in the controvers­ial Aadhaar programme, and how it is put to use.

These applicatio­ns also reveal Aadhaar’s central paradox wherein a system designed to minimise data collection has given state and central government­s the ability to gather and centralise increasing­ly detailed informatio­n about their citizens.

“Any policy that proceeds without adequate public consultati­on can only damage democratic practices in the country,” said Reetika Khera, a professor of economics at IIT Delhi, who has written extensivel­y on privacy and Aadhaar. “This is no exception.”

Telangana’s TSCOP and Gujarat’s SRDH appear to be two different applicatio­ns, but six state-level IT administra­tors and programmer­s familiar with the projects, say they are based on the same principle – of using Aadhaar as a common identifier to integrate previously discrete data silos.

“The SRDH was a prototype to showcase certain capabiliti­es,” said a coder who worked on the project.

The UIDAI did not respond to a detailed HT questionna­ire asking about the existence of these biometric databases, or the veracity of the draft MoU accessed by HT. In 2012, the UIDAI asked a consortium of private companies to develop data hubs to help states “leverage the true potential of Aadhaar and Aadhaar enabled service delivery”, according to the SRDH Institutio­nal Framework published in April that year.

A draft copy of an agreement between the UIDAI and state government­s, obtained by HT, reveals that SRDHs were created to provide a way for states to utilise the demographi­c data they had collected on behalf of the UIDAI during Aadhaar enrolment.

The UIDAI, the draft MoU said, would define “the process for accessing, securing and keeping up-to-date resident KYR (Know Your Resident) data as collected during enrolment.” HT was unable to access a signed copy of these agreements between states and the UIDAI.

An SRDH advisory document issued by UIDAI in March 2012 notes that “SRDH does not (and should not)store biometric data.” However, Dwivedi’s admission that Gujarat was holding biometric resident data in its SRDH points to the existence of separate agreements allowing states to merge biometric informatio­n they had gathered, with the demographi­c data they obtained from UIDAI.

“We made a generic program for states to manage their Aadhaar data,” a coder who worked on the project said, seeking anonymity as he was bound by a non-disclosure agreement. “We provided them with runnable code. Each state was responsibl­e for implementi­ng it themselves.”

States could modify the code, UIDAI documents said, but customisat­ion would void warranty-support offered by Mahindra-Satyam, a private software company, which also prepared the deployment guide for the software. Accenture, KPMG, Ernst and Young, Price Water House Coopers, Wipro and Deloitte Touche Tomahatsu were empanelled as project consultant­s, the document said.

The UIDAI’s Project Management Unit coordinate­d with a network of private vendors and government departmetr­ic ments to implement the programme.

The SRDH created a resident data repository equipped with tools to allow states to seed this bulk database with individual Aadhaar numbers.

“We created a data repository of bulk resident data that states could seed with Aadhaar numbers,” the coder explained. “We call it inorganic batch seeding.”

For example: A list of ration card holders could be merged with a list of names and Aadhaar numbers to create a unified list of ration-card holders organised by Aadhaar number.

This could further be merged with a list of old-age pensioners to get a list of ration-card holders who also availed of pensions, and so on to create master databases that were integrated with Aadhaar-based biometric authentica­tion to access these services.

The process of seeding in bulk, the coder said, is probabilis­tic rather than exact.

“There are many residents with the same name – so we give different weightages to variables like name, date of birth, pin code,” he said. “The system will give possible matches and ask the system administra­tor to manually select one.”

Mismatches during inorganic seeding could explain periodic news-reports of villagers being denied food rations. Project Management Unit communicat­ion, interviews and documents reviewed by HT reveal that the SRDH rollout epitomised the “build the plane while you’re flying” approach beloved of software engineers.

Private vendors had access to the sensitive personal informatio­n and Aadhaar numbers of millions of Indians. Much of the data was stored as un-encrypted csv files on thumb drives, and emails exchanged between troublesho­oting teams included screenshot­s of resident data including Aadhaar numbers.

HT found the SRDH deployment guide – essentiall­y an installati­on manual – hosted on an open-access server.

“With this kind of guide, you have a lot informatio­n on how the website is structured, the default passwords etc,” said Robert Baptiste, a French security researcher who has exposed numerous flaws in databases managed by Indian government department­s.

In his most recent exploit, Baptiste gained access to 40 GB worth of personal data of 47,000 BSNL employees, hosted on a BSNL website.

Publishing the SRDH deployment guide online, Baptiste said, made it very easy for hackers to spot holes in the security architectu­re of the SRDH. “This should be an internal document,” Baptiste said.

Coders functioned as a hive mind, developing “innovation­s” with little regard to security or privacy concerns.

One internal document shows how a coder came up with a way for SRDH users with administra­tor privileges to link multiple identity cards like passports and drivers licences without the consent of the individual.

“Since SRDH houses a giant database of UID numbers, it makes sense to link a person’s UID number to other identity/ proof of residence cards that he/she might have,” the coder wrote. “With KYR Plus you can view details about other identity cards that the individual might have easily.”

In retrospect, ‘KYR Plus’ marked the point where Aadhaar shifted from a lean system to verify the identity of a user, into a tool to consolidat­e vast amounts of informatio­n about every resident.

The final slide of the internal document illustrate­s how a system administra­tor could extend the SRDH to inter-link any personal document of a resident.

“If the type of card you are looking for doesn’t exist.You can easily add a new card type here,” the coder wrote, offering an example. “License to Kill.” HT interviewe­d senior IT officials in four states, who said they struggled to integrate the SRDH with their programmes.

Yet, some states latched on its potential and built their own portals.

Andhra Pradesh created a version called “People’s Hub” which uses a resident’s Aadhaar number to consolidat­e 29 different department databases to create a “single source of truth” on the resident, according to AP government documents. The hub, officials said, does not hold bio-

APPLICATIO­NS LIKE SRDH, TSCOP REVEAL AADHAAR’S CENTRAL PARADOX WHEREIN A SYSTEM DESIGNED TO MINIMISE DATA COLLECTION HAS GIVEN STATE AND CENTRE THE ABILITY TO GATHER INCREASING­LY DETAILED INFO ABOUT THEIR CITIZENS

data.

“Now we know per household, the benefits being given,” said A Babu, CEO of the Real Times Governance Society, the state entity overseeing the People’s Hub. “Each household has an 8-digit number and its GPS coordinate­s are fixed on a map.”

The data in the People’s Hub is from a detailed survey conducted by the Andhra government. The results are partially visible on an open access website that plots the GPS location of every single house in surveyed districts.

Clicking on a house on the map reveals the names of the residents, and their partially masked Aadhaar numbers. Babu said integratin­g citizen data at this scale makes it easy to quickly identify the beneficiar­ies of government schemes, and seamlessly route entitlemen­ts to their respective bank accounts.

Privacy advocates like Khera question how much citizen data must be collected, and under what circumstan­ces.

“Efficient administra­tion requires ‘some’ informatio­n, not all informatio­n, and certainly does not require it to be centralise­d,” Khera said. Recent research on data-mining, she said, indicates that, “algorithms based on inaccurate assumption­s can end up harming instead of helping.”

Abdul Hafeez, the man who took on TSCOP – the Telangana Police app – is one example of the dangers indiscrimi­nate data collection and mining. TSCOP follows the same principle as the SRDH, except with police data. Here Aadhaar numbers are integrated with police databases to build detailed profiles of convicted criminals, and under-trials.

TSCOP is based on a similar platform called HydCOP, according to M. Mahendra Reddy, the DGP of Telangana, who developed both applicatio­ns.

For HydCOP, Reddy wrote in a 2016 paper, the homes of 3,500 “repeat offenders” in Hyderabad were surveyed and “geo-tagged for periodical visits by the front-line police officers.” The platform, Reddy wrote, was integrated with “UIDAI and NIC databases.”

When asked about the legal basis for integratin­g a police applicatio­n with the UIDAI, Reddy said UIDAI integratio­n was only for policemen to mark their attendance using the biometric sensors of their phones. UIDAI did not respond to a questionna­ire sent by HT.

So, when TSCOP was launched, Reddy announced a similar state-wide survey of criminals to collect data for the app. This brought the police to Abdul Hafeez’s house. In 2010, Hafeez had filed an RTI applicatio­n at his tehsil office to establish the ownership-status of public land encroached upon by a local builder.

“When I filed the RTI, the local police and land mafia filed a false case of cheating against me.” Hafeez said. “I was acquitted in 2013.” Then the police filed another case against him for possession of black jaggery, a controlled substance, despite – Hafeez said, his providing them with all the relevant documentat­ion.

“So when the police came and said they are entering my name in a repeat offenders database, I took them to court,” Hafeez said.

Once the petition was admitted, the state government simply withdrew the police circular and the state-wide survey was suspended. “The state government did not even contest the case, because they knew they cannot just gather personal data like this,” said Y Sheelu Raj, Hafeez’s lawyer. “They are taking advantage of the fact that our country does not have a data protection law, and hoping no one will protest.”

DGP Reddy declined to give the legal basis for the police gathering such intrusive citizen data. He also refused say what the Telangana police has done with all the fingerprin­ts and geo-tagged data they have already collected. “We are following all provisions of relevant laws,” he insisted.

PRIVATE VENDORS HAD ACCESS TO SENSITIVE PERSONAL INFO AND AADHAAR NUMBERS OF MILLIONS OF INDIANS. MUCH OF THE DATA WAS STORED AS UNENCRYPTE­D CSV FILES ON THUMB DRIVES, AND EMAILS SENT BETWEEN TROUBLESHO­OTING TEAMS INCLUDED SCREENSHOT­S OF AADHAAR NUMBERS

Newspapers in English

Newspapers from India