Hindustan Times (Patiala)

Indian data ecosystem needs an overhaul

Either there isn’t enough data available or the one that exists is sometimes unreliable but is used anyway because there is no alternativ­e

- Samarth Bansal samarth.bansal@htlive.com

In July, in front of a roomful of policy wonks, government officials and journalist­s, Union health secretary CK Mishra made an honest acknowledg­ement — there are serious problems with India’s public health statistics.

For one, he said, data from the latest round of the National Family Health Survey (NFHS-4) — the major source for detailed health statistics in India, conducted under the aegis of the ministry of health and family welfare (MoHFW) itself — is unreliable for certain states.

On top of that, the Health Management Informatio­n System (HMIS), which Mishra called “a data mine”, is not effectivel­y used. “We use very little of it in the planning process” due to lack of expertise to read and understand the data, he said.

The health secretary’s statement raises concerns: how can the country formulate evidence-based policy or plan wisely for the future without credible data? And Mishra, a 34-year veteran of the Indian Administra­tive Service who was appointed to head the MoHFW last year, is not alone. A recent paper by the Health Team of the National Institute of Public Finance and Policy, New Delhi, found that the country’s health data was unreliable, irregularl­y published, and failed to cover a broadenoug­h population.

PROBLEMS GALORE

And such problems are not restricted to the health sector alone. The entire Indian data ecosystem needs improvemen­t. Former RBI governor Duvvuri Subbarao has stated that monetary policy decisions often go astray because of erroneous data provided by the government. The debate on the reliabilit­y of India’s macroecono­mic data, GDP and IIP numbers, for instance, remains unsettled. At a time when unemployme­nt — or rather, underemplo­yment — is a key socio-economic concern, economists cannot measure the problem’s magnitude because they do not have credible figures and surveys. India’s agricultur­al statistics have also come under the scanner. Talk about crime, and all you have is aggregated data from FIRs — no official crime victimisat­ion surveys have been instituted yet.

To be sure, every data set comes with caveats that must be considered when making interpreta­tions. But some failings appear to be a standard characteri­stic of Indian data sets.

To begin with, there isn’t enough data. The data that does exists is sometimes unreliable but is used anyway because there is no alternativ­e. Several important data sets are released with a huge time lag. Others are missing granular districtle­vel estimates. If such estimates are present, they are not always used for policy making or governance. And even when data sets are good and people want to use them, there may be too few who understand how to work with them, as Mishra said about HIMS.

Taken together, these shortcomin­gs amount to an Indian statistica­l ecosystem that falls short of the needs of the world’s largest democracy.

MODES OF DATA COLLECTION

There are two major modes of data collection: administra­tive, which refers to data collected as a result of an organisati­on’s daily operations (think of patient registrati­ons at a hospital or new accounts opened at a bank); and surveys, which are based on how a part of a population (what statistici­ans call a ‘sample’) responds to a set of questions.

P.C. Mahalanobi­s, the statistici­an credited for laying the foundation­s of the data systems of independen­t India, “focused on creating credible data sets from representa­tive sample surveys,” says a Mint essay which traced the history of Indian statistica­l system.

But Mahalanobi­s’s preference for surveys came at the expense of data collection at the administra­tive level, the essay argued, and may have undermined the government’s ability to collect regular, reliable data.

“Instead of being sparingly used for purposes where there was no alternativ­e to sampling, sampling became the first choice of technique for collecting data.” Sometimes, surveys are the only way to capture data. Economic statistics, for example, cannot be collected at the administra­tive level because of the huge size of the Indian economy’s informal sector, which employs around 90% of the country’s workforce, says Pronab Sen, former chief statistici­an of India.

Yet India faces challenges to conducting good surveys a population of more than a billion people, relatively high rates of illiteracy, and dependence on the informal economy that simply do not exist in much of the rest of the world, says Sen.

VACANCY ISSUES

The government also employs too few people to carry out regular and robust surveys. The National Sample Survey Office’s (NSSO) field operations division, which is responsibl­e for collecting primary socio-economic data, has around 24% of positions vacant for the posts of junior and senior statistica­l officers.

The NSSO’s critics do not realise how hard it is to undertake actual data collection on the ground, Sonalde Desai, professor of sociology at the University of Maryland who also conducts the India Human Developmen­t Survey (IHDS), said in an email. Without adequate internal staff, the agency must contract with outside agencies.

“This is what both IHDS and NFHS do, and only we know how difficult it is to maintain quality. Some of the agencies we work with are fantastic, and some are struggling themselves. This requires enormous supervisio­n, and if one slips there, the data can be highly questionab­le,” Desai said.

“This hit-and-miss approach is not acceptable for data that form the core of our policy-building process.”

Experts say that technology can be leveraged to improve data collection systems. Private data collection agencies are already making use of apps and tools to conduct surveys electronic­ally, rather than on paper. But that comes with its own challenges. Richa Verma, who leads the research and analysis team at Social Cops, a data intelligen­ce company, says that better design is key to make it easier for people to adopt technology.

While working with the government and various non-profits, Verma found that many of its trainees have never used a smartphone. Data collection technology must be made simple, and appropriat­e training must be conducted, so that anyone can be trained to use it.

EVERY DATA SET COMES WITH CAVEATS THAT MUST BE CONSIDERED WHEN MAKING INTERPRETA­TIONS. BUT SOME FAILINGS APPEAR TO BE A STANDARD CHARACTERI­STIC OF INDIAN DATA SETS. SEVERAL DATA SETS ARE RELEASED WITH A HUGE TIME LAG

 ?? HT FILE ?? A health department official conducts a survey in Ludhiana.
HT FILE A health department official conducts a survey in Ludhiana.

Newspapers in English

Newspapers from India