When economists look to the sky

Mint Asia ST - - News -

bined to ob­tain a cloud-free com­pos­ite im­age. The ma­chine learn­ing al­go­rithm helps cat­e­go­rize the com­pos­ite im­age data—in the form of pix­els, each of which is a vec­tor of quan­ti­ties in dif­fer­ent bands—into a dis­crete set of land cover cat­e­gories. How far such work­arounds cap­ture the real level of eco­nomic ac­tiv­ity still re­mains a mat­ter of fur­ther re­search.

Even as satel­lite-based data on lights are be­ing mined, other sources are also be­ing har­nessed to un­der­stand the dy­namism of economies, es­pe­cially ur­ban economies. For in­stance, Google Street View of­fers a rich source of vis­ual snap­shots of cities across the world. Har­vard Univer­sity econ­o­mist Ed­ward Glaeser and oth­ers (hbs.me/2vcrzbl) have used Google Street View images on the qual­ity of roads and type of dwellings to de­ter­mine in­come at a much dis­ag­gre­gated level. They find that Google Street View data pre­dict in­come and hous­ing prices within New York pretty well.

Google Street View images can also help us un­der­stand gen­tri­fi­ca­tion in cities. In their 2014 Amer­i­can Jour­nal of So­ci­ol­ogy re­search pa­per, Har­vard Univer­sity so­ci­ol­o­gists Jack­e­lyn Hwang and Robert Samp­son scoured thou­sands of images for 23 cities in the US to show that gen­tri­fi­ca­tion raised in­equal­ity in Amer­i­can cities, with the blacks bear­ing the brunt of it.

Even af­ter hav­ing con­trolled for a num­ber of fac­tors in­clud­ing crime rate, per­cep­tion, ac­cess to ameni­ties, race still ex­plains why cer­tain neigh­bour­hoods tend to be poor and oth­ers tend to be rich.

As cell phones be­come ubiq­ui­tous in de­vel­op­ing coun­tries, mo­bile data is also be­ing used to mea­sure wealth and ur­ban com­mut­ing. Us­ing an anonymized data­base con­tain­ing call records of bil­lions of in­ter­ac­tions in Rwanda, Joshua Blu­men­stock of the Univer­sity of Cal­i­for­nia, Berke­ley, in a 2015 re­search pa­per pub­lished in Sci­ence (bit.ly/2es­16mf), cre­ated a mea­sure of wealth based on the length and du­ra­tion of calls, to find that it closely tracked the so­cio-eco­nomic sta­tus of in­di­vid­u­als, and at an ag­gre­gate level, the wealth level of re­gions.

Although these stud­ies are quite in­no­va­tive in their ap­pli­ca­tion of mod­ern data min­ing tech­niques to get around the prob­lem of ir­reg­u­lar or patchy eco­nomic data, it is worth not­ing that they are meant to be work­arounds for the most part. Like any other mod­el­ling ex­er­cise, there are im­plicit as­sump­tions hid­den in most eco­nomic es­ti­ma­tions us­ing satel­lite im­agery.

One typ­i­cal as­sump­tion is that the eco­nomic ac­tiv­ity or lu­mi­nos­ity of each dis­tinct ge­o­graph­i­cal unit is in­de­pen­dent of each other (or spa­tial in­de­pen­dence, as economists term it). But this as­sump­tion can be vi­o­lated for satel­lite images given that the value of a vari­able for a par­tic­u­lar lo­ca­tion is af­fected by the value of neigh­bour­ing lo­ca­tions.

Se­condly, all satel­lite-based data are de­pen­dent upon the or­bits that satel­lites take around the earth. And, the qual­ity of images cap­tured by a satel­lite varies over space and time. How this af­fects anal­y­sis is still not en­tirely clear.

Thirdly, as Dave Don­ald­son and Adam Storey­gard em­pha­size in a re­cent review pa­per (bit.ly/2ezmxjm) on the use of satel­lite data, the use of ma­chine learn­ing tech­niques im­poses ad­di­tional costs in terms of re­sources and anal­y­sis that a re­searcher has to deploy on the ground to ar­rive at ro­bust con­clu­sions.

“A crit­i­cal in­put to these (ma­chine learn­ing tech­niques) and other meth­ods is the avail­abil­ity of training data on the vari­able of in­ter­est that as­signs ground truth val­ues to sam­ple sites,” the duo point out. “For ex­am­ple, de­lin­eat­ing im­aged ur­ban neigh­bor­hoods as res­i­den­tial, or even more specif­i­cally as slums, re­quires first pro­vid­ing a set of ar­eas pre-de­fined as slums by other means. Do­ing so well re­quires a training dataset that re­flects the full di­ver­sity of dis­tinct neigh­bor­hoods within the cat­e­gory of slums. This is es­pe­cially chal­leng­ing when the ob­ject of in­ter­est is het­ero­ge­neous or im­pre­cisely de­fined….one could imag­ine economists us­ing re­motely sensed in­for­ma­tion on build­ings to es­ti­mate a re­gion’s cap­i­tal stock; in such a case, the ideal training data would con­cern build­ing val­ues in­stead of build­ing types. Be­cause these training datasets are used to de­fine the classes un­der­ly­ing a clas­si­fi­ca­tion al­go­rithm, they must be pro­duced out­side the al­go­rithm. Thus, they are typ­i­cally a la­bor-in­ten­sive ana­log con­straint on a tech­nol­ogy that oth­er­wise can op­er­ate with all the scale ben­e­fits of com­puter pro­cess­ing.”

Fi­nally and per­haps most im­por­tantly, most satel­lite-based data can po­ten­tially iden­tify in­di­vid­u­als and house­holds. Cell phone data are the most prob­lem­atic in this re­gard, and have se­ri­ous reper­cus­sions on pri­vacy.

It is also worth not­ing that most early pro­po­nents of the use of night-lights data ad­vo­cated the use of such data as a sub­sti­tute for na­tional ac­counts and house­hold sur­vey data where such data is ei­ther not re­li­able or is ir­reg­u­lar, and as a com­ple­ment where such data is in­deed avail­able. The rea­son for the con­tin­ued pref­er­ence for old-fash­ioned data col­lec­tion tech­niques is that they gen­er­ate thick lay­ers of in­for­ma­tion, which col­lec­tively can con­vey a richer sense of an econ­omy than mere satel­lite images can. Tra­di­tional data­bases thus help us form in­fer­ences based on a wider va­ri­ety of data. The flip side of course is that it may not be pos­si­ble to dis­ag­gre­gate the tra­di­tional mea­sures in the same man­ner as satel­lite data, which is avail­able at a gran­u­lar level.

To sum up, new and ex­cit­ing data-sets are help­ing us un­der­stand the world bet­ter. How­ever, it is er­ro­neous to be­lieve that these new data-sets can sub­sti­tute ex­ist­ing sur­vey-based or na­tional ac­counts data.

A satel­lite can hardly tell us any­thing about in­tra-house­hold al­lo­ca­tion of re­sources, for in­stance, or the level of dis­crim­i­na­tion in a ru­ral labour mar­ket in a coun­try such as In­dia.

The use of satel­lite data in a het­ero­ge­neous coun­try such as ours also re­quires in­ten­sive use of on-ground re­sources in sev­eral cases, as dis­cussed above. More­over, economists are still grap­pling with chal­lenges in in­ter­pret­ing the in­for­ma­tion from these new and big datasets, which means in­fer­ences must be drawn with greater cau­tion.

Un­doubt­edly, the un­der­stand­ing of such data will evolve over time. At this point, it is best to think of these new data-sets as com­ple­ments to the tra­di­tional data sources col­lected by the reg­u­lar sta­tis­ti­cal ma­chin­ery.

Su­mit Mishra teaches eco­nom­ics at the In­sti­tute for Fi­nan­cial Man­age­ment and Re­search, Sri City.

Eco­nom­ics Ex­press runs ev­ery fort­night, and looks at the world through the lens of eco­nom­ics.

Newspapers in English

Newspapers from Malaysia

© PressReader. All rights reserved.