Liv­ing big data

Financial Mirror (Cyprus) - - FRONT PAGE - By Alex Pent­land

Big data is made from the dig­i­tal trail that we leave be­hind when we use credit cards, mo­bile phones, or the Web. Used care­fully and ac­cu­rately, these data give us un­prece­dented scope to un­der­stand our so­ci­ety, and im­prove the way we live and work. But what works in the­ory may not trans­late well in the real world, where com­plex hu­man in­ter­ac­tions can­not al­ways be cap­tured, even by the most so­phis­ti­cated mod­els. Big data re­quires us to ex­per­i­ment on a big scale.

My own lab­o­ra­tory, for ex­am­ple, is build­ing a Web site which, based on Google maps, uses so­ci­ety’s dig­i­tal trail to map poverty, in­fant mor­tal­ity, crime rates, changes in GDP, and other so­cial in­di­ca­tors, neigh­bour­hood by neigh­bour­hood – all of which will be up­dated daily. This new ca­pa­bil­ity al­lows view­ers to see, for ex­am­ple, where govern­ment ini­tia­tives are work­ing or fail­ing.

But, while such im­pres­sive vi­su­al­i­sa­tion tools can dra­mat­i­cally en­hance trans­parency and pub­lic knowl­edge, they are sur­pris­ingly limited when ap­plied to solv­ing so­ci­ety’s prob­lems. One rea­son is that such rich streams of data en­cour­age spu­ri­ous cor­re­la­tions.

Even the use of the nor­mal sci­en­tific method no longer works; given so many mea­sure­ments, and so many more po­ten­tial con­nec­tions among what’s be­ing mea­sured, our stan­dard sta­tis­ti­cal tools gen­er­ate non­sen­si­cal re­sults. With­out know­ing all pos­si­ble al­ter­na­tives, we can­not form a limited, testable set of clear hy­pothe­ses. And if we can no longer rely on lab­o­ra­tory ex­per­i­ments to test causal­ity, we must test it in the real world, us­ing mas­sive vol­umes of real-time data. This in­volves mov­ing be­yond the closed, ques­tion-and-an­swer process typ­i­cal of the lab, and ap­ply­ing our ideas in so­ci­ety, ear­lier and more fre­quently than ever be­fore.

To see how things work in re­al­ity, we must con­struct liv­ing lab­o­ra­to­ries – that is, com­mu­ni­ties will­ing to try new ways of do­ing things (to be blunt, to act as guinea pigs). An ex­am­ple of such a liv­ing lab is the “open data city,” which I launched with the city of Trento in Italy, along with Tele­com Italia, Tele­fónica, the re­search univer­sity Fon­dazione Bruno Kessler, the In­sti­tute for Data Driven De­sign, and lo­cal com­pa­nies. Im­por­tantly, this liv­ing lab has the ap­proval and in­formed con­sent of all in­volved; they un­der­stand that they are par­tic­i­pat­ing in a gi­gan­tic ex­per­i­ment whose goal is to cre­ate a bet­ter way of liv­ing.

One ma­jor chal­lenge for a liv­ing lab is to pro­tect in­di­vid­ual pri­vacy with­out di­min­ish­ing the po­ten­tial for bet­ter govern­ment. The Trento lab, for ex­am­ple, will pi­lot my pro­posed “New Deal on Data,” which gives users greater con­trol over their per­sonal data through trust-net­work soft­ware such as our open PDS (Per­sonal Data Store) sys­tem. We hope that the abil­ity to share data safely, while pro­tect­ing pri­vacy, will en­cour­age in­di­vid­u­als, com­pa­nies, and gov­ern­ments to com­mu­ni­cate their ideas widely, and so in­crease pro­duc­tiv­ity and cre­ativ­ity across the en­tire city.

But the big­gest dif­fi­culty in us­ing big data to build a bet­ter so­ci­ety is be­ing able to de­velop a hu­man-scale, in­tu­itive un­der­stand­ing of so­cial physics. Al­though dense, con­tin­u­ous data and mod­ern com­pu­ta­tion al­low us to map many de­tails about so­ci­ety, and to ex­plain how com­mu­ni­ties might work, such raw math­e­mat­i­cal mod­els con­tain too many vari­ables and com­plex re­la­tion­ships for most people to un­der­stand.

What is needed is some kind of di­a­logue be­tween hu­man in­tu­ition and the com­pelling re­al­ity of big data – a di­a­logue that is cur­rently ab­sent from man­age­ment and govern­ment sys­tems. If big data is to be de­ployed ef­fec­tively, people must be able to un­der­stand and in­ter­pret the rel­e­vant sta­tis­tics.

This calls for a new un­der­stand­ing of hu­man be­hav­ior and so­cial dy­nam­ics that goes be­yond tra­di­tional eco­nomic and po­lit­i­cal mod­els. Only by de­vel­op­ing the sci­ence and lan­guage of so­cial physics will we be able to make a world of big data a world in which we want to live.

