Acom­mon re­frain around cor­po­rate spa­ces th­ese days is “give me the data”, and it’s of­ten heard at the end of a meet­ing when the boss is try­ing to wrig­gle out of spend­ing money. Well, here’s what can hap­pen when you give them the data.

Data shows there’s a cor­re­la­tion be­tween the num­ber of peo­ple who drown in pools and the num­ber of films Ni­co­las Cage ap­pears in. Data also shows a strong as­so­ci­a­tion be­tween per capita cheese con­sump­tion and the num­ber of peo­ple who die by be­com­ing en­tan­gled in sheets. There are also wor­ry­ing fig­ures that have linked the di­vorce rate with mar­garine con­sump­tion, and an­other set link­ing mar­riage rates with the num­ber of peo­ple who drown af­ter fall­ing out of a fish­ing boat.

Those cor­re­la­tions were col­lected by mil­i­tary in­tel­li­gence an­a­lyst Tyler Vi­gen, who wanted to show the dif­fer­ence be­tween cor­re­la­tion and cau­sa­tion and, along the way, also showed the many weird ways Amer­i­cans die. His blog has now been turned into a book, Spu­ri­ous Cor­re­la­tions, and it’s a timely re­minder that data is a sci­ence that should be han­dled with care.

Data now has a halo of re­spectabil­ity in meet­ing rooms. It’s treated with re­spect (some­times awe); it wins ar­gu­ments; it’s dressed in graphs to make it more im­pos­ing; and it’s usu­ally given the last word. The only way to beat data in a board­room is to bring an­other set of sta­tis­tics to the ta­ble.

The rea­son ev­ery­one is get­ting ex­cited by data is that there’s so much of it around — and there’s enough com­puter power to crunch it into “mean­ing­ful” in­sights. Many bosses de­mand data when they can’t make de­ci­sions. The mantra in big com­pa­nies is that their peo­ple should col­lect as much data as pos­si­ble so the geeks in data an­a­lyt­ics can dump it into al­go­rithms and get the an­swers to ev­ery ques­tion, in­clud­ing those no­body ever thought of ask­ing. It’s more fun than an ex­ec­u­tive toy.

As much fun as it is, some fear we are en­ter­ing an era of dodgy data. One of those is Richard Nis­bett, who wrote Mind­ware: Tools for Smart Think­ing, and who is par­tic­u­larly wor­ried about how data is used in health stud­ies. For in­stance, he quotes the fact there is a sta­tis­ti­cal cor­re­la­tion be­tween men who take vi­ta­min E and a lower risk of prostate can­cer. This find­ing got re­searchers so ex­cited they did a real study on peo­ple and found vi­ta­min E ac­tu­ally con­trib­utes to the like­li­hood of prostate can­cer. So the data told men to pop a vi­ta­min, while the ex­per­i­ment told them to get a chill pill.

The health in­dus­try is awash with pop­u­lar cor­re­la­tions. For in­stance, when we learned peo­ple in the Mediter­ranean live longer, ev­ery­one rushed to the con­clu­sion it was the olive oil. Oth­ers have linked the ex­tended life­span of Ja­panese to soy con­sump­tion or sea­weed di­ets. And when the longevity of Greek men on the is­land of Icaria was dis­cov­ered, peo­ple started draw­ing con­clu­sions about the fact they didn’t wear watches, or they drink a spe­cial “moun­tain tea”. And, of course, we all drink a lot of red wine be­cause long-lived French peo­ple do.

Now, all of those things may have some­thing to do with a long life — olive oil is a good start­ing point. Or they might be part of the story — the French para­dox might have more to do with cheese. Or it might be none of the story. But to scep­tics it’s not good enough. For in­stance, that find­ing about vi­ta­min E in­take and prostate can­cer didn’t ask the ob­vi­ous ques­tion: what else are men who take vi­ta­min E do­ing? Wouldn’t they be the peo­ple who are so in­ter­ested in their health they also eat well, ex­er­cise, get reg­u­lar med­i­cal checks and have the money to in­vest in their health?

The me­dia is partly to blame. Data gives us head­lines, it fits into big type and some­times makes read­ers sit up and shake. But data should be a way into the con­ver­sa­tion, not the fi­nal word. It should flag an area worth in­ves­ti­gat­ing, and then we should do what we’ve al­ways done — ask good ques­tions, look at it in con­text and be aware of the lim­i­ta­tions of what it’s telling us. And when a boss says the data should pro­vide the an­swer, ask the boss what the ques­tion is.

