Acommon refrain around corporate spaces these days is “give me the data”, and it’s often heard at the end of a meeting when the boss is trying to wriggle out of spending money. Well, here’s what can happen when you give them the data.
Data shows there’s a correlation between the number of people who drown in pools and the number of films Nicolas Cage appears in. Data also shows a strong association between per capita cheese consumption and the number of people who die by becoming entangled in sheets. There are also worrying figures that have linked the divorce rate with margarine consumption, and another set linking marriage rates with the number of people who drown after falling out of a fishing boat.
Those correlations were collected by military intelligence analyst Tyler Vigen, who wanted to show the difference between correlation and causation and, along the way, also showed the many weird ways Americans die. His blog has now been turned into a book, Spurious Correlations, and it’s a timely reminder that data is a science that should be handled with care.
Data now has a halo of respectability in meeting rooms. It’s treated with respect (sometimes awe); it wins arguments; it’s dressed in graphs to make it more imposing; and it’s usually given the last word. The only way to beat data in a boardroom is to bring another set of statistics to the table.
The reason everyone is getting excited by data is that there’s so much of it around — and there’s enough computer power to crunch it into “meaningful” insights. Many bosses demand data when they can’t make decisions. The mantra in big companies is that their people should collect as much data as possible so the geeks in data analytics can dump it into algorithms and get the answers to every question, including those nobody ever thought of asking. It’s more fun than an executive toy.
As much fun as it is, some fear we are entering an era of dodgy data. One of those is Richard Nisbett, who wrote Mindware: Tools for Smart Thinking, and who is particularly worried about how data is used in health studies. For instance, he quotes the fact there is a statistical correlation between men who take vitamin E and a lower risk of prostate cancer. This finding got researchers so excited they did a real study on people and found vitamin E actually contributes to the likelihood of prostate cancer. So the data told men to pop a vitamin, while the experiment told them to get a chill pill.
The health industry is awash with popular correlations. For instance, when we learned people in the Mediterranean live longer, everyone rushed to the conclusion it was the olive oil. Others have linked the extended lifespan of Japanese to soy consumption or seaweed diets. And when the longevity of Greek men on the island of Icaria was discovered, people started drawing conclusions about the fact they didn’t wear watches, or they drink a special “mountain tea”. And, of course, we all drink a lot of red wine because long-lived French people do.
Now, all of those things may have something to do with a long life — olive oil is a good starting point. Or they might be part of the story — the French paradox might have more to do with cheese. Or it might be none of the story. But to sceptics it’s not good enough. For instance, that finding about vitamin E intake and prostate cancer didn’t ask the obvious question: what else are men who take vitamin E doing? Wouldn’t they be the people who are so interested in their health they also eat well, exercise, get regular medical checks and have the money to invest in their health?
The media is partly to blame. Data gives us headlines, it fits into big type and sometimes makes readers sit up and shake. But data should be a way into the conversation, not the final word. It should flag an area worth investigating, and then we should do what we’ve always done — ask good questions, look at it in context and be aware of the limitations of what it’s telling us. And when a boss says the data should provide the answer, ask the boss what the question is.