complex algorithms. These formulas reveal patterns that investigators can then compare with whatever the CDC or other health agencies report about the sickness. If a computer-generated prediction matches reality, we know the experts are onto something.
Search queries aren’t the only vein of data that researchers mine for flu clues. Svitlana Volkova, a data scientist at the Pacific Northwest National Laboratory, looks for gems of information on Twitter. She recently verified a new deep-learning method that probes tweets for signs of the flu. In an analysis of more than 170 million tweets posted over three years, Volkova and her colleagues found their model could accurately produce three-day forecasts of flu-like illnesses at a local level. That’s much quicker than waiting for flu reports from the CDC, which lag up to two weeks behind what’s happening in the world. (Facebook says it’s not in the flu-predicting business, so for now, your sick emoji doesn’t serve a greater good.)
Social media adds more data for researchers to work with, but it still has limitations. Annoyingly, the image we present online doesn’t always match the mucus-plagued person we are at home. Michael Paul, an information scientist at the University of Colorado at Boulder, recently found that people rarely tweet about their flu-like symptoms. In fact, the researchers found that people tweet less when they’re ill. So the next time your favorite Twitter personality seems oddly quiet, it could be because they’re sick of Twitter—but it might just be that they’re sick. Paul also investigated Instagram and found that acute illness is the least-common health topic for photo posting. Not surprisingly, flu-ridden people don’t love taking selfies.
Disease detectives, including Simonsen, hope that electronic health records could augment data from our tweets and posts. Insurance-claim forms, which list ailments and how they were treated, are particularly crucial. But people are typically reluctant to share private health data with researchers.
Epidemiologists would like to calm those privacy worries. They want only the numbers, never the names. But the final call ultimately lies with individuals. The public, Simonsen says, must weigh the balances: “Privacy on one side and the need to know more on the other.” That deliberation is even more pertinent since the EU implemented the General Data Protection Regulation this year—giving people more say in how their information is used.
Adding information from an app used to log health status—just as we do with fitness trackers or diet programs—could make big data-based flu forecasts even more accurate, Simonsen says. And private companies might come around: UNICEF is working with several, including IBM, to gather data in order to improve responses to global illnesses.
Ultimately, the potential for big data to predict the next flu pandemic might depend on people around the globe all oversharing our illnesses. The more we tweet about our #flu symptoms, the more data we generate. The more we allow companies to share that data with researchers, the more accurate they can make their predictions. And all that sharing, Volkova says, “will help the world.”