More and more, it seems to know you better than you even know yourself
MENLO PARK, Calif. — We all know by now Facebook isn’t cool. Yet somehow it’s more popular than ever. This week, the company announced its growth continues to surge — not only in terms of the sheer number of Facebook users, but in terms of how much they use the site. On any given day, Mark Zuckerberg said, 63 per cent of Facebook’s 1.28 billion users log into the site. The proportion of users who log in at least six days a week has now surpassed 50 per cent.
How is it possible Facebook keeps getting more addictive over time, rather than less?
It’s possible because Facebook knows what you like — and it’s getting better at understanding you all the time.
As much work and data — your data — as Facebook feeds into its targeted advertising, it works at least as hard at figuring out which of your friends’ posts you’re most likely to want to see each time you open the app. Advertisers may butter Facebook’s bread, but its most pressing interest of all is in keeping its users coming back for more. If it ever fails, its advertising business will implode.
So how does Facebook know what we like? On a recent visit to the company’s headquarters in Menlo Park, Calif., I talked about that with Will Cathcart, who oversees the product management teams that work on the company’s news feed. The answer holds lessons for the future of machine learning, the media and the Internet at large.
Facebook launched the news feed in 2006, but it didn’t introduce the “like” button until a year later. Only then did the site have a way to figure out which posts you were actually interested in — and which new posts you might be interested in, based on what your friends and others were liking. In the years since its launch, the news feed has gone from being a simple chronological list to a machine learning product, with posts ranked in your timeline according to the likelihood that you would find them interesting. The goal is to ensure, for example, the first picture of your best friend’s new baby would take precedence over a remote acquaintance’s most recent Mafia Wars score.
For a while, Facebook likes — coupled with a few other metrics, such as shares, comments, and clicks — served as a pretty decent proxy for engagement. But they were far from perfect, Cathcart concedes. A funny photo meme might get thousands of quick likes, while a thoughtful news story analyzing the conflict in Ukraine would be punished by Facebook’s algorithms because it didn’t lend itself to a simple thumbs-up. The result was people’s news feeds became littered with the social-media equivalent of junk food. Facebook had become optimized for stories people Facebook-liked, rather than stories that people actually liked.
Worse, many of the same stories thousands of people Facebook-liked turned out to be ones thousands of other people genuinely hated. They included posts that had clicky headlines designed to score cheap likes and clicks, that actually led to pages filled with spammy ads rather than the content the headline promised. But in the absence of a “dislike” button, Facebook’s algorithms had no way of knowing which posts were turning users off. Eventually, about a year ago, Facebook acknowledged it had a “quality content” problem.
This is not a problem specific to Facebook. It’s a problem that confronts every company or product that harnesses data analytics to drive decisionmaking. So how do you solve it? For some, the answer might be to temper data- driven insights with a healthy dose of human intuition. But Facebook’s news feed operates on a scale and a level of personalization that makes direct human intervention infeasible. So for Facebook, the answer was to begin collecting new forms of data designed to generate insights the old forms of data — such as shares, comments and clicks — couldn’t.
Three sources of data in particular are helping Facebook to refashion its news feed algorithms to show users the kinds of posts that will keep them coming back: surveys, A/ B tests and data on the time users spend away from Facebook once they click on a given post — and what they do when they come back.
Surveys can get at questions that other metrics can’t, while A/ B tests offer Facebook a way to put its hunches under a microscope. Every time its developers make a tweak to the algorithms, Facebook tests it by showing it to a small percentage of users. At any given moment, Cathcart says, there might be 1,000 different versions of Facebook running for different groups of users. Facebook is gathering data on all of them, to see which changes are generating positive reactions and which ones are falling flat.
For instance, Facebook recently tested a series of changes designed to correct for the proliferation of “like-bait” — stories or posts that explicitly ask users to hit the “like” button in order to boost their ranking in your news feed. Some in the media worried Facebook was making unjustified assumptions about its users’ preferences. In fact, Facebook had already tested the changes on a small group of users before it publicly announced them. “We actually very quickly saw that the people we launched that improvement to were clicking on more articles in their news feed,” Cathcart explains.
When users click on a link in their news feed, Cathcart says, Facebook looks very carefully at what happens next. “If you’re someone who, every time you see an article from the New York Times, you not only click on it, but go offsite and stay offsite for a while before you come back, we can probably infer that you in particular find articles from the New York Times more relevant” — even if you don’t actually hit “like” on them.
At the same time, Facebook has begun more carefully differentiating between the likes a post gets before users click on it and the ones it gets after they’ve clicked. A lot of people might be quick to hit the like button on a post based solely on a headline or teaser that panders to their political sensibilities. But if very few of them go on to like or share the article after they’ve read it, that might indicate to Facebook that the story didn’t deliver.
Some have speculated Facebook’s news feed changes were specifically targeting certain sites for demotion while elevating the ranking of others. That’s not the case, Cathcart insists. Facebook defines high-quality content not by any objective ranking system, but according to the tastes of its users. If you love Upworthy and find the Times snooze-worthy, then Facebook’s goal is to show you more of the former and less of the latter.
Each time you log in to Facebook, the site’s algorithms have to choose from among an average of 1,500 possible posts to place at the top of your news feed. “The perfect test for us,” Cathcart says, “would be if we sat you down and gave you all 1,500 stories and asked you to rearrange them from 1 to 1,500 in the order of what was most relevant for you. That would be the gold standard.” But that’s a little too much testing, even for Facebook.
For a lot of people, the knowledge Facebook’s computers are deciding what stories to show them — and which ones to hide — remains galling. Avid Twitter users swear by that platform’s more straightforward chronological timeline, which relies on users to carefully curate their own list of people to follow. There’s a reason Facebook’s engagement metrics keep growing while Twitter’s are stagnant. As much as we’d like to think we could do a better job than the algorithms, the fact is most of us don’t have time to sift through 1,500 posts on a daily basis. And so, even as we resent Facebook’s paternalism, we keep coming back to it.
Just maybe, if Facebook keeps getting better at figuring out what we actually like as opposed to what we just Facebook-like, we’ll start to actually like Facebook itself a little more than we do today.