Why data scientists reject a ‘Muslim ban’
Behavior, not beliefs, is best predictor of future actions
Because some Muslims want to harm Americans, wouldn’t banning or monitoring Muslims keep us safer? This question burns as a white-hot national controversy.
Even though President Trump’s travel ban targets by country rather than religion, the question of religious screening remains open, lacking broad agreement. Many stand behind the president’s campaign pledge to ban Muslims, his support for a Muslim registry, and his consideration of Muslim internment. They believe religious identity predicts risk and reason that adopting religious equality — meaning that religion plays no part in any kind of vetting — would endanger civilians.
They see such a policy as overly idealistic political correctness.
What best predicts behavior? Behavior. This stands as a core tenet in my field, predictive analytics (a.k.a. machine learning, a branch of data science), the science of learning from data to drive decisions. As its universal value solidifies, the private and public sectors are rapidly deploying this technology, propelling its market beyond a projected $6.5 billion within a few years.
What do the data tell us? Screening by a demographic category such as religion detrimentally weakens security; screening instead by behavior strengthens it. Religious screening compromises the advancements in security we stand to gain from the latest in data science.
When used most effectively, predictive analytics assesses individuals by their prior behavior. Behavioral data identifies risk more proficiently than demographic data. The reason for this is intuitive: What you’ve done reflects who you are, while the category in which you belong provides much less precise insight.
Religious screening operates like the bluntest of instruments. Banning an entire religious group from entry would eliminate some attacks, but so would banning, for example, all males between 18 and 35 (across religions). Such broad-stroked approaches would serve only as extremely approximate stand-ins for more revealing behavioral indicators, which can include prior online, financial, travel and criminal activities. Factors such as these present the best opportunity to improve security screening.
Predictive analytics automatically discovers effective combinations of predictive factors, i.e., patterns. Companies and governments apply these patterns to screen millions of individuals for risk on a daily basis. They predict whether you will commit fraud, miss a bill payment, default on a loan, cancel as a customer, quit your job, commit a crime, or respond poorly to a medical treatment. Across these domains, the data repeatedly show that your prior behavior provides the most powerful indicator of future behavior.
Admittedly, data analysis itself hasn’t settled the debate as to whether being Muslim is intrinsically predictive of terrorism. One side claims Islamic doctrines are more susceptible to the perversions of extremism. The other, with which I agree, contends that all major religions are susceptible, and that root causes such as geopolitical and socioeconomic factors are the true catalysts, even when terrorists claim they act in the name of a religion.
Tallying up which religion generates the most terrorism has failed to resolve the dispute, because the results vary
depending on which attacks are considered true terrorism, which are considered secular, and which time spans are included.
But data analysis does deliver one largely undisputed, critical insight: For no major religion do individual members present a particular danger. Even in the least favorable assessment, individual members of any one religion show an admissibly low risk level, usually below one hundredth of a percent.
As a result, religious identity fails to expose malicious intent. Like any other demographic category, religion places people into broad groups. If religions differ in their overall risk levels, this reflects only general trends rather than absolutes.
I’ve found that even my more conservative colleagues, who feel Islam is culpable, understand the culprits are a minuscule minority for any religion. Those colleagues agree that there are many millions of peaceful Muslims who — at least ideally — should be free to practice their faith without being treated differently. As for the opinion of my colleagues as an overall group, the majority of data scientists polled oppose Trump’s travel ban.
Ultimately, behavioral data always prevail. But only a steadfast investment in this more sophisticated, behaviorbased approach delivers the goods. During development, if behavioral data fail to out-predict religious screening, that is not a sign of failure — rather, it is a signal we must continue the efforts by collecting more behavioral data.
This practice pays off handsomely. Behavioral screening built upon enriched data strengthens security. This is the very process of bringing security fully into the Information Age. It’s also the antidote to religious screening, a practice that satisfies the definitions of religious intolerance, discrimination and prejudice (“prejudging” by religion). We can do better for both security and social justice, so we must.
Eric Siegel, Ph.D., is the founder of the Predictive Analytics World conference series, author of “Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die,” and a former computer science professor at Columbia University. To comment, submit your letter to the editor at www.sfchronicle.com/letters.