As psychology ‘cleans house,’ do the classic experiments hold up?
The urge to pull down statues extends well beyond the public squares of nations in turmoil. Lately it has been stirring the air in some corners of science, particularly psychology.
In recent months, researchers and some journalists have strung cables around the necks of at least three monuments of the modern psychological canon:
The famous Stanford Prison Experiment, which found that people play-acting as guards quickly exhibited uncharacteristic cruelty.
The landmark marshmallow test, which found that young children who could delay gratification showed greater educational achievement years later than those who could not.
And the lesser known but influential concept of ego depletion — the idea that willpower is like a muscle that can be built up but also tires.
The assaults on these studies are not all new. Each is a story in its own right, involving debates over methodology and statistical bias that have surfaced before in some form.
But since 2011, the psychology field has been giving itself an intensive background check, redoing more than 100 well-known studies. Often the original results cannot be reproduced, and the entire contentious process has been coloured, inevitably, by generational change and charges of patriarchy.
It is one thing to frisk the studies appearing almost daily in journals that form the current back-and-forth of behaviour research. It is somewhat different to call out experiments that became classics — and world-famous outside of psychology — because they dramatized something people recognized in themselves and in others.
They live in the common culture as powerful metaphors, explanations for aspects of our behaviour that we sense are true and that are captured somehow in a laboratory mini-drama constructed by an inventive researcher, or research team. The Stanford prison experiment is a case in point. In the summer of 1971, Philip Zimbardo, a midcareer psychologist, recruited 24 college students through newspaper ads and randomly cast half of them as “prisoners” and half as “guards,” setting them up in a mock prison, complete with cells and uniforms. He had the simulation filmed.
After six days, Zimbardo called the experiment off, reporting that the “guards” began to assume their roles too well. They became abusive, some of them shockingly so.
Zimbardo published dispatches about the experiment in a couple of obscure journals. He provided a more complete report in an article he wrote in the New York Times, describing how cruel instincts could emerge spontaneously in ordinary people as a result of situational pressures and expectations.
That article and Quiet Rage, a documentary about the experiment, helped make Zimbardo a star in the field and media favourite, most recently in the wake of the Abu Ghraib prison scandal in the early 2000s.
Perhaps the central challenge to the study’s claims is that its author coached the “guards” to be hard cases.
Is this coaching “not an overt invitation to be abusive in all sorts of psychological ways?” wrote Peter Gray, a psychologist at Boston University who decided to exclude any mention of the simulation from his popular introductory textbook.
“And, when the guards did behave in these ways and escalated that behavior, with Zimbardo watching and apparently (by his silence) approving, would that not have confirmed in the subjects’ minds that they were behaving as they should?”
Recent challenges have echoed Gray’s, and earlier this month Zimbardo was moved to post a response online.
“My instructions to the guards, as documented by recordings of guard orientation, were that they could not hit the prisoners but could create feelings of boredom, frustration, fear and a sense of powerlessness — that is, ‘we have total power of the situation and they have none,’” he wrote. “We did not give any formal or detailed instructions about how to be an effective guard.”
In an interview, Zimbardo said that the sim- ulation was a “demonstration of what could happen” to some people influenced by powerful social roles and outside pressures, and that his critics had missed this point.
Given modern ethics restrictions, mounting precise replications of old experiments is not always possible. The prison experiment would likely have to be seriously modified to pass institutional review.
When Brian Nosek, a professor of psychology at the University of Virginia, published his first major replication paper in 2015, finding that about 60 per cent of prominent studies did not pan out on a second try, it was a gift to skeptics eager to dismiss the entire field as a congregation of poorly anchored findings. It is not. On the contrary. Housecleaning is a crucial corrective in science, and psychology has led by example. But in science, as in life, there is reason for care before dragging the big items to the curb.
“We did not give any formal or detailed instructions about how to be an effective guard.” PHILIP ZIMBARDO CREATOR, STANFORD PRISON EXPERIMENT