Most Experiments Don’t Work, and That’s OK

block

Tony Evans, PhD :
In a new study, groups of researchers teamed up to conduct a “megastudy” testing different approaches to encourage people to go to the gym more regularly (Milkman et al., 2021). As part of the study, different teams of researchers proposed and tested the following three approaches:
Value affirmation: Participants completed a personality survey asking questions about their “grit” (their ability to stick with long-term goals) and were asked to reflect on the questions.
Defaults: By default, participants were scheduled to attend the gym three times each week.
Signed pledge: Participants wrote and signed a pledge to follow a regular workout schedule.
Which of the three approaches was successful in encouraging people to go to the gym more often?
The answer is “none of the above.” The study included a total of 54 different approaches. Out of those 54 approaches, only 45% were successful in encouraging participants to go to the gym more often. The researchers also tested whether the interventions had long-term effects on behavior. Out of the 45% of approaches that were successful in the short term, only 8% had lasting effects on participants after the conclusion of the study.
This study reveals two important things about applied behavioral science: First, changing people’s behavior is hard. Second, behavioral scientists need to be transparent about the experiments that don’t work out.
Human behavior is complicated and difficult to change. Even academic experts with years of experience (testing approaches that are based on scientific theories with decades of support) cannot reliably encourage people to go to the gym more often. We should not expect experiments to work out anywhere near 100% of the time; and we also probably should not trust a scientist who delivers amazing results with every new study.
Additionally, the difficulty of changing behavior means that we should plan studies based on the assumption that not everything will work. If the rate of success is 45%, then a study with three different approaches might make more sense than a study that focuses only on one approach. With three independent approaches, there is at least an 83% chance that at least one of them will work.
The second lesson from the megastudy from Milkman and colleagues is that researchers should be transparent about the interventions that don’t work out. In behavioral science, there is a strong tendency to focus on publishing significant results. For a long time, researchers focused on publishing the experiments that worked out and ignored (and in some cases, explicitly hid) the studies that failed to produce significant results. In the 1990s and early 2000s, it was common for researchers to conduct ten experiments; publish a paper based on the one that worked and forget about the nine experiments that failed (Baumeister, 2016). This left many readers (like me, as a graduate student) with the incorrect impression that good researchers were infallible. And it also creates pressure to make the results of failed experiments look better than they actually are. Part of the value of evidence-based behavioral science is that researchers can provide people with honest advice on what works (and what doesn’t). In this sense, being transparent about experiments that don’t work is key.
In the past, researchers focused primarily on talking about what works. But for behavioral science to have a lasting impact on organizations and public policy, it’s just as important to talk about what doesn’t work.

(Tony Evans, PhD is an assistant professor of social psychology at Tilburg University in the Netherlands).

block