The Data Are In: Experiments In Policy Are Worth It

Our guest author today is David Dunning, professor of psychology at Cornell University, and a fellow of both the American Psychological Society and the American Psychological Association. 

When I was a younger academic, I often taught a class on research methods in the behavioral sciences. On the first day of that class, I took as my mission to teach students only one thing—that conducting research in the behavioral sciences ages a person. I meant that in two ways. First, conducting research is humbling and frustrating. I cannot count the number of pet ideas I have had through the years, all of them beloved, that have gone to die in the laboratory at the hands of data unwilling to verify them.

But, second, there is another, more positive way in which research ages a person. At times, data come back and verify a cherished idea, or even reveal a more provocative or valuable one that no one has never expected. It is a heady experience in those moments for the researcher to know something that perhaps no one else knows, to be wiser—more aged if you will—in a small corner of the human experience that he or she cares about deeply.

Here, I take as my mission to assert that is what true of the individual researcher is also true of policymakers trying to become wiser, or more aged, in the corner they most care about, whether it be education, health, the law, or the workplace. The best way to become wiser in the pursuit of effective policy is to subject one’s ideas to possible empirical verification through research. That research can take on many modes. Many new technologies allow for “data mining," combing through data (such as student grades or stock market moves), to try to find regular patterns. Or, one can take “found” data to try to determine the causes of important outcomes.

The Worth of Experiments

But I submit that the best way to subject one’s ideas to research verification is through the classic experiment, particularly the traditional random field trial, the history of which is well-told in Jim Manzi’s recent book, Uncontrolled. Nothing, as Manzi suggests, really outperforms auditioning a new idea in an experiment. Would changing the name of a store, for example, produce higher sales? No armchair analysis or argument will settle the issue as fast, or as decisively, as changing the name at a few locations and comparing what happens to similar stores where the name stays the same.

And research in the form of an experiment matters. Although research is meant to lead, sometimes it misleads. However, experiments lead more reliably to the truth than do other forms of empirical study. Ioannidis, in his provocative 2005 JAMA paper examining 49 highly influential intervention studies, found that 90 percent were successfully replicated in full when they were randomized trial experiments, but only 20 percent achieved such success when they were not. Instead, nonrandom studies tended to produce replication attempts that were failures or that generated reduced benefits.

Experiments can be successful in “aging” policymakers in both the ways I cited above. First, experiments can be cheap and quick ways for bad ideas, no matter how attractive, to be disconfirmed. Regrettably, there are many more ideas that are plausible than are correct, and experiments are well positioned to discover the ideas that fail to live up to their reputation. And, sometimes, the best benefit of a small experiment is avoiding the cost and wasted time of full-scale rollout that produces unintended consequences that are unwanted or that distracts policymakers from pursuing more effective interventions.

But experiments also reveal those ideas that work. And in education, for example, experiments are revealing the worth of a number of exciting ideas. For example, in a recent NBER paper, Eric Bettinger and Rachel Baker have discovered that “coaching” students through college does a better job at nudging them to complete their degrees than do programs that increase financial aid. Greg Walton and Geoff Cohen in a Science paper discovered the remarkable value of revealing to college students that their adjustment difficulties were temporary and shared by all.

No longer apt to blame their troubles to “not belonging”, African American students presented such insight owned higher GPAs three years afterward that cut the minority achievement gap in half. And, of course, there are the long-standing benefits resulting from the Perry and Abecedarian pre-school enrichment programs, which produced higher education and job rates, and lower teenage pregnancy rates, among participants at age 21 relative to those not assigned to these programs.

Enhancing the Value of Experiments

However, to involve experimentation in policymaking most effectively, one must be mindful of four basic facts. First, one should not base a policy on only one experiment. Experiments do point to the truth, but they do so with some imperfection, and so continued study and replication is always called for.

Second, replication is necessary because it allows for refinements or extensions of any policy or intervention. If there is any single lesson from my specific sub-field, social psychology, it is that details and nuance matter. Thus, experiments should take care to test for any impact due to nuance.

For example, Robert Cialdini recently found that he could boost the willingness of hotel guests to forgo having their towels and linens washed if they saw a card truthfully stating that most hotel guests preferred it that way. However, this rate was boosted even more if guests were told that most guests who had previously stayed in their exact hotel room had held this preference. And there is the classic finding of Leventhal and colleagues that they could increase vaccination rates nine-fold among college students in an intervention simply by making sure students signed up for an appointment and were given a map to the campus health center.

Third, it is crucial that experiments be conducted by those with the proper training. In recent years, behavioral scientists have been gratified to see colleagues in related fields, such as philosophy and the law, begin to collect their own data and conduct their own experiments. However, behavioral scientists have also seen these smart and motivated colleagues make “rookie” errors in procedure or data analysis that render their results valueless. For example, every behavioral scientist knows that one needs to switch around the order of questions in long surveys, lest questions at the end be influenced by boredom or fatigue. These simple mistakes can be caught and corrected by those with appropriate training, making that training essential to the enterprise.

Fourth, it is important for the behavioral scientist to fashion the research they conduct to best address the questions that policymakers may have. For example, my own research has inspired a few studies in medical education research to see if medical students and physicians hold accurate views of their professional skill. One can ask, for example, if doctors rate their skills too high or too low relative to their impressions of their colleagues. But that is not the most relevant question to ask. The most relevant question to ask is whether doctors know when to turn to a specialist for a consultation, that is, just before they are about to make a mistake.

Concluding Thoughts

In sum, the time is ripe for a new focus on experimentation to inform policymakers. With new technologies, it is just easier to reach people in the real world. All an experimenter needs is a tablet computer with a questionnaire loaded on it or the number of a respondent’s smart phone. With all these newly emerging advantages, it just makes sense to use experiments as a way to inform and shape policy.

That is, aging has its frustrations, but I do not know of anyone who would trade away the wisdom that it brings. And experiments are one of the most effective avenues available for a society to purchase such insight to the benefit of its members.

- David Dunning

Issues Areas

I don't know. I'd like to see your assertions tested in a randomized controlled trial. Let's pick 100 policy ideas, then randomly assign them to be evaluated using experimental and non-experimental research designs. Which ever group has a higher uptake rate will be the winner.


Great post. Thoroughly enjoyed it.

Our high school tried a randomized intervention just this year. Data is still coming in, but I don't think it "worked."

Some of our stakeholders believed it was working, so the randomization will help the very least, we'll tinker with the intervention, and continue to proceed cautiously.


Interesting article, thank you. What is considered to be a "long survey" that would require putting the questions in different order?