Revisiting The "Best Evidence" Theory Of Charter School Performance
Among the more persistent arguments one hears in the debate over charter schools is that the “best evidence” shows charters are more effective. I have discussed this issue before (as have others), but it seems to come up from time to time, even in mainstream media coverage.
The basic point is that we should essentially dismiss – or at least regard with extreme skepticism - the two dozen or so high-quality “non-experimental” studies, which, on the whole, show modest or no differences in test-based effectiveness between charters and comparable regular public schools. In contrast, “randomized controlled trials” (RCTs), which exploit the random assignment of admission lotteries to control for differences between students, tend to yield positive results. Since, so the story goes, the “gold standard” research shows that charters are superior, we should go with that conclusion.
RCTs, though not without their own limitations, are without question powerful, and there is plenty of subpar charter research out there. That said, however, the “best evidence” argument is not particularly compelling (and it's also a distraction from the positive shift away from obsessing about whether charters do or don't work toward an examination of why). A full discussion of the methodological issues in the charter school literature would be long and burdensome, but it might be helpful to lay out three very basic points to bear in mind when you hear this argument.
1 – Only a relatively tiny handful of charters have ever been part of an RCT. This point is somewhat "cosmetic," but worth noting. As of now, there are ten random assignment studies of charter schools in the U.S. Ten studies sounds impressive, but if you add up the total number of schools across all these analyses, it comes to roughly 150-160 (for perspective, there are currently over 5,000 charters in the U.S., though these ten studies span the 2000s). And that figure "double counts" roughly 10-20 schools, as some of them are included in more than one analysis (e.g., the RCTs of Boston and Massachusetts include a bunch of the same charters).
Among the remaining eight analyses, five of them focus on a single school. All but two include data for fewer than 10 schools.
The two exceptions, which include about two-thirds of the charters among the 150-160, are the the Hoxby/Murarka/Kang analysis of roughly 75 New York City charters (discussed here and reviewed here), and the 36 charter middle schools, located across 16 different states, included in this Mathematica study (discussed here).
Drawing conclusions about charters in general from the results for such a small group is a bit of a stretch, especially given the second point, below.
2 – Oversubscribed charters are not representative of charters as a whole. It's no accident that there very few charters included in the group of RCTs thus far released. Typically, a charter school becomes “eligible” for an RCT by having an admission lottery. That usually means they have to be “oversubscribed” – i.e., they must have more applicants than available spaces.
So, unlike their students (from the pool of applicants), schools included in RCTs are not chosen randomly. Although it’s going too far to say that every oversubscribed charter is an excellent school, it’s a fairly safe assumption that they tend to be better than average (for instance, the Boston RCT found that oversubscribed charters outperformed those that weren't oversubscribed).
Half of the ten RCTs mentioned above focus on one or two schools in given area that are oversubscribed, including high-cost models such as the Harlem Children’s Zone, KIPP and the SEED boarding school in Washington, D.C. This is a form of cherrypicking (in the non-judgmental sense of the term). The only real exceptions are the studies of charters in New York and Boston, two cities with relatively small, well-funded and carefully-managed charter sectors (see here for a discussion of these two sectors).
In other words, to the degree demand signals quality, oversubscribed schools are not your average charters (see here and here for more on this issue of external validity). RCTs may show positive charter effects not because they are better analyses, but rather because they select better schools. It is not defensible to argue that even the most rigorous analyses of these schools can be used to draw conclusions about charter schools nationally (or even in a given location).
3 – The evidence so far suggests that experimental charter studies produce conclusions similar to those of non-experimental studies. As mentioned above, the “best evidence” theory is basically premised on the idea that there’s an unacceptably high probability that the results from two decades worth of research on charter school performance would be substantially different were experimental evaluations conducted.
And this is in many respects the crux of the matter: Do RCTs and "non-experimental" methods produce different conclusions?
The evidence on this score is scarce (mostly because RCTs are so rare), but, so far, the research suggests that they do not. In fact, three of the experimental analyses included in the “best evidence” repertoire – the report on Boston charters, the Massachusetts study (including Boston), and Mathematica’s reanalysis of charter middle schools – tested this proposition, directly or indirectly, by performing the analysis both ways (the latter in a very rigorous manner). They all found that the two approaches produced similar conclusions, though, in some cases, the magnitudes of the differences varied between approaches (and, for Boston pilot schools, the divergence was substantial, but the sample was quite small).*
So, it fair to say that this issue remains an open question, but that, so far, the comparisons have been encouraging. It is, of course, entirely possible that future analyses will tell a different story (one which, by the way, wouldn't necessarily show that RCTs are biased in favor of regular public schools).
In the meantime, for all their limitations, there is little basis for the level of skepticism of strong "non-experimental" analyses that would be required to dismiss this considerable body of research in favor of the RCT results for a small group of oversubscribed charters.
In general, charter schools are one of the few reforms under heavy discussion today that actually has a solid, long-term research base. If some charter supporters want to argue that this body of evidence – RCT and otherwise – suggests that charters might be more effective with low-income students, or when they set up shop in urban areas, there’s a case to be made there (though the evidence is far more mixed than is sometimes implied).
However, the research overall is rather clear - there are good, bad and medium charters, and the same goes for regular public schools.
Those who cling to the “best evidence” theory certainly score points for social scientific caution, but taking this viewpoint too far - i.e., essentially ignoring all research but RCTs - seems like wishful thinking more than anything else. And, more importantly, it distracts from the far more important task of trying to explain the wide variation in measured charter performance in terms of concrete policies and practices, which can inform all schools, regardless of their governance structures. The charter movement as a whole seems to be moving in this direction. Let’s hope this continues.
- Matt Di Carlo
* Similarly, one might compare RCTs and non-RCTs between-studies. For example, two major analyses of New York City charters – the Hoxby RCT mentioned above and this CREDO report – also reached the same broad-brush conclusions, though the magnitude of their estimates were a bit different, and the reasons for this disputed by both parties (see here and here).