The Unfortunate Truth About This Year's NYC Charter School Test Results
There have now been several stories in the New York news media about New York City’s charter schools’ “gains” on this year’s state tests (see here, here, here, here and here). All of them trumpeted the 3-7 percentage point increase in proficiency among the city’s charter students, compared with the 2-3 point increase among their counterparts in regular public schools. The consensus: Charters performed fantastically well this year.
In fact, the NY Daily News asserted that the "clear lesson" from the data is that "public school administrators must gain the flexibility enjoyed by charter leaders," and "adopt [their] single-minded focus on achievement." For his part, Mayor Michael Bloomberg claimed that the scores are evidence that the city should expand its charter sector.
All of this reflects a fundamental misunderstanding of how to interpret testing data, one that is frankly a little frightening to find among experienced reporters and elected officials.
As I've discussed many times, short-term changes in raw state testing results (whether scores or rates), especially small changes, are almost always a poor gauge of schools' effectiveness. This is true for several reasons, one of them being that you're comparing two different groups of students (i.e., the data don't follow students over time).
In the case of charter schools, this is an even more critical consideration, as charter sectors tend to be relatively small and in rapid flux. It seems that almost nobody thought to check on this before drawing conclusions from this year's NYC’s charter schools' results.
Had they done so, they would have found that charter enrollment increased a rather incredible 25 percent between 2010-11 and 2011-12, to roughly 47,000 students (most of this jump is presumably due to 11 new schools opening). This means that almost 10,000 of these students, a large proportion of them in tested grades, were not in charters last year.
There is no way to know, based on the public data, how this affected the results.
(And this isn't even counting "normal" mobility - i.e., students entering/leaving the district or moving between sectors, as well as a new cohort of third graders entering the sample, while eighth graders leave it. When you add it all up, it's plausible that roughly half the charter students tested this year are not included in the tested sample from last year.)
In other words, the average scores for charter students as a whole are not comparable between years. The data by themselves cannot even tell you whether last year's charter students improved, to say nothing of how much of the change is due to actual school effectiveness, and how that compares with that of regular public schools (see here and here for actual evidence from 2009 and earlier).
So, every news story that presented the increase in average charter scores or proficiency rates as “gains," and used the results to draw conclusions about the effectiveness of the sector, is seriously - albeit unintentionally - misinforming the public.*
(And, by the way, this also goes for arguments in previous years asserting the opposite position, based on charter students’ lackluster results).
Charter schools and their students may have done very well over the past year, or they may not have. These data cannot tell us either way. Period.
- Matt Di Carlo
* The same could be said for coverage of the city's regular public schools (the sheer size of the city's regular public school population makes it less prone than charters to large amount of sampling variation, but this year's changes were so small that it's unsafe to draw any conclusions [for this and other reasons as well, including the fact that NY administered new, longer tests this year, designed by a new contractor]).
Good question. The short answer is no (at least not in this case). For one thing, standard educational variables (e.g., lunch program eligibility) are notoriously inept. They’re just “yes/no” variables, and ignore all the variation within these groups.
Looking at subgroups is helpful, and, in certain cases, when increases are very large and shared by subgroups, I think you can cautiously conclude that there has been some “real” improvement between years among students in the overlapping groups (though the magnitude of that improvement, as well as its causes, would remain open).
In this instance, given the growth of city’s charter enrollment, the small size of the changes, and the other factors involved (e.g., new tests), there’s no way to draw any conclusions about the overall sector.
By the way, this is a great paper on this topic, one that I should have cited in the post:
Thanks for your comment,