Schools Aren't The Only Reason Test Scores Change

In all my many posts about the interpretation of state testing data, it seems that I may have failed to articulate one major implication, which is almost always ignored in the news coverage of the release of annual testing data. That is: raw, unadjusted changes in student test scores are not by themselves very good measures of schools' test-based effectiveness.

In other words, schools can have a substantial impact on performance, but student test scores also increase, decrease or remain flat for reasons that have little or nothing to do with schools. The first, most basic reason is error. There is measurement error in all test scores - for various reasons, students taking the same test twice will get different scores, even if their "knowledge" remains constant. Also, as I've discussed many times, there is extra imprecision when using cross-sectional data. Often, any changes in scores or rates, especially when they’re small in magnitude and/or based on smaller samples (e.g., individual schools), do not represent actual progress (see here and here). Finally, even when changes are "real," other factors that influence test score changes include a variety of non-schooling inputs, such as parental education levels, family's economic circumstances, parental involvement, etc. These factors don't just influence how highly students score; they are also associated with progress (that's why value-added models exist).

Thus, to the degree that test scores are a valid measure of student performance, and changes in those scores a valid measure of student learning, schools aren’t the only suitors at the dance. We should stop judging school or district performance by comparing unadjusted scores or rates between years.

Do Charter Schools Serve Fewer Special Education Students?

A new report from the U.S. Government Accountability Office (GAO) provides one of the first large-scale comparisons of special education enrollment between charter and regular public schools. The report’s primary finding, which, predictably, received a fair amount of attention, is that roughly 11 percent of students enrolled in regular public schools were on special education plans in 2009-10, compared with just 8 percent of charter school students.

The GAO report’s authors are very careful to note that their findings merely describe what you might call the “service gap” – i.e., the proportion of special education students served by charters versus regular public schools – but that they do not indicate the reasons for this disparity.

This is an important point, but I would take the warning a step further:  The national- and state-level gaps themselves should be interpreted with the most extreme caution.

Ohio's New School Rating System: Different Results, Same Flawed Methods

Without question, designing school and district rating systems is a difficult task, and Ohio was somewhat ahead of the curve in attempting to do so (and they're also great about releasing a ton of data every year). As part of its application for ESEA waivers, the state recently announced a newly-designed version of its long-standing system, with the changes slated to go into effect in 2014-15. State officials told reporters that the new scheme is a “more accurate reflection of … true [school and district] quality."

In reality, however, despite its best intentions, what Ohio has done is perpetuate a troubled system by making less-than-substantive changes that seem to serve the primary purpose of giving lower grades to more schools in order for the results to square with preconceptions about the distribution of “true quality." It’s not a better system in terms of measurement - both the new and old schemes consist of mostly the same inappropriate components, and the ratings differentiate schools based largely on student characteristics rather than school performance.

So, whether or not the aggregate results seem more plausible is not particularly important, since the manner in which they're calculated is still deeply flawed. And demonstrating this is very easy.

Making (Up) The Grade In Ohio

In a post last week over at Flypaper, the Fordham Institute’s Terry Ryan took a “frank look” at the ratings of the handful of Ohio charter schools that Fordham’s Ohio branch manages. He noted that the Fordham schools didn’t make a particularly strong showing, ranking 24th among the state’s 47 charter authorizers in terms of the aggregate “performance index” among the schools it authorizes. Mr. Ryan takes the opportunity to offer a few valid explanations as to why Fordham ranked in the middle of the charter authorizer pack, such as the fact that the state’s “dropout recovery schools," which accept especially hard-to-serve students who left public schools, aren’t included (which would likely bump up Fordham's relative ranking).

Mr. Ryan doth protest too little. His primary argument, which he touches on but does not flesh out, should be that Ohio’s performance index is more a measure of student characteristics than of any defensible concept of school effectiveness. By itself, it reveals relatively little about the “quality” of schools operated by Ohio’s charter authorizers.

But the limitations of measures like the performance index, which are discussed below (and in the post linked above), have implications far beyond Ohio’s charter authorizers. The primary means by which Ohio assesses school/district performance is the state’s overall “report card grades," which are composite ratings comprised of multiple test-based measures, including the performance index. Unfortunately, however, these ratings are also not a particularly useful measure of school effectiveness. Not only are the grades unstable between years, but they also rely too heavily on test-based measures, including the index, that fail to account for student characteristics. While any attempt to measure school performance using testing data is subject to imprecision, Ohio’s effort falls short.

The Stability Of Ohio's School Value-Added Ratings And Why It Matters

I have discussed before how most testing data released to the public are cross-sectional, and how comparing them between years entails the comparison of two different groups of students. One way to address these issues is to calculate and release school- and district-level value-added scores.

Value added estimates are not only longitudinal (i.e., they follow students over time), but the models go a long way toward accounting for differences in the characteristics of students between schools and districts. Put simply, these models calculate “expectations” for student test score gains based on student (and sometimes school) characteristics, which are then used to gauge whether schools’ students did better or worse than expected.

Ohio is among the few states that release school- and district-level value-added estimates (though this number will probably increase very soon). These results are also used in high-stakes decisions, as they are a major component of Ohio’s “report card” grades for schools, which can be used to close or sanction specific schools. So, I thought it might be useful to take a look at these data and their stability over the past two years. In other words, what proportion of the schools that receive a given rating in one year will get that same rating the next year?

In Ohio, Charter School Expansion By Income, Not Performance

For over a decade, Ohio law has dictated where charter schools can open. Expansion was unlimited in Lucas County (the “pilot district” for charters) and in the “Ohio 8” urban districts (Akron, Canton, Cincinnati, Cleveland, Columbus, Dayton, Toledo, and Youngstown). But, in any given year, charters could open up in any other district that was classified as a “challenged district," as measured by whether the district received a state “report card” rating of “academic watch” or “academic emergency." This is a performance-based standard.

Under this system, there was of course very rapid charter proliferation in Lucas County and the “Ohio 8” districts. Only a small number of other districts (around 20-30 per year) “met” the  performance-based standard. As a whole, the state’s current charter law was supposed to “open up” districts for charter schools when the districts are not doing well.

Starting next year, the state is adding a fourth criterion: Any district with a “performance index” in the bottom five percent for the state will also be open for charter expansion. Although this may seem like a logical addition, in reality, the change offends basic principles of both fairness and educational measurement.

Charter And Regular Public School Performance In "Ohio 8" Districts, 2010-11

Every year, the state of Ohio releases an enormous amount of district- and school-level performance data. Since Ohio has among the largest charter school populations in the nation, the data provide an opportunity to examine performance differences between charters and regular public schools in the state.

Ohio’s charters are concentrated largely in the urban “Ohio 8” districts (sometimes called the “Big 8”): Akron; Canton; Cincinnati; Cleveland; Columbus; Dayton; Toledo; and Youngstown. Charter coverage varies considerably between the “Ohio 8” districts, but it is, on average, about 20 percent, compared with roughly five percent across the whole state. I will therefore limit my quick analysis to these districts.

Let’s start with the measure that gets the most attention in the state: Overall “report card grades." Schools (and districts) can receive one of six possible ratings: Academic emergency; academic watch; continuous improvement; effective; excellent; and excellent with distinction.

These ratings represent a weighted combination of four measures. Two of them measure performance “growth," while the other two measure “absolute” performance levels. The growth measures are AYP (yes or no), and value-added (whether schools meet, exceed, or come in below the growth expectations set by the state’s value-added model). The first “absolute” performance measure is the state’s “performance index," which is calculated based on the percentage of a school’s students who fall into the four NCLB categories of advanced, proficient, basic and below basic. The second is the number of “state standards” that schools meet as a percentage of the number of standards for which they are “eligible." For example, the state requires 75 percent proficiency in all the grade/subject tests that a given school administers, and schools are “awarded” a “standard met” for each grade/subject in which three-quarters of their students score above the proficiency cutoff (state standards also include targets for attendance and a couple of other non-test outcomes).

The graph below presents the raw breakdown in report card ratings for charter and regular public schools.