Burden Of Proof, Benefit Of Assumption
** Also posted here on "Valerie Strauss' Answer Sheet" in the Washington Post
Michelle Rhee, the controversial former chancellor of D.C. public schools, is a lightning rod. Her confrontational style has made her many friends as well as enemies. As is usually the case, people’s reaction to her approach in no small part depends on whether or not they support her policy positions.
I try to be open-minded toward people with whom I don’t often agree, and I can certainly accept that people operate in different ways. Honestly, I have no doubt as to Ms. Rhee’s sincere belief in what she’s doing; and, even if I think she could go about it differently, I respect her willingness to absorb so much negative reaction in order to try to get it done.
What I find disturbing is how she continues to try to build her reputation and advance her goals based on interpretations of testing results that are insulting to the public’s intelligence.
In a New York Daily News op-ed, published earlier this week, Ms. Rhee once again offered up testing results during her time at the helm of D.C. schools as evidence that her policy preferences work. In supporting New York City Mayor Michael Bloomberg’s newly-announced plan to give bonuses to the city’s teachers, she pointed to her own plan in D.C., as a result of which she said “teachers are being rewarded for great work."
In the very next sentence, she goes on to say:
That’s not all. D.C. students have made strong academic gains. This progress by the city’s children, who were lagging so far behind their peers around the country, has been demonstrated on both city-administered tests and the federal test known as the National Assessment of Educational Progress. High school graduation rates also rose, and the city experienced public school enrollment increases for the first time in four decades. Sure, merit pay alone did not produce all of these successes. There is unfortunately no single solution that by itself can solve the many problems facing our schools.In other words, she’s saying that the performance bonus program produced some – but not all – of these “successes."
In order to assess the utter emptiness of this claim, we need to do a little housekeeping of her premises. Let’s put aside the fact that the test results to which she refers are not “gains” but cohort changes (comparisons between two different groups of students) and that these results were almost certainly influenced by changes in demographics among DCPS's rapidly-changing student body. Let’s also ignore the fact that, in the case of D.C.’s state tests (the DC-CAS), the district only releases proficiency rates and not test scores, which means that we really don’t know much about the actual performance of the “typical student."
Finally, and most importantly, let’s dismiss all the tried and true principles of policy analysis and assume that we can actually judge the effectiveness of a particular policy intervention by looking at changes in raw test scores immediately after it is implemented. In other words, we’ll assume that merit pay itself is at least partially responsible for any changes in scores that coincide with its implementation, rather than all the other policies and factors, school- and non-school, that influence results.
Even if we ignore the fact that all these premises are, at best, highly problematic – and give her the benefit of the doubt – the evidence she is using supports a conclusion that is opposite from the one she reaches.
DCPS’s NAEP scores and state test proficiency rates increased quite a bit between 2007 and 2009. Michelle Rhee’s performance pay plan awarded its first bonuses based on teacher evaluation results for the 2009-10 school year (the bonus amounts also depended on other factors, such as the poverty level of the school in which teachers work).
Since that time, DCPS performance on both tests has been largely flat.
DC-CAS proficiency rates for elementary school students are actually a few percentage points lower than they were in 2009, while the rates among secondary students are a few points higher. These are both rather modest (and perhaps not statistically significant) two-year changes. Although the timing of the NAEP TUDA test does not coincide perfectly with the start of the DC bonus program, scores were also statistically unchanged between 2009 and 2011 in three out of the four NAEP tests (fourth grade math, and fourth and eighth grade reading), while there was a moderate discernible increase in the average eighth grade math score.
For the most part, then, there was little meaningful change in DCPS testing performance over the past two full school years.
Moreover, the graduation rate was also essentially unchanged – moving from 72 percent in 2009 to 73 percent in 2010 (2011 rates will be released later this year).
So, according to the standards by which Ms. Rhee judges policy effects – and advocates for their expansion – her performance bonus program has not worked. And, for the record, the same thing goes for her other signature policies, including D.C.’s new evaluation system and the annual dismissals based on the results of that system.
Yet, on the pages of a major newspaper, she’s using the same evidence to say they’ve succeeded. It’s not only unsupportable – it’s downright nonsensical.
Luckily for Ms. Rhee, the standard by which she would have her policies be judged is completely inappropriate. The relative success or failure of her bonus program, new teacher evaluation system, and all her other policies is an open question – one that must be addressed by high-quality program evaluations that are specifically designed to isolate these policies’ effects from all the other factors that affect performance. She is also fortunate that, insofar as the purpose of merit pay plans, according to proponents, is to attract “better candidates” to the district and keep them around, the outcomes of this program – if there are any – would take several years to even begin to show up. We therefore should not judge this policy based on short-term testing (or non-testing) outcomes.
Put simply, Ms Rhee’s own evidentiary standards are so flawed that we must ignore the fact that, in this case, if anything, they actually work against her.
Michelle Rhee, while hardly alone in misinterpreting data to support policy beliefs, is a national figure. In many respects, she is the most prominent voice among the market-based reform crowd. Some of her ideas may even have some merit, and she is trying to make her case, but she does herself, her supporters, and the public a disservice by continuing to abuse evidence in an attempt to make it. This cheapens the debate and perpetuates a flawed understanding of the uses of data and research to inform policy. That benefits nobody, especially students.
- Matt Di Carlo