The Status Fallacy: New York State Edition

A recent New York Times story addresses directly New York Governor Andrew Cuomo’s suggestion, in his annual “State of the State” speech, that New York schools are in a state of crisis and "need dramatic reform." The article’s general conclusion is that the “data suggest otherwise.”

There are a bunch of important points raised in the article, but most of the piece is really just discussing student rather than school performance. Simple statistics about how highly students score on tests – i.e., “status measures” – tell you virtually nothing about the effectiveness of the schools those students attend, since, among other reasons, they don’t account for the fact that many students enter the system at low levels. How much students in a school know in a given year is very different from how much they learned over the course of that year.

I (and many others) have written about this “status fallacy” dozens of times (see our resources page), not because I enjoy repeating myself (I don’t), but rather because I am continually amazed just how insidious it is, and how much of an impact it has on education policy and debate in the U.S. And it feels like every time I see signs that things might be changing for the better, there is an incident, such as Governor Cuomo’s speech, that makes me question how much progress there really has been at the highest levels.

Before discussing the speech, however, I would like quickly to note that virtually none of what I am about to say actually pertains to whether or not Governor Cuomo’s general policy proposals are correct. This is not about the policies. It is about the interpretations of data used to craft and justify those policies.

Let’s start with the governor’s widely-reported reaction to the state’s teacher evaluation results. His “State of the State” speech included the following passage:

Now 38% of high schools students are college ready. 38%. 98.7% of high school teachers are rated effective. How can that be? How can 38% of the students be ready, but 98% of the teachers effective? 31% of third to eight graders are proficient in English, but 99% of the teachers are rated effective. 35% of third to eighth graders are proficient in math but 98% of the math teachers are rated effective. Who are we kidding, my friends? The problem is clear and the solution is clear. We need real, accurate, fair teacher evaluations.

Though not uncommon among advocates and policymakers, this comparison is, at best, highly misleading. In some respects, frankly, it is a little absurd.

For one thing, note that these student performance rates are based on the state’s new tests and higher standards, and that the proficiency rates were 75-85 percent just a few years ago (a point Aaron Pallas made, correctly, in the New York Times article mentioned above). Under the previous standards, the same comparison might have made the teacher evaluation results look fine. If your argument about school effectiveness is not robust to changes in proficiency cut scores, you're probably not on very firm ground.*

But there’s a deeper problem here. This argument is pure status fallacy – i.e., it says that the proportion of non-proficient students in a given state is a direct measure of the performance of schools in that state (and this is not even mentioning the idea that student and teacher “pass rates” can be directly compared, which is extremely shaky).

That may be a good soundbyte, but it is wrong, and it's wrong for precisely the same reasons why teacher evaluation systems would never judge teachers by how highly their students score at the end of the year. Doing so would be ridiculous, since everyone knows that a given teacher might be wonderfully effective in boosting test scores, and generate huge improvements among her students, but those students might still exhibit relatively low scores at the end of the year due to the simple fact that they started the year even further behind.

The same basic logic applies to schools, and there is great irony in the governor’s interpreting teacher evaluation results based on assumptions contradictory to those of the evaluations themselves.

So, whatever one thinks of New York’s evaluation results, or those of any other policy, assessing them in this manner may unduly influence one’s approach to policymaking. If, for instance, a policymaker (inappropriately) judges schools based on how many of their students are proficient in math and reading, he or she likely will hold ill-informed perceptions of the overall extent and distribution of the problem. He or she will be unable to assess accurately which schools/districts are effective and which are not, to use this inter-district variation to identify what works, and to understand the research that evaluates policies. In short, it’s very difficult, if not impossible, to make good policy when you don’t understand how to evaluate school performance and policy effects. It's a recipe for bad decision making.

(Side note: In my personal opinion, this is precisely what seems to be happening in New York, at least in the area of teacher evaluations.**)

Let’s move on to a different part of the speech:

For too many [education] is now the great discriminator and the truth is we have two systems; one for the rich and one for the poor and the greatest symbol of disparity is our failing schools. Students in failing schools lag well behind in virtually every academic category. State average for graduation is 76%, in a failing school it is 47%. Worse – more than nine out of ten students in failing schools are minority or poor students. Nine out of ten are minority or poor students.

Here again we see the status fallacy on full display. Since Governor Cuomo defines “failing schools” largely in terms of status (proficiency and/or graduation rates), and since the former is predominantly a function of student background, the schools deemed as “failing” are almost always those serving large proportions of students from disadvantaged backgrounds. The governor, in fact, makes this very point when he notes that “nine out of ten students in failing schools are minority or poor students.”

Yet, when he compares the graduation rates between “failing” and “non-failing” schools, he characterizes it as a gap in school effectiveness, when in reality his argument is tautological: Schools serving large proportions of lower-performing students tend to exhibit lower average performance.

To be clear, the fact that poor and minority students exhibit lower proficiency and graduation rates than their white, more affluent peers is very important, and it does have policy implications. By itself, however, it is far more an argument about the distribution of students across schools than it is about the distribution of school effectiveness across different populations of students.

And we can see the repercussions of this misinterpretation when we move on to the thrust of this part of the speech, which continues:

There are 178 failing schools in New York State. 77 have been failing for an entire decade. Over the last ten years, 250,000 children went through those failing schools while New York State government did nothing. Just think about that and that has to end this year. I understand the obstacles. I also understand what our students need to move forward. We should be ashamed of those numbers.

Any honest person must take seriously that there are, in fact, low-performing schools in New York, that many of them have been low-performing for a long time, and that something should be done about that – even if there disagreements about which schools are low-performing and the best interventions.

But the precision he projects here – 178 failing schools, 77 failing for an entire decade – is based on a definition of “failing school” that, again, is driven by the status fallacy. Schools with the lowest proficiency and graduation rates are “failing schools."

As we showed in a recent post, many of these schools are actually effective. They are boosting student testing outcomes and helping students graduate who otherwise wouldn’t have. Their proficiency/graduation rates are low because their students enter the schools performing relatively poorly (at least to the degree tests and graduation standards capture that), and they remain low over time because cross-sectional proficiency/graduation rate changes are not “growth” or "progress," and can remain flat even when students are making tremendous progress.

Of course, many of the schools in poor neighborhoods are actually “failing,” and all of them could use additional resources to help their students catch up. But it is not a coincidence that virtually all of the schools New York deems “failing” are in higher-poverty, minority communities. It is a direct consequence of the measures being used, and it is poorly-designed policy. There is a big difference between a low performing school and a school that serves relatively low performing students.

It is, for example, difficult to think of anything more unfair and destructive than closing schools, or even just labeling them as “failing,” when they are actually effective in serving the most disadvantaged student populations. But that’s exactly what’s happening in New York, and in a very high-profile manner, due in large part to misunderstanding of basic concepts of data interpretation and causal inference (the governor's office has since launched a PR campaign on this "failing schools" issue, and released a report listing each of them, while credulous advocacy organizations and reporters amplify what is essentially a massive misclassification due to misinterpreted data).

We find a similar situation when Governor Cuomo’s speech turns to school funding:

The education industry’s cry that more money will solve the problem is false. Money without reform only grows the bureaucracy. It does not improve performance. The state average per student is $8,000. The state average in a high-needs district is $12,000. A failing district like Buffalo, which has been a failing district for many, many years, the state spends $16,000 per student. So don’t tell me that if we only had more money, it would change. We have been putting more money into this system every year for a decade and it hasn’t changed and 250,000 will condemn the failing schools by this system.

In this passage, Governor Cuomo is basically saying that putting more money into the system, without accompanying reforms (presumably those he personally supports), will not help. Now, again, there’s more than a grain of truth here. Adequate funding is a necessary but not sufficient condition for strong school performance, and, indeed, it matters how you spend money; additional spending used unwisely can very well fail to yield benefits. That is why the research on school funding is large, complex and nuanced (see Baker 2012 for a review).

But Governor Cuomo’s conclusions, despite his single qualification (“without reform”), suggest virtually no familiarity with this body of evidence, and a reliance instead on simplistic conclusions based on the conflation of school and student performance.

He’s pretty clearly implying that spending more money is not a wise approach (“don’t tell me that if we only had more money…”), because, he claims, spending more does not improve performance. His evidence could not be more simple, or more sloppy. He instructs his constituents to look at Buffalo, which spends $16,000 per student and, yet, has remained a “failing district” for “many, many years.” Once again, he is calling Buffalo a “failing district” mostly because of its low proficiency/graduation rates, and he is asserting that money doesn’t matter because this “failing district” spends what seems like a lot of money but still exhibits low proficiency/graduation rates.

Yet, again, the low rates tell you virtually nothing about effectiveness (Buffalo serves a relatively disadvantaged student population), nor, as discussed above, do the persistence of these low rates. I have no doubt that Buffalo, like almost all districts, could stand some improvement on multiple fronts. And there’s a serious discussion to be had about that. But labeling the district as “failing” based largely on the students it serves, and juxtaposing that loaded term with a seemingly large per-pupil spending amount, may be good politics, but it is horrible policy analysis. If you misinterpret data on educational inputs and outputs, it’s almost impossible to understand the crucial relationship between them.

***

In summary, then, the status fallacy is not just some innocent, isolated nitpick. It plays a highly consequential role in our national policy and debate about education. That is why, in this single speech, the fallacy could be found underlying several pretty substantial misinterpretations, and these misinterpretations seem to have influenced several of the cornerstones of the governor’s education reform proposals going forward.

Granted, it is very important to acknowledge here that Governor Cuomo is absolutely not the only person who makes these mistakes - they are endemic (the governor is, perhaps, more forceful in his expression of them, and more drastic in his policy reactions).

Moreover, belief in the status fallacy does not necessarily mean that one’s policy proposals are wrong or misguided. It does, however, put one at risk of misdiagnosing problems and making poor decisions about solutions, while also perpetuating the flawed measurement that was institutionalized under NCLB, and continues to pollute our education debate and policymaking.

*****

* Every time a high-level public official employs this kind of rhetoric, it adds fuel to the fire set by opponents of Common Core, who claim that implementation of the new standards and tests will result “in more schools being labeled failing.” The veracity of this claim rests entirely on the assumption that school officials and policymakers will commit malpractice in their interpretation of testing data, and one cannot blame Common Core opponents for their skepticism when high-profile elected officials do so.

** Instead of examining the variation in evaluation results and its relationship to districts’ designs, Governor Cuomo has declared as “baloney” all districts’ systems, even though they vary quite widely in design and results. He has proposed altering evaluations to produce the results he desires by effectively cancelling the systems districts negotiated, and imposing new, state-determined scoring schemes, as well as by increasing the weight assigned to state growth measures from 25 to 50 percent. This is a clumsy, ill-considered means of producing a wider spread of final ratings. It also bears mentioning that the original state law on evaluations pre-determined the weights assigned to each component, as well as the scheme for converting final point totals into performance categories, but also required districts to negotiate scoring systems for their local "learning measures" and observations before they even got the first round of results. Finally, note that the governor frames his recommendations in part as a means to reduce testing. It is not entirely clear how this is the case, as his proposals would still require assessments for teachers in grades and subjects that do not have state tests.

Blog Topics

Issues Areas