Making Sense Of Florida's School And Teacher Performance Ratings
Last week, Florida State Senate President Don Gaetz (R – Niceville) expressed his skepticism about the recently-released results of the state’s new teacher evaluation system. The senator was particularly concerned about his comparison of the ratings with schools’ “A-F” grades. He noted, “If you have a C school, 90 percent of the teachers in a C school can’t be highly effective. That doesn’t make sense."
There’s an important discussion to be had about the results of both the school and teacher evaluation systems, and the distributions of the ratings can definitely be part of that discussion (even if this issue is sometimes approached in a superficial manner). However, arguing that we can validate Florida’s teacher evaluations using its school grades, or vice-versa, suggests little understanding of either. Actually, given the design of both systems, finding a modest or even weak association between them would make pretty good sense.
In order to understand why, there are two facts to consider.
First, as shown here and here, Florida’s school grades are largely driven by how highly students score on tests (e.g., status measures such as proficiency rates), rather than how quickly they make progress (e.g., growth model estimates).
This is not only because 50 percent of the grades are based nominally on straight proficiency rates in four subjects, but also because another 25 percent comes from a growth measure that is rather redundant with proficiency, as a direct result of how the state codes students as “making gains” (see here). In addition, the absolute performance measures vary a great deal more than the "gains" scores, which means the former play a larger role in determining final school scores than their assigned weights suggest.
This doesn't necessarily mean it's a "bad" system - accountability is about incentives, and there’s a role for measures that capture both status and growth. It does, however, mean that Florida's “A-F” grades tell you much more about the performance levels of students at a given school than the school’s contribution to those levels, which is why the grades are strongly associated with characteristics such as subsidized lunch eligibility (a rough proxy for poverty/income).
The second fact to keep in mind here is that Florida’s teacher evaluations, in contrast with its school grades, are intended to capture teachers’ performance in a manner that is, to the degree possible, independent of students’ absolute performance levels. Most notably, the value-added model that Florida uses controls directly for how highly students score in the previous year, which is the primary reason why the scores it yields, unlike those from the school grading system, are not correlated significantly with measurable student characteristics such as poverty (a finding to which state officials very eagerly called attention).
Similarly, the other major component of the evaluations – classroom observations – are also supposed to informally “control” for students’ performance levels. An observation score, to whatever extent it can be avoided, should not penalize teachers in classes with large numbers of lower-scoring students.
So, in summary: Florida’s schools grades are heavily driven by students’ absolute performance levels, while its teacher evaluation ratings are designed to be independent of those levels. Again, both are matters of degree, and there are other reasons to expect to find some level of concentration of “lower-performing” teachers in schools with lower absolute performance scores (e.g., recruitment/retention issues).
That said, Florida’s school and teacher rating systems are, by design, measuring different things. If anything, an extremely strong relationship between the grades and evaluation ratings might be seen as a red flag that the latter are biased. At the very least, validating one by assuming it must match up with the other is, to put it gently, inappropriate.
Perhaps we can give Senator Gaetz a pass for not being intimately familiar with the properties of these systems’ measures. But there are certainly people in Florida with a better grasp of these issues, and let’s just hope their voices are being heard.
- Matt Di Carlo