The Proportionality Principle In Teacher Evaluations
Our guest author today is Cory Koedel, Assistant Professor of Economics at the University of Missouri.
In a 2012 post on this blog, Dr. Di Carlo reviewed an article that I coauthored with colleagues Mark Ehlert, Eric Parsons and Michael Podgursky. The initial article (full version here, or for a shorter, less-technical version, see here) argues for the policy value of growth models that are designed to force comparisons to be between schools and teachers in observationally-similar circumstances.
The discussion is couched within the context of achieving three key policy objectives that we associate with the adoption of more-rigorous educational evaluation systems: (1) improving system-wide instruction by providing useful performance signals to schools and teachers; (2) eliciting optimal effort from school personnel; and (3) ensuring that current labor-market inequities between advantaged and disadvantaged schools are not exacerbated by the introduction of the new systems.
We argue that a model that forces comparisons to be between equally-circumstanced schools and teachers – which we describe as a “proportional” model – is best-suited to achieve these policy objectives. The conceptual appeal of the proportional approach is that it fully levels the playing field between high- and low-poverty schools. In contrast, some other growth models have been shown to produce estimates that are consistently associated with the characteristics of students being served (e.g., Student Growth Percentiles).
However, while the proportional approach appeals to an intuitive sense of “fairness” for some, others rightly point out that it might be unfair to enforce proportionality through the structure of a growth model, and that a proportional evaluation system could generate inaccurate teacher ratings. Based on empirical evidence from a number of studies (e.g., see here and here), the concern is that schools serving advantaged students, and teachers at those schools, are more effective than those serving disadvantaged students, a fact that would not be captured by the output from a proportional model. As noted in Dr. Di Carlo’s original blog post, this is a plausible hypothesis for a variety of reasons related to the differences between high- and low-poverty schools in terms of resources, labor-market access, etc. (although it is difficult to confirm this hypothesis based on available data and models).
My co-author Jiaxi Li and I tackle this concern head-on in a new paper. We empirically evaluate the relative merits of using a proportional model – i.e., one that “levels the playing field” by design – and we do so by first simulating data in which teachers in more-advantaged circumstances are more effective than those in disadvantaged circumstances (in terms of “true” value-added, which we can only observe because we simulate it).
(Note that if this is not the case, the argument in favor of proportionality becomes much easier to make because it suggests that other models produce biased estimates.)
Using a proportional model in this situation could potentially lead to negative outcomes, because the model by its design forces there to be an equal share of teachers removed from both high- and low-poverty schools. Since we know that advantaged schools actually have better teachers in our simulated data, the proportional model will result in the removal of teachers who are, on average, more effective than the teachers removed using a model that does not impose proportionality. In other words, forcing the model to “equalize” outcomes between high- and low-poverty schools means that teachers in low-poverty schools may still be identified as ineffective even if they have a higher “true” value-added than some of their counterparts in high-poverty schools who are not identified.
We ask whether enforcing proportionality in this situation and using the model to inform decisions about teacher removals leads to better or worse test-based outcomes for students.
Interestingly, we find that proportional policies do not result in worse outcomes for students. This is because all of the reasons that lead to systematic differences in effectiveness for incumbent teachers in high- versus low-poverty schools also lead to systematic differences in effectiveness for replacement teachers in these schools.
Put differently, while the proportional model results in the removal of more teachers from low-poverty schools, who are of higher quality on average, than would be the case under a model that does not account for school poverty, it also results in the hiring of more teachers into positions at low-poverty schools (where these hires are of higher average value-added as well).
In short, then, when we take labor dynamics – i.e., the quality of replacements - seriously, proportionality has the potential to offer the policy benefits that we discuss in the original paper without harming workforce quality overall.
The results from our new paper, built around a hypothetical removal policy targeted at the least effective teachers, will translate directly to a symmetric retention-bonus policy targeted at retaining highly effective teachers.
A proportional retention-bonus system would spread state- or district-level rewards evenly across teachers in different circumstances rather than disproportionately awarding them to teachers in low-poverty schools.
It is also important to recognize that although our analysis is rooted in test-based evaluation metrics, our findings will generalize to other, non-test-based metrics including classroom observations, student surveys, etc. Emerging evidence indicates that there are systematic differences in teacher ratings based on these alternative metrics across schooling environments. The possibility of imposing proportionality on classroom observation scores and other non-test-based metrics should be given careful consideration as evaluation systems continue to come online.
Our work on proportionality is far from the last word on the topic – in fact, it is more like the first word. We began the conversation by highlighting a number of policy benefits that can result from the implementation of an educational evaluation system that enforces proportionality, and showed that significant costs of such a system, at least along the most obvious dimensions, seem unlikely.
Still, like anything else, proportional models have their limitations and implementation challenges. Our goal with this work is not to assert that proportional evaluations are clearly superior to all available alternatives. Instead, we have the more modest objective of simply bringing the idea to the table for serious discussion. Proportional evaluations should be viewed as a viable alternative for administrators charged with developing and implementing educational evaluation systems, and their strengths and weaknesses should be considered side-by-side with other available options.
- Cory Koedel