Throughout the process of reforming teacher evaluation systems over the past 5-10 years, perhaps the most contentious, discussed issue was the importance, or weights, assigned to different components. Specifically, there was a great deal of debate about the proper weight to assign to test-based teacher productivity measures, such estimates from value-added and other growth models.
Some commentators, particularly those more enthusiastic about test-based accountability, argued that the new teacher evaluations somehow were not meaningful unless value-added or growth model estimates constituted a substantial proportion of teachers’ final evaluation ratings. Skeptics of test-based accountability, on the other hand, tended toward a rather different viewpoint – that test-based teacher performance measures should play little or no role in the new evaluation systems. Moreover, virtually all of the discussion of these systems’ results, once they were finally implemented, focused on the distribution of final ratings, particularly the proportions of teachers rated “ineffective.”
A recent working paper by Matthew Steinberg and Matthew Kraft directly addresses and informs this debate. Their very straightforward analysis shows just how consequential these weighting decisions, as well as choices of where to set the cutpoints for final rating categories (e.g., how many points does a teacher need to be given an “effective” versus “ineffective” rating), are for the distribution of final ratings.