Teacher Evaluations: Don't Begin Assembly Until You Have All The Parts
** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post
Over the past year or two, roughly 15-20 states have passed or are considering legislation calling for the overhaul of teacher evaluation. The central feature of most of these laws is a mandate to incorporate measures of student test score growth, in most cases specifying a minimum percentage of a teacher’s total score that must consist of these estimates.
There’s some variation across states, but the percentages are all quite high. For example, Florida and Colorado both require that at least 50 percent of an evaluation must be based on growth measures, while New York mandates a minimum of 40 percent. These laws also vary in terms of other specifics, such as the degree to which the growth measure proportion must be based on state tests (rather than other assessments), how much flexibility districts have in designing their systems, and how teachers in untested grades and subjects are evaluated. But they all share that defining feature of mandating a minimum proportion – or “weight” – that must be attached to a test-based estimate of teacher effects (at least for those teachers in tested grades and subjects).
Unfortunately, this is typical of the misguided manner in which many lawmakers (and the advocates advising them) have approached the difficult task of overhauling teacher evaluation systems. For instance, I have discussed previously the failure of most systems to account for random error. The weighting issue is another important example, and it violates a basic rule of designing performance assessment systems: You should exercise extreme caution in pre-deciding the importance of any one component until you know what the other components will be. Put simply, you should have all the parts in front of you before you begin the assembly process.