The Categorical Imperative In New Teacher Evaluations
There is a push among many individuals and groups advocating new teacher evaluations to predetermine the number of outcome categories – e.g., highly effective, effective, developing, ineffective, etc. - that these new systems will include. For instance, a "statement of principles" signed by 25 education advocacy organizations recommends that the reauthorized ESEA law require “four or more levels of teacher performance." The New Teacher Project’s primary report on redesigning evaluations made the same suggestion.* For their part, many states have followed suit, mandating new systems with a minimum of 4-5 categories.
The rationale here is pretty simple on the surface: Those pushing for a minimum number of outcome categories believe that teacher performance must be adequately differentiated, a goal on which prior systems, most of which relied on dichotomous satisfactory/unsatisfactory schemes, fell short. In other words, the categories in new evaluation systems must reflect the variation in teacher performance, and that cannot be accomplished when there are only a couple of categories.
It’s certainly true that the number of categories matters – it is an implicit statement as to the system’s ability to tease out the “true” variation in teacher performance. The number of categories a teacher evaluation system employs should depend on how on how well it can differentiate teachers with a reasonable degree of accuracy. If a system is unable to pick up this “true” variation, then using several categories may end up doing more harm than good, because it will be providing faulty information. And, at this early stage, despite the appearance of certainty among some advocates, it remains unclear whether all new teacher evaluation systems should require four or more levels of “effectiveness."