Let’s say we were trying to evaluate a teacher’s performance for this academic year, and part of that evaluation would use students’ test scores (if you object to using test scores this way, put that aside for a moment). We checked the data and reached two conclusions. First, we found that her students made fantastic progress this year. Second, we also saw that the students’ scores were still quite a bit lower than their peers’ in the district. Which measure should we use to evaluate this teacher?
Would we consider judging her even partially based on the latter – students’ average scores? Of course not. Those students made huge progress, and the only reason their absolute performance levels are relatively low is because they were low at the beginning of the year. This teacher could not control the fact that she was assigned lower-scoring students. All she can do is make sure that they improve. That’s why no teacher evaluation system places any importance on students’ absolute performance, instead focusing on growth (and, of course, non-test measures). In fact, growth models control for absolute performance (prior year’s test scores) so it doesn't bias the results.
If we would never judge teachers based on absolute performance, why are we judging schools that way? Why does virtually every school/district rating system place some emphasis – often the primary emphasis – on absolute performance?