The Impact Of Race To The Top Is An Open Question (But At Least It's Being Asked)

You don’t have to look very far to find very strong opinions about Race to the Top (RTTT), the U.S. Department of Education’s (USED) stimulus-funded state-level grant program (which has recently been joined by a district-level spinoff). There are those who think it is a smashing success, while others assert that it is a dismal failure. The truth, of course, is that these claims, particularly the extreme views on either side, are little more than speculation.*

To win the grants, states were strongly encouraged to make several different types of changes, such as adoption of new standards, the lifting/raising of charter school caps, the installation of new data systems and the implementation of brand new teacher evaluations. This means that any real evaluation of the program’s impact will take some years and will have to be multifaceted – that is, it is certain that the implementation/effects will vary not only by each of these components, but also between states.

In other words, the success or failure of RTTT is an empirical question, one that is still almost entirely open. But there is a silver lining here: USED is at least asking that question, in the form of a five-year, $19 million evaluation program, administered through the National Center for Education Evaluation and Regional Assistance, designed to assess the impact and implementation of various RTTT-fueled policy changes, as well as those of the controversial School Improvement Grants (SIGs).

The Categorical Imperative In New Teacher Evaluations

There is a push among many individuals and groups advocating new teacher evaluations to predetermine the number of outcome categories – e.g., highly effective, effective, developing, ineffective, etc. - that these new systems will include. For instance, a "statement of principles" signed by 25 education advocacy organizations recommends that the reauthorized ESEA law require “four or more levels of teacher performance." The New Teacher Project’s primary report on redesigning evaluations made the same suggestion.* For their part, many states have followed suit, mandating new systems with a minimum of 4-5 categories.

The rationale here is pretty simple on the surface: Those pushing for a minimum number of outcome categories believe that teacher performance must be adequately differentiated, a goal on which prior systems, most of which relied on dichotomous satisfactory/unsatisfactory schemes, fell short. In other words, the categories in new evaluation systems must reflect the variation in teacher performance, and that cannot be accomplished when there are only a couple of categories.

It’s certainly true that the number of categories matters – it is an implicit statement as to the system’s ability to tease out the “true” variation in teacher performance. The number of categories a teacher evaluation system employs should depend on how on how well it can differentiate teachers with a reasonable degree of accuracy. If a system is unable to pick up this “true” variation, then using several categories may end up doing more harm than good, because it will be providing faulty information. And, at this early stage, despite the appearance of certainty among some advocates, it remains unclear whether all new teacher evaluation systems should require four or more levels of “effectiveness."

Persistently Low-Performing Incentives

Today, the National Center on Performance Incentives (NCPI) and the RAND Corporation released a long-awaited experimental evaluation of teacher performance pay in Nashville, Tenn. It finds that performance bonuses have virtually no effect on student math test scores (there were small but significant gains by fifth graders, but only in two of the three years examined, and the gains did not last into sixth grade).

Since this is such a politically contentious issue, these findings are likely to spark a lot of posturing and debate. So it’s worth trying to put them in context. As I discussed in a prior post, we now have at least preliminary results from three randomized experimental evaluations of merit pay in the U.S., the first contemporary, high-quality evidence of its kind.  This Nashville report and the two previously-released studies – one from Chicago and one from New York City's schoolwide bonus program – reached the same conclusion: Performance bonuses for teachers have little or no discernable effect on student test scores. 

Although the NYC and Chicago findings are preliminary (the evaluations are still in progress), the NYC program provides schoolwide and not individual bonuses, and one additional study (Round Rock, Tex.) is yet to be released, the three already-released reports do represent a fairly impressive, though still very tentative, body of evidence on merit pay’s utility as a means to improve test scores.

And at this point, it’s a good bet that, when all the evaluations are final and the smoke has cleared, we will have to conclude that performance bonuses are, at the very least, a very unpromising policy for producing short-term test score gains. 

Why Aren't We Closing The Achievement Gap?

When it comes to closing the academic achievement gap between students from lower- and higher-income families, we share the fate of Greek mythological figure Sisyphus, who was sentenced to spend eternity pushing a giant rock uphill, watching it roll back down, and then repeating the task.

The gap in school performance comes “pre-installed," as it were, beginning well before children ever step foot in the classroom. By the time they enter kindergarten, poor children are already at a huge disadvantage relative to their counterparts from high-income families. By the time they take their first standardized test, the differences in vocabulary, background knowledge, and non-cognitive skills are so large that most poor children will never overcome them – no matter what school they attend, which teachers they are assigned to, or how these teachers are evaluated. And, like Sisyphus, whatever gap-closing progress we may make with each cohort of struggling students after they enter school, we must start all over again with the next.

What can be done? Stop putting out fires and prevent them – address the achievement gap before it widens.