The Smoke And The Fire From Evaluations Of Teach For America
A recent study by the always reliable research organization Mathematica takes a look at the characteristics and test-based effectiveness of Teach For America (TFA) teachers who were recruited as part of a $50 million federal “Investing in Innovation” grant, which is supporting a substantial scale-up of TFA’s presence in U.S. public schools.
The results of this study pertain to a small group of recruits (and comparison non-TFA teachers) from the first two years of the program – i.e., a sample of 156 PK-5 teachers (66 TFA and 90 non-TFA) in 36 schools spread throughout 10 states. What distinguishes the analysis methodologically is that it exploits the random assignment of students to teachers in these schools, which ensures that any measured differences between TFA and comparison teachers are not due to unobserved differences in the students they are assigned to teach.
The Mathematica researchers found, in short, that the estimated differences in the impact of TFA and comparison teachers on math and reading scores across all grades were modest in magnitude and not statistically discernible at any conventional level. There were, however, meaningful positive estimated differences in the earliest grades (PK-2), though they were only statistically significant in reading, and the coefficient in reading for grades 3-5 was negative (and not significant). Let’s take a quick look at these and other findings from this report and what they might mean.
The political controversy surrounding TFA overshadows its educational impact. The public reaction to the report was one of those half disturbing, half amusing instances in which the results seemed to confirm the pre-existing beliefs of the beholder. TFA critics pointed out that TFA teachers did no better overall, while TFA supporters noted that the comparison teachers had, on average, about 14 years of experience (i.e., TFA teachers did as well as comparison teachers with far more training and experience).
It is certainly fair to note that the TFA teachers were less experienced, and had less formal training, than their colleagues in the comparison sample (though it is also true that TFA supporters sometimes downplay the importance of teacher experience, and that TFA's model consists of short-term teaching commitments and compressed training regimens).
(One needed only to look past the executive summary to find that the report included estimates comparing TFA with novice non-TFA teachers - i.e., those with 1-2 years of experience. The results were basically the same in math, while TFA teachers’ estimated test-based performance was better than that of comparison novice teachers in reading. Neither coefficient, however, was statistically significant, perhaps due to the very small samples.)
In any case, research on TFA tends to provoke incredibly strong reactions among both supporters and opponents. In contrast, the results of this and prior studies on TFA, though they vary by level, location and the definition of the comparison group, are fairly mixed: the estimated differences in test-based effectiveness tend to be positive in math and nil in reading (e.g., Decker et al. 2004; Boyd et al. 2006; Kane et al. 2007; Xu et al. 2011; Clark et al. 2013; Hanson et al. 2014). In this sense, the “TFA issue” is more important politically than it is in terms of educational impact.
This results of this report were not entirely consistent with prior research on TFA. Many commentators characterized the results of this study as being generally in line with previous research on TFA. That’s not entirely true. To the degree that this new report found any differences, they were in reading, rather than math. With very few exceptions (e.g., Henry et al. 2014), prior research has tended to find the opposite, and this is a noteworthy discrepancy.
Perhaps it is due in part to the fact that this particular study was concentrated on early grades (PK-5), whereas most (but not all) previous research has been limited to grades 3-8 (and, in this latest evaluation, the test used for PK-2 teachers was of course different than that for teachers in grades 3-5). It is also possible that TFA's scale-up for this grant caused their recruits to be different from past cohorts. Or it may simply be the case that this small sample of teachers is not representative of TFA recruits during this time period (or that the comparison teachers are different). Regardless, this is something worth keeping an eye on in future studies, particularly among recruits that are part of this TFA expansion.
There are interesting differences and similarities between TFA and non-TFA teachers in terms of outcomes other than test-based effectiveness. With these evaluations of controversial issues, such as TFA, there's a temptation to skip right to the conclusions about estimated test-based productivity and ignore the rest. This is often (if not usually) a mistake. There are some very interesting descriptive statistics in the report, particularly the results of surveys of teachers in the sample. Bearing in mind that the two groups of teachers (TFA and comparison teachers) are different in terms of experience, background, etc., and that the samples are small and not necessarily representative, interesting findings include (but are not limited to) the following:
- TFA and comparison were asked about a dozen or so factors that hinder their classroom instruction, including student misbehavior, lack of resources, and conflicts among students. With relatively few exceptions, the proportions were similar across the two groups (and none of the differences were statistically significant), suggesting that the day-to-day difficulties of TFA and comparison teachers don’t differ drastically (Table IV.9);
- In contrast, there were substantial differences in terms of responses to questions about job satisfaction. For example, compared with non-TFA teachers, far smaller proportions of TFA teachers reported satisfaction with school-level factors such as the “professional caliber of colleagues,” “the leadership and vision of principals,” and “influence over school policies and practices.” Still, results were similar across the two groups in terms of some areas, particularly those pertaining to the profession in general, such as “personal fulfillment” and the “opportunity to help students succeed academically” (Table IV.10);
- Finally, almost 90 percent of TFA teachers reported that they did not plan to spend their entire careers as teachers, compared with about one quarter of comparison teachers (remember, again, that members of the latter group were older, on average, which means many had already made their career choices). Among TFA teachers who said they would be leaving teaching, about 43 percent intended to stay in the education field, and a roughly equal proportion said they planned to leave (the remaining 14 percent were undecided). In contrast, about four out of five non-TFA teachers who said they planned to leave the classroom expressed a desire to remain in education (Table IV.11).
We should be careful not to miss the big picture lessons from these TFA evaluations. The estimated differences between TFA and comparison teachers in terms of test-based effects may not always be large or clear cut, but that of course does not mean they do not carry policy implications, particularly for those of us who are neither pro- nor anti-TFA.
Now, on the one hand, it’s absolutely fair to use the results of this and previous TFA evaluations to suggest that we may have something to learn from TFA training and recruitment (e.g., Dobbie 2011). Like all new teachers, TFA recruits struggle at first, but they do seem to perform as well as or better than other teachers, many of whom have had considerably more experience and formal training.
On the other hand, as I've discussed before, there is also, perhaps, an implication here regarding the “type” of person we are trying to recruit into teaching. Consider that TFA recruits are the very epitome of the hard-charging, high-achieving young folks that many advocates are desperate to attract to the profession. To be clear, it is a great thing any time talented, ambitous, service-oriented young people choose teaching, and I personally think TFA deserves credit for bringing them in. Yet, no matter how you cut it, they are, at best, only modestly more effective (in raising math and reading test scores) than non-TFA teachers.
This reflects the fact that identifying good teachers based on pre-service characteristics is extraordinarily difficult, and the best teachers are very often not those who attended the most selective colleges or scored highly on their SATs. And yet so much of our education reform debate is about overhauling long-standing human resource policies largely to attract these high-flying young people. It follows, then, that perhaps we should be very careful not to fixate too much on an unsupported idea of the “type” of person we want to attract and what they are looking for, and instead pay a little more attention to investigating alternative observable characteristics that may prove more useful, and identifying employment conditions and work environments that maximize retention of effective teachers who are already in the classroom.