Teacher Quality Only Matters If Students Come To School

The “no excuses” mantra in education started with an irrefutable premise: Nobody should use poverty as an excuse to tolerate dysfunctional pubic schools. For some (but not all) people, it eventually became an accusation as well, hurled at those who brought up the fact – often in a perfectly reasonable manner – that there is a strong, demonstrated relationship between income and achievement. But in its most virulent form, “no excuses” fosters the colonization of additional problems for which schools and teachers can be “held accountable."

Former DCPS Chancellor Michelle Rhee and her fiancée, Sacramento Mayor Kevin Johnson, were hosted by the University of Arkansas’ Clinton School of Public Service for a discussion on education that aired on C-Span a few weeks ago. The moderator asked the panelists for their views on the dismal conditions in many cities, and how that relates to efforts to improve neighborhood schools.

Rhee recounted a story from the final year of her chancellorship, in which she visited a school unannounced, arriving early in the morning. Many of the classrooms were mostly empty. When she inquired, she was told that attendance was low because it was Friday and raining. Rhee said that she was horrified, but continued to tour the school. She finally found a classroom that was full, and asked one of the students about the class. The student told Rhee that this was her favorite teacher.

Great Expectations

A couple of years ago, Eat Pray Love author Elizabeth Gilbert explored the negative side of our unrealistically high expectations for artists and, more generally, for those who rely on their creativity to make a living. In ancient Rome, Gilbert recounts, creativity was associated with a sort of divine spirit that came to human beings from some distant and unknowable source, for distant and unfathomable reasons. The Romans referred to this intangible spirit as a genius. An individual was not a genius, but rather had a genius - a magical entity who was believed to live in the walls of an artist's studio and who would come out and invisibly assist the artist with his/her work. The lesson Gilbert draws is one of humility (i.e., successes are not entirely ours – don’t be such a narcissist) and emancipatory relief (i.e., failures are not completely our fault either – can’t hurt to try).

What does all this have to do with education and teachers? It seems to me that our expectations for both teachers and artists are sometimes unrealistic and unproductive, if not detrimental. Great teachers are often portrayed as superheroes, unencumbered by anything that might distract them from their teaching crusade – "refusing to surrender to the combined menaces of poverty, bureaucracy, and budgetary shortfalls." As a recent article in The Atlantic explained, Teach for America now asks applicants to talk about how they have overcome the challenges in their lives and uses these answers to rate their perseverance.

Yet the meaning of "Great Teacher" rarely gets analyzed. Instead, our definition of greatness – or even competence – remains a convenient black box, leading some to suggest that the question of what makes a teacher great is less important than separating the wheat from the chaff. In turn, this reveals a simplistic and, in my view, negative assumption that greatness, unlike Gilbert’s genius, is a stable, static, innate, and independent attribute. You either have it or you don’t.

Value-Added In Teacher Evaluations: Built To Fail

With all the controversy and acrimonious debate surrounding the use of value-added models in teacher evaluation, few seem to be paying much attention to the implementation details in those states and districts that are already moving ahead. This is unfortunate, because most new evaluation systems that use value-added estimates are literally being designed to fail.

Much of the criticism of value-added (VA) focuses on systematic bias, such as that stemming from non-random classroom assignment (also here). But the truth is that most of the imprecision of value-added estimates stems from random error. Months ago, I lamented the fact that most states and districts incorporating value-added estimates into their teacher evaluations were not making any effort to account for this error. Everyone knows that there is a great deal of imprecision in value-added ratings, but few policymakers seem to realize that there are relatively easy ways to mitigate the problem.

This is the height of foolishness. Policy is details. The manner in which one uses value-added estimates is just as important – perhaps even more so – than the properties of the models themselves. By ignoring error when incorporating these estimates into evaluation systems, policymakers virtually guarantee that most teachers will receive incorrect ratings. Let me explain.

To Understand The Impact Of Teacher-Focused Reforms, Pay Attention To Teachers

You don’ t need to be a policy analyst to know that huge changes in education are happening at the state- and local-levels right now – teacher performance pay, the restriction of teachers’ collective bargaining rights, the incorporation of heavily-weighted growth model estimates in teacher evaluations, the elimination of tenure, etc. Like many, I am concerned about the possible consequences of some of these new policies (particularly about their details), as well as about the apparent lack of serious efforts to monitor them.

Our “traditional” gauge of “what works” – cross-sectional test score gains – is totally inadequate, even under ideal circumstances. Even assuming high quality tests that are closely aligned to what has been taught, raw test scores alone cannot account for changes in the student population over time and are subject to measurement error. There is also no way to know whether fluctuations in test scores (even fluctuations that are real) are the result of any particular policy (or lack thereof).

Needless to say, test scores can (and will) play some role, but I for one would like to see more states and districts commissioning reputable, independent researchers to perform thorough, longitudinal analyses of their assessment data (which would at least mitigate the measurement issues). Even so, there is really no way to know how these new, high-stakes test-based policies will influence the validity of testing data, and, as I have argued elsewhere, we should not expect large, immediate testing gains even if policies are working well. If we rely on these data as our only yardstick of how various policies are working, we will be getting a picture that is critically incomplete and potentially biased.

What are the options? Well, we can’t solve all the measurement and causality issues mentioned above, but insofar as the policy changes are focused on teacher quality, it makes sense to evaluate them in part by looking at teacher behavior and characteristics, particularly in those states with new legislation. Here’s a few suggestions.

A Big Fish In A Small Causal Pond

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post

In three previous posts, I discussed what I’ve begun to call the “trifecta” of teacher-focused education reform talking points:

In many respects, this “trifecta” is driving the current education debate. You would have trouble finding many education reform articles, reports, or speeches that don’t use at least one of these arguments.

Indeed, they are guiding principles behind much of the Obama Administration’s education agenda, as well as the philosophies of high-profile market-based reformers, such as Joel Klein and Michelle Rhee. The talking points have undeniable appeal. They imply, deliberately or otherwise, that policies focused on improving teacher quality in and of themselves can take us a very long way - not all the way, but perhaps most of the way - towards solving all of our education problems.

This is a fantasy.

Teacher Quality Is Not A Policy

I often hear the following argument: Improving teacher quality is more cost-effective than other options, such as reducing class size (see here, for example). I am all for evaluating policy alternatives based on their costs relative to their benefits, even though we tend to define the benefits side of the equation very narrowly - in terms of test score gains.

But “improving teacher quality” cannot yet be included in a concrete costs/benefits comparison with class size or anything else. It is not an actual policy. At best, it is a category of policy options, all of which are focused on recruitment, preparation, retention, improvement, and dismissal of teachers. When people invoke it, they are presumably referring to the fact that teachers vary widely in their test-based effectiveness. Yes, teachers matter, but altering the quality distribution is whole different ballgame from measuring it overall. It’s actually a whole different sport.

I think it is reasonable to speculate that we might get more bang for our buck if we could somehow get substantially better teachers, rather than more of them, as would be necessary to reduce class sizes. But the sad, often unstated truth about teacher quality is that there is very little evidence, at least as yet, that public policy can be used to improve it, whether cost-effectively or otherwise

How Many Teachers Does It Take To Close An Achievement Gap?

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post

Over the weekend, New York Times columnist Nick Kristof made a persuasive argument that teachers should be paid more. In making his case, he also put forth a point that you’ve probably heard before: “One Los Angeles study found that having a teacher from the 25 percent most effective group of teachers for four years in a row would be enough to eliminate the black-white achievement gap."

This is an instance of what we might call the "X consecutive teachers” argument (sometimes it’s three, sometimes four or five). It is often invoked to support, directly or indirectly, specific policy prescriptions, such as merit pay, ending tenure, or, in this case, higher salaries (see also here and here). To his credit, Kristof’s use of the argument is on the cautious side, but there are plenty of examples in which it used as evidence supporting particular policies.

Actually, the day after the column ran, in a 60 Minutes segment featuring “The Equity Project," a charter school that pays its teachers $125,000 a year, the school’s principal was asked how he planned to narrow the achievement gap with his school. His reply was: “The difference between a great teacher and a mediocre or poor teacher is several grade levels of achievement in a given year. A school that focuses all of its energy and its resources on fantastic teaching can bridge the achievement gap."

Indeed, it is among the most common arguments in our education policy debate today.  In reality, however, it is little more than a stylistic riff on empirical research findings, and a rough one at that. It is not at all useful when it comes to choosing between different policy options.

Students First, Facts Later

On Wednesday, Michelle Rhee’s new organization, Students First, rolled out its first big policy campaign: It’s called “Save Great Teachers," and it is focused on ending so-called “seniority-based layoffs."

Rhee made several assertions at the initial press conference and in an accompanying op-ed in the Atlanta Constitution Journal (and one on CNN.com). At least three of these claims address the empirical research on teacher layoffs and quality. Two are false; the other is misleading. If history is any guide, she is certain to repeat these “findings” many times in the coming months.

As discussed in a previous post, I actually support the development of a better alternative to seniority-based layoffs, but I am concerned that the debate is proceeding as if we already have one (most places don't), and that there's quite a bit of outrage-inspiring misinformation flying around on this topic. So, in the interest of keeping the discussion honest, as well as highlighting a few issues that bear on the layoff debate generally, I do want to try and correct Rhee preemptively.

The 5-10 Percent Solution

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post.

In the world of education policy, the following assertion has become ubiquitous: If we just fire the bottom 5-10 percent of teachers, our test scores will be at the level of the highest-performing nations, such as Finland. Michelle Rhee likes to make this claim. So does Bill Gates.

The source and sole support for this claim is a calculation by economist Eric Hanushek, which he sketches out roughly in a chapter of the edited volume Creating a New Teaching Profession (published by the Urban Institute). The chapter is called "Teacher Deselection" (“deselection” is a polite way of saying “firing”). Hanushek is a respected economist, who has been researching education for over 30 years. He is willing to say some of the things that many other market-based reformers also believe, and say privately, but won’t always admit to in public.

So, would systematically firing large proportions of teachers every year based solely on their students’ test scores improve overall scores over time? Of course it would, at least to some degree. When you repeatedly select (or, in this case, deselect) on a measurable variable, even when the measurement is imperfect, you can usually change that outcome overall.

But anyone who says that firing the bottom 5-10 percent of teachers is all we have to do to boost our scores to Finland-like levels is selling magic beans—and not only because of cross-national poverty differences or the inherent limitations of most tests as valid measures of student learning (we’ll put these very real concerns aside for this post).