The Education Reporter's Dilemma

I’ve written so many posts about the misinterpretation of testing data in news stories that I’m starting to annoy myself. For example, I’ve shown that year-to-year changes in testing results might be attributable to the fact that, each year, a different set of students takes the test. I’ve discussed the fact that proficiency rates are not test scores – they only tell you the proportion of students above a given line – and that the rates and actual scores can move in opposite directions (see this simple illustration). And I’ve pleaded with journalists, most of whom I like and respect, to write with care about these issues (and, I should note, many of them do so).

Yet here I am, back on my soapbox again. This time the culprit is the recent release of SAT testing data, generating dozens of error-plagued stories from newspapers and organizations. Like virtually all public testing data, the SAT results are cross-sectional – each year, the test is taken by a different group of students. This means that demographic changes in the sample of test takers influence the results. This problem is even more acute in the case of the SAT, since it is voluntary. Despite the best efforts of the College Board (see their press release), a slew of stories improperly equated the decline in average SAT scores since the previous year with an overall decline in student performance – a confirmation of educational malaise (in fairness, there were many exceptions).

I’ve come to think that there’s a fundamental problem here: When you interpret testing data properly, you don’t have much of a story.

How Cross-Sectional Are Cross-Sectional Testing Data?

In several posts, I’ve complained about how, in our public discourse, we misinterpret changes in proficiency rates (or actual test scores) as “gains” or “progress," when they actually represent cohort changes—that is, they are performance snapshots for different groups of students who are potentially quite dissimilar.

For example, the most common way testing results are presented in news coverage and press releases is to present year-to-year testing results across entire schools or districts – e.g., the overall proficiency rate across all grades in one year compared with the next. One reason why the two groups of students being compared (the first versus the second year) are different is obvious. In most districts, tests are only administered to students in grades 3-8. As a result, the eighth graders who take the test in Year 1 will not take it in Year 2, as they will have moved on to the ninth grade (unless they are retained). At the same time, a new cohort of third graders will take the test in Year 2 despite not having been tested in Year 1 (because they were in second grade). That’s a large amount of inherent “turnover” between years (this same situation applies when results are averaged for elementary and secondary grades). Variations in cohort performance can generate the illusion of "real" change in performance, positive or negative.

But there’s another big cause of incomparability between years: Student mobility. Students move in and out of districts every year. In urban areas, mobility is particularly high. And, in many places, this mobility includes students who move to charter schools, which are often run as separate school districts.

I think we all know intuitively about these issues, but I’m not sure many people realize just how different the group of tested students across an entire district can be in one year compared with the next. In order to give an idea of this magnitude, we might do a rough calculation for the District of Columbia Public Schools (DCPS).

What Americans Think About Teachers Versus What They're Hearing

The results from the recent Gallup/PDK education survey found that 71 percent of surveyed Americans “have trust and confidence in the men and women who are teaching children in public schools." Although this finding received a fair amount of media attention, it is not at all surprising. Polls have long indicated that teachers are among the most trusted professions in the U.S., up there with doctors, nurses and firefighters.

(Side note: The teaching profession also ranks among the most prestigious U.S. occupations – in both analyses of survey data as well as in polls [though see here for an argument that occupational prestige scores are obsolete].)

What was rather surprising, on the other hand, was the Gallup/PDK results for the question about what people are hearing about teachers in the news media. Respondents were asked, “Generally speaking, do you hear more good stories or bad stories about teachers in the news media?"

Over two-thirds (68 percent) said they heard more bad stories than good ones. A little over a quarter (28 percent) said the opposite.

Comparing Teacher Turnover In Charter And Regular Public Schools

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post

A couple of weeks ago, a new working paper on teacher turnover in Los Angeles got a lot of attention, and for good reason. Teacher turnover, which tends to be alarmingly high in lower-income schools and districts, has been identified as a major impediment to improvements in student achievement.

Unfortunately, some of the media coverage of this paper has tended to miss the mark. Mostly, we have seen horserace stories focusing on fact that many charter schools have very high teacher turnover rates, much higher than most regular public schools in LA. The problem is that, as a group, charter school teachers are significantly dissimilar to their public school peers. For instance, they tend to be younger and/or less experienced than public school teachers overall; and younger, less experienced teachers tend to exhibit higher levels of turnover across all types of schools. So, if there is more overall churn in charter schools, this may simply be a result of the demographics of the teaching force or other factors, rather than any direct effect of charter schools per se (e.g., more difficult working conditions).

But the important results in this paper aren’t about the amount of turnover in charters versus regular public schools, which can measured very easily, but rather the likelihood that similar teachers in these schools will exit.

Melodramatic

At a press conference earlier this week, New York City Mayor Michael Bloomberg announced the city’s 2011 test results. Wall Street Journal reporter Lisa Fleisher, who was on the scene, tweeted Mayor Bloomberg’s remarks. According to Fleisher, the mayor claimed that there was a “dramatic difference” between his city’s testing progress between 2010 and 2011, as compared with the rest of state.

Putting aside the fact that the results do not measure “progress” per se, but rather cohort changes – a comparison of cross-sectional data that measures the aggregate performance of two different groups of students – I must say that I was a little astounded by this claim. Fleisher was also kind enough to tweet a photograph that the mayor put on the screen in order to illustrate the “dramatic difference” between the gains of NYC students relative to their non-NYC counterparts across the state.  Here it is:

We Interrupt This Message

So, I’m reading an opinion piece by Harold Meyerson in the online edition of yesterday’s Washington Post. Meyerson starts by talking about how teachers’ unions get blamed for everything. All of a sudden, in the middle of the text, right after the second paragraph, the piece is interrupted by the following message:

(Watch a video of D.C. Schools Chancellor Michelle Rhee discussing the D.C. Public School system.)
Strange, I thought. Then, right after Meyerson gets going again, criticizing “Waiting for Superman” and hailing the Baltimore teachers’ contract as meaningful progress, I am interrupted yet again:
(For more opinions on the trouble with America's education system, read Jo-Ann Armao's "Is the public turning against teacher unions?" and a Post editorial "Education jobs bill is motivated by politics.")
Now I am taken aback. I’m reading this piece defending teachers’ unions, and at two separate points, in the middle of the text, the Post inserts links: one to an editorial implying that the education jobs bill is a gift to teachers’ unions; one to a video of Michelle Rhee; and the third a short article by Armao that is fair but has undertones. Opinions within opinions, it seems.

Talking About But Not Learning From Finland

Finland’s education system has become an international celebrity. Their remarkable results are being trumpeted, usually in the “What can we learn from them?" context. Yet a lot of the recent discussion about what we can learn – as far as concrete policies – has been rather shallow. 

Right now, the factoid that is getting the most play is that Finnish teachers come from the “top ten percent” of those entering the labor force, whereas U.S. teachers don’t. But without knowing the reasons behind this difference, this fact is not particularly useful.

Although there has been some interesting research on these issues (see here, here, here, here, here, and here), I still haven’t really seen a simple comparison of Finnish vs. American policies that can help us understand what they’re doing right (and perhaps what we’re doing wrong). I am not an expert in comparative education, but I have assembled a few quick lists of features and policies. Needless to say, I am not suggesting that we do everything Finland does, and cease doing everything they don’t. It's very difficult to isolate the unique effects of each of these policies. Also, more broadly, Finland is small (less than six million residents), homogeneous, and their welfare state keeps poverty and inequality at one of the lowest levels among all developed nations (the U.S. is among the highest).

But if we are going to learn anything from the Finnish system, it is important to lay out the concrete differences (I inevitably missed things, so please leave a comment if you have additions).

Teacher Quality On The Red Carpet; Accuracy Swept Under The Rug

The media campaign surrounding “Waiting for Superman," which has already drawn considerable coverage, only promises to get bigger. While I would argue – at least in theory – that increased coverage of education is a good thing, it also means that this is a critically important (and, in some respects, dangerous) time. Millions of new people will be tuning in, believing that they are hearing serious discussions about the state of public education in America and “what the research says” about how it can be improved.

It’s therefore a sure bet that what I’ve called the “teacher effects talking point” will be making regular appearances. It goes something like this: Teachers are the most important schooling factor influencing student achievement. This argument provides much of the empirical backbone for the current push toward changes in teacher personnel policies. It is an important finding based on high-quality research, one with clear policy implications. It is also potentially very misleading.

The same body of evidence that shows that teachers are the most important within-school factor influencing test score gains also demonstrates that non-school factors matter a great deal more. The first wave of high-profile articles in our newly-energized education debate not only seem to be failing to provide this context, but are ignoring it completely. Deliberately or not, they are publishing incorrect information dressed up as empirical fact, spreading it throughout a mass audience new to the topic, to the detriment of us all.

The Test-Based Language Of Education

A recent poll on education attitudes from Gallup and Phi Delta Kappan got a lot of attention, including a mention on ABC’s "This Week with Christian Amanpour," which devoted most of its show to education yesterday. They flashed results for one of the poll’s questions, showing that 72 percent of Americans believe that "each teacher should be paid on the basis of the quality of his or her work," rather than on a "standard-scale basis."

Anyone who knows anything about survey methodology knows that responses to questions can vary dramatically with different wordings (death tax, anyone?). The wording of this Gallup/PDK question, of course, presumes that the "quality of work" among teachers might be measured accurately. The term "teacher quality" is thrown around constantly in education circles, and in practice, it is usually used in the context of teachers’ effects on students’ test scores (as estimated by various classes of "value-added" models).

But let’s say the Gallup/PDK poll asked respondents if "each teacher should be paid on the basis of their estimated effect on their students’ standardized test scores, relative to other teachers." Think the results would be different? Of course. This doesn’t necessarily say anything about the "merit" of the compensation argument, so to speak, nor does it suggest that survey questions should always emphasize perfect accuracy over clarity (which would also create bias of a different sort). But has anyone looked around recently and seen just how many powerful words, such as "quality," are routinely used to refer to standardized test score-related measures? I made a tentative list.

Teachers Matter, But So Do Words

The following quote comes from the Obama Administration’s education "blueprint," which is its plan for reauthorizing ESEA, placing a heavy emphasis, among many other things, on overhauling teacher human capital policies:

Of all the work that occurs at every level of our education system, the interaction between teacher and student is the primary determinant of student success.

Specific wordings vary, but if you follow education even casually, you hear some version of this argument with incredible frequency. In fact, most Americans are hearing it – I’d be surprised if many days pass when some approximation of it isn’t made in a newspaper, magazine, or high-traffic blog. It is the shorthand justification – the talking point, if you will – for the current efforts to base teachers’ hiring, firing, evaluation, and compensation on students’ test scores and other "performance” measures.