NAEP Shifting

** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post

Tomorrow, the education world will get the results of the 2011 National Assessment of Educational Progress (NAEP), often referred to as the “nation’s report card." The findings – reading and math scores among a representative sample of fourth and eighth graders - will drive at least part of the debate for the next two years, when the next round comes out.

I’m going to make a prediction, one that is definitely a generalization, but is hardly uncommon in policy debates: People on all “sides” will interpret the results favorably no matter how they turn out.

If NAEP scores are positive – i.e., overall scores rise by a statistically significant margin, and/or there are encouraging increases among key subgroups such as low performers or low-income students – supporters of market-based reform will say that their preferred policies are working. They’ll claim that the era of test-based accountability, which began with the enactment of No Child Left Behind ten years ago, have produced real results. Market reform skeptics, on the other hand, will say that virtually none of the policies, such as test-based teacher evaluations and merit pay, for which reformers are pushing were in force in more than a handful of locations between 2009 and 2011. Therefore, they’ll claim, the NAEP progress shows that the system is working without these changes.

If the NAEP results are not encouraging – i.e., overall progress is flat (or negative), and there are no strong gains among key subgroups – the market-based crowd will use the occasion to argue that the “status quo” isn’t producing results, and they will strengthen their call for policies like new evaluations and merit pay. Skeptics, in contrast, will claim that NCLB and standardized test-based accountability were failures from the get-go. Some will even use the NAEP results to advocate for the wholesale elimination of standardized testing.

Needless to say, most actual reactions won’t be this clear-cut – there will be an endless stream of articles, posts and policy briefs breaking down the results by state, subgroup and subject, all of them trying to discern some kind of meaning from the data, preferably one that fits in with their pre-existing positions. I’m all for that, even if much of it will reflect misinterpretation. No doubt I will do it myself. NAEP is a great test, and the data, used responsibly, can be very useful.

What I would like to see is for people on both “sides” to acknowledge that, no matter how the results turn out, they can’t be used to draw even moderately strong inferences about what works and what doesn’t. The main NAEP assessments provide a snapshot of math and reading performance among fourth and eighth graders at a single point in time. Even broken down by subgroup, the data can mask serious shifts in the conditions and characteristics of students taking the test. This is especially true given that the past two years are marked by severe economic hardship among U.S. families, as well as massive budget cuts to public education.

More importantly, test scores at this aggregate level – across entire states – cannot be used to make arguments about the causal impact of specific policies. There are just too many intervening factors, inside and outside of education policy, that can influence test scores. You can make suggestions, and present tentative evidence, and that’s all great. But you can’t draw anything resembling strong policy conclusions using these data alone. Period.

The fact that, in the past, all “sides” of the education debate have used the same NAEP data as “evidence” for their completely opposite viewpoints is indicative of the fact that they really aren’t suited for this purpose. So, let’s get on with the scatterplots and line graphs, but if we choose to offer our interpretations as to what they mean about education policy, let’s call this what it is: Speculation.

- Matt Di Carlo