** Also posted here on “Valerie Strauss’ Answer Sheet” in the Washington Post
About two weeks ago, the National Education Policy Center (NEPC) released a review of last year’s Los Angeles Times (LAT) value-added analysis – with a specific focus on the technical report upon which the paper’s articles were based (done by RAND’s Richard Buddin). In line with prior research, the critique’s authors – Derek Briggs and Ben Domingue – redid the LAT analysis, and found that teachers’ scores vary widely, but that the LAT estimates would be different under different model specifications; are error-prone; and conceal systematic bias from non-random classroom assignments. They were also, for reasons yet unknown, unable to replicate the results.
Since then, the Times has issued two responses. The first was a quickly-published article, which claimed (including in the headline) that the LAT results were confirmed by Briggs/Domingue – even though the review reached the opposite conclusions. The basis for this claim, according to the piece, was that both analyses showed wide variation in teachers’ effects on test scores (see NEPC’s reply to this article). Then, a couple of days ago, there was another response, this time on the Times’ ombudsman-style blog. This piece quotes the paper’s Assistant Managing Editor, David Lauter, who stands by the paper’s findings and the earlier article, arguing that the biggest question is:
...whether teachers have a significant impact on what their students learn or whether student achievement is all about ... factors outside of teachers’ control. ... The Colorado study comes down on our side of that debate. ... For parents and others concerned about this issue, that’s the most significant finding: the quality of teachers matters.
Saying “teachers matter” is roughly equivalent to saying that teacher effects vary widely - the more teachers vary in their effectiveness, controlling for other relevant factors
, the more they can be said to “matter” as a factor explaining student outcomes. Since both analyses found such variation, the Times
claims that the NEPC review confirms their “most significant finding."
The review’s authors had a much different interpretation (see their second reply). This may seem frustrating. All the back and forth has mostly focused on somewhat technical issues, such as model selection, sample comparability, and research protocol (with some ethical charges thrown in for good measure). These are essential matters, but there is also an even simpler reason for the divergent interpretations, one that is critically important and arises constantly in our debates about value-added.