About Value-Added And "Junk Science"
One can often hear opponents of value-added referring to these methods as “junk science." The term is meant to express the argument that value-added is unreliable and/or invalid, and that its scientific “façade” is without merit.
Now, I personally am not opposed to using these estimates in evaluations and other personnel policies, but I certainly understand opponents’ skepticism. For one thing, there are some states and districts in which design and implementation has been somewhat careless, and, in these situations, I very much share the skepticism. Moreover, the common argument that evaluations, in order to be "meaningful," must consist of value-added measures in a heavily-weighted role (e.g., 45-50 percent) is, in my view, unsupportable.
All that said, calling value-added “junk science” completely obscures the important issues. The real questions here are less about the merits of the models per se than how they're being used.
If value-added is “junk science” regardless of how it's employed, then a fairly large chunk of social scientific research is “junk science." If that’s your opinion, then okay – you’re entitled to it – but it’s not very compelling, at least in my (admittedly biased) view.
And those who hold this opinion will find that their options for using evidence to support their policy views are extremely limited. They should, for instance, cease citing the CREDO charter school study, which uses a somewhat similar approach – i.e., put simply, judging effectiveness by statistical comparison with schools serving similar students. In this sense, CREDO (and most of the charter school literature) must also be called “junk science."
Furthermore, what is the case against calling classroom observations “junk science” too? Even when done properly -- by well-trained observers observing multiple times throughout the year, observation scores also fluctuate over time and between raters, and they are subject to systematic bias (e.g., poorly-trained or vindictive principals).
You might believe that human judgment is a better way to assess performance than analyzing large-scale test score datasets, and you might be correct, but that's just an opinion, and it hardly means that all alternative measures are "junk" no matter their policy deployment.
It's also important to bear in mind the obvious fact that value-added has a wide range of research-related uses outside of high-stakes personnel decisions (including program evaluation). In fact, many of the conclusions from this literature are things with which few teachers would disagree - e.g., teachers vary widely in their measured performance, they improve a great deal during their first few years, etc.
In short, value-added models are what they are – sophisticated but imperfect tools that must be used properly. We can and should disagree about their proper uses, but calling the models "junk science" adds almost nothing of substance to that debate.
- Matt Di Carlo