Skip to:

Value-Added Versus Observations, Part Two: Validity


Thanks for your diligent, thoughtful treatment of this topic, Matt. There is much that is troubling about value-added’s rapid rise in public policy, not the least of which is few seem to have paused and asked some very fundamental (and seemingly obvious) questions about the reliability and validity of the measure. But, hey, let’s not let a lack of supporting empirical evidence thwart the advancement of a popular political agenda. An examination of value added gain scores for more than 7,296 cohorts of students in Ohio for 2009-10 and 2010-11 and the change in the VA gain for those cohorts from one year to the next yielded the following: Correlations: 1. VA gain in 2009-10 is significantly and negatively correlated to VA gain in 2010-11 (r= -0.351, p<.01). 57.7% of positive VA gain scores for a cohort in 2009-10 yielded negative VA gain scores for the same cohort in 2010-11 (2,074 out of 3,593). 61.1% of negative VA gain scores for a cohort in 2009-10 yielded positive VA gain scores for the same cohort in 2010-11 (2,262 out of 3,697), 2. VA gain in 2009-10 is significantly and negatively correlated to the change (increase/decrease) in VA gain in 2010-11 (r= -0.832, p<.01). 78.7% of positive VA gain scores for a cohort in 2009-10 yielded smaller VA gain scores for the same cohort in 2010-11 (2,831 out of 3,593). 80.9% of negative VA gain scores for a cohort in 2009-10 yielded larger VA gain scores for the same cohort in 2010-11. (2,994 out of 3,697) Accordingly, positive VA gain scores for a cohort in 2009-10 will almost certainly decrease for the same cohort in 2010-11, and negative VA gain scores for a cohort of students in 2009-10 will almost certainly increase for the same cohort in 2010-11. Similar analysis of value added gain scores from 2005-06, 2006-07, 2007-08, and 2008-09 revealed nearly identical results. Given the stature of value-added as an accountability metric in Ohio that is about to be elevated to a teacher effectiveness, evaluation, and compensation metric, I am alarmed by empirical evidence that suggests that instead of teacher effectiveness/impact on learning, what is really on display here is little more than regression to mean.

Thanks, Matt...awesome pair of posts that I'll be rereading. ;0) I guess my one question/reaction is not just about the accuracy of the VA results as much as it is about what the student tests are assessing in the first place. Current standardized tests offer feedback on only a very thin slice of the learning outcomes we want for our kids. Seems like that's a whole 'nother part of the conversation that may actually have to come before we get to the VA and observation piece.

Very interesting and informative post. Garners more questions than answers but contributes to the dialogue over effectiveness of classroom evaluations. A recent book (review below) has the potential of moving the evaluative process into the realm of professional development where I believe there is a strong likelihood of not only improving instruction, but also student achievement. Review follows: A Value Added Decision: To Support the Delivery of the Common Core Standards by Maria C. Guilott and Gaylynn Parker. Outskirts Press, 2012 (available on Amazon and as an E-Book) This book goes beyond its title to offer a “Values Added” aspect to bringing the cost down on staff development. In most school districts in the country the budget for professional development fosters school board knife-sharping. “Why not cut here? After all, the staff we haired are trained professionals, why spend more money on keeping up their skills when they can very well pay for it themselves through additional course work?” Guilott and Parker have offered a sensible, dynamic and focused staff development program through this little gem of a book. Here they create a method for principals to achieve a status as instructional leaders in their respective buildings instead of being seen as evaluators of something they know nothing about. Let’s face it. The faculty rooms of America are filled with teachers who resent being observed and evaluated by people who never taught their subject or who left the classroom because they were not good in the classroom. Good teachers know good teachers and they know who was not a good teacher. But this system of professional development partners an administrator with other excellent teachers in their buildings, it calls for frank observation of learning and analysis of why a particular method employed increased the learning possibility for students. Notice, the target is the earning not the teaching. A team of one administrator and a few teacher colleagues visit a classroom and focus on the kids learning not the teacher talking. These “walk-through’s” are not evaluative, they are learning experiences for the observing teachers who join the principal on a journey through the process of good teaching and increased student achievement. The principal acts as a guide helping her staff see good learning experiences happening right in their own buildings. As Grant Wiggins has noted, this form of “look-for,” “…Puts the camera on the players instead of the coach.” Guilott and Parker’s approach puts the camera on the learners learning rather than the teacher covering stuff. The experience ends with the likelihood that the teachers experiencing the CLW (Collegial Learning Walks) will begin to make similar changes in their own practice. The process foresees a different type of professional dialogue rather than the typical faculty room talk complaining about the frustrations of dealing with kids. Bringing the CLW approach to instructional leadership fosters action research on the part of the participants. It encourages a new relationship between administrator and teachers. It provides small and large school districts with an inexpensive reform process that can lead to increased student achievement. Finally, it meets the single most important criteria for professional improvement voiced in all the research done on instructional improvement… it allows teachers to see other teachers doing a good job; teachers want to learn from other good teachers. This little book is one of the most powerful instruments to bring about positive change in the classroom. If implemented in the proscribed manner, this process has the potential to help the American educational system turn that long awaited corner discussed in all the journals and in the media. It is a simple, elegant idea leading the average principal into the arena of truly becoming an instructional leader in his or her building. This book is a must read for teachers, principals, union leaders and superintendents. This is at the heart of a well-planned reform process that can begin at the building level. It reflects Jay McTighe’s suggestions to all school leaders who want to have their vision of professional growth flourish. This system calls for the leaders to, “Think big, act small and go for an early win in Iowa.”

In Washington, DC public schools (DCPS) value added is part of the formula that determines a teacher's Impact score, and, as such, directly affects both compensation and continued employment. Please contemplate the implications of the following inequity: At my school, there is one first grade classroom, one second grade classroom, and one combination first/second grade classroom. In assigning children to the combination class, the rationale seems to have been to combine the lowest performing second graders with a random group of first graders. Of the ten second graders in this class, six are English language learners and/or special education students. Not surprisingly, this group consistently makes a poor showing on the standardized tests administered five times a year, while the other second grade (with no special needs students, and the top five first graders from the year before) consistently outperforms the district average. As a testing grade, second grade outcomes carry weight At the end of the year the value added components of each teacher's professional evaluation will be vastly different, not because of any glaring disparity of skill or effort on their parts, but because of budget/enrollment factors and administrative decisions. I wonder if anyone can argue convincingly that this is a way a fair or valid use of test scores to evaluate teacher performance, or if it gives any useful information about the relationship between teacher competence and student outcomes.

Really great post. I think your paragraph about the high stakes use of VA compelling teaching to the test is a critical one. Not only does it result in a watered down and perverse curriculum, but it dilutes the data in itself. In other words, measuring something causes a change in that measurement, but also in the value of that measurement. To me, the tragedy of this is not just that it causes a teaching to the test. Teaching to the test would not be that terrible if the test was well-designed, and supported by a good curriculum. I teach to "the test" all the time, but sometimes "the test" is a presentation, or an essay test, a long paper. I think supporters of the use of VAM hear "teach to the test" and think to themselves "Why is that so bad? If the test assesses whether you can read, then why is teaching to the test such an awful thing? Kids need to know how to read." Needless to say, they don't spend the time to look at what the test assesses. Most do not understand the difference between different elements of reading (such as decoding and vocabulary). But when we look at what happens when we use high stakes metrics (and this is true in any field where there are high stakes metrics) we get a narrowing of pedagogy and conservative strategies. Much better to narrow your pedagogy and game the checklist than take a risk with innovation, or with skills or knowledge that may have long term benefits rather than benefits at the end of this year. So we then have the problem that teachers are just teaching to this year's test, not next years, or ten years. Part of me can't understand why the business community that supports VA doesn't realize that this is exactly the same situation that perverts American business and stifles innovation, which is one of the last comparative advantages that we have in today's global economy. Enron's focus on quarterly stock price led them to game that short term metric, not caring about long term value of the company. How can businessmen celebrate long-term visionaries like Steve Jobs or Bill Gates or any number of internet innovators, but then we go about discouraging the situations that make their long-term thinking possible in our education system. Charter schools or vouchers aren't magical innovation machines if they are held to the same high-stakes testing standards. And if the high stakes are the problem, why not free our public school teachers to innovate themselves?


This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.