Skip to:

Value-Added, For The Record

Comments

(I posted this once and it disappeared. Sorry if this is a dupe) Given the "thin research" on VAM and high school, why would you write an entire post in favor of using VA to assess teachers without mentioning whether you would limit it to teachers of 3-8 or not? Seems like a big hole. So, do you think it's only appropriate for elementary school teachers, or do you think that all new teachers (and apparently new teachers only) should be assessed with VA? Second question, and this one is real: does VA take average proficiency level per classroomus into account? Teacher A has 30 students: 5 advanced, 15 proficient, 10 just below basic. Teacher B has 30 students: 5 basic, 10 below basic, 15 far below basic. Are the expectations for each teacher the same for each proficiency level? Or, a real-life example: http://educationrealist.wordpress.com/2012/02/24/algebra-student-distribution-an-example/ This is the actual distribution of incoming algebra ability at a Title I school (I collected the data myself, as one of the teachers). Should each teacher be assessed based on incoming ability only, even though some teachers had far fewer below basic students than others?

John Thompson- to your point, check out empirical evidence from TN on their public website: https://tvaas.sas.com/welcome.html?ad=DDm8vljOAeANuo4n Click "Reports" - "Scatterplots" - Pick a subject/grade/year - "Growth vs. % Econonomically Disadvantaged" - and "Add All" schools. You can view scatterplots of all TN schools ranked by % FRPL and see that high-poverty schools have a fair and equal ability to show growth as compared to affluent schools. Actually, many more of these high-poverty schools are above the growth standard (making more than the expected amount of progress with their students as measured by value-added) than not. Full disclosure, I work for SAS, which provides TN's value-added analysis. I think that looking through the actual data is the best way to see the level playing field across socioeconomic, demographic, SPED, and achievement groups. However, I also understand that not all value-added models are the same across states, nor are the policy decisions placed around them the same. Here is one of my blogs where I interviewed 2 NC teacher's to gather their opinions of value-added being used as one component of their evaluation system: http://blogs.sas.com/content/statelocalgov/2012/06/27/nc-teachers-voices-regarding-use-of-student-growth-in-educator-evaluations/

Hi Matt, As always your balance and nuanced approach are important. You are quite right that the battle is more about the firing (but also shaming) of teachers. I think if teachers and administrators were simply offered to be given the calculations as information for reference, there would be no problem. But when using something for policy and other high stakes decision making (down to the opening and closing of entire schools) it makes sense to require a high degree (.9+) of reliability and validity which, in their totality, I don't think these models can make a very strong claim to. But there is one last reason, perhaps most important, that these models should be used for information, more than for strict/automatic decisions. That is the role the tests themselves play in the curriculum. As a teacher, I was not at all worried about being fired. I was worried about how my entire curriculum was designed not to teach conceptual understanding of mathematics but to narrowly produce the scores that would make people look good. As a test prep coach and math major, I was acutely aware of the difference and how much it impacted all decision making--including mine. In my opinion, the overuse of the tests is the disease, the model and the debate are more symptom.

Nadja, You need a password to access it. Can you access it and find evidence that high-poverty NEIGHBORHOOD high schools are not disadvantaged? If so, I'd like to read it. If so, can you find a single objective study that confirms it. While we're comparing scatterplots, check out Tulsa's web site. Between 85% and 100% of all of their high schools have the identical pattern - high performing schools have high growth while low performing have low growth. So, maybe, Tennessee can do mass firings of high poverty high school teachers and not do injustices, but this is a big diverse nation. "Reformers" are imposing it on 90% low income districts, like mine, as well as places with many times as much per student funding. By the way, have you taught in a high-poverty NEIGHBORHOOD high school? If you have, you know that they typically have nothing in common with high poverty msagnet and charter schools. Please note that I choose my words carefully and I wrote of "schools where it is harder to raise test scores." If you've been in NEIGHBORHOOD schools you've seen that the out-of-control violence, the prohibitions of enforcing attendance, disciplinary and academic policies, and other policies set from above correlate pretty well with poverty, but its not one-to-one. Maybe you've seen a high-poverty NEIGHBORHOOD high school where teachers were allowed to enforce the rules, but I never have. Who knows? If the two teachers you cite have taught in the inner city, perhaps they've experienced something other than mayhem. I can't imagine a teacher who has taught in schools like mine who would trust a value-added model. And please remember, the issue isn't whether value-added might work some places. The issue is whether it will cause devasting harm to many. And the poorer the district, the more likely that value-added could prove an existential threat. In a place like my OKC, where there are twenty something other districts in the county so that teachers just need to extend their commute a few minutes to get away from the perfect store of value-added and intense concentrations of trauma and generational poverty, I don't think our district has a snowball's chance of surviving an extended period of high-stakes value-added.

Matt- I'm very appreciative of the thoughtful comments you have made. In the abstract, I can agree that value-added estimates might provide a useful, but limited signal regarding school or teacher performance. (As already noted, the tests themselves are very limited as measures. ) However, I think you would agree, the major problems arise in how states are using this information. It seems that there has been far too little research done to validate the overall evaluation systems being enacted, not just the role value-added measures will play. (I worry that many policymakers harbor a bias towards what they believe are objective quantitative measures of performance, having little understanding of or appreciation for the many subjective judgments that go into their construction.) I was pleased to see that Ms. Young has entered this conversation. I’ll state up front that I'm not convinced that EVAAS is purged of the bias introduced by unmeasured demographic characteristics. I was wondering what SAS's response was to the concerns raised by Dr. Bruce Baker in this post: http://schoolfinance101.wordpress.com/2011/11/06/when-vams-fail-evaluating-ohios-school-performance-measures/ Baker's work suggests that the school level Ohio value-added scores are negatively related to poverty (free and reduced price lunch) and special education status. When I looked at school level EVAAS (PVAAS) indexes published for Pennsylvania, I found that truancy rates (not typically part of state data collections) have a moderate negative (-.4) correlation to PVAAS reading scores. The Ballou, Mokher, and Cavalluzzo paper from last March's AEFP conference suggests that model choice and unmeasured demographic characteristics can have a very substantial impact on which teachers end up in the tails of the value-added distribution. So it seems to me that John Thompson’s concerns are very important and not easily dismissed. I think it incumbent on states to provide evidence regarding potential VAM bias that is evaluated by qualified third parties. From what I have observed, states don’t even seem to bother with putting out RFPs or solicit competitive bids for their vendors of value-added analyses. Relying solely on vendors for technical reviews of their own products is indefensible. Finally, if you know of any independent organization that is evaluating either the value-added models being sold to states or the overall evaluation systems they are being incorporated into, please pass that along. I recently saw a piece put out by the Center for American Progress. At least in the case of Pennsylvania, they only interviewed a couple of officials of the State Department of Education. I have no doubt they would have come up with a different picture had they bothered to dig deeper. Until these concerns are addressed and the evaluation systems undergo rigorous validation, I will retain my belief that value-added remains a noisy signal of performance and will find much greater value as a powerful research tool.

If value-added ratings based on one year's worth of data are highly volatile and even random, as they seem to be, why should they be included as part of teacher evaluation? Instead they will give teachers a false sense of confidence if they are high, and unfairly wreck their morale if they are low.

Matthew, Would a system be valid if 5 or 10 or 15% of teachers of a high poverty high school PER YEAR have their careers damaged or destroyed due a statistical model that can't control for poverty? How many teachers must see how many of their friends damaged by flawed guest-imates before they throw in the towell? How could urban teachers have any piece of mind when they can be fired, in part, because of circumstances beyond their control? Is it valid, in terms of policy, to take such risks in high schools without first doing the research on high schools? Of course, I support the firing of ineffective teachers. My preference would be PAR, but I'd prefer empowered, even unreformed, principals over a system that encorouges more bubble-in testing. I'd even reluctantly support the Grand Bargain where peer evaluators consider value-added, although there is a huge difference between using it to complement or check human observations, as opposed to the system which is now incentivized - where value-added can indict a teacher as ineffective. For many principals, the indictment would be tantamount to a conviction. But, value-added is systemically different for three reasons. Its most likely to produce an exodus of teaching talent from schools where it is harder to raise test scores. Secondly, abusive human evaluators, I bet, are distributed pretty equally across all types of schools. Value-added stacks EVERYTHING against urban scores, where fearful administrators clearly are more willing to sacrifice their teachers and play games with numbers. Steve nailed the biggest reason. Value-added will sentence more students to educational malpractice as teachers are pressured to teach to the test. And, given our victories recently, I'm reconsidering my support for the Grand Bargain. Now is the time to drive a stake through the heart of high-stakes test-driven experiments. Finally, you indicated you support 10 to 15% high stakes, which lowers the potential harm. But, graduation rates and attendence rates, often, are only 10%, each, of NCLB accountability. But, how many districts responded by completely fabricating those metrics? I bet they are the most falsified data in existance. In other words, even fairly small metrics are huge for some administrators and thus become their job #1. and that gets us back to why value-added is systemically more dangerous for poor schools. Certainly, that has been my experience where under-the-gun urban systems react with the most fear and retribution.

I'm stunned. I'm abssolutely stunned. I can't even gather myself to reply. So, I'll just ask this. What available evidence do you have that value-added is valid for high-poverty high schools? After I get over my shock, depending if you cite a source for high schools (and please don't cite Chetty et. al. because it excludes classes with over 25% of students are on IEPs)I'll want to what evidence you find to be persuasive. Or, are we teachers (and our students who will be sacrificed to more rote instruction) in the inner city to be seen as pawns to be sacrificed in the off chance that that evidence you cite might prove valid? I just can't believe you wrote that ...

Hi John, I'm not sure why you're stunned, but I guess it's good I wrote this post. In either case, I like the candor. For me to answer your question, you would have to define "valid," and, specifically, "valid for what?" As you know, validity is a property of the inferences one draws from the measures, rather than the measures themselves. This means that one cannot really address the validity of value-added without reference to how they’re used. Correct me if I'm wrong, but the language in your comment (“pawns to be sacrificed ”) makes it sound like you’re objecting to firing teachers with value-added. One of my big points in this post, which I may not have expressed clearly enough, is the distinction between value-added as potentially useful signal and value-added as the basis for high-stakes decisions. Accordingly, I would ask you the following questions: Are you okay with firing teachers based on classroom observations, or on some combination of non-test measures? If that's acceptable to you, can you produce evidence that observation ratings (or the other non-test measures) are "valid" as you define it? If you're not okay with firing, then this becomes a different conversation. Thanks for the comment, MD P.S. There is some research on value-added at the high school level, but it's a little thin, since most states only test in grades 3-8.

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.