Skip to:

Accountability

  • Student Sorting And Teacher Classroom Observations

    Written on February 25, 2016

    Although value added and other growth models tend to be the focus of debates surrounding new teacher evaluation systems, the widely known but frequently unacknowledged reality is that most teachers don’t teach in the tested grades and subjects, and won’t even receive these test-based scores. The quality and impact of the new systems therefore will depend heavily upon the quality and impact of other measures, primarily classroom observations.

    These systems have been in use for decades, and yet, until recently, relatively little is known about their properties, such as their association with student and teacher characteristics, and there are, as yet, only a handful of studies of their impact on teachers’ performance (e.g., Taylor and Tyler 2012). The Measures of Effective Teaching (MET) Project, conducted a few years ago, was a huge step forward in this area, though at the time it was perhaps underappreciated the degree to which MET’s contribution was not just in the (very important) reports it produced, but also in its having collected an extensive dataset for researchers to use going forward. A new paper, just published in Educational Evaluation and Policy Analysis, is among the many analyses that have and will use MET data to address important questions surrounding teacher evaluation.

    The authors, Rachel Garrett and Matthew Steinberg, look at classroom observation scores, specifically those from Charlotte Danielson’s widely employed Framework for Teaching (FFT) protocol. These results are yet another example of how observation scores share most of the widely-cited (statistical) criticisms of value added scores, most notably their sensitivity to which students are assigned to teachers.

    READ MORE
  • Beyond Teacher Quality

    Written on February 23, 2016

    Beyond PD: Teacher Professional Learning in High-Performing Systems is a recent report from the Learning First Alliance and the International Center for Benchmarking in Education at the National Center for Education and the Economy. The paper describes practices and policies from four high-performing school systems – British Columbia, Hong Kong, Shanghai, and Singapore – where professional learning is believed to be the primary vehicle for school improvement.

    My first reaction was: This sounds great, but where is the ubiquitous discussion of “teacher quality?” Frankly, I was somewhat baffled that a report on school improvement never even mentioned the phrase.* Upon close reading, I found the report to be full of radical (and very good) ideas. It’s not that the report proposed anything that would require an overhaul of the U.S. education system; rather, they were groundbreaking because these ideas did not rely on the typical assumptions about how the youth or the adults in these systems learn and achieve mastery. Because, while things are changing a bit in the U.S. with regard to our understanding of student learning – e.g., we now talk about “deep learning” – we have still not made this transition when it comes to teachers.

    In the U.S., a number of unstated but common assumptions about “teacher quality” suffuse the entire school improvement conversation. As researchers have noted (see here and here), instructional effectiveness is implicitly viewed as an attribute of individuals, a quality that exists in a sort of vacuum (or independent of the context of teachers’ work), and which, as a result, teachers can carry with them, across and between schools. Effectiveness also is often perceived as fairly stable: teachers learn their craft within the first few years in the classroom and then plateau,** but, at the end of the day, some teachers have what it takes and others just don’t. So, the general assumption is that a “good teacher” will be effective under any conditions, and the quality of a given school is determined by how many individual “good teachers” it has acquired.

    READ MORE
  • Evidence From A Teacher Evaluation Pilot Program In Chicago

    Written on December 4, 2015

    The majority of U.S. states have adopted new teacher evaluation systems over the past 5-10 years. Although these new systems remain among the most contentious issues in education policy today, there is still only minimal evidence on their impact on student performance or other outcomes. This is largely because good research takes time.

    A new article, published in the journal Education Finance and Policy, is among the handful of analyses examining the preliminary impact of teacher evaluation systems. The researchers, Matthew Steinberg and Lauren Sartain, take a look at the Excellence in Teaching Project (EITP), a pilot program carried out in Chicago Public Schools starting in the 2008-09 school year. A total of 44 elementary schools participated in EITP in the first year (cohort 1), while an additional 49 schools (cohort 2) implemented the new evaluation systems the following year (2009-10). Participating schools were randomly selected, which permits researchers to gauge the impact of the evaluations experimentally.

    The results of this study are important in themselves, and they also suggest some more general points about new teacher evaluations and the building body of evidence surrounding them.

    READ MORE
  • Where Al Shanker Stood: The Importance And Meaning Of NAEP Results

    Written on October 30, 2015

    In this New York Times piece, published on July 29, 1990, Al Shanker discusses the results of the National Assessment of Educational Progress (NAEP), and what they suggested about the U.S. education system at the time.

    One of the things that has influenced me most strongly to call for radical school reform has been the results of the National Assessment of Educational Progress (NAEP) examinations. These exams have been testing the achievement of our 9, 13 and 17-year olds in a number of basic areas over the past 20 years, and the results have been almost uniformly dismal.

    According to NAEP results, no 17-year-olds who are still in school are illiterate and innumerate - that is, all of them can read the words you would find on a cereal box or a billboard, and they can do simple arithmetic. But very few achieve what a reasonable person would call competence in reading, writing or computing.

    For example, NAEP's 20-year overview, Crossroads in American Education, indicated that only 2.6 percent of 17-year-olds taking the test could write a good letter to a high school principal about why a rule should be changed. And when I say good, I'm talking about a straightforward presentation of a couple of simple points. Only 5 percent could grasp a paragraph as complicated as the kind you would find in a first-year college textbook. And only 6 percent could solve a multi-step math problem like this one:"Christine borrowed $850 for one year from Friendly Finance Company. If she paid 12% simple interest on the loan, what was the total amount she repaid?"

    READ MORE
  • The Magic Of Multiple Measures

    Written on August 6, 2015

    Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.

    Teacher evaluation has become a contentious issue in U.S.  Some observers see the primary purpose of these reforms as the identification and removal of ineffective teachers; the popular media as well as politicians and education reform advocates have all played a role in the framing of teacher evaluation as such.  But, while removal of ineffective teachers was a criterion under Race to the Top, so too was the creation of evaluation systems to be used for teacher development and support.

    I think most people would agree that teacher development and improvement should be the primary purpose, as argued here.  Some empirical evidence supports the efficacy of evaluation for this purpose (see here).  And given the sheer number of teachers we need, declining enrollment in teacher preparation programs, and the difficulty disadvantaged schools have retaining teachers, school principals are probably none too enthusiastic about dismissing teachers, as discussed here.

    Of course, to achieve the ambitious goal of improving teaching practice, an evaluation system must be implemented well.  Fans of Harry Potter might remember when Dolores Umbridge from the Ministry of Magic takes over as High Inquisitor at Hogwarts and conducted “inspections” of Hogwart’s teachers in Book 5 of J.K. Rowling’s series.  These inspections pretty much demonstrate how not to approach classroom observations: she dictates the timing, fails to provide any of indication of what aspects of teaching practice she will be evaluating, interrupts lessons with pointed questions and comments, and evidently does no pre- or post-conferencing with the teachers. 

    READ MORE
  • Research On Teacher Evaluation Metrics: The Weaponization Of Correlations

    Written on July 21, 2015

    Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.

    In recent years, many districts have implemented multiple-measure teacher evaluation systems, partly in response to federal pressure from No Child Left Behind waivers and incentives from the Race to the Top grant program. These systems have not been without controversy, largely owing to the perception – not entirely unfounded - that such systems might be used to penalize teachers.  One ongoing controversy in the field of teacher evaluation is whether these measures are sufficiently reliable and valid to be used for high-stakes decisions, such as dismissal or tenure.  That is a topic that deserves considerably more attention than a single post; here, I discuss just one of the issues that arises when investigating validity.

     The diagram below is a visualization of a multiple-measure evaluation system, one that combines information on teaching practice (e.g. ratings from a classroom observation rubric) with student achievement-based measures (e.g. value-added or student growth percentiles) and student surveys.  The system need not be limited to three components; the point is simply that classroom observations are not the sole means of evaluating teachers.   

    In validating the various components of an evaluation system, researchers often examine their correlation with other components.  To the extent that each component is an attempt to capture something about the teacher’s underlying effectiveness, it’s reasonable to expect that different measurements taken of the same teacher will be positively related.  For example, we might examine whether ratings from a classroom observation rubric are positively correlated with value-added.

    READ MORE
  • Empower Teachers To Lead, Encourage Students To Be Curious

    Written on July 9, 2015

    Our guest author today is Ashim Shanker, a former English Language Arts teacher in public schools in Tokyo, Japan. Ashim has a Master’s Degree in International Education Policy from Harvard University and is the author of three books, including Don’t Forget to Breathe. Follow him on Twitter at @ashimshanker.

    In the 11 years that I was a public school teacher in Japan, I came to view education as a holistic enterprise. Schools in Japan not only imbued students with relevant skills, but also nurtured within them the wherewithal to experience a sense of connection with the larger world, and the exploratory capacity to discover their place within it.

    In my language arts classes, I encouraged students to read about current events and human rights issues around the world. I asked them to make lists of the electronics they used, the garments they wore, and the food products they consumed on a daily basis. I then had them research where these products were made and under what labor conditions.

    The students gave presentations on child laborers and about modern-day slavery. They debated about government secrecy laws in Japan and cover-ups in the aftermath of the Fukushima nuclear disaster. They read an essay on self-reliance by Emerson and excerpts on civil disobedience by Thoreau, and I asked them how these two activists might have felt about the actions of groups like Anonymous, or about whistleblowers like Edward Snowden. We discussed the Milgram Experiment and the Stanford Prison experiment, exploring how obedience and situational role conformity might tip even those with the best of intentions toward acts of cruelty. We talked about bullying, and shared anecdotes of instances in which we might unintentionally have hurt others. There were opportunities for self-reflection, engagement, and character building—attributes that I would like to think foster the empathic foundations for better civic engagement and global citizenship.

    READ MORE
  • Do We Know How To Hold Teacher Preparation Programs Accountable?

    Written on June 30, 2015

    This piece is co-authored by Cory Koedel and Matthew Di Carlo. Koedel is an Associate Professor of Economics and Public Policy at the University of Missouri, Columbia.

    The United States Department of Education (USED) has proposed regulations requiring states to hold teacher preparation programs accountable for the performance of their graduates. According to the proposal, states must begin assigning ratings to each program within the next 2-3 years, based on outcomes such as graduates’ “value-added” to student test scores, their classroom observation scores, how long they stay in teaching, whether they teach in high-needs schools, and surveys of their principals’ satisfaction.

    In the long term, we are very receptive to, and indeed optimistic about, the idea of outcomes-based accountability for teacher preparation programs (TPPs). In the short to medium term, however, we contend that the evidence base underlying the USED regulations is nowhere near sufficient to guide a national effort toward high-stakes TPP accountability.

    This is a situation in which the familiar refrain of “it’s imperfect but better than nothing” is false, and rushing into nationwide design and implementation could be quite harmful.

    READ MORE
  • Will Value-Added Reinforce The Walls Of The Egg-Crate School?

    Written on June 25, 2015

    Our guest author today is Susan Moore Johnson, Jerome T. Murphy Research Professor in Education at Harvard Graduate School of Education. Johnson directs the Project on the Next Generation of Teachers, which examines how best to recruit, develop, and retain a strong teaching force.

    Academic scholars are often dismayed when policymakers pass laws that disregard or misinterpret their research findings. The use of value-added methods (VAMS) in education policy is a case in point.

    About a decade ago, researchers reported that teachers are the most important school-level factor in students’ learning, and that that their effectiveness varies widely within schools (McCaffrey, Koretz, Lockwood, & Hamilton 2004; Rivkin, Hanushek, & Kain 2005; Rockoff 2004). Many policymakers interpreted these findings to mean that teacher quality rests with the individual rather than the school and that, because some teachers are more effective than others, schools should concentrate on increasing their number of effective teachers.

    Based on these assumptions, proponents of VAMS began to argue that schools could be improved substantially if they would only dismiss teachers with low VAMS ratings and replace them with teachers who have average or higher ratings (Hanushek 2009). Although panels of scholars warned against using VAMS to make high-stakes decisions because of their statistical limitations (American Statistical Association, 2014; National Research Council & National Academy of Education, 2010), policymakers in many states and districts moved quickly to do just that, requiring that VAMS scores be used as a substantial component in teacher evaluation.

    READ MORE
  • Trust: The Foundation Of Student Achievement

    Written on May 21, 2015

    When sharing with me the results of some tests, my doctor once said, "You are a scientist, you know a single piece of data can't provide all the answers or suffice to make a diagnosis. We can't look at a single number in isolation, we need to look at all results in combination." Was my doctor suggesting that I ignore that piece of information we had? No. Was my doctor deemphasizing the result? No. He simply said that we needed additional evidence to make informed decisions. This is, of course, correct.

    In education, however, it is frequently implied or even stated directly that the bottom line when it comes to school performance is student test scores, whereas any other outcomes, such as cooperation between staff or a supportive learning environment, are ultimately "soft" and, at best, of secondary importance. This test-based, individual-focused position is viewed as serious, rigorous, and data driven. Deviation from it -- e.g., equal emphasis on additional, systemic aspects of schools and the people in them -- is sometimes derided as an evidence-free mindset. Now, granted, few people are “purely” in one camp or the other. Most probably see themselves as pragmatists, and, as such, somewhere in between: Test scores are probably not all that matters, but since the rest seems so difficult to measure, we might as well focus on "hard data" and hope for the best.

    Why this narrow focus on individual measures such as student test scores or teacher quality? I am sure there are many reasons but one is probably lack of familiarity with the growing research showing that we must go beyond the individual teacher and student and examine the social-organizational aspects of schools, which are associated (most likely causally) with student achievement. In other words, all the factors skeptics and pragmatists might think are a distraction and/or a luxury, are actually relevant for the one thing we all care about: Student achievement. Moreover, increasing focus on these factors might actually help us understand what’s really important: Not simply whether testing results went up or down, but why or why not.

    READ MORE

Pages

Subscribe to Accountability

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.