Recent Evidence On The New Orleans School Reforms

A new study of New Orleans (NOLA) schools since Katrina, published by the Education Research Alliance (ERA), has caused a predictable stir in education circles (the results are discussed in broader strokes in this EdNext article, while the full paper is forthcoming). The study’s authors, Doug Harris and Matthew Larsen, compare testing outcomes before and after the hurricanes that hit the Gulf Coast in 2005, in districts that were affected by those storms. The basic idea, put simply, is to compare NOLA schools to those in other storm-affected districts, in order to assess the general impact of the drastic educational change undertaken in NOLA, using the other schools/districts as a kind of control group.

The results, in brief, indicate that: 1) aggregate testing results after the storms rose more quickly in NOLA vis-à-vis the comparison districts, with the difference in 2012 being equivalent to roughly 15 percentile points ; 2) there was, however, little discernible difference in the trajectories of NOLA students who returned after the storm and their peers in other storm-affected districts (though this latter group could only be followed for a short period, all of which occurred during these cohorts' middle school years). Harris and Larsen also address potential confounding factors, including population change and trauma, finding little or no evidence that these factors generate bias in their results.

The response to this study included the typical of mix of thoughtful, measured commentary and reactionary advocacy (from both “sides”). And, at this point, so much has been said and written about the study, and about New Orleans schools in general, that I am hesitant to join the chorus (I would recommend in particular this op-ed by Doug Harris, as well as his presentation at our recent event on New Orleans).

The Magic Of Multiple Measures

Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.

Teacher evaluation has become a contentious issue in U.S.  Some observers see the primary purpose of these reforms as the identification and removal of ineffective teachers; the popular media as well as politicians and education reform advocates have all played a role in the framing of teacher evaluation as such.  But, while removal of ineffective teachers was a criterion under Race to the Top, so too was the creation of evaluation systems to be used for teacher development and support.

I think most people would agree that teacher development and improvement should be the primary purpose, as argued here.  Some empirical evidence supports the efficacy of evaluation for this purpose (see here).  And given the sheer number of teachers we need, declining enrollment in teacher preparation programs, and the difficulty disadvantaged schools have retaining teachers, school principals are probably none too enthusiastic about dismissing teachers, as discussed here.

Of course, to achieve the ambitious goal of improving teaching practice, an evaluation system must be implemented well.  Fans of Harry Potter might remember when Dolores Umbridge from the Ministry of Magic takes over as High Inquisitor at Hogwarts and conducted “inspections” of Hogwart’s teachers in Book 5 of J.K. Rowling’s series.  These inspections pretty much demonstrate how not to approach classroom observations: she dictates the timing, fails to provide any of indication of what aspects of teaching practice she will be evaluating, interrupts lessons with pointed questions and comments, and evidently does no pre- or post-conferencing with the teachers. 

Where Al Shanker Stood: Policymaking And Innovation

In this piece, which was published in the New York Times on December 24, 1995, Al Shanker uses a creative analogy to argue that policies require experimentation and refinement before they are brought to scale, and that some reformers mistake this process for rigidity and "stifling innovation."

A couple of weeks ago, the New York Times food section ran an article about a French bread that you can make with a food processor (November 22, 1995). The article claimed that the baguette was as delicious as the kind you buy in a good bakery. I was skeptical. I have made bread for my family and friends for a number of years, and I know that a good French loaf is a real accomplishment. I had no trouble believing that the bread would be quick and easy. But delicious? Nevertheless, I tried the recipe for Thanksgiving. It was terrific!

Though making the bread was as painless as the article said, the process by which Charles van Over, a chef and restaurateur, arrived at the recipe was anything but simple. Van Over experimented over a period of several years in order to get a bread with the best possible texture, flavor, and crust - and a recipe that could be made with predictable results by other cooks. It occurred to me as I read the article that there might be some lessons for school reformers in van Over's systematic efforts to perfect his recipe for a food processor baguette.

Research On Teacher Evaluation Metrics: The Weaponization Of Correlations

Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.

In recent years, many districts have implemented multiple-measure teacher evaluation systems, partly in response to federal pressure from No Child Left Behind waivers and incentives from the Race to the Top grant program. These systems have not been without controversy, largely owing to the perception – not entirely unfounded - that such systems might be used to penalize teachers.  One ongoing controversy in the field of teacher evaluation is whether these measures are sufficiently reliable and valid to be used for high-stakes decisions, such as dismissal or tenure.  That is a topic that deserves considerably more attention than a single post; here, I discuss just one of the issues that arises when investigating validity.

 The diagram below is a visualization of a multiple-measure evaluation system, one that combines information on teaching practice (e.g. ratings from a classroom observation rubric) with student achievement-based measures (e.g. value-added or student growth percentiles) and student surveys.  The system need not be limited to three components; the point is simply that classroom observations are not the sole means of evaluating teachers.   

In validating the various components of an evaluation system, researchers often examine their correlation with other components.  To the extent that each component is an attempt to capture something about the teacher’s underlying effectiveness, it’s reasonable to expect that different measurements taken of the same teacher will be positively related.  For example, we might examine whether ratings from a classroom observation rubric are positively correlated with value-added.

Do We Know How To Hold Teacher Preparation Programs Accountable?

This piece is co-authored by Cory Koedel and Matthew Di Carlo. Koedel is an Associate Professor of Economics and Public Policy at the University of Missouri, Columbia.

The United States Department of Education (USED) has proposed regulations requiring states to hold teacher preparation programs accountable for the performance of their graduates. According to the proposal, states must begin assigning ratings to each program within the next 2-3 years, based on outcomes such as graduates’ “value-added” to student test scores, their classroom observation scores, how long they stay in teaching, whether they teach in high-needs schools, and surveys of their principals’ satisfaction.

In the long term, we are very receptive to, and indeed optimistic about, the idea of outcomes-based accountability for teacher preparation programs (TPPs). In the short to medium term, however, we contend that the evidence base underlying the USED regulations is nowhere near sufficient to guide a national effort toward high-stakes TPP accountability.

This is a situation in which the familiar refrain of “it’s imperfect but better than nothing” is false, and rushing into nationwide design and implementation could be quite harmful.

New Policy Brief: The Evidence On The Florida Education Reform Formula

The State of Florida is well known in the U.S. as a hotbed of education reform. The package of policies spearheaded by then Governor Jeb Bush during the late 1990s and early 2000s focused, in general, on test-based accountability, competition, and choice. As a whole, they have come to be known as the “Florida Formula for education success,” or simply the “Florida Formula.”

The Formula has received a great deal of attention, including a coordinated campaign to advocate (in some cases, successfully) for its export to other states. The campaign and its supporters tend to employ as their evidence changes in aggregate testing results, most notably unadjusted increases in proficiency rates on Florida’s state assessment and/or cohort changes on the National Assessment of Educational Progress. This approach, for reasons discussed in the policy brief, violates basic principles of causal inference and policy evaluation. Using this method, one could provide evidence that virtually any policy or set of policies “worked” or “didn’t work,” often in the same place and time period.

Fortunately, we needn’t rely on these crude methods, as there is quite a bit of high quality evidence pertaining to several key components of the Formula, and it provides a basis for tentative conclusions regarding their short- and medium term (mostly test-based impact. Today we published a policy brief, the purpose of which is to summarize this research in a manner that is fair and accessible to policymakers and the public.

Developing Workplaces Where Teachers Stay, Improve, And Succeed

** Republished here in the Washington Post

Our guest authors today are Matthew A. Kraft and John P. Papay. Kraft is an Assistant Professor of Education at Brown University. Papay is an Assistant Professor of Education and Economics at Brown University. In 2015, they received the American Educational Research Association Palmer O. Johnson Memorial Award for the research discussed in this essay. 

When you study education policy, the inevitable question about what you do for a living always gets the conversation going. Controversies over teachers unions, charter schools, and standardized testing provide plenty of fodder for lively debates. People often are eager to share their own experiences about individual teachers who profoundly shaped their lives or were less than inspiring.

A large body of research confirms this common experience – teachers have large effects on students’ learning, and some teachers are far more effective than others. What is largely absent in these conversations, and in the scholarly literature, is a recognition of how these teachers are also supported or constrained by the organizational contexts in which they teach.

The absence of an organizational perspective on teacher effectiveness leads to narrow dinner conversations and misinformed policy. We tend to ascribe teachers’ career decisions to the students they teach rather than the conditions in which they work. We treat teachers as if their effectiveness is mostly fixed, always portable, and independent of school context. As a result, we rarely complement personnel reforms with organizational reforms that could benefit both teachers and students.

The Purpose And Potential Impact Of The Common Core

I think it makes sense to have clear, high standards for what students should know and be able to do, and so I am generally a supporter of the Common Core State Standards (CCSS). That said, I’m not comfortable with the way CCSS is being advertised as a means for boosting student achievement (i.e., test scores), nor the frequency with which I have heard speculation about whether and when the CCSS will generate a “bump” in NAEP scores.

To be clear, I think it is plausible to argue that, to the degree that the new standards can help improve the coherence and breadth/depth of the content students must learn, they may lead to some improvement over the long term – for example, by minimizing the degree to which student mobility disrupts learning or by enabling the adoption of coherent learning progressions across grade levels. It remains to be seen whether the standards, as implemented, can be helpful in attaining these goals.

The standards themselves, after all, only discuss the level and kind of learning that students should be pursuing at a given point in their education. They do not say what particular content should be taught when (curricular frameworks), how it should be taught (instructional materials), who will be doing the teaching and with what professional development, or what resources will be made available to teachers and students. And these are the primary drivers of productivity improvements. Saying how high the bar should be raised (or what it should consist of) is important, but outcomes are determined by whether or not the tools are available with which to accomplish that raising. The purpose of having better or higher standards is just that – better or higher standards. If you're relying on immediate test-based gratification due solely to CCSS, you're confusing a road map with how to get to your destination.

Why We Defend The Public Square

The following are the texts of the two speeches from the opening session of our recent two-day conference, “In Defense of the Public Square,” which was held on May 1-2 at Georgetown University in Washington, D.C. The introduction was delivered by Leo Casey and the keynote address was delivered by Randi Weingarten. The video of the full event will be available soon here.

Remarks by Leo Casey

We meet here today in “defense of the public square.”

The public square is the place where Americans come together as a people and establish common goals in pursuit of our common good.

The public square is the place where Americans – in all of our rich diversity – promote the general welfare, achieving as a community what we never could do as private individuals.

The public square is the place where Americans weave together our ideal of political equality and our solidarity with community in a democratic political culture, as de Tocqueville saw so well.

Is The Social Side Of Education Touchy Feely?

That's right, measuring social and organizational aspects of schools is just... well, "touchy feely." We all intuitively grasp that social relations are important in our work environments, that having mentors on the job can make a world of difference, that knowing how to work with colleagues matters to the quality of the end product, that innovation and improvement relies on the sharing of ideas, that having a good relationship with supervisors influences both engagement and performance, and so on.

I could go on, but I don't have to; we all just know these things. But is there hard evidence, other than common sense and our personal experiences? Behaviors such as collaboration and interaction or qualities like trust are difficult to quantify. In the end, is it possible that they are just 'soft' and that, even if they’re important (and they are), they just don't belong in policy conversations?

Wrong.

In this post, I review three distinct methodological approaches that researchers have used to understand social-organizational aspects of schools. Specifically, I selected studies that examine the relationship between aspects of teachers' social-organizational environments and their students' achievement growth. I focus both on the methods and on the substantive findings. This is because I think some basic sense of how researchers look at complex constructs like trust or collegiality can deepen our understanding of this work and lead us to embrace its implications for policy and practice more fully.