The Uncertain Future Of Charter School Proliferation

This is the third in a series of three posts about charter schools. Here are the first and second parts.

As discussed in prior posts, high-quality analyses of charter school effects show that there is wide variation in the test-based effects of these schools but that, overall, charter students do no better than their comparable regular public school counterparts. The existing evidence, though very tentative, suggests that the few schools achieving large gains tend to be well-funded, offer massive amounts of additional time, provide extensive tutoring services and maintain strict, often high-stakes discipline policies.

There will always be a few high-flying chains dispersed throughout the nation that get results, and we should learn from them. But there’s also the issue of whether a bunch of charters schools with different operators using diverse approaches can expand within a single location and produce consistent results.

Charter supporters typically argue that state and local policies can be leveraged to “close the bad charters and replicate the good ones." Opponents, on the other hand, contend that successful charters can’t expand beyond a certain point because they rely on selection bias of the best students into these schools (so-called “cream skimming”), as well as the exclusion of high-needs students.

Given the current push to increase the number of charter schools, these are critical issues, and there is, once again, some very tentative evidence that might provide insights.

Explaining The Consistently Inconsistent Results of Charter Schools

This is the second in a series of three posts about charter schools. Here is the first part, and here is the third.

As discussed in a previous post, there is a fairly well-developed body of evidence showing that charter and regular public schools vary widely in their impacts on achievement growth. This research finds that, on the whole, there is usually not much of a difference between them, and when there are differences, they tend to be very modest. In other words, there is nothing about "charterness" that leads to strong results.

It is, however, the exceptions that are often most instructive to policy. By taking a look at the handful of schools that are successful, we might finally start moving past the “horse race” incarnation of the charter debate, and start figuring out which specific policies and conditions are associated with success, at least in terms of test score improvement (which is the focus of this post).

Unfortunately, this question is also extremely difficult to answer – policies and conditions are not randomly assigned to schools, and it’s very tough to disentangle all the factors (many unmeasurable) that might affect achievement. But the available evidence at this point is sufficient to start draw a few highly tentative conclusions about “what works."

The Evidence On Charter Schools

** Also posted here on "Valerie Strauss' Answer Sheet" in the Washington Post and here on the Huffington Post

This is the first in a series of three posts about charter schools. Here are the second and third parts.

In our fruitless, deadlocked debate over whether charter schools “work," charter opponents frequently cite the so-called CREDO study (discussed here), a 2009 analysis of charter school performance in 16 states. The results indicated that overall charter effects on student achievement were negative and statistically significant in both math and reading, but both effects sizes were tiny. Given the scope of the study, it’s perhaps more appropriate to say that it found wide variation in charter performance within and between states – some charters did better, others did worse and most were no different. On the whole, the size of the aggregate effects, both positive and negative, tended to be rather small.

Recently, charter opponents’ tendency to cite this paper has been called “cherrypicking." Steve Brill sometimes levels this accusation, as do others. It is supposed to imply that CREDO is an exception – that most of the evidence out there finds positive effects of charter schools relative to comparable regular public schools.

CREDO, while generally well-done given its unprecedented scope, is a bit overused in our public debate – one analysis, no matter how large or good, cannot prove or disprove anything. But anyone who makes the “cherrypicking” claim is clearly unfamiliar with the research. CREDO is only one among a number of well-done, multi- and single-state studies that have reached similar conclusions about overall test-based impacts.

This is important because the endless back-and-forth about whether charter schools “work” – whether there is something about "charterness" that usually leads to fantastic results – has become a massive distraction in our education debates. The evidence makes it abundantly clear that that is not the case, and the goal at this point should be to look at the schools of both types that do well, figure out why, and use that information to improve all schools.

When The Legend Becomes Fact, Print The Fact Sheet

The New Teacher Project (TNTP) just released a "fact sheet" on value-added (VA) analysis. I’m all for efforts to clarify complex topics such as VA, and, without question, there is a great deal of misinformation floating around on this subject, both "pro-" and "anti-."

The fact sheet presents five sets of “myths and facts." Three of the “myths” seem somewhat unnecessary: that there’s no research behind VA; that teachers will be evaluated based solely on test scores; and that VA is useless because it’s not perfect. Almost nobody believes or makes these arguments (at least in my experience). But I guess it never hurts to clarify.

In contrast, the other two are very common arguments, but they are not myths. They are serious issues with concrete policy implications. If there are any myths, they're in the "facts" column.

The Impact Of The Principal In The Classroom

Direct observation is way of gathering data by watching behavior or events as they occur; for example, a teacher teaching a lesson. This methodology is important to teacher induction and professional development, as well as teacher evaluation. Yet, direct observation has a major shortcoming: it is a rather obtrusive data gathering technique. In other words, we know the observer can influence the situation and the behavior of those being observed. We also know people do not behave the same way when they know they are being watched. In psychology, these forms of reactivity are known as the Hawthorne effect, and the observer- or experimenter- expectancy effect (also here).

Social scientists and medical researchers are well aware of these issues and the fact that research findings don’t mean a whole lot when the researcher and/or the study participants know the purpose of the research and/or are aware that they are being observed or tested. To circumvent these obstacles, techniques like “mild deception” and “covert observation” are frequently used in social science research.

For example, experimenters often take advantage of “cover stories” which give subjects a sensible rationale for the research while preventing them from knowing (or guessing) the true goals of the study, which would threaten the experiment’s internal validity – see here. Also, researchers use double-blind designs, which, in the medical field, mean that neither the research participant nor the researcher know when the treatment or the placebo are being administered.

In Research, What Does A "Significant Effect" Mean?

If you follow education research – or quantitative work in any field – you’ll often hear the term “significant effect." For example, you will frequently read research papers saying that a given intervention, such as charter school attendance or participation in a tutoring program, had “significant effects," positive or negative, on achievement outcomes.

This term by itself is usually sufficient to get people who support the policy in question extremely excited, and to compel them to announce boldly that their policy “works." They’re often overinterpreting the results, but there’s a good reason for this. The problem is that “significant effect” is a statistical term, and it doesn’t always mean what it appears to mean. As most people understand the words, “significant effects” are often neither significant nor necessarily effects.

Let’s very quickly clear this up, one word at a time, working backwards.

The Teachers' Union Hypothesis

For the past couple of months, Steve Brill's new book has served to step up the eternally-beneath-the-surface hypothesis that teachers’ unions are the primary obstacle to improving educational outcomes in the U.S. The general idea is that unions block “needed reforms," such as merit pay and other forms of test-based accountability for teachers, and that they “protect bad teachers” from being fired.

Teachers’ unions are a convenient target. For one thing, a significant proportion of Americans aren’t crazy about unions of any type. Moreover, portraying unions as the villain in the education reform drama facilitates the (mostly false) policy-based distinction between teachers and the organizations that represent them – put simply, “love teachers, hate their unions." Under the auspices of this dichotomy, people can advocate for changes , such as teacher-level personnel policies based partially on testing results, without having to address why most teachers oppose them (a badly needed conversation).

No, teachers’ unions aren’t perfect, because the teachers to whom they give voice aren’t perfect. There are literally thousands of unions, and, just like districts, legislatures and all other institutions, they make mistakes. But I believe strongly in separating opinion and anecdote from actual evidence, and the simple fact is that the pervasive argument that unions are a substantial cause of low student performance has a weak empirical basis, while the evidence that unions are a primary cause of low performance does not exist.

What Are "Middle Class Schools"?

An organization called “The Third Way” released a report last week, in which they present descriptive data on what they call “middle class schools." The primary conclusion of their analysis is that “middle class schools” aren’t “making the grade," and that they are “falling short on their most basic 21st century mission: To prepare kids to get a college degree." They also argue that “middle class schools” are largely ignored in our debate and policymaking, and we need a “second phase of school reform” in order to address this deficit.

The Wall Street Journal swallowed the report whole, running a story presenting Third Way’s findings under the headline “Middle class schools fail to make the grade."

To be clear, I think that our education policy debates do focus on lower-income schools to a degree that sometimes ignores those closer to the middle of the distribution. So, it’s definitely worthwhile to take a look at “middle class schools’” performance and how it can be improved. In other words, I’m very receptive to the underlying purpose of the report.

That said, this analysis consists mostly of arbitrary measurement and flawed, vague interpretations. As a result, it actually offers little meaningful insight.

The Real Charter School Experiment

The New York Times reports that there is a pilot program in Houston, called the "Apollo 20 Program" in which some of the district’s regular public schools are "mimicking" the practices of high-performing charter schools. According to the Times article, the group of pilot schools seek to replicate five of the practices commonly used by high-flying charters: extended school time; extensive tutoring; more selective hiring of principals and teachers; “data-driven” instruction, including frequent diagnostic quizzing; and a “no excuses” culture of high expectations.

In theory, this pilot program is a good idea, since a primary mission of charter schools should be as a testing ground for new policies and practices that could help to improve all schools. More than a decade of evidence has made it very clear that there’s nothing about "charterness" that makes a school successful – and indeed, only a handful get excellent results. So instead of arguing along the tired old pro-/anti-charter lines, we should, like Houston, be asking why these schools excel and working to see if we can use this information productively.

I’ll be watching to see how the pilot schools end up doing. I’m also hoping that the analysis (the program is being overseen by Harvard’s EdLabs) includes some effort to separate out the effects of each of the five replicated practices. If so, I’m guessing that we will find that the difference between high- and low-performing urban schools depends more than anything else on two factors: time and money.

Attracting The "Best Candidates" To Teaching

** Also posted here on "Valerie Strauss' Answer Sheet" in the Washington Post

One of the few issues that all sides in the education debate agree upon is the desirability of attracting “better people” into the teaching profession. While this certainly includes the possibility of using policy to lure career-switchers, most of the focus is on attracting “top” candidates right out of college or graduate school.

The common metric that is used to identify these “top” candidates is their pre-service (especially college) characteristics and performance. Most commonly, people call for the need to attract teachers from the “top third” of graduating classes, an outcome that is frequently cited as being the case in high-performing nations such as Finland. Now, it bears noting that “attracting better people," like “improving teacher quality," is a policy goal, not a concrete policy proposal – it tells us what we want, not how to get it. And how to make teaching more enticing for “top” candidates is still very much an open question (as is the equally important question of how to improve the performance of existing teachers).

In order to answer that question, we need to have some idea of whom we’re pursuing – who are these “top” candidates, and what do they want? I sometimes worry that our conception of this group – in terms of the “top third” and similar constructions – doesn’t quite square with the evidence, and that this misconception might actually be misguiding rather than focusing our policy discussions.