Research And Evidence Can Help Guide Teachers During The Pandemic

This post is part of our series entitled Teaching and Learning During a Pandemic, in which we invite guest authors to reflect on the challenges of the Coronavirus pandemic for teaching and learning. Our guests today are Sara Kerr, Vice President of Education Policy Implementation at Results for America, and Nate Schwartz, Professor of Practice at Brown Universitys Annenberg Institute for School Reform. Other posts in the series are compiled here.

Teachers are used to playing many different roles, but this year they are facing the most complex challenges of their careers. They are being asked to be public health experts. Tech support specialists. Social workers to families reeling from the effects of layoffs and illness. Masters of distance learning and trauma-responsive educational practices. And they are being asked to take on these new responsibilities against a backdrop of rising COVID-19 cases in many parts of the country, looming budget cuts for many school districts, and a hyper-polarized political debate over the return to school.

To make any of this possible, educators need to be armed with the best available science, data, and evidence, not only about the operational challenges of reopening that have dominated the news cycle but also about how to to meet the increasingly complex social-emotional and academic needs of students and their families. They dont have time to sift through decades of academic papers for answers. Fortunately, the nations education researchers are eager and ready to help.

Why School Climate Matters For Teachers And Students

Our guest authors today are Mathew A. Kraft, associate professor of education and economics at Brown University, and Grace T. Falken, a research program associate at Brown’s Annenberg Institute. This article originally appeared in the May 2020 issue of The State Education Standard, the journal of the National Association of State Boards of Education.

Over the past decade, education reformers have focused much of their attention on raising teacher quality. This makes sense, given the well-evidenced, large impacts teachers have on student outcomes and the wide variation in teacher effectiveness, even within the same school (Goldhaber 2015Jackson et al. 2014). Yet this focus on individual teachers has caused policymakers to lose sight of the importance of the organizational contexts in which teachers work and students learn. 

The quality of a school’s teaching staff is greater than the sum of its parts. School environments can enable teachers to perform to their fullest potential or undercut their efforts to do so. 

When we think of work environments, we often envision physical features: school facilities, instructional resources, and the surrounding neighborhood. State and district policies that shape curriculum standards, class size, and compensation also come to mind. These things matter, but so do school climate factors that are less easily observed or measured. Teachers’ day-to-day experiences are influenced most directly by the culture and interpersonal environment of their schools.

Interpreting School Finance Measures

Last week we released the second edition of our annual report, "The Adequacy and Fairness of State School Finance Systems," which presents key findings from the School Finance Indicators Database (SFID). The SFID, released by the Shanker Institute and Rutgers Graduate School of Education (with my colleagues and co-authors Bruce Baker and Mark Weber), is a free collection of sophisticated finance measures that are designed to be accessible to the public. At the SFID website, you can read the summary of our findings, download the full report and datasets, or use our online data visualization tools.

The long and short of the report is that states vary pretty extensively, but most fund their schools either non-progressively (rich and poor districts receive roughly the same amount of revenue) or regressively (rich districts actually receive more revenue), and that, in the vast majority of states, funding levels are inadequate in all but the most affluent districts (in many cases due to a lack of effort).

One of the difficulties in producing this annual report is that the our "core" measures upon which we focus (effort, adequacy, and progressivity) are state-level, and it's not easy to get attention for your research report when you basically have 51 different sets of results. One option is assigning states grades, like a school report card. Often, this is perfectly defensible and useful. We decided against it, not only because assigning grades would entail many arbitrary decisions (e.g., where to set the thresholds), but also because assigning grades or ratings would risk obscuring some of the most useful conclusions from our data. Let's take a quick look at an example of how this works.

Interpreting Effect Sizes In Education Research

Interpreting “effect sizes” is one of the trickier checkpoints on the road between research and policy. Effect sizes, put simply, are statistics measuring the size of the association between two variables of interest, often controlling for other variables that may influence that relationship. For example, a research study may report that participating in a tutoring program was associated with a 0.10 standard deviation increase in math test scores, even controlling for other factors, such as student poverty, grade level, etc.

But what does that mean, exactly? Is 0.10 standard deviations a large effect or a small effect? This is not a simple question, even for trained researchers, and answering it inevitably entails a great deal of subjective human judgment. Matthew Kraft has an excellent little working paper that pulls together some general guidelines and a proposed framework for interpreting effect sizes in education. 

Before discussing the paper, though, we need to mention what may be one of the biggest problems with the interpretation of effect sizes in education policy debates: They are often ignored completely.

The Offline Implications Of The Research About Online Charter Schools

It’s rare to find an educational intervention with as unambiguous a research track record as online charter schools. Now, to be clear, it’s not a large body of research by any stretch, its conclusions may change in time, and the online charter sub-sector remains relatively small and concentrated in a few states. For now, though, the results seem incredibly bad (Zimmer et al. 2009Woodworth et al. 2015). In virtually every state where these schools have been studied, across virtually all student subgroups, and in both reading and math, the estimated impact of online charter schools on student testing performance is negative and large in magnitude.

Predictably, and not without justification, those who oppose charter schools in general are particularly vehement when it comes to online charter schools – they should, according to many of these folks, be closed down, even outlawed. Charter school supporters, on the other hand, tend to acknowledge the negative results (to their credit) but make less drastic suggestions, such as greater oversight, including selective closure, and stricter authorizing practices.

Regardless of your opinion on what to do about online charter schools’ poor (test-based) results, they are truly an interesting phenomenon for a few reasons.

We Need To Reassess School Discipline

It has been widely documented that, in American schools, students of color are disproportionately punished for nonviolent behaviors, and are targeted for exclusionary discipline within schools more often than their white peers. Exclusionary discipline is defined as students being removed from their learning environment, whether by in-school suspension, out-of-school suspension, or expulsion. 

In a national study, Sullivan et al. (2013) found that “Black students were more than twice as likely as White students to be suspended, whereas Hispanic and Native American students were 10 and 20 percent more likely to be suspended.” Out of all the racial minority groups, Asians had the lowest suspension rates across the board. Across all the racial groups, “males were twice as likely as female students to be suspended, and Black males had the highest rates of all subgroups.”

One reason that students of color are at a performance disadvantage to their White counterparts is because, put simply, they are being removed from the classroom much more often. This is true nationally, but it seems to be a particularly pronounced issue in the Commonwealth of Virginia. The Center for Public Integrity released a 2015 study demonstrating that schools in Virginia “referred students to law enforcement agencies at a rate nearly three times the national rate” (Ferriss, 2015). According to the U.S. Department of Education, Virginia’s Black student population, which is 23 percent of all students, received 59 percent of short-term arrests and 43 percent of expulsions (Lum, 2018).

Weaning Educational Research Off Of Steroids

Our guest authors today are Hunter Gehlbach and Carly D. Robinson. Gehlbach is an associate professor of education and associate dean at the University of California, Santa Barbara’s Gevirtz Graduate School of Education, as well as Director of Research at Panorama Education. Robinson is a doctoral candidate at Harvard’s Graduate School of Education.

Few people confuse academics with elite athletes. As a species, academics are rarely noted for their blinding speed, raw power, or outrageously low resting heart rates. Nobody wants to see a calendar of scantily clad professors. Unfortunately, recent years have surfaced one commonality between these two groups—a commonality no academic will embrace. And one with huge implications for educational policymakers’ and practitioners’ professional lives.

In the same way that a 37 year-old Barry Bonds did not really break the single-season home run record—he relied on performance-enhancing drugs—a substantial amount of educational research has undergone similar “performance enhancements” that make the results too good to be true.

To understand, the crux of the issue, we invite readers to wade into the weeds (only a little!), to see what research “on steroids” looks like and why it matters. By doing so, we hope to reveal possibilities for how educational practitioners and policymakers can collaborate with researchers to correct the problem and avoid making practice and policy decisions based on flawed research.

The Teacher Diversity Data Landscape

This week, the Albert Shanker Institute released a new research brief, authored by myself and Klarissa Cervantes. It summarizes what we found when we contacted all 51 state education agencies (including the District of Columbia) and asked whether data on teacher race and ethnicity was being collected, and whether and how it was made available to the public. This survey was begun in late 2017 and completed in early 2018.

The primary reason behind this project is the growing body of research to suggest that all students, and especially students of color, benefit from a teaching force that reflects the diverse society in which they must learn to live, work and prosper. ASI’s previous work has also documented that a great many districts should turn their attention to recruiting and retaining more teachers of color (see our 2015 report). Data are a basic requirement for achieving this goal – without data, states and districts are unable to gauge the extent of their diversity problem, target support and intervention to address that problem, and monitor the effects of those efforts. Unfortunately, the federal government does not require that states collect teacher race and ethnicity data, which means the responsbility falls to individual states. Moreover, statewide data are often insufficient – teacher diversity can vary widely within and between districts. Policymakers, administrators, and the public need detailed data (at least district-by-district and preferably school-by-school), which should be collected annually and be made easily available.

The results of our survey are generally encouraging. The vast majority of state education agencies (SEAs), 45 out of 51, report that they collect at least district-by-district data on teacher race and ethnicity (and all but two of these 45 collect school-by-school data). This is good news (and, frankly, better results than we anticipated). There are, however, areas of serious concern.

Why Teacher Evaluation Reform Is Not A Failure

The RAND Corporation recently released an important report on the impact of the Gates Foundation’s “Intensive Partnerships for Effective Teaching” (IPET) initiative. IPET was a very thorough and well-funded attempt to improve teaching quality in schools in three districts and four charter management organizations (CMOs). The initiative was multi-faceted, but its centerpiece was the implementation of multi-measure teacher evaluation systems and the linking of ratings from those systems to professional development and high stakes personnel decisions, including compensation, tenure, and dismissal. This policy, particularly the inclusion in teacher evaluations of test-based productivity measures (e.g., value-added scores), has been among the most controversial issues in education policy throughout the past 10 years.

The report is extremely rich and there's a lot of interesting findings in there, so I would encourage everyone to read it themselves (at least the executive summary), but the headline finding was that the IPET had no discernible effect on student outcomes, namely test scores and graduation rates, in the districts that participated, vis-à-vis similar districts that did not. Given that IPET was so thoroughly designed and implemented, and that it was well-funded, it can potentially be viewed as a "best case scenario" test of the type of evaluation reform that most states have enacted. Accordingly, critics of these reforms, who typically focus their opposition on the high stakes use of evaluation measures, particularly value-added and other test-based measures, in these evaluations, have portrayed the findings as vindication of their opposition. 

This reaction has merit. The most important reason why is that evaluation reform was portrayed by advocates as a means to immediate and drastic improvements in student outcomes. This promise was misguided from the outset, and evaluation reform opponents are (and were) correct in pointing this out. At the same time, however, it would be wise not to dismiss evaluation reform as a whole, for several reasons, a few of which are discussed below.

We Can't Graph Our Way Out Of The Research On Education Spending

The graph below was recently posted by U.S. Education Department (USED) Secretary Betsy DeVos, as part of her response to the newly released scores on the 2017 National Assessment of Educational Progress (NAEP), administered every two years and often called the “nation’s report card.” It seems to show a massive increase in per-pupil education spending, along with a concurrent flat trend in scores on the fourth grade reading version of NAEP. The intended message is that spending more money won’t improve testing outcomes. Or, in the more common phrasing these days, "we can't spend our way out of this problem."

Some of us call it “The Graph.” Versions of it have been used before. And it’s the kind of graph that doesn’t need to be discredited, because it discredits itself. So, why am I bothering to write about it? The short answer is that I might be unspeakably naïve. But we’ll get back to that in a minute.

First, let’s very quickly run through the graph. In terms of how it presents the data, it is horrible practice. The double y-axes, with spending on the left and NAEP scores on the right, are a textbook example of what you might call motivated scaling (and that's being polite). The NAEP scores plotted range from a minimum of 213 in 2000 to a maximum of 222 in 2017, but the graph inexplicably extends all the way up to 275. In contrast, the spending scale extends from just below the minimum observation ($6,000) to just above the maximum ($12,000). In other words, the graph is deliberately scaled to produce the desired visual effect (increasing spending, flat scores). One could very easily rescale the graph to produce the opposite.