Is ESSER Money Being Spent or Not?

Our guest author today is Jess Gartner, CEO and Founder of Allovue, an education finance technology company.

As part of a series of federal pandemic-relief stimulus packages, K-12 schools received three rounds of funds through the Elementary and Secondary School Emergency Relief Fund (ESSER I, II, and III), totaling nearly $200 billion. Almost immediately, headlines across the nation probed how (or if) schools were spending these dollars. Nearly three years after the initial round of funding ($13 billion) was granted by the CARES Act in March 2020, questions linger about the pace and necessity of spending. Why is it so hard to get a straight answer?

For two years, the prevailing theme in the headlines had been that school districts were sitting on stacks of cash, whereas more recent (and far less breathless) stories say the money is now on track to be spent. Why all the confusion? The complex multi-year process of receiving, planning, spending, and reporting ESSER dollars is more complicated and drawn out than a single soundbyte can convey (I’ve tried!). Let’s take a quick look at a few key issues to bear in mind when thinking (or reading) about ESSER funds, and then a couple of conclusions as to what’s really going on.

Interpreting Effect Sizes In Education Research

Interpreting “effect sizes” is one of the trickier checkpoints on the road between research and policy. Effect sizes, put simply, are statistics measuring the size of the association between two variables of interest, often controlling for other variables that may influence that relationship. For example, a research study may report that participating in a tutoring program was associated with a 0.10 standard deviation increase in math test scores, even controlling for other factors, such as student poverty, grade level, etc.

But what does that mean, exactly? Is 0.10 standard deviations a large effect or a small effect? This is not a simple question, even for trained researchers, and answering it inevitably entails a great deal of subjective human judgment. Matthew Kraft has an excellent little working paper that pulls together some general guidelines and a proposed framework for interpreting effect sizes in education. 

Before discussing the paper, though, we need to mention what may be one of the biggest problems with the interpretation of effect sizes in education policy debates: They are often ignored completely.

Weaning Educational Research Off Of Steroids

Our guest authors today are Hunter Gehlbach and Carly D. Robinson. Gehlbach is an associate professor of education and associate dean at the University of California, Santa Barbara’s Gevirtz Graduate School of Education, as well as Director of Research at Panorama Education. Robinson is a doctoral candidate at Harvard’s Graduate School of Education.

Few people confuse academics with elite athletes. As a species, academics are rarely noted for their blinding speed, raw power, or outrageously low resting heart rates. Nobody wants to see a calendar of scantily clad professors. Unfortunately, recent years have surfaced one commonality between these two groups—a commonality no academic will embrace. And one with huge implications for educational policymakers’ and practitioners’ professional lives.

In the same way that a 37 year-old Barry Bonds did not really break the single-season home run record—he relied on performance-enhancing drugs—a substantial amount of educational research has undergone similar “performance enhancements” that make the results too good to be true.

To understand, the crux of the issue, we invite readers to wade into the weeds (only a little!), to see what research “on steroids” looks like and why it matters. By doing so, we hope to reveal possibilities for how educational practitioners and policymakers can collaborate with researchers to correct the problem and avoid making practice and policy decisions based on flawed research.

The Teacher Diversity Data Landscape

This week, the Albert Shanker Institute released a new research brief, authored by myself and Klarissa Cervantes. It summarizes what we found when we contacted all 51 state education agencies (including the District of Columbia) and asked whether data on teacher race and ethnicity was being collected, and whether and how it was made available to the public. This survey was begun in late 2017 and completed in early 2018.

The primary reason behind this project is the growing body of research to suggest that all students, and especially students of color, benefit from a teaching force that reflects the diverse society in which they must learn to live, work and prosper. ASI’s previous work has also documented that a great many districts should turn their attention to recruiting and retaining more teachers of color (see our 2015 report). Data are a basic requirement for achieving this goal – without data, states and districts are unable to gauge the extent of their diversity problem, target support and intervention to address that problem, and monitor the effects of those efforts. Unfortunately, the federal government does not require that states collect teacher race and ethnicity data, which means the responsbility falls to individual states. Moreover, statewide data are often insufficient – teacher diversity can vary widely within and between districts. Policymakers, administrators, and the public need detailed data (at least district-by-district and preferably school-by-school), which should be collected annually and be made easily available.

The results of our survey are generally encouraging. The vast majority of state education agencies (SEAs), 45 out of 51, report that they collect at least district-by-district data on teacher race and ethnicity (and all but two of these 45 collect school-by-school data). This is good news (and, frankly, better results than we anticipated). There are, however, areas of serious concern.

Our Request For Simple Data From The District Of Columbia

For our 2015 report, “The State of Teacher Diversity in American Education,” we requested data on teacher race and ethnicity between roughly 2000 and 2012 from nine of the largest school districts in the nation: Boston; Chicago; Cleveland; District of Columbia; Los Angeles; New Orleans; New York; Philadelphia; and San Francisco.

Only one of these districts failed to provide us with data that we could use to conduct our analysis: the District of Columbia.

To be clear, the data we requested are public record. Most of the eight other districts to which we submitted requests complied in a timely fashion. A couple of them took months to fill the request, and required a little follow up. But all of them gave us what we needed. We were actually able to get charter school data for virtually all of these eight cities (usually through the state).

Even New Orleans, which, during the years for which we requested data, was destroyed by a hurricane and underwent a comprehensive restructuring of its entire school system, provided the data.

But not DC.

An Alternative Income Measure Using Administrative Education Data

The relationship between family background and educational outcomes is well documented and the topic, rightfully, of endless debate and discussion. A students’ background is most often measured in terms of family income (even though it is actually the factors associated with income, such as health, early childhood education, etc., that are the direct causal agents).

Most education analyses rely on a single income/poverty indicator – i.e., whether or not students are eligible for federally-subsidized lunch (free/reduced-price lunch, or FRL). For instance, income-based achievement gaps are calculated by comparing test scores between students who are eligible for FRL and those who are not, while multivariate models almost always use FRL eligibility as a control variable. Similarly, schools and districts with relatively high FRL eligibility rates are characterized as “high poverty.” The primary advantages of FRL status are that it is simple and collected by virtually every school district in the nation (collecting actual income would not be feasible). Yet it is also a notoriously crude and noisy indicator. In addition to the fact that FRL eligibility is often called “poverty” even though the cutoff is by design 85 percent higher than the federal poverty line, FRL rates, like proficiency rates, mask a great deal of heterogeneity. Families of two students who are FRL eligible can have quite different incomes, as could two families of students who are not eligible. As a result, FRL-based estimates such as achievement gaps might differ quite a bit from those calculated using actual family income from surveys.

A new working paper by Michigan researchers Katherine Michelmore and Susan Dynarski presents a very clever means of obtaining a more accurate income/poverty proxy using the same administrative data that states and districts have been collecting for years.

Perceived Job Security Among Full Time U.S. Workers

In a previous post, we discussed some recent data on contingent work or alternative employment relationships – those that are different from standard full time jobs, including temporary help, day labor, independent contracting, and part time jobs. The prevalence of and trends in contingent work vary widely depending on which types of arrangements one includes in the definition, but most of them are characterized by less security (and inferior wages and benefits) relative to “traditional” full time employment.

The rise of contingent work is often presented as a sign of deteriorating conditions for workers (see the post mentioned above for more discussion of this claim). Needless to say, however, unemployment insecurity characterizes many jobs with "traditional" arrangements -- sometimes called precarious work -- which of course implies that contingent work is an incomplete conceptualization of the lack of stability that is its core feature.

One interesting way to examine job security is in terms of workers’ views of their own employment situations. In other words, how many workers perceive their jobs as insecure, and how has this changed over time? Perceived job security not only serves as a highly incomplete and imperfect indicator of “real” job security, but it also affects several meaningful non-employment outcomes related to well being, including health (e.g., Burgard et al. 2009). We might take a very quick look at perceived job security using data from the General Social Survey (GSS) between 1977 and 2014.

On Focus Groups, Elections, and Predictions

Focus groups, a method in which small groups of subjects are questioned by researchers, are widely used in politics, marketing, and other areas. In education policy, focus groups, particularly those comprised of teachers or administrators, are often used to design or shape policy. And, of course, during national election cycles, they are particularly widespread, and there are even television networks that broadcast focus groups as a way to gauge the public’s reaction to debates or other events.

There are good reasons for using focus groups. Analyzing surveys can provide information regarding declaratory behaviors and issues’ rankings at a given point in time, and correlations between these declarations and certain demographic and social variables of interest. Focus groups, on the other hand, can help map out the issues important to voters (which can inform survey question design), as well investigate what reactions certain presentations (verbal or symbolic) evoke (which can, for example, help frame messages in political or informational campaigns).

Both polling/surveys and focus groups provide insights that the other method alone could not. Neither of them, however, can answer questions about why certain patterns occur or how likely they are to occur in the future. That said, having heard some of the commentary about focus groups, and particularly having seen them being broadcast live and discussed on cable news stations, I feel strongly compelled to comment, as I do whenever data are used improperly or methodologies are misinterpreted.

Is The Motherhood Penalty Real? The Evidence From Poland

It has long been assumed that the residual gap in earnings between men and women (after controlling for productivity characteristics, occupation and industry segregation, and union membership status) is due to gender discrimination. A growing body of evidence, however, suggests that it may also reflect the effect of having children.

According to this research, employed mothers now account for most of the gender gap in wages (Glass 2004). In the U.S., controlling for work experience, hourly wages of mothers are approximately four percent lower for each child they have, compared to the wages of non-mothers (Budig and England, 2001). The magnitude of these family effects differs across countries, but, in general, men accrue modest earnings premiums for fatherhood, whereas women incur significant earnings penalties for motherhood (Waldfogel, 1998; Harkness and Waldfogel, 2003; Sigle-Rushton and Waldfogel, 2007; Budig and Hodges, 2010; Hodges and Budig, 2010; Smith Koslowski, 2011).

The size of the penalty seems also to vary by whether women and men are toward the top or bottom of the employment hierarchies of skills and wages, and it also varies across countries (England et al. 2014; Cooke 2014). The findings in this area are sometimes inconsistent, however, and suggest that there is a need to include a combination of skills and wages (England et al. 2014) and to choose carefully measures of job interruptions (Staff and Mortimer, 2012).

Lessons And Directions From The CREDO Urban Charter School Study

Last week, CREDO, a Stanford University research organization that focuses mostly on charter schools, released an analysis of the test-based effectiveness of charter schools in “urban areas” – that is, charters located in cities located within in 42 urban areas throughout 22 states. The math and reading testing data used in the analysis are from the 2006-07 to 2010-11 school years.

In short, the researchers find that, across all areas included, charters’ estimated impacts on test scores, vis-à-vis the regular public schools to which they are compared, are positive and statistically discernible. The magnitude of the overall estimated effect is somewhat modest in reading, and larger in math. In both cases, as always, results vary substantially by location, with very large and positive impacts in some places and negative impacts in others.

These “horse race” charter school studies are certainly worthwhile, and their findings have useful policy implications. In another sense, however, the public’s relentless focus on the “bottom line” of these analyses is tantamount to asking continually a question ("do charter schools boost test scores?") to which we already know the answer (some do, some do not). This approach is somewhat inconsistent with the whole idea of charter schools, and with harvesting what is their largest potential contribution to U.S. public education. But there are also a few more specific issues and findings in this report that merit a little bit of further discussion, and we’ll start with those.