How Cross-Sectional Are Cross-Sectional Testing Data?

In several posts, I’ve complained about how, in our public discourse, we misinterpret changes in proficiency rates (or actual test scores) as “gains” or “progress," when they actually represent cohort changes—that is, they are performance snapshots for different groups of students who are potentially quite dissimilar.

For example, the most common way testing results are presented in news coverage and press releases is to present year-to-year testing results across entire schools or districts – e.g., the overall proficiency rate across all grades in one year compared with the next. One reason why the two groups of students being compared (the first versus the second year) are different is obvious. In most districts, tests are only administered to students in grades 3-8. As a result, the eighth graders who take the test in Year 1 will not take it in Year 2, as they will have moved on to the ninth grade (unless they are retained). At the same time, a new cohort of third graders will take the test in Year 2 despite not having been tested in Year 1 (because they were in second grade). That’s a large amount of inherent “turnover” between years (this same situation applies when results are averaged for elementary and secondary grades). Variations in cohort performance can generate the illusion of "real" change in performance, positive or negative.

But there’s another big cause of incomparability between years: Student mobility. Students move in and out of districts every year. In urban areas, mobility is particularly high. And, in many places, this mobility includes students who move to charter schools, which are often run as separate school districts.

I think we all know intuitively about these issues, but I’m not sure many people realize just how different the group of tested students across an entire district can be in one year compared with the next. In order to give an idea of this magnitude, we might do a rough calculation for the District of Columbia Public Schools (DCPS).

A new report from the Urban Institute takes a quick look at student mobility in DCPS between the 2007-08 and 2008-09 school years. Using student records, the authors find that, not counting graduates or students who enter the system at the lowest grade, roughly one-fifth of DCPS students in 2007-08 were not in the district in 2008-09. Now, in fairness, some of this might be due to coding error in the district’s data system. But, presumably, most of it is “real” mobility – students dropping out, moving to charter/private schools or leaving the area entirely (perhaps due to foreclosures, which is the primary focus of the report).

Using this information, we might ballpark the “stability” between samples. For the sake of this illustration, let’s assume that mobility is equally distributed across grades, and that a quarter (five percentage points) of the mobility from the Urban Institute report represents coding error. This means that, right off the bat, 15 percent of the students tested in 2007-08 were not among the sample taking the exams in 2008-09.

Now let’s account for the differences between years that stem from simple grade progression. In DC, the only students who will take the test in both years (not counting those who must repeat eighth or tenth grade) are the students in grades 3-7, who, according to grade-level data from the National Center for Education Statistics, comprise roughly three-quarters of DCPS enrollment in tested grades. Around one-quarter will be in the 2007-08 testing sample but not in the sample the next year.

If we roughly “combine” the estimates based on grade progress and student mobility, about 35-40 percent of DCPS students who took the test in 2007-08 did not take a district assessment in 2008-09. So, when we compare testing results between these two years, as many as two out of five kids are different.

To be fair, this unusually high figure is inflated in the case of DCPS for a couple of reasons, including the high rate of foreclosures during this time (which will hopefully level off at some point), the unusually rapid proliferation of charter schools in the district and the addition of a tenth grade test, which most states do not have.

Even when the year-to-year incomparability is as high at it was in DCPS, it doesn’t mean we simply cannot make any comparisons. Over larger samples, there is a degree to which differences between cohorts are smoothed out. For instance, when looking at a whole district, third graders in one year might exhibit roughly the same performance levels as third graders the next year. But this is not always the case, and there are other sources of year-to-year differences, such as charter school proliferation and foreclosures, that definitely vary over time and are not random. They may generate significant differences among the groups of students taking the tests, even over very short time periods.

So, the basic takeaway here is that year-to-year comparisons must be made with extreme caution (and they too often are not). Student mobility and simple grade progression mean that a large proportion – sometimes a very large proportion – of students in one year are not “in the data” the next year, and this can influence results substantially. Put differently, the data are sometimes even more cross-sectional than many people realize.

One final note: Massive amounts of public attention that were paid to DCPS’ testing results between 2007-08 and 2008-09, since it was Michelle Rhee’s first year. It’s worth noting that about two in five of the students being tested were different from one year to the next, in part due to factors (e.g., foreclosures) that generate incomparability between years. Without a longitudinal analysis, it’s not possible to know how much this affected the results.

- Matt Di Carlo

Blog Topics

Also, another reason why Central Falls RI is such a bad place to draw or teach any lessons about school reform. It is one square mile and full of immigrants! There will always be huge mobility in and out.

Shanker Blog

How Cross-Sectional Are Cross-Sectional Testing Data?