Why Did Florida Schools' Grades Improve Dramatically Between 1999 and 2005?
** Reprinted here in the Washington Post
Former Florida Governor Jeb Bush was in Virginia last week, helping push for a new law that would install an “A-F” grading system for all public schools in the commonwealth, similar to a system that has existed in Florida for well over a decade.
In making his case, Governor Bush put forth an argument about the Florida system that he and his supporters use frequently. He said that, right after the grades went into place in his state, there was a drop in the proportion of D and F schools, along with a huge concurrent increase in the proportion of A schools. For example, as Governor Bush notes, in 1999, only 12 percent of schools got A's. In 2005, when he left office, the figure was 53 percent. The clear implication: It was the grading of schools (and the incentives attached to the grades) that caused the improvements.
There is some pretty good evidence (also here) that the accountability pressure of Florida’s grading system generated modest increases in testing performance among students in schools receiving F's (i.e., an outcome to which consequences were attached), and perhaps higher-rated schools as well. However, putting aside the serious confusion about what Florida’s grades actually measure, as well as the incorrect premise that we can evaluate a grading policy's effect by looking at the simple distribution of those grades over time, there’s a much deeper problem here: The grades changed in part because the criteria changed.
The graph below presents the distribution of A-F grades for elementary and middle schools between 1999 and 2008 (sample size varies by year, but the trends are extremely similar if limited to schools that received grades in all years).
You can see that Governor Bush’s numbers are essentially correct. The small percentage of F schools, with a couple of exceptions, was relatively constant between 1999 and 2005, but the proportion receiving D's dipped. Moreover, in 1999, about 10 percent of schools received A’s, whereas roughly 55-60 percent got that grade from 2005 on.
However, you might also notice that the vast majority of these shifts occurred either between 1999 and 2000, or between 2001 and 2003. Other than that, the lines are somewhat flat.
This pattern is mostly a direct result of changes to the system in those years. Let’s quickly review a couple of these alterations, with a focus on the massive increase in A grades.
In 1999, in order to receive an A, schools had to meet several criteria (see the 1999 guide). One of them was minimum percentages at or above level 2 (or level 3 in writing) for six different subgroups – economically disadvantaged, black, white, Hispanic, Asian and American Indian.
I would try to piece together what grades schools would have received in 2000 had the system not changed after 1999 (or vice-versa), but the data I would need are only available in a format that would require a prohibitively laborious process to compile.*
Instead, consider the following: A-rated schools needed at least 60 percent at or above level 2 in reading or math for all six of these subgroups (or, conversely, a maximum of 40 percent at level 1). Given the fact that the statewide level 1 rate for a few of the groups was close to or higher than 40 percent in 2000, it seems safe to conclude that this rule, which is based entirely on absolute performance levels (how highly students score), eliminated a great many schools from having any realistic shot at an A.
In 2000, however, the state replaced this criterion. Instead of absolute targets for six subgroups, schools needed to show a decrease in the proportions of students scoring at level 1, regardless of race or income (the rule varied a bit for schools that had too few students at level 1 and/or level 2). Since any school could theoretically exhibit such decreases, even if their rates were very low to start the year, this new rule substantially expanded the pool of schools that could plausibly receive an A. That probably goes a long way toward explaining why the proportion doing so almost tripled in a single year.
Moving on, there were no big changes in the system between 2000 and 2001 (and, as you can see in the graph above, there were only relatively minor fluctuations in the distribution of grades).
In 2002, however, the entire system changed dramatically. Florida moved to a points-based system. Each school was assigned point totals, and these point totals were sorted into grades. The new scheme placed much greater emphasis on "growth" than its predecessors, and was in most respects a better system. But it also led to a larger proportion of A-rated schools.
In this case, we need not speculate. This paper used detailed student-level data to simulate the grades that elementary schools would have received in 2002 under the old 2001 system. About half of them would have received the same grade. Among those that would have gotten different ratings, the changes, put simply, made it "easier" to get both high and low grades.
On the one hand, the new system increased the number of F schools (e.g., the vast majority of the few dozen F schools under the new system would have received C's and D's under the old system). On the other hand, it also massively increased the number of A and B schools. For example, among the schools receiving an A under the new 2002 system, well over half would have gotten a lower grade under the 2001 rules.
In other words, the big jump in A-rated schools between 2001 and 2002 was artificial. The rules changed, so the grades changed.
(For the record, between 2002 and 2003, the grades improved quite a bit. These increases cannot be chalked up to rule changes, as the system did not change much. See here and here for analyses that focus on alternative explanations.)
In summary, then, it is incredibly misleading to compare the distributions of grades between 1999 and 2005 (to say nothing of attributing the increases, even if they're "real," to the system itself). Using a consistent set of criteria, there would almost certainly have been significant improvement in the grades over this time, but ignoring the huge rule changes in 2000 and 2002 severely overstates this positive change.
Again, Governor Bush and supporters of his reforms have some solid evidence to draw upon when advocating for the Florida reforms, particularly the grade-based accountability system. The modest estimated effects in these high-quality analyses are not as good a talking point as the “we quadrupled the number of A-rated schools in six years” argument, but they are far preferable to claiming credit for what’s on the scoreboard after having changed the rules of the game.
- Matt Di Carlo
* The FLDOE data tool requires one to download the data for each school individually. If any of these data are made public (or they are available in a more convenient format on the site, and I missed them), please leave a comment.
Matt, thanks for your expose. The same has happened in Louisiana: change the rules, and "improve" the scores. My colleague Herb Bassett and I just did an interview for Louisiana Public Broadcasting in the school score inflation in 2012 in Louisiana:
Motorola employees used to say, regarding the Six Sigma system before the company fell apart, "be careful what you measure" and "the metrics become the goal." These warnings apply to educational accountabiity systems too. Accountability systems, under the guise of presenting a true representation of data, can be extremely misleading when used inappropriately. Unfortunately, the public generally doesn't understand the misuse of these systems.
Matt, I didn’even realize that Mercedes Schneider had already commented. She and Herb did a great job of pointing out the scam in Louisiana. They can also tell you that La is following FL and I suspect other states, in destroying the ability of researchers to utilize databases in order to check claims of the reformers. All historical databases were dropped from the new website in LS… now it is all propaganda. Thanks for the details on Fla!~
You can go back pretty far with the FL data at this website; email me if you want more help with this.
Yes, I use those datasets often (probably too often!).
What I was looking for was school-level data that would enable me to backcode data from a given year into a previous grading system. For example, to roughly simulate the 1999 system, I would need level 1 rates by subgroup. The data are available, but only if you download them for each school individually.
Thanks for the comment,