Skip to:

Standardized Tests

  • The False Choice Of Growth Versus Proficiency

    Written on October 1, 2019

    Tennessee is considering changing its school accountability system such that schools have the choice of having their test-based performance judged by either status (how highly students score) or growth (how much progress students make over the course of the year). In other words, if schools do poorly on one measure, they are judged by the other (apparently, Texas already has a similar system in place).

    As we’ve discussed here many times in the past, status measures, such as proficiency rates, are poor measures of school performance, since some students, particularly those living in poverty, enter their schools far behind their more affluent peers. As a result, schools serving larger proportions of poor students will exhibit lower scores and proficiency rates, even if they are very effective in compelling progress from their students. That is why growth models, which focus on individual student gains on over time, are a superior measure of school performance per se.

    This so-called “growth versus proficiency” debate has resurfaced several times over the years, and it was particularly prevalent during the time when states were submitting proposals for their accountability systems during reauthorization of the Elementary and Secondary Education Act. The policy that came out of these discussions was generally promising, as many states moved at least somewhat toward weighting growth model estimates more heavily. 

    At the same time, however, it is important to mention that the “growth versus proficiency” debate sometimes implies that states must choose between these two types of indicators. This is misleading. And the Tennessee proposal is a very interesting context for discussing this, since they are essentially using these two types of measures interchangeably. The reality, of course, is that both types of measures transmit valuable but different information, and both have a potentially useful role to play in accountability systems.

    READ MORE
  • Tests Worth Teaching To

    Written on September 24, 2019

    Our guest authors today are Chester E. Finn, Jr. and Andrew E. Scanlan. Finn is a distinguished senior fellow and president emeritus at the Thomas B. Fordham Institute and a senior fellow at Stanford University’s Hoover Institution. Scanlan is a research and policy associate at the Thomas B. Fordham Institute.

    This year, some 165,000 American educators are teaching Advanced Placement (AP) classes—a veritable army, mobilized to serve some three million students as they embark on coursework leading to the AP program’s rigorous three-hour exams each May. As we explore in our new book, Learning in the Fast Lane: The Past, Present and Future of Advanced Placement, preparing these young people to succeed on the tests (scored from 1 to 5, with 3 or better deemed “qualifying”) is a major instructional objective for teachers as well as for the students (and their families) who recognize the program’s potential to significantly enhance their post-secondary prospects.

    For AP teachers, one might suppose that this objective would be vexing—yet another end-of-year exam that will constrain their curricular choices, stunt their classroom autonomy, and turn their pupils into cram-and-memorize machines rather than eager, deeper learners, creative thinkers, and inquisitive intellectuals.

    One might also suppose that the AP program, as it has infiltrated 70 percent of U.S. public (and half of private) high schools, would be vulnerable to the anti-testing resentments and revolts of recent years. These have been largely driven by government-imposed school accountability regimes that are mostly based on the scores kids get on state-mandated assessments, especially in math and English. That’s led many schools to press teachers to devote more hours to “test prep,” minimize time spent on other subjects, and neglect topics that aren’t included in state standards (and therefore won’t be tested). It’s not unreasonable, then, to expect resistance to AP as well.

    READ MORE
  • For Florida's School Grading System, A Smart Change With Unexpected Effects

    Written on July 13, 2017

    Last year, we discussed a small but potentially meaningful change that Florida made to its school grading system, one that might have attenuated a long-standing tendency of its student “gains” measures, by design, to favor schools that serve more advantaged students. Unfortunately, this result doesn’t seem to have been achieved.

    Prior to 2014-15, one of the criteria by which Florida students could be counted as having “made gains” was scoring as proficient or better in two consecutive years, without having dropped a level (e.g., from advanced to proficient). Put simply, this meant that students scoring above the proficiency threshold would be counted as making “gains,” even if they in fact made only average or even below average progress, so long as they stayed above the line. As a result of this somewhat crude “growth” measure, schools serving large proportions of students scoring above the proficiency line (i.e., schools in affluent neighborhoods) were virtually guaranteed to receive strong “gains” scores. Such “double counting” in the “gains” measures likely contributed to a very strong relationship between schools’ grades and their students’ socio-economic status (as gauged, albeit roughly, by subsidized lunch eligibility rates).

    Florida, to its credit, changed this “double counting” rule effective in 2014-15. Students who score as proficient in two consecutive years are no longer automatically counted as making “gains.” They must also exhibit some score growth in order to receive the designation.

    READ MORE
  • Improving Accountability Measurement Under ESSA

    Written on May 25, 2017

    Despite the recent repeal of federal guidelines for states’ compliance with the Every Student Succeeds Act (ESSA), states are steadily submitting their proposals, and they are rightfully receiving some attention. The policies in these proposals will have far-reaching consequences for the future of school accountability (among many other types of policies), as well as, of course, for educators and students in U.S. public schools.

    There are plenty of positive signs in these proposals, which are indicative of progress in the role of proper measurement in school accountability policy. It is important to recognize this progress, but impossible not to see that ESSA perpetuates long-standing measurement problems that were institutionalized under No Child Left Behind (NCLB). These issues, particularly the ongoing failure to distinguish between student and school performance, continue to dominate accountability policy to this day. Part of the confusion stems from the fact that school and student performance are not independent of each other. For example, a test score, by itself, gauges student performance, but it also reflects, at least in part, school effectiveness (i.e., the score might have been higher or lower had the student attended a different school).

    Both student and school performance measures have an important role to play in accountability, but distinguishing between them is crucial. States’ ESSA proposals make the distinction in some respects but not in others. The result may end up being accountability systems that, while better than those under NCLB, are still severely hampered by improper inference and misaligned incentives. Let’s take a look at some of the key areas where we find these issues manifested.

    READ MORE
  • Do Subgroup Accountability Measures Affect School Ratings Systems?

    Written on October 28, 2016

    The school accountability provisions of No Child Left Behind (NCLB) institutionalized a focus on the (test-based) performance of student subgroups, such as English language learners, racial and ethnic groups, and students eligible for free- and reduced-price lunch (FRL). The idea was to shine a spotlight on achievement gaps in the U.S., and to hold schools accountable for serving all students.

    This was a laudable goal, and disaggregating data by student subgroups is a wise policy, as there is much to learn from such comparisons. Unfortunately, however, NCLB also institutionalized the poor measurement of school performance, and so-called subgroup accountability was not immune. The problem, which we’ve discussed here many times, is that test-based accountability systems in the U.S. tend to interpret how highly students score as a measure of school performance, when it is largely a function of factors out of schools' control, such as student background. In other words, schools (or subgroups of those students) may exhibit higher average scores or proficiency rates simply because their students entered the schools at higher levels, regardless of how effective the school may be in raising scores. Although NCLB’s successor, the Every Student Succeeds Act (ESSA), perpetuates many of these misinterpretations, it still represents some limited progress, as it encourages greater reliance on growth-based measures, which look at how quickly students progress while they attend a school, rather than how highly they score in any given year (see here for more on this).

    Yet this evolution, slow though it may be, presents a somewhat unique challenge for the inclusion of subgroup-based measures in formal school accountability systems. That is, if we stipulate that growth model estimates are the best available test-based way to measure school (rather than student) performance, how should accountability systems apply these models to traditionally lower scoring student subgroups?

    READ MORE
  • Thinking About Tests While Rethinking Test-Based Accountability

    Written on August 5, 2016

    Earlier this week, per the late summer ritual, New York State released its testing results for the 2015-2016 school year. New York City (NYC), always the most closely watched set of results in the state, showed a 7.6 percentage point increase in its ELA proficiency rate, along with a 1.2 percentage point increase in its math rate. These increases were roughly equivalent to the statewide changes.

    City officials were quick to pounce on the results, which were called “historic,” and “pure hard evidence” that the city’s new education policies are working. This interpretation, while standard in the U.S. education debate, is, of course, inappropriate for many reasons, all of which we’ve discussed here countless times and will not detail again (see here). Suffice it to say that even under the best of circumstances these changes in proficiency rates are only very tentative evidence that students improved their performance over time, to say nothing of whether that improvement was due to a specific policy or set of policies.

    Still, the results represent good news. A larger proportion of NYC students are scoring proficient in math and ELA than did last year. Real improvement is slow and sustained, and this is improvement. In addition, the proficiency rate in NYC is now on par with the statewide rate, which is unprecedented. There are, however, a couple of additional issues with these results that are worth discussing quickly.

    READ MORE
  • A Small But Meaningful Change In Florida's School Grades System

    Written on July 28, 2016

    Beginning in the late 1990s, Florida became one of the first states to assign performance ratings to public schools. The purpose of these ratings, which are in the form of A-F grades, is to communicate to the public “how schools are performing relative to state standards.” For elementary and middle schools, the grades are based entirely on standardized testing results.

    We have written extensively here about Florida’s school grading system (see here for just one example), and have used it to illustrate features that can be found in most other states’ school ratings. The primary issue is the heavy reliance that states place on how highly students score on tests, which tells you more about the students the schools serve than about how well they serve those students – i.e., it conflates school and student performance. Put simply, some schools exhibit lower absolute testing performance levels than do other schools, largely because their students enter performing at lower levels. As a result, schools in poorer neighborhoods tend to receive lower grades, even though many of these schools are very successful in helping their students make fast progress during their few short years of attendance.

    Although virtually every states’ school rating system has this same basic structure to varying degrees, Florida’s system warrants special attention, as it was one of the first in the nation and has been widely touted and copied (as well as researched -- see our policy brief for a review of this evidence). It is also noteworthy because it contains a couple of interesting features, one of which exacerbates the aforementioned conflation of student and school performance in a largely unnoticed manner. But, this feature, discussed below, has just been changed by the Florida Department of Education (FLDOE). This correction merits discussion, as it may be a sign of improvement in how policymakers think about these systems.

    READ MORE
  • Charter Schools And Longer Term Student Outcomes

    Written on April 28, 2016

    An important article in the Journal of Policy Analysis and Management presents results from one of the published analyses to look at the long term impact of attending charter schools.

    The authors, Kevin Booker, Tim Sass, Brian Gill, and Ron Zimmer, replicate part of their earlier analysis of charter schools in Florida and Chicago (Booker et al. 2011), which found that students attending charter high schools had a substantially higher chance of graduation and college enrollment (relative to students that attended charter middle schools but regular public high schools). For this more recent paper, they extend the previous analysis, including the addition of two very important, longer term outcomes – college persistence and labor market earnings.

    The limitations of test scores, the current coin of the realm, are well known; similarly, outcomes such as graduation may fail to capture meaningful skills. This paper is among the first to extend the charter school effects literature, which has long relied almost exclusively on test scores, into the longer term postsecondary and even adulthood realms, representing a huge step forward for this body of evidence. It is a development that is likely to become more and more common, as longitudinal data hopefully become available from other locations. And this particular paper, in addition to its obvious importance for the charter school literature, also carries some implications regarding the use of test-based outcomes in education policy evaluation.

    READ MORE
  • Where Al Shanker Stood: The Importance And Meaning Of NAEP Results

    Written on October 30, 2015

    In this New York Times piece, published on July 29, 1990, Al Shanker discusses the results of the National Assessment of Educational Progress (NAEP), and what they suggested about the U.S. education system at the time.

    One of the things that has influenced me most strongly to call for radical school reform has been the results of the National Assessment of Educational Progress (NAEP) examinations. These exams have been testing the achievement of our 9, 13 and 17-year olds in a number of basic areas over the past 20 years, and the results have been almost uniformly dismal.

    According to NAEP results, no 17-year-olds who are still in school are illiterate and innumerate - that is, all of them can read the words you would find on a cereal box or a billboard, and they can do simple arithmetic. But very few achieve what a reasonable person would call competence in reading, writing or computing.

    For example, NAEP's 20-year overview, Crossroads in American Education, indicated that only 2.6 percent of 17-year-olds taking the test could write a good letter to a high school principal about why a rule should be changed. And when I say good, I'm talking about a straightforward presentation of a couple of simple points. Only 5 percent could grasp a paragraph as complicated as the kind you would find in a first-year college textbook. And only 6 percent could solve a multi-step math problem like this one:"Christine borrowed $850 for one year from Friendly Finance Company. If she paid 12% simple interest on the loan, what was the total amount she repaid?"

    READ MORE
  • Trust: The Foundation Of Student Achievement

    Written on May 21, 2015

    When sharing with me the results of some tests, my doctor once said, "You are a scientist, you know a single piece of data can't provide all the answers or suffice to make a diagnosis. We can't look at a single number in isolation, we need to look at all results in combination." Was my doctor suggesting that I ignore that piece of information we had? No. Was my doctor deemphasizing the result? No. He simply said that we needed additional evidence to make informed decisions. This is, of course, correct.

    In education, however, it is frequently implied or even stated directly that the bottom line when it comes to school performance is student test scores, whereas any other outcomes, such as cooperation between staff or a supportive learning environment, are ultimately "soft" and, at best, of secondary importance. This test-based, individual-focused position is viewed as serious, rigorous, and data driven. Deviation from it -- e.g., equal emphasis on additional, systemic aspects of schools and the people in them -- is sometimes derided as an evidence-free mindset. Now, granted, few people are “purely” in one camp or the other. Most probably see themselves as pragmatists, and, as such, somewhere in between: Test scores are probably not all that matters, but since the rest seems so difficult to measure, we might as well focus on "hard data" and hope for the best.

    Why this narrow focus on individual measures such as student test scores or teacher quality? I am sure there are many reasons but one is probably lack of familiarity with the growing research showing that we must go beyond the individual teacher and student and examine the social-organizational aspects of schools, which are associated (most likely causally) with student achievement. In other words, all the factors skeptics and pragmatists might think are a distraction and/or a luxury, are actually relevant for the one thing we all care about: Student achievement. Moreover, increasing focus on these factors might actually help us understand what’s really important: Not simply whether testing results went up or down, but why or why not.

    READ MORE

Pages

Subscribe to Standardized Tests

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.