• Calling Black Men To The Blackboard

    Our guest author today is Travis Bristol, former high school English teacher in New York City public schools, who is currently a clinical teacher educator with the Boston Teacher Residency program, as well as a fifth-year Ph.D. candidate at Teachers College, Columbia University. His research interests focus on the intersection of gender and race in organizations. Travis is a 2013 National Academy of Education/Spencer Dissertation Fellow.

    W.E.B. Du Bois, the preeminent American scholar, suggested that the problem of the twentieth-century is the problem of the color-line. Without question, the problem of the 21st century continues to be the “color-line," which is to say race. And so it is understandable why Cabinet members in the Obama administration continue to address the race question head-on, through policies that attempt to decrease systemic disparities between Latino and Black Americans when compared to White Americans.

    Most recently, in August 2013, U.S. Attorney General Eric Holder announced the Justice Department’s decision to reduce federal mandatory drug sentencing regulations.  Holder called “shameful” the fact that “black male offenders have received sentences nearly 20 percent longer than those imposed on white males convicted of similar crimes." Attempts, such as Holder's, to reform the criminal justice system appear to be an acknowledgment that institutionalized racism influences how Blacks and Whites are sentenced.

  • Learning From The 1963 March On Washington For Jobs And Freedom

    Today marks the actual calendar day of the 50th Anniversary of the 1963 March on Washington for Jobs and Freedom. In honor of that day, we republish Al Shanker’s tribute to A. Philip Randolph, the director of the March, on the occasion of Randolph’s passing in 1979. One of the themes of Shanker’s comments is the distinctive place of A. Philip Randolph in the African-American freedom struggle, distinguished from Booker T. Washington, W. E. B. DuBois and Marcus Garvey, by his focus on the empowerment of African-American working people and his commitment to non-violent, mass action as the means of empowerment. One of the lesson plans the Shanker Institute has published for teaching the 1963 March focuses precisely on this distinctive contribution of Randolph. Other lesson plans look at Randolph’s close partner, Bayard Rustin, who was the organizing genius behind the March, and examine the alliance between the labor movement and civil rights movement which made the March a success. All of the Shanker Institute lesson plans can be read here.

    It may be said - I think without exaggeration - that no American in this century has done more to eliminate racial discrimination in our society and to improve the condition of working people than did A. Philip Randolph, who died this week at the age of 90. 

    For A. Philip Randolph, a man of quiet eloquence with dignity in every gesture, freedom and justice were never granted people. They had to be fought for in struggles that were never-ending. And progress was something that had to be measured in terms of tangible improvements in people's lives, in the condition of society generally, and in the quality of human relationships.

  • The Great Proficiency Debate

    A couple of weeks ago, Mike Petrilli of the Fordham Institute made the case that absolute proficiency rates should not be used as measures of school effectiveness, as they are heavily dependent on where students “start out” upon entry to the school. A few days later, Fordham president Checker Finn offered a defense of proficiency rates, noting that how much students know is substantively important, and associated with meaningful outcomes later in life.

    They’re both correct. This is not a debate about whether proficiency rates are at all useful (by the way, I don't read Petrilli as saying that). It’s about how they should be used and how they should not.

    Let’s keep this simple. Here is a quick, highly simplified list of how I would recommend interpreting and using absolute proficiency rates, and how I would avoid using them.

  • Proficiency Rates And Achievement Gaps

    The change in New York State tests, as well as their results, has inevitably resulted in a lot of discussion of how achievement gaps have changed over the past decade or so (and what they look like using the new tests). In many cases, the gaps, and trends in the gaps, are being presented in terms of proficiency rates.

    I’d like to make one quick point, which is applicable both in New York and beyond: In general, it is not a good idea to present average student performance trends in terms of proficiency rates, rather than average scores, but it is an even worse idea to use proficiency rates to measure changes in achievement gaps.

    Put simply, proficiency rates have a legitimate role to play in summarizing testing data, but the rates are very sensitive to the selection of cut score, and they provide a very limited, often distorted portrayal of student performance, particularly when viewed over time. There are many ways to illustrate this distortion, but among the more vivid is the fact, which we’ve shown in previous posts, that average scores and proficiency rates often move in different directions. In other words, at the school-level, it is frequently the case that the performance of the typical student -- i.e., the average score -- increases while the proficiency rate decreases, or vice-versa.

    Unfortunately, the situation is even worse when looking achievement gaps. To illustrate this in a simple manner, let’s take a very quick look at NAEP data (4th grade math), broken down by state, between 2009 and 2011.

  • New York State Of Mind

    Last week, the results of New York’s new Common Core-aligned assessments were national news. For months, officials throughout the state, including New York City, have been preparing the public for the release of these data.

    Their basic message was that the standards, and thus the tests based upon them, are more difficult, and they represent an attempt to truly gauge whether students are prepared for college and the labor market. The inevitable consequence of raising standards, officials have been explaining, is that fewer students will be “proficient” than in previous years (which was, of course, the case) – this does not mean that students are performing worse, only that they are being held to higher expectations, and that the skills and knowledge being assessed require a new, more expansive curriculum. Therefore, interpretation of the new results versus those in previous year must be extremely cautious, and educators, parents and the public should not jump to conclusions about what they mean.

    For the most part, the main points of this public information campaign are correct. It would, however, be wonderful if similar caution were evident in the roll-out of testing results in past (and, more importantly, future) years.

  • NAEP And Public Investment In Knowledge

    As reported over at Education Week, the so-called “sequester” has claimed yet another victim: The National Assessment of Educational Progress, or NAEP. As most people who follow education know, this highly respected test, which is often called the “nation’s report card," is a very useful means of assessing student performance, both in any given year and over time.

    Two of the “main assessments” – i.e., those administered in math and reading every two years to fourth and eighth graders – get most of the attention in our public debate, and these remain largely untouched by the cuts. But, last May, the National Assessment Governing Board, which oversees NAEP, decided to eliminate the 2014 NAEP exams in civics, history and geography for all but 8th graders (the exams were previously administered in grades 4, 8 and 12). Now, in its most recent announcement, the Board has decided to cancel its plans to expand the sample for 12th graders (in math, reading, and science) to make it large enough to allow state-level results. In addition, the 4th and 8th grade science samples will be cut back, making subgroup breakdowns very difficult, and the science exam will no longer be administered to individual districts. Finally, the “long-term trend NAEP," which has tracked student performance for 40 years, has been suspended for 2016. These are substantial cutbacks.

    Although its results are frequently misinterpreted, NAEP is actually among the few standardized tests in the U.S. that receives rather wide support from all “sides” of the testing debate. And one cannot help but notice the fact that federal and state governments are currently making significant investments in new tests that are used for high-stakes purposes, whereas NAEP, the primary low-stakes assessment, is being scaled back.

  • Under The Hood Of School Rating Systems

    Recent events in Indiana and Florida have resulted in a great deal of attention to the new school rating systems that over 25 states are using to evaluate the performance of schools, often attaching high-stakes consequences and rewards to the results. We have published reviews of several states' systems here over the past couple of years (see our posts on the systems in Florida, Indiana, Colorado, New York City and Ohio, for example).

    Virtually all of these systems rely heavily, if not entirely, on standardized test results, most commonly by combining two general types of test-based measures: absolute performance (or status) measures, or how highly students score on tests (e.g., proficiency rates); and growth measures, or how quickly students make progress (e.g., value-added scores). As discussed in previous posts, absolute performance measures are best seen as gauges of student performance, since they can’t account for the fact that students enter the schooling system at vastly different levels, whereas growth-oriented indicators can be viewed as more appropriate in attempts to gauge school performance per se, as they seek (albeit imperfectly) to control for students’ starting points (and other characteristics that are known to influence achievement levels) in order to isolate the impact of schools on testing performance.*

    One interesting aspect of this distinction, which we have not discussed thoroughly here, is the idea/possibility that these two measures are “in conflict." Let me explain what I mean by that.

  • So Many Purposes, So Few Tests

    In a new NBER working paper, economist Derek Neal makes an important point, one of which many people in education are aware, but is infrequently reflected in actual policy. The point is that using the same assessment to measure both student and teacher performance often contaminates the results for both purposes.

    In fact, as Neal notes, some of the very features required to measure student performance are the ones that make possible the contamination when the tests are used in high-stakes accountability systems. Consider, for example, a situation in which a state or district wants to compare the test scores of a cohort of fourth graders in one year with those of fourth graders the next year. One common means of facilitating this comparability is administering some of the questions to both groups (or to some "pilot" sample of students prior to those being tested). Otherwise, any difference in scores between the two cohorts might simply be due to differences in the difficulty of the questions. If you cannot check that out, it's tough to make meaningful comparisons.

    But it’s precisely this need to repeat questions that enables one form of so-called “teaching to the test," in which administrators and educators use questions from prior assessments to guide their instruction for the current year.

  • The Characteristics Of SIG Schools

    A few years ago, the U.S. Department of Education (USED) launched the School Improvement Grant (SIG) program, which is designed to award grants to “persistently low-achieving schools” to carry out one of four different intervention models.

    States vary in how SIG-eligible schools are selected, but USED guidelines require the use of three basic types of indicators: absolute performance level (e.g., proficiency rates); whether schools were “making progress” (e.g., rate changes); and, for high schools, graduation rates (specifically, whether the rate is under 60 percent). Two of these measures – absolute performance and graduation rates – tell you relatively little about the actual performance of schools, as they depend heavily on the characteristics (e.g., income) of students/families in the neighborhood served by a given school. It was therefore pretty much baked into the rules that the schools awarded SIGs have tended to exhibit certain characteristics, such as higher poverty rates.

    Over 800 schools were awarded “Tier 1” or “Tier 2” grants for the 2010-11 school year (“SIG Cohort One”). Let’s take a quick look at a couple of key characteristics of these schools, using data from USED and the National Center for Education Statistics.

  • Charter School Market Share And Performance

    One of the (many) factors that might help explain -- or at least be associated with -- the wide variation in charter schools’ test-based impacts is market share. That is, the proportion of students that charters serve in a given state or district. There are a few reasons why market share might matter.

    For example, charter schools compete for limited resources, including private donations and labor (teachers), and fewer competitors means more resources. In addition, there are a handful of models that seem to get fairly consistent results no matter where they operate, and authorizers who are selective and only allow “proven” operators to open up shop might increase quality (at the expense of quantity). There may be a benefit to very slow, selective expansion (and smaller market share is a symptom of that deliberate approach).

    One way to get a sense of whether market share might matter is simply to check the association between measured charter performance and coverage. It might therefore be interesting, albeit exceedingly simple, to use the recently-released CREDO analysis, which provides state-level estimates based on a common analytical approach (though different tests, etc.), for this purpose.