The Data-Driven Education Movement

** Also reprinted here in the Washington Post

In the education community, many proclaim themselves to be "completely data-driven." Data Driven Decision Making (DDDM) has been a buzz phrase for a while now, and continues to be a badge many wear with pride. And yet, every time I hear it, I cringe.

Let me explain. During my first year in graduate school, I was taught that excessive attention to quantitative data impedes – rather than aids – in-depth understanding of social phenomena. In other words, explanations cannot simply be cranked out of statistical analyses, without the need for a precursor theory of some kind – a.k.a. “variable sociology” – and the attempt to do so constitutes a major obstacle to the advancement of knowledge.

I am no longer in graduate school, so part of me says: Okay, I know what data-driven means in education. But then, at times, I still think: No, really, what does “data-driven” mean even in this context?

At a basic level, it seems to signal a general orientation toward making decisions based on the best information that we have, which is a very good thing. But there are two problems here. First, we tend to have an extremely narrow view of the information that counts – that is, data that can be quantified easily. Second, we seem to operate under the illusion that data, in and of themselves, can tell stories and reveal truth.

But the thing is: (1) numbers are not the only type of data that matter; and (2) all data need to be interpreted before they can be elevated to the status of evidence – and theory should drive this process, not data.

Remember the parable about the drunk man searching for his wallet under a streetlight? When someone comes to help, they ask “Are you sure you dropped it here?" The drunk says, “I probably dropped it in the street, but the light is bad there, so it’s easier to look over here." In science, this phenomenon – that is, researchers looking for answers where the data are better, “rather than where the truth is most likely to lie” – has been called the “streetlight effect."

As David Freedman explains it in a Discover Magazine article that asks why scientific studies are so often wrong, researchers “don’t always have much choice. It is often extremely difficult or even impossible to cleanly measure what is really important, so scientists instead cleanly measure what they can, hoping it turns out to be relevant."

As Freedman says, “We should fully expect scientific theories to frequently butt heads and to wind up being disproved sometimes as researchers grope their way toward the truth. That is the scientific process: Generate ideas, test them, discard the flimsy, repeat."

But what if they develop the ideas to fit the data they have, rather than finding the data to test the most important ideas?

So, as yawn-inducing as the word theory may sound to a lot of people, theory acts to rationalize the search for your wallet or anything else, helping to focus attention on the areas where it is most likely to be found. In education, it often seems like we are too preoccupied with the convenient and well-lit. So, while it seems like we are drowning in education data, are they the data that we need to make sound decisions?

Sociologists Peter Hedström and Richard Swedberg (1996) wrote:

Quantitative research is essential both for descriptive purposes and for testing sociological theories. We do, however, believe that many sociologists have had all too much faith in statistical analysis as a tool for generating theories, and that the belief in an isomorphism between statistical and theoretical models [...] has hampered the development of sociological theories built upon concrete explanatory mechanisms.

Something similar could be said about the data-driven education movement: Excessive faith in data crunching as a tool for making decisions has interfered with the important task of asking the fundamental questions in education, such as whether we are looking for answers in the right places, and not just where it is easy (e.g., standardized test data).

As education scholar (and blogger) Bruce Baker has shown (often humorously), data devoid of theory can suggest ridiculous courses of action:

Let’s say I conducted a study in which I rented a fleet of helicopters and used those helicopters to, on a daily basis, transport a group of randomly selected students from Camden, NJ to elite private day schools around NJ and Philadelphia. I then compared the college attendance patterns of the kids participating in the helicopter program to 100 other kids from Camden who also signed up for the program but were not selected and stayed in Camden public schools. It turns out that I find that the helicopter kids were more likely to attend college – therefore I conclude logically that “helicopters improve college attendance among poor, minority kids.

As preposterous as this proposal may sound, the Brookings report he mentions argues somewhat along these lines – only the helicopters are vouchers. The study, says Baker, “purports to find [or at least the media spin on it] that vouchers as a treatment, worked especially for black students." A minimal understanding of the mechanisms involved here should have made it obvious that vouchers are likely no more relevant than helicopters to children’s educational attainment.

A second example: About a year ago at the United Nations Social Innovation Summit, Nicholas Negroponte suggested that the “One Laptop Per Child” program might, “literally or figuratively, drop out of a helicopter with tablets into a village where there is no school," and then come back after a year to see how children have taught themselves to read.

This faith in the power of new technology to bring about fundamental educational transformation is not new, but I think it could be minimized if we reflected on more basic questions such as: What it is that helicopter-dropped tablets might actually do to increase children’s educational gains?

My colleague recently wrote that NCLB "has helped to institutionalize the improper interpretation of testing data." True. But I would go even further: NCLB has helped to institutionalize not just how we handle data, but also, and more importantly, what counts as data. The law requires schools to rely on scientifically-based research but, as it turns out, case studies, ethnographies, interviews, and other forms of qualitative research seem to fall outside this definition – and, thus, are deemed unacceptable as a basis for making decisions.

Since when are qualitative data unacceptable in social and behavioral science research and as a guide in policy-relevant decision-making?

Our blind faith in numbers has ultimately caused impoverishment in how (and what) information is used to help address real world problems. We now apparently believe that numbers are not just necessary, but sufficient, for making research-based decisions.

The irony, of course, is that this notion is actually contrary to the scientific process. Being data-driven is only useful if you have a strong theory by which to navigate; anything else can leave you heading blindly toward a cliff.

- Esther Quintero

Blog Topics

Data driven education produces lots of data but little education.

Obviously that conclusion about the helicopters can't be drawn from the "study" you describe - you'd need a control group that also attended the private schools but used another form of transportation. If outcomes for the helicopter group were statistically superior then one could conclude that the helicopters (rather than the private schools) had an impact. You, of course, already know this.
Shame on you for presenting such a flawed example and pretending it's a strong parallel to the voucher study. I have come to expect a more rigorous standard of discussion on this blog. This type of manipulative dialogue will get us nowhere - we would be better served by helping the public understand how statistics actually works.

This is what SMART goals are all about - specific, measurable, attainable, realistic and timely - if you can't measure it, can't compartmentalize it, it's not useful. And as you say, it not only changes what we measure, it changes what we ask. It affects what we focus on and what is considered important. It sacrifices a process, a journey, for a snapshot. It purports to measure growth, but doesn't really acknowledge that growth takes time and growth is not necessarily a linear process. And it reinforces the idea that if it's not recognizably practical before you even begin, it's not worth consideration and as a result stifles growth, experimentation, and innovation.

Because I was told in graduate school that raw data needs to be interpreted, I took the 17 daily data sheets required for one student with autism home with me to attempt to create a scatterplot. For this, I was "written up" for not having the data available the next day for the ABA coach to inspect. This convinced me of something I already suspected; the massive amounts of data teachers are forced to churn out are not meant to drive instruction but to prove that we are not sitting around eating bonbons all day. I would have preferred to have a camera mounted on my shoulder to prove I was doing my job. At least that would have freed me up to actually get some teaching done!

Laura said "you’d need a control group that also attended the private schools but used another form of transportation"

However, proponents of vouchers ARE comparing the education potential of kids who attend one school versus another. The study design described does not conclude anything about helicopters, as you correctly state. However, telling the public that kids who have a voucher-based entrance into an elite private school will do better as compared to students going to the local public school also says nothing about the voucher itself either. In fact, such studies are inherently flawed because the means of attendance is not the only variable.

How many studies out there actually DO look at the college entrance rates of poor minority kids who are placed in an elite private school as compared to other kids at the same school who can afford it? Good luck finding the studies you describe that could have been referenced in this post. It is very difficult to perform such studies as the sample sizes are quite small and variables are many.

Some time ago my school started an independent reading program. Middle school students chose their own book and were required to read for 30 minutes a day, make weekly entries into a journal to which their teacher responded, and give book talks. The purpose of the program was not simply to improve vocabulary and reading comprehension but to create a culture of reading in our school. (Previously we had a lot of students who were self-proclaimed non-readers, as in "I don't read.") One day I saw a student who got kicked out of class sitting in the office. He took his book out of his book bag and sat there reading until the assistant principal was available to deal with him. That's when I knew our the reading program was effective, even though that evidence was totally non-measurable.

I can provide several examples of how data is misused by administrators from my own experience:

I was told that my ESL students "suffered" under my teaching because out of the FIVE students I had who were ESL, ONE of mine did not do as well as each of my PLC partners FIVE ESL students. I was moved out of core LA, and this was one of the reasons cited. Five students is statistically insignificant, but that was never considered. Nor were the qualitative factors that made this student a unique situation.

I also know that my principal gathered data on me from students who were in ISS (in school suspension). He had a theory all right (I wouldn't tamper with grades so he wanted me out of core LA), and was trying to gather evidence to support removing me from core LA from students who were habitually in ISS--not usually sent by me, by the way.

If data driven education is going to be used, administrators need to take a few math courses in how to use it appropriately.

Some other problems with data-driven education I wrote about here: http:robinwilsonjohnston.edublogs.org
See "Fixing Only What is Broken"

This is a very insightful piece. I agree with the overall gist but would suggest two important qualifications.

1. Yes, it is true that those who believe in the power of "data" fail to interpret it correctly--and that this is because they fail to begin with a theory or question. But the problem goes deeper than Quintero suggests. She writes, "Excessive faith in data crunching as a tool for making decisions has interfered with the important task of asking the fundamental questions in education, such as whether we are looking for answers in the right places, and not just where it is easy (e.g., standardized test data)." But before asking such a question, one should ask, "what are we hoping to find out here, and what does each of our key terms mean?" If our main question pertains to 'achievement'; one should ask, "how am I defining 'achievement' in this context, and why?" Give up or ignore such a question, and everything else gets muddled.

2. Instead of expanding the definition of "data" to include enthnographies, interviews, and so forth, I would narrow it but acknowledge that data do not provide all the important information. This is a technicality but a key one. Data, in their traditional sense, are comparable bits of measurable information that, when collected in large numbers and interpreted in relation to a research question, can provide information. Student essays do not count as data, because you cannot treat them as comparable bits or break them down into comparable bits (rubrics aside). Some scholars would disagree with me and say that the term "data" has loosened over time to include things that can't be directly compared with one another. I see problems with such loosening: in particular, methodological confusion and corruption. Calling something "data" often sets the stage for breaking it down into comparable and supposedly measurable bits, when it should not be broken down in such a manner. On the other hand, I see many reasons to consider things that do not fall under the stricter definition of "data." Just call them what they are: essays, interviews, or what have you.

I wrote about this here: http://dianasenechal.wordpress.com/2012/06/30/the-misuse-of-data-the-wo… (non-satirical piece) and here: http://dianasenechal.wordpress.com/2012/11/10/student-shows-23-percent-… (satirical piece).

Also, for an alarming example of data-worship in a second-grade classroom, see this video (thanks to Robert Pondiscio for originally pointing it out and to James O'Keeffe for reminding me of it):

http://blog.coreknowledge.org/2010/12/09/data-is-fabulous/

Note: I posted this comment on Diane Ravitch's blog as well.

Shanker Blog

The Data-Driven Education Movement