The Ethics of Testing Children Solely To Evaluate Adults

The recent New York Times article, “Tests for Pupils, but the Grades Go to Teachers," alerts us of an emerging paradox in education – the development and use of standardized student testing solely as a means to evaluate teachers, not students. “We are not focusing on teaching and learning anymore; we are focusing on collecting data," says one mother quoted in the article. Now, let’s see: collecting data on minors that is not explicitly for their benefit – does this ring a bell?

In the world of social/behavioral science research, such an enterprise – collecting data on people, especially on minors – would inevitably require approval from the Institutional Review Board (IRB). For those not familiar, IRB is a committee that oversees research that involves people and is responsible for ensuring that studies are designed in an ethical manner. Even in conducting a seemingly harmless interview on political attitudes or observing a group studying in a public library, the researcher would almost certainly be required to go through a series of steps to safeguard participants and ensure that the norms governing ethical research will be observed.

Very succinctly, IRBs’ mission is to see that (1) the risk-benefit ratio of conducting the research is favorable; (2) any suffering or distress that participants may experience during or after the study is understood, minimized, and addressed; and (3) research participants’ agreed to participate freely and knowingly – usually, subjects are requested to sign an informed consent which includes a description of the study’s risks and benefits, a discussion of how confidentiality will be guaranteed, a statement on the voluntary nature of involvement, and a clarification that refusal or withdrawal at any time will involve no penalty or loss of benefits. When the research involves minors, parental consent and sometimes child assent are needed.

In short, IRB procedures exist to protect people. To my knowledge, student evaluation procedures and standardized testing are exempt from this sort of scrutiny. So the real question is: Should they be? Perhaps not.

One would think unethical research is something of the past but the truth is there are still many lapses and areas that lack sufficient regulation – See here for a discussion of the use of human tissue samples in medical research, a la Henrietta Lacks. Is standardized testing an area that managed to fly under our ethical/IRB radar?

In my experience, IRB asks the tough questions that researchers – sometimes too enamored with their work – may feel compelled to overlook: Do you really need these data? Why? What is to be gained and learned from the research? What are the downsides? In addition to safeguarding participants, these questions are conducive to rigorous and theory-driven data collection – there’s no “collect data first, ask questions later," so to speak, but quite the opposite.

This post is not about the appropriate use of student test scores or whether they should play a part in teacher evaluations. This is about looking at these questions from a different angle, using an entirely different framework. If collecting data on kids is what we are doing, then, at a minimum, shouldn’t we be doing so in an ethical manner? This is not to suggest that current practices are unethical; I can’t speak to that. But in the world of social science research – educational tests being one of the few exceptions – any work that involves research on humans is subject to IRB scrutiny and supervision, which acts as a sort of quality assurance that seems to be missing in the case of student testing.

Such a framework should also help structure important questions that many people are already asking about standardized tests – Are they worth my kid’s time? Are they a good measure of student learning? How will they be used to help improve my child’s instruction? What’s else could my child be doing if he/she wasn’t busy taking tests or doing test prep? In what ways may tests be negative for my kid? Should students be aware that their scores will be used to evaluate their teachers and why (or why not)? Do parents know that they have the right to opt out of having their children take these tests?

Many of these issues are already being raised; what seems to be missing is a framework under which all of these questions are routinely, inclusively, and simultaneously asked for each, individual instance of testing. Only such a schema, I argue here, would be conducive to both improved research designs (i.e., better data for better teacher evaluation models) and a coherent and comprehensive set of answers for parents, kids, and educators.

- Esther Quintero


considering that the tests measure how much students have learned, I think it's OK.


I am glad someone has noticed this. jr, that's not the point (although the assertion that tests measure "how much" learning has happened is very debatable).

While state-mandated educational testing gets a free pass on ethical reviews and oversight, research on education and educational tests by psychometricians do not. There is a weird double-standard here. Research on minors and their cognitive abilities, whether or not for their benefit, should have quality oversight that is at least similar to what we expect researchers with a state- or federal research grant to employ.

A qualified, trained university researcher could not do this kind of work on minors or adults without getting approval from an ethics committee (sometimes more than one). Most likely, the researcher(s) would need to take a number of steps that are not used with state mandated testing now. This would include making reliability and validity measurements of the test and results available, obtaining parental and minor consent with an option to opt-out, assessing the likelihood of anxiety and distress during the test and offering ways to quit the study early, and promising to keep the data anonymous and/or destroying it. There are also issues of handling learning disabilities, tracking for item bias by gender and race, and usually the researcher is expected to make the experience as beneficial for the people participating as possible.

Honestly, the entire in-school testing enterprise is so weak from a perspective of ethics and rigor that I've always seen is as more of a passing hobby that politicians like to engage in. Now that we are basing school closures, teacher tenure and payment, on such a weak system, the flaws in the rigor of the system are becoming exposed. It's all based on a house of cards.

I do not believe an education researcher could get funded with a US Department of Education grant for work that proposed using the lax ethical approach that the Department of Education is encouraging through its own RTTT program.

Just do a cost-benefit analysis. Is the benefit of knowing which teachers are contributing to student achievement worth the cost of students taking the test (as opposed to doing something else with the same time)?

The answer could be yes, even from the student perspective. For example, if the test revealed that the students were not learning as much as similar students in other classes exposed to the same material, then those students might be offered time with a different teacher or offered extra resources to compensate for the crappy instruction that was not their fault.


any work that involves research on humans is subject to IRB scrutiny and supervision

Any work that is "conducted" or "supported" by the federal government, or "for which a federal department or agency has specific responsibility for regulating as a research activity." See 45 C.F.R. part 46.