Why Tests Should Not Be Used to Measure Teacher Performance

Many states have reformed their teacher evaluation systems to hold public school teachers accountable for the academic achievement of their students. The hope is that if teachers are measured by the improvement — or lack thereof — in their students’ achievement, they will work harder to ensure their students learn and consistently unsuccessful teachers will be identified and eventually penalized or let go.

The federal government has provided financial and policy support for this movement. President Obama’s education agenda includes two key programs that reward states that adopt or expand these test-based reforms of teacher evaluation. The Race to the Top initiative has given large federal grants to 18 states and the District of Columbia on the condition they put in place these systems. Secretary Duncan has approved waivers of key provisions of the No Child Left Behind Act for 39 states and the District of Columbia that agreed, among other conditions, to measure teacher performance based on student test scores.

But isn’t it just common sense to hold teachers accountable for their students’ academic performance? Isn’t that a principal purpose of education?

The fly in the ointment is that the tests now used are inappropriate to make that kind of judgment. Therefore, these schemes are unfair to teachers. These systems will face legal challenges that they probably will not overcome once test results are used to fire teachers or reward or penalize them with higher or lower pay.

The ordinary citizen, as well as the ordinary state legislator, does not seem to understand this. To most people, a test is a test, and test results ought to be used to make major decisions about schools, teachers, and students.

The clearest explanation of the limitations of tests can be found in a new book by W. James Popham, Evaluating America’s Teachers: Mission Possible? (Corwin Books). Professor Popham, who taught at the University of California, Los Angeles, not only clearly explains the basic facts in this controversy but also lays out an alternative system that would more fairly and reliably evaluate teachers.

As Popham describes it, there are two major problems with using current tests for teacher evaluation: defects rooted in how tests are constructed, and the absence of a link between teaching and test results.

With the first set of problems, tests now used for student accountability are designed in a way that ignores some types of student achievement and some student characteristics. These tests are constructed to determine differences in achievement between students. If most students answer a question correctly, that question is dropped, and student achievement of the knowledge or skills embodied in the question is not measured. These tests are also culturally biased, so that students from families with greater educational attainment or higher incomes are more likely to answer certain questions correctly. In addition, these tests do not take into account differences among individuals in their inherent ability, so that an intellectually gifted student, for example, may score very high on a test regardless of the effectiveness of his or her teacher.

The second major deficiency is that current tests are not instructionally sensitive. In other words, they do not differentiate well between students who are taught effectively and those who are taught poorly, so that student performance on the test does not accurately reflect the quality of instruction provided to help students master what is being tested. Unless a test is instructionally sensitive, it does not inform teachers about how to change their instruction to influence their students’ scores. This defect strikes at the very heart of the argument that teachers ought to be held accountable for student achievement based on these tests. Why should teachers be held accountable when they cannot influence the results by improving how they teach or what they teach?

The Obama administration and state governments are creating problems for themselves by promoting these test-based teacher evaluation systems. It may not affect their political support because superficially the idea makes sense, but they will face a peck of trouble in the courts over the next few years. New tests being developed with federal support do not seem to remedy these defects, and so the problems remain.

The principal benefit of this debate is that it has exposed the shortcomings of the former systems used to evaluate teachers, which often relied on occasional, casual observations by principals and which resulted in high ratings for most teachers. In correcting one wrong, however, we should not create another. Why can’t we do it right this time?

Evaluating teachers should be a comprehensive scheme that uses a number of factors, but it should not rely heavily on deeply flawed student test results. Our teachers — and students — deserve better.

 

On July 9, 2013, this appeared as a blog by Jack Jennings in the Huffington Post.