I’m sharing my blog today with Jill Kerper Mora, associate professor emerita at the School of Teacher Education at San Diego State University. She is blogging about why she believes merit pay based on student test scores won’t work. For another viewpoint, check out our earlier Blogger for a Day, local attorney Tyler Cramer.
These are her views, not mine, so if you have comments, questions or counterarguments, please e-mail Jill directly at firstname.lastname@example.org. Don’t forget to tell her if she can use your name to respond to your points in a blog post! — Emily Alpert
President Obama and Secretary of Education Arne Duncan announced this administration’s education initiative called “Race to the Top.” It proposes regulations for state education agencies to qualify for $4.35 billion in economic stimulus money. The administration stipulated that to earn grant awards, states must not have laws prohibiting the use of students’ test scores to evaluate, compensate and promote teachers. Race to the Top gives this definition of an effective teacher: “Effective teacher means a teacher whose students achieve acceptable rates (e.g. at least one grade level in an academic year) of student growth.”(p. 37811)
According to the Race to the Top criteria, laws that prevent school officials from using test scores to document teachers’ effectiveness pose an obstacle to efforts to improve teacher quality. This law is what local attorney and Blogger for a Day Tyler Cramer calls the “firewall” that may prevent California schools from receiving federal stimulus money for our schools. Mr. Cramer is a proponent of using students’ test scores for making personnel decisions based on a teacher’s students’ performance. I disagree.
It is important to examine the reasons why teachers unions and many educators oppose the use of test scores in evaluating teachers to determine whether or not they are effective and to award pay bonuses, tenure and/or promotion. The concept of merit pay is based on what can be called the “water glass theory” of teaching and learning. The belief is that teaching is much like pouring water into a glass that gradually fills until it reaches a certain level, such as the knowledge needed to master the curriculum standards for a particular grade level, or the level of knowledge needed to earn a high school diploma.
The theory posits that at each grade level, a teacher pours in a certain amount of knowledge to add to students´ learning from previous years of schooling. The assumption is that growth in knowledge, like the level of water in the glass, can be accurately measured by determining growth in test scores when test scores for individual students are compared from year to year longitudinally.
If Teacher X is effective, s/he causes the students’ knowledge level to rise more and faster, and she should be rewarded with higher pay. Meanwhile, if Teacher Y’s students’ level of knowledge does or does not increase to a predicted level as shown by gains in test scores, he is deemed to be ineffective and is subject to sanctions or dismissal.
Mr. Cramer gives us a hypothetical example of just such a scenario, using second-to-third-grade math scores as a comparison. Mr. Cramer claims that since 20 of Mrs. Mathy’s students’ scores rose 15 percentile points on the state math test from last year’s scores, she is deserving of a reward for their outstanding progress. Meanwhile, Mr. Remarc’s students’ scores dropped five percentile points, so he is not deserving of an award.
There are several fallacies in this hypothetical example that are based on common misunderstandings about standardized test scores. First, we observe that Mr. Remarc actually meets the federal regulation’s definition of an effective teacher, since a score difference of only five points is probably not statistically significant. This is because a five point difference in second- and third-grade scores is within the normal and predictable score range (called a standard deviation) around which percentile rankings will “wobble” for individual students from year to year.
Mr. Remarc’s students, in fact, show that they gained one academic year in achievement for one academic year of instruction, even though their scores dropped slightly. We must keep in mind that standardized test scores are only approximations, based on samplings of students’ knowledge, and are notoriously imprecise.
So what can be said about Mrs. Mathy’s students’ 15 percentile point gain? While this gain may be statistically significant according to a statistician, can we really be sure that this growth in scores is attributable to her “outstanding” effectiveness as a teacher? What if the third-grade test was simply easier than the second-grade test and therefore lower or equivalent raw scores translated into higher percentile rankings when students were compared to all of the others in the “norm group” who took the test?
Or what if Mrs. Mathy just got lucky and guessed accurately what would be on the test so she prepped her students better than her colleague Mr. Remarc? Or what if Mrs. M’s students had a wonderful librarian who turned them on to reading during their twice-a-week visits? Or maybe they had the benefit of an after-school enrichment program. There are many factors over which teachers have no control that can impact their students’ test scores.
Mr. Cramer claims that “…longitudinal data systems can isolate the effects of instructional inputs.” Again, I disagree.
To return to the “water glass” theory, once water comes streaming into the glass from many different sources, such as with secondary schools where students see five or six different teachers in a school day, it becomes impossible to trace their learning back to a single teacher. The more teachers collaborate within schools or with outside education providers, or even with highly involved parents to promote students’ learning, the less likely a single teacher can either take the credit or accept the blame for students’ test score increases or decreases.
We educators are puzzled as to why the Obama administration believes that more and better number crunching will improve teachers’ effectiveness. The one year growth for one academic year of schooling model is especially problematic for evaluating teachers who work with student populations that predictably do not show or cannot show growth, such as English learners and students who are enrolled in special education classes.
Twenty five percent of California’s students are classified as limited in English proficiency, which means that they cannot score on grade level on standardized tests because they do not understand, read and write English. Do we want an evaluation system where teachers who have large enrollments of English learners in their classrooms do not have an equal opportunity for earning merit pay or demonstrating their effectiveness because their students are still learning English?
The United States Department of Education’s guidelines under President Obama are designed to entice and/or pressure the states into adopting unsound and inequitable evaluation schemes for teachers, principals and even teacher education programs in order to secure funding for schools. This is an unprecedented encroachment into public education by the federal government, which has no constitutional role in education.
We should return to the wisdom of the Founding Fathers and Just Say No to overturning the law that protects teachers from the unfair use of students’ test scores for evaluating and compensating their professional endeavors in educating our children.