The conversation continues on tying teacher evaluation to test scores! Jill Kerper Mora, professor emerita of education and an opponent of linking teacher evaluation to test scores, responds to a post from former education reporter Vlad Kogan. — Emily Alpert
Vlad,
Thank you for your thoughtful response to the postings. I believe that the history of “value-added” research that attempts to link test scores to teacher effectiveness shows us clearly that the sorts of statistical analysis you talk about cannot accurately measure teacher quality.
Take, for example, the work of Bill Sanders from Tennessee. His work has been largely discredited because his statistical model was based on the assumption that teachers teach in a sort of box: One teacher to 25-30 students with no pull-out programs or supplemental instructional services, etc., such that the model could capture an isolated “teacher effect.”
This is rarely the case in public schools, so the number of variables and the statistical gyrations became so complex and individualized to become practically useless, not to mention expensive in terms of what the state had to pay statisticians.
Mr. Cramer uses the term “similarly situated,” which if I remember correctly from my two lawyer parents and my lawyer sister, is a term that’s used to describe litigants in a class action law suit. Our situation in CA is hardly “similar” to other states because of our large English Language Learner (ELL) population (25 percent ELL, 46 percent speakers of a language other than English in the home).
This also means that very few schools are “similarly situated” in terms of the numbers and types of ELL students they have. For example, here in San Diego we have large numbers of immigrant students who are newly arrived in the United States or who are “transnational” in that they have familial and economic ties in both Mexico and the U.S. These students have a very different learning/schooling context than, say, Chinese immigrants from Hong Kong living in San Francisco.
Although they may share some statistical characteristics in terms of English language proficiency, they can hardly be terms to be “similarly situated” in terms of their level of contact with English native speakers and a whole host of other language learning variables.
In addition, we know that the language learning curve is not straight up and that different components of language show very different growth rates on language assessment tests like the CELDT. Students’ CELDT scores tend to rise rapidly at the lower levels of proficiency because the tests measure listening and speaking skills, but literacy skills growth is very difficult to measure using this test. Students frequently reach a “plateau” in test scores at proficiency level 3.
I have heard many teachers say that they have a hard time convincing their administrators that this is not evidence that teachers with level 3 learners in their classrooms are less effective, but rather that this is an artifact of the natural progression of language learning and the test’s inability to discriminate forms of learning (literacy, content-area vocabulary) that contribute to total growth over time. The state has mandated that students grow one CELDT level per year for their school to be termed effective, even though research shows that the average is .75. I give you this example to show how test data can be misinterpreted, misapplied, and misused to the detriment of teachers and schools rather than to support students’ enhanced achievement.
I also agree that “random” student placements are impossible. Again, an example from our linguistically diverse student population: What about students who are placed through the waiver process in transitional bilingual programs or dual immersion programs? These are not random placements. We shouldn’t even consider altering parental choice options to fit a statistical model’s assumptions in evaluating teachers.
In my second posting of the day, I emphasized the impact on the teaching force of imposing an evaluation system that teachers do not “buy into” or are opposed to. I hope that in weighing the “trade-offs” that policymakers and the public fully understand the implications of an unfair evaluation system on teachers’ sense of professionalism and their abilities to collaborate and support each other rather than being artificially placed in competition with each other or judged using unsound and incomplete evidence of their performance.
Thanks for the conversation.