Guest blogger Rick Beach, president of the San Diego Science Alliance, is taking on the topic of how to measure teacher performance. You might remember that a few bloggers debated this issue last month. We’re continuing this conversation with Beach, who questions whether test scores are the right way to gauge teaching.

These are his views, not mine, so if you have comments, questions or counterarguments, please post them directly to the blog or e-mail Rick directly at Don’t forget to tell him if he can use your name to respond to your points in a blog post. And if you have a different view and want to blog on this topic, feel free to contact me at


We all know that teachers matter. Most of us have memories of a teacher that made a difference. We want our children to have more teachers that make a difference.

But how do we measure that difference?

How Many Teachers Do You Know?

Within a school of 500, there are about 20 teachers. Within a district of 5,000 there are about 200 teachers. Within San Diego County of 500,000 students, there are more than 20,000 teachers.

Which teachers make a difference? Can you make a comparison?

If not you, then who does? Do you trust each of those people to use the same measuring gauge?

Let’s Use Standardized Measures

When we have too many things for one observer to measure, it is tempting to use a standard. Then lots of people can compare the many things to the standard and know if each one measures up.

Like standardized test scores.

Educational testing in our state uses the California Standardized Tests. Each subject area gets a test — mathematics, language arts, science, and so on. They are there, so let’s use them, right?

But do they test all the subjects that our teachers teach? Language arts (English skills) and mathematics are tested in all grades 2 through 11. Oops, what about kindergarten, 1st grade and 12th grade?  There are no CST scores for those students.

What about science? There are tests for only grades 5, 8 and 10. Same for social science/history. What about electives?

What if the teacher you know that makes a difference doesn’t teach the subject being tested?

CST scores leave lots of gaps in the measures. Some teachers will have several tests of the subjects they teach, some will have none.

Standardized Measures Don’t Compare

Another technical difficulty with standardized tests is that they rarely help compare one year to the next.  Did a teacher make a difference this year compared to last year?

The CST results are for that year in that subject – only. The tests change. So, the test questions in 2008 will not be the same as those in 2009. One may be more difficult than the other. But which one?

And since the topics in each subject change between grades, you can’t compare the previous grade to the next grade. Fourth grade math deals with whole numbers and simple fractions, while fifth grade math introduces formulae and angles to solve problems.

If a student struggles in math, was it the teacher or the subject?


Another technical issue with using test scores is the movement of children between schools. A teacher may have different students at different times of the school year. Some may have more students move than others. Is it a big problem?

My research in evaluating an innovative program at the Lemon Grove School District surprised me. That district experiences a 30-percent mobility rate. That is in year one, 90 of 300 students that were in classes at the end of the year, hadn’t been there at the beginning. We were studying a three-year program, and after two years, half of the students were different, about 180 of 300 students.

Reliable measures of innovation are not possible with so much variation just from students moving in and out of districts.

Exclude the Differences

You could simply exclude the students who were not in the class for the whole year.  The state does that for computing standardized test scores for a school.

But what about teaching those students?  Teachers still have to adapt to new students coming into their classes. Some classes have high mobility rates, others do not. Some schools have high mobility rates, others do not. Some districts have high mobility rates, others do not.

How do you compare the workload and challenge for those teachers who have lots of changes with those who have few changes?

 Student Disparity

Wouldn’t it be great if students moving from one school to the other were all learning the same thing at the same time? Standardize the instruction.  Business folks often offer this suggestion because standardization works in industry.

But students don’t perform at the same level even when taught the same material. Consider the experience of a 6th grade teacher in that Lemon Grove project. Halfway through the school year, she showed me a report from SuccessMaker, an online supplemental instruction tool that helps accelerate student learning, on the actual grade levels of her students based on ability.

Her 36 students ranged from grade equivalents of 1.4 to 8.2.  Obviously, her class had a mix of special education students at the low end and gifted students at the high end. Teach any single concept to that class and some students will be bewildered and unprepared while some will be well beyond you and ready for more difficult material.

How do you compare a teacher in a class with a wide range of understanding with one who has a narrow range?

What Skills are Tested?

Another oddity about standardized tests is what skills they measure. Most tests are multiple choice. Not messy descriptions like the problems we face in our lives.

Most teachers find students benefit from learning how to take tests. Not the subject matter, just the strategies to select an answer from multiple choices. Are we testing test-taking?

The test sheets contain bubbles that must be filled in completely with a pen or pencil. Even many lottery tickets have gone away from that skill. College courses rarely use bubble marking. Do you use bubble marking in your career?

 Are Outcomes the Same for Students, Teachers and Schools?

Standardized tests form the foundation for the state accountability system for schools and districts.  To them, test scores matter.

But do the students face the same consequences as teachers or the schools and districts? The school in Palo Alto, which my daughter had attended, recently failed to make adequate yearly progress in its test scores. Palo Alto?  How could that high-performing school fail?

Turns out those students want to get into college. Entrance requirements include Advanced Placement courses, which has an end of course exam held in the same month as California standardized tests. AP scores get you into college, CST scores don’t. So enough students skipped the CST tests that the school suffered while the students didn’t.

What if we measure teachers with tests that have no consequences for the students?  Don’t like your teacher, then tank your test score!

 Measuring Teaching

For all these reasons, simplistic measures of student learning are ineffective for teachers. Test scores simply have too many problems to rely upon for comparisons about teaching.

What does work? Professional standards of teaching and observation of teaching behaviors consistent with those standards will work.

Things like asking good questions of students, inspiring curiosity and motivating inquiry about the subject. Things like differentiating the material based on what the child already knows, or doesn’t know. Those were the qualities of teachers that made a difference in my life. And I’m a Ph.D. who got 16 percent on my first algebra test!

Not as easy as scanning test answer sheets for 500,000 students, but a better way to recognize good teaching is when your child experiences it. 


Leave a comment

We expect all commenters to be constructive and civil. We reserve the right to delete comments without explanation. You are welcome to flag comments to us. You are welcome to submit an opinion piece for our editors to review.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.