Monday, Oct. 6, 2008 | Year after year, Mary Beth Douglas has complained that students are misplaced in her math classes at De Portola Middle School, forcing her to switch students after school begins. Meanwhile her principal, Elizabeth Gillingham, has slogged through files to figure out which classes flourished and which faltered on standardized tests, spending her summers hunched over printed charts.

But something changed this summer. Using a new computer program that tracks individual students and their scores over time, Douglas could start sorting kids earlier, clicking through their records to see whose scores seemed too low for the class. Gillingham could see test scores broken down classroom by classroom at the click of her mouse, freeing her to deduce which teaching methods worked. She could see which classes struggled with vocabulary and which understood it — even dicing up data to see which teachers saw the largest gains — and could ask those teachers to share their techniques.

“We didn’t even have this information in the past,” Douglas said while Gillingham showed the new tool to teacher Sheila Wiener.

“It tells you the actual needs of each child,” Wiener said.

Proponents say San Diego Unified’s new system is vastly more sophisticated than measuring scores of an entire grade level or school, and helps them decide what works and what doesn’t in the classroom. Educators can zero in on specific teachers and how their students fared on tests. They can pull up scores for an individual student and target the weaknesses their tests reveal. And they can easily follow how a child scored over time — a task that was once as burdensome as a research project.

It is controversial among educators who fear their jobs or salaries could be hitched to whether test scores rise or fall in their classes, and question the power of testing. Such data have been used elsewhere to decide what teachers get paid, a bitterly contested practice called merit pay. Superintendent Terry Grier and the school board members say they want to use the information to help teachers, not to judge them, but suspicions are rife in a school district that is still adjusting to a new superintendent and a battery of changes in a brutal budget year.

“Teachers have been beaten up by data again and again,” said Aimee Guidera, director of the Texas-based Data Quality Campaign, which supports using more data to better teaching. “Rarely have we used it as a tool for improvement. It’s scary. We’re asking people to take risks and learn something new.”

Schools such as De Portola have previously judged their progress by tracking school-wide or grade-wide scores from year to year. That judges apples against oranges because they aren’t comparing the same children. When California schools compare how 2nd graders scored from year to year, they are actually comparing how one group of 2nd graders — this year’s class — scored compared to a previous group of 2nd graders.

“Did our 2nd graders make a year’s progress? Did they spend a year and make hardly any progress?” said former state Sen. Dede Alpert, who advocates using more data to gauge how schools and teachers impact their students. “We haven’t been able to tell the public or parents.”

That apples-to-oranges comparison can warp scores. If more children who struggle with English start attending a school, its scores would likely seem to plummet, even if the school did nothing differently. Critics say such false comparisons are a key problem with No Child Left Behind and its penalties for low scores, flogging schools that serve disadvantaged kids even if they improve scores for each child.

Now San Diego Unified schools can easily track how individual students and classes fare on tests. Gillingham can see which teachers boosted scores among English learners or students with disabilities, and which classes understood the scientific ideas of motion, buoyancy or the periodic table. Math scores reveal which ideas stumped students who have moved on to the next grade. Teachers can pull up reams of data on each student at their computers, even checking on the progress of students they taught years ago, to understand what students understand.

“You could see if a kid didn’t understand borrowing and carrying” in 3rd grade mathematics, said school board member and retired teacher John de Beck. “So even if I pass a kid because most of his homework is OK, the 4th grade teacher can see this kid can’t borrow and carry. They’ll work with him on that.”

The difference is dramatic. In years past a principal might have gotten a chart showing what percentage of each grade scored proficient on each test, broken down by race, socioeconomic status and other characteristics. They would also get a full roster with each child and their scores, and months later a summary that sorted out the percentage of each grade that understood specific concepts in math, science, English or social studies. That could tell principals, for instance, that their 7th graders had trouble with math.

But Gillingham said it gave them little help in understanding what to do. Months later into the school year, another report might tell them what aspects of math were especially troublesome for 7th graders, but picking out which students had that problem was daunting. It made tests — already an aggravation for many teachers — merely a frustrating chore that rarely helped them improve their lessons.

The new data are also prompter. Teachers have complained that test scores are often calculated months after exams are taken, making them useless for children who have already moved on. Union President Camille Zombro said giving and preparing for exams devoured a whole month of her time at Baker Elementary School and gave her little information to tailor her teaching. Gillingham said the delays make it difficult to catch up if a child has missed a key concept in reading or math. The new system allows San Diego Unified to deliver data sooner and instantly uploads results from school district tests.

“We get scores during the year when we can do some good with the kids,” said Bruce McGirr, president of the administrators association. “It tells us things in October or January instead of waiting until August of the next year.”

The drab world of data has become the grounds for a surprising crusade. California and the federal government are pushing to track students and their scores more precisely. Education reformers demand more data more quickly to see which programs work and which don’t. Some want to impose the same test-based scrutiny to decide which teachers are effective. Frustration with the limited data used by No Child Left Behind to grade and penalize schools has fed the zeal for “longitudinal” systems that follow children and gauge what they gain over time.

“The whole field of education is changing very swiftly and becoming a much more data-driven system,” said Jane Hannaway, director of the Center for Analysis of Longitudinal Data in Education Research at the Urban Institute.

But critics say tests fall short of judging the complexity of what schools and teachers do. Children are also influenced by what happens at home and other factors outside school. Studies show that scores sink when influenza hits a school or even when a dog barks outside, casting doubt on tests’ validity. Testing is a “false measure” that commodifies students and gauges limited, regimented information, said Rich Gibson, professor emeritus of education at San Diego State University.

“It’s tragic that teachers are being forced through instruments of fear to teach to high-stakes standardized tests they know well full are designed to misanalyze their students,” said Gibson, who opposes standardized testing. “The key to education is having the freedom to make good decisions in the classrooms about how to reach into a child’s mind. That freedom is eradicated by people who depend on high-stakes standardized tests to noose children and their teachers and create an atmosphere where freedom is wiped out.”

Even data advocates such as Hannaway caution that tests are prone to errors and misuse. Narrow tests will force teachers to narrow their teaching, she said, and scholars are still learning how to untangle teacher impact from the myriad factors that affect children and their test scores. Nor is data readily transferable from year to year or grade to grade. The California scales that measure how students perform change from one grade to the next, flustering teachers who wonder if their students have improved their scores.

Because scales are different for each grade, “if you scored a 280 one year and a 290 next year, it doesn’t mean that you did better the second year,” said Karen Bachofer, executive director of standards, assessment and accountability. Data “is much more sophisticated than it was 10 years ago. But a lot of people have questions.”

Many of those questions center on merit pay and whether test scores are a fair and realistic measure of teachers. California law bans school districts from using standardized tests to evaluate teachers, and Grier has said data will be used to help teachers, not judge their effectiveness. De Beck said it would be unfair and inaccurate to judge teachers based on tests, and called it “the last use I want to see for” the data.

“Our focus is how we help the teachers,” said school board member Mitz Lee. “… If they think they will get in trouble, they will never share that they need help.”

That fear is palpable among many teachers who see a renewed focus on tests as detrimental to teaching and a baby step toward merit pay. Now that the numbers are available and can be pinned to teachers, they worry principals may use data to punish them. Douglas wondered aloud about whether she would be “tracked” by the school district. Other teachers were too nervous about their jobs to comment.

“These tests are a horrible assessment of what children can actually do,” Zombro said. “The only way to find out what a child can do is sit down and watch them and talk to them about it. Teachers don’t have time to do that kind of authentic assessment because they’re jumping through all these hoops.”

Schools must overcome teachers’ distrust and “use data not as a hammer, but as a flashlight,” said Guidera of the Data Quality Campaign. But Grier has not yet gained widespread trust from teachers and other staffers at San Diego Unified, which is still reeling from budget cuts and an ugly fight over teacher layoffs. Nor has he convinced the union that data will not become a tool for evaluating teachers. He is known as an advocate of merit pay from his work in North Carolina and recently mandated that principals attend a talk by statistician William Sanders, whose work is closely identified with merit pay.

“To bring up the whole idea of merit pay seems like pouring gasoline on the fire,” McGirr said.

Whether or not dollars are involved, linking test scores to teachers and students is already a hot topic. “If you don’t know what you are trying to address, how do you address the problem?” asked Mitz Lee. “It’s like walking blind in the dark.”

Please contact Emily Alpert directly at with your thoughts, ideas, personal stories or tips. Or set the tone of the debate with a letter to the editor.

Leave a comment

We expect all commenters to be constructive and civil. We reserve the right to delete comments without explanation. You are welcome to flag comments to us. You are welcome to submit an opinion piece for our editors to review.

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.