Sunday, December 18, 2011

Test Scores Often Misused in Policy Decisions

This from the Huffington Post:

Education policies that affect millions of students have long been tied to test scores, but a new paper suggests those scores are regularly misinterpreted.
According to the new research out of Mathematica, a statistical research group, the comparisons sometimes used to judge school performance are more indicative of demographic change than actual learning.

For example: Last week's release of National Assessment of Educational Progress scores led to much finger-pointing about what's working and what isn't in education reform. But according to Mathematica, policy assessments based on raw test data is extremely misleading -- especially because year-to-year comparisons measure different groups of students.

"Every time the NAEP results come out, you see a whole slew of headlines that make you slap your forehead," said Steven Glazerman, an author of the paper and a senior fellow at Mathematica. "You draw all the wrong conclusions over whether some school or district was effective or ineffective based on comparisons that can't be indicators of those changes."

"We had a lot of big changes in DC in 2007," Glazerman continued. "People are trying to render judgments of Michelle Rhee based on the NAEP. That's comparing people who are in the eighth grade in 2010 vs. kids who were in the eighth grade a few years ago. The argument is that this tells you nothing about whether the DC Public Schools were more or less effective. It tells you about the demographic."
Those faulty comparisons, Glazerman said, were obvious to him back in 2001, when he originally wrote the paper. But Glazerman shelved it then because he thought the upcoming implementation of the federal No Child Left Behind act would make it obsolete.

That expectation turned out to be wrong. NCLB, the country's sweeping education law which has been up for authorization since 2007, mandated regular standardized testing in reading and math and punished schools based on those scores. As Glazerman and his coauthor Liz Potamites wrote, severe and correctable errors in the measurement of student performance are often used to make critical education policy decisions associated with the law.

"It made me realize somebody still needs to make these arguments against successive cohort indicators," Glazerman said, referring to the measurement of growth derived from changes in score averages or proficiency rates in the same grade over time. "That's what brought this about." So he picked up the paper again.

NCLB requires states to report on school status through a method known as "Adequate Yearly Progress." It is widely acknowledged that AYP is so ill-defined that it has depicted an overly broad swath of schools as "failing," making it difficult for states to distinguish truly underperforming schools. Glazerman's paper argues NCLB's methods for targeting failing schools are prone to error.

"Don't compare this year's fifth graders with last year's," Glazerman said. "Don't use the NAEP to measure short-term impacts of policies or schools."

The errors primarily stem from looking at the percentage of students proficient in a given subject from one year to the next -- but it measures different groups of students from year to year, leading to false impressions of growth or loss.

Hat tip to the Commish.

1 comment:

Anonymous said...

As someone who has been saying this for year (you can't make significant instructional decisions by comparing one year's students from a following year's) I can't believe this may finally be taking hold in some cirles. The state for years has been raming this down school's throats by expecting short sighted school improvement plans based on state and national assessment scores. Similarly, how can we be canning school administrators based on test scores after only a year or two of leadership?

What is so sad is that we have been playing this year-to-year, chase your instructional tail, false application of data for almost two decades now and what have we got to show for it?

On the same note, if this articles position is to be embraced regarding assessment, why are we spending so much time and money on it? I am sure we could have spent half as much and the commissioner and other superintendents could have still gotten their free trips on the vendor's dime.