New Research on Instability of Teacher Evaluation Metrics

Current Teacher Evaluation Metrics are Unstable

The Education Policy Analysis Archives has just published (Oct 6, 2014) “The Stability of Teacher Performance and Effectiveness: Implication for Policies Concerning Teacher Evaluation,” by Morgan, Hodge, Trepinski and Anderson.  This is a well-designed and carefully completed study that looked at the stability of value-added scores and teacher observation scores to determine how stable these scores were from year to year, whether the scores were correlated with each others, and how school ratings affected results.

Patterns Are Problematic for Evaluation Purposes

The researchers used four categories of trends for the value-added and observational ratings: Invariant, positive trend, negative trend, and scatter.  For value-added trends, 6.1% of teachers had invariant ratings, 18.9% had positive trends, 21.1% had negative trends, and 62.9% were in the scatter category, meaning their ratings were unpredictably high or low from year to year.  The observational ratings were similarly, but somewhat less, erratic, with 10.6% invariant, 43.9% positive, 14.4% negative, and 31.1% scattered.

When looking at school performance and these ratings, they write “on average the observational ratings were the same as the value-added ratings in schools performing average or above average.  On the other hand, schools that were poor performing tended to have higher mean observational ratings than value-added ratings for their teachers.  This indicates that on average the observational ratings may overestimate teacher effectiveness in lower performing schools.”

Stability of Metrics Is Important for Teacher Evaluation

The forced push toward value-added measures in teacher evaluation are regularly shown to be stable by researchers.  This article is not the only research demonstrating this problem, as the references at the end of the article demonstrate. (I have some of this research under my Resources tab, in Think Tanks, and then under  If 63% of teachers changt heir effectiveness ratings from year to year, going up and down in a random pattern, does this mean teacher performance is this erratic?  I don’t think so.  I think, rather, that the measures are inadequate to measure all the variables that influence teacher effectiveness from year to year.  Trained teachers are not that erratic.  The measures are simply statistically inadequate for this task, as critics of value-added have suggested for several years.  These measures were never designed for teacher evaluation–they are an adaptation of statistical procedures by folks who found a way to suggest a new use for their mathematical models. So why are so many policy-makers bent on using such metrics to evaluate teachers?  It’s wrong-headed and detrimental to public education.

This article is a great example of independent research that seeks to clarify on of the greatest movements in education today.  Please check it out.  Then join those of us who oppose proponents who profit from value-added policies, and who seek to denigrate public education by forcing illegitimate measures on public educations that create false impressions of ineffectiveness.  We need to speak up for public education and make our voices known.


Leave a Reply

Your email address will not be published. Required fields are marked *