Value Added and Teacher Evaluation

Now that value added measures are commonplace throughout the country, what can schools do to effectively incorporate them into teacher evaluation?  This is the central question that classroom teachers and building principals must answer as they meet requirements for federal funding that is driving this model.  I’m enthusiastic about using data for school improvement, and it’s been central to the last 15 years of my career as data warehouses and analysis tools for classroom teachers have emerged.  But the way states have adopted the use of VA assessments varies greatly, and the impact of a single round of testing on teacher evaluations also varies.  This variance is problematic at several levels.

The critics of VA point out that student scores can vary from year to year–one source of variance.  If you read teacher blogs or blogs from assessment critics about how VA assessments are used, or you read some progressive school improvement bloggers, you’ll read many examples of students who scored highly one year and were significantly lower in scores the next–changes that could not be conveniently explained by the effects of teachers or schools.  Scores of individual teachers also vary across classes in the same year and from year to year, and examples of this variance fill the pages of critical analyses of VA methodology.  In such scenarios, a given teacher of a single subject in the same year can show significant variance between students across multiple sections of the same course.  A teacher might also score highly for a year or two and then plummet the following year.  Such variance is not explainable or accounted for particularly well in the statistical models.  I’ve identified several research analyses on these topics in the Research section of this site.

The concept of Student Growth Percentiles, which is a variant of VA analysis, compares similar student scores and ranks students according to their growth.  When the student scores are aggregated into teacher classes, the teacher can then be given a growth score that is intended to fairly account for the population in the teacher’s class.  This system weighs the aggregated class groupings to similar students so that teacher X, with 10 ESL students and 20 regular students has the scores of the ESL students compared to other ESL students in the state.  This is intended to lessen the impact of second language students on the teacher’s VA score.  However, an informed Internet reader will still find examples of teachers in New York (which is using Student Growth Percentiles for the first time this year) who are rated by principals and parents and colleagues as outstanding who have a mediocre VA score.  So rating teachers on VA scores, even with Student Growth Percentiles, is fraught with problems.

The widely adopted but simplistic solution is to use a VA score (or any other assessment score) as only a part of a teacher evaluation.  Using a summary test score in a subject area for only part of a teacher evaluation while adding classroom observations as a large, or largest component of evaluation is a commonly proposed means to lessen the impact of assessments, and a tacit recognition that there’s a problem with this unproven system.  States have jumped into this complex process without any national research on what works, without agreement on how to use student test results to evaluate teachers, and without any real analysis of whether this has affected the quality of student learning where the programs have been implemented.  We are dealing with a core shift that is virtually ungrounded in research.  Were we to try this in other fields, we would be stopped by regulators and common sense at every level.

ASCD’s November 2012 edition of Educational Leadership is themed: Teacher Evaluation: What’s Fair? What’s Effective?  This edition presents a generic overview of the current state of teacher evaluation, and particularly discusses value added assessment measures.  The range of articles reflect all the concerns I’ve mentioned.  I particularly like several articles that are offering recommendations about what to do with the results of all the testing and teacher evaluation that is going on.  The threads in common here are very clear–lessen the impact of summative tests and devise a means to use more formative testing throughout the year in a feedback loop centered on teachers identifying what kids need to learn.  Use teacher observations as a means to coach teachers toward improving their instruction.  This sounds easy, but in general principals are both untrained to do so and have no time to spend in classrooms anyway.  Those who have never been principals or building level administrators with responsibilities for day to day operations and teacher evaluation simply don’t have a clue about how problematic new teacher evaluation requirements are.

Educational Leadership isn’t exactly educational research per se, but a few of the articles come from folks with solid research backgrounds, and in the references at the end of the articles are a few meaty morsels worthy of further reading if you have access to a decent library where you can find the books and journals.

My favorite article is “How to Use Value-Added Measures Right,” by Matthew Di Carlo.  He offers suggestions on how VA can be an appropriate element of teacher evaluation.  The importance of getting this right is summarized by his comments in the first few paragraphs.  He writes:  “….there is virtually no empirical evidence as to whether using value-added or other growth models–they types of models being used vary from state to state–in high-stakes evaluation can improve teacher performance or student outcomes  The reason is simple:It has never really been tried before.

Your health insurance is unlikely to pay for experimental treatments.  We have mandates all over America to implement a diverse range of experimental treatments for teacher evaluation.  It would have made sense to try such programs in small scale pilots across several states in multiple district contexts.  Instead, we are demoralizing the profession by using unproven and unreliable means to evaluate performance, and then often publishing the results and destroying individual teachers.

You can find these articles on the ASCD website, though if you are not an ASCD member you may only be able to read abstracts.  If you are a member, you’ll have the journal in your mailboxes.  Share your copies with colleagues, check it out online.  If you’re interested in or affected by VA assessments, Di Carlo’s bibliography is a good one–I’ll be adding some of the reference he lists to my own research page here at k12edtalk.