Teacher Evaluation | k12edtalk.com

New York schools, despite having one of the largest percentages of students in poverty, consistently rate highly in Quality Counts, from Education Week. The state was a second round winner of $700 million in Race to the Top funds from the Feds, and aggressively rammed new legislation through the state legislature to qualify. The unintended consequences are now being felt by those responsible for implementation. Teacher morale is down, the costs of implementation are proving to be unfunded mandates several times greater than what was gained from ‘winning’ the Race, and a tax cap pushed through by the Governor is wiping out district capacity to raise taxes even where the public would be willing to pay more to maintain quality schools. Moreover, many are estimating that 40% of New York districts face bankruptcy in the next few years.

Dr. Kenneth Mitchell, the Superintendent of the South Orangetown Central School District in Rockland County, NY, contributed a wonderful analysis of the unintended consequences of Race to the Top in New York, published by the Center for Research, Regional Education and Outreach at the State University of New York at New Paltz. Open ‘Discussion Brief 8’ from this link to read the full article. With the cooperation of 18 school districts in Rockland, Westchester and Putnam Counties, the three counties bordering New York City to the North, Dr. Mitchell produced an eye-opening analysis of the impact of RTTT in this relatively wealthy region. These three counties are among the richest in New York–and the 18 districts are representative of the range of wealth found in the 60 districts in the region, though, I think, they would average somewhat above the mean in community wealth because some of the larger and lower income districts did not contribute data to the study.

Financially, Dr. Mitchell reports that the four year return to the 18 districts from RTTT funds will be $520,415, while the expense to districts is estimated to be $6,472,166, for a deficit of $5,951,751, an unfunded mandate of nearly $400 per pupil, to be funded by local taxpayers. And because of the tax cap, which now requires a 60% super majority to pass a budget above the 2% cap, very few districts will be able to fund the mandates without substantial cuts in programs somewhere. The cost of implementing the state’s new Annual Professional Performance Review – APPR (teacher and principal evaluation) by itself represents a 3% increase in local costs to pay for the testing and evaluation training required for implementation without considering the costs of redesigning instruction to move to the Common Core curriculum, which requires new texts, instructional materials, shifts of content from grade to grade, and additional teacher professional development for implementation.

The effects, as gleaned from the 18 districts responses to the program, are disturbing. Despite Race to the Top’s promotion as a school improvement program, districts will see staff cuts and larger class sizes, and non-mandated programs will be cut. Districts are cutting maintenance despite this being a wonderful opportunity to get cost effective pricing during an otherwise slow economy. Priorities are shifting away from instructional services that districts have developed over years of successful local control in order to fund the external mandates. Internal professional development and staff to provide it is being cut, and training is focusing on teacher evaluation requirements, not on classroom instruction. Some districts will be required to hire additional supervisory staff to complete the evaluation mandates, which require more time than existing programs. Curriculum is narrowing as districts prepare for extensive testing that will be used for teacher accountability. And finally, quoting directly from the brief, “..the hidden costs may be greater than the outlay in dollars. Teachers and administrators, stressed by the rapid change, the demand for accountability via the new testing and observation requirements, and anxieties about receiving low scores, are very likely to abandon initiatives that may be innovative and beneficial for preparing the next generation, but are out of alignment with a narrowed professional agenda for staying within the ‘Effective” range on the APPR.”

Dr. Mitchell proposes some reasonable shifts that New York politicians could make to back off and redirect this initiative. He also identifies several sophisticated research groups that question the use of tests for teacher evaluation as Race to the Top demanded, and question the massive national shift to the Common Core curriculum. Both these shifts are untested and lack a research base to justify the depth of the changes. And since the changes were imposed by the legislature in New York, and by legislatures racing to get Federal funds across the nation during the economic slow-down, the changes are out of the hands of state education departments even if those leaders were inclined to back off in the first place. Since they were generally the designers of the system, they are unlikely to be responsive–it’s full steam ahead, into the chaos that will ensure.

Download the article pdf at the link above, and see the research citations in the “Works Cited” link as well.

Now that value added measures are commonplace throughout the country, what can schools do to effectively incorporate them into teacher evaluation? This is the central question that classroom teachers and building principals must answer as they meet requirements for federal funding that is driving this model. I’m enthusiastic about using data for school improvement, and it’s been central to the last 15 years of my career as data warehouses and analysis tools for classroom teachers have emerged. But the way states have adopted the use of VA assessments varies greatly, and the impact of a single round of testing on teacher evaluations also varies. This variance is problematic at several levels.

The critics of VA point out that student scores can vary from year to year–one source of variance. If you read teacher blogs or blogs from assessment critics about how VA assessments are used, or you read some progressive school improvement bloggers, you’ll read many examples of students who scored highly one year and were significantly lower in scores the next–changes that could not be conveniently explained by the effects of teachers or schools. Scores of individual teachers also vary across classes in the same year and from year to year, and examples of this variance fill the pages of critical analyses of VA methodology. In such scenarios, a given teacher of a single subject in the same year can show significant variance between students across multiple sections of the same course. A teacher might also score highly for a year or two and then plummet the following year. Such variance is not explainable or accounted for particularly well in the statistical models. I’ve identified several research analyses on these topics in the Research section of this site.

The concept of Student Growth Percentiles, which is a variant of VA analysis, compares similar student scores and ranks students according to their growth. When the student scores are aggregated into teacher classes, the teacher can then be given a growth score that is intended to fairly account for the population in the teacher’s class. This system weighs the aggregated class groupings to similar students so that teacher X, with 10 ESL students and 20 regular students has the scores of the ESL students compared to other ESL students in the state. This is intended to lessen the impact of second language students on the teacher’s VA score. However, an informed Internet reader will still find examples of teachers in New York (which is using Student Growth Percentiles for the first time this year) who are rated by principals and parents and colleagues as outstanding who have a mediocre VA score. So rating teachers on VA scores, even with Student Growth Percentiles, is fraught with problems.

The widely adopted but simplistic solution is to use a VA score (or any other assessment score) as only a part of a teacher evaluation. Using a summary test score in a subject area for only part of a teacher evaluation while adding classroom observations as a large, or largest component of evaluation is a commonly proposed means to lessen the impact of assessments, and a tacit recognition that there’s a problem with this unproven system. States have jumped into this complex process without any national research on what works, without agreement on how to use student test results to evaluate teachers, and without any real analysis of whether this has affected the quality of student learning where the programs have been implemented. We are dealing with a core shift that is virtually ungrounded in research. Were we to try this in other fields, we would be stopped by regulators and common sense at every level.

ASCD’s November 2012 edition of Educational Leadership is themed: Teacher Evaluation: What’s Fair? What’s Effective? This edition presents a generic overview of the current state of teacher evaluation, and particularly discusses value added assessment measures. The range of articles reflect all the concerns I’ve mentioned. I particularly like several articles that are offering recommendations about what to do with the results of all the testing and teacher evaluation that is going on. The threads in common here are very clear–lessen the impact of summative tests and devise a means to use more formative testing throughout the year in a feedback loop centered on teachers identifying what kids need to learn. Use teacher observations as a means to coach teachers toward improving their instruction. This sounds easy, but in general principals are both untrained to do so and have no time to spend in classrooms anyway. Those who have never been principals or building level administrators with responsibilities for day to day operations and teacher evaluation simply don’t have a clue about how problematic new teacher evaluation requirements are.

Educational Leadership isn’t exactly educational research per se, but a few of the articles come from folks with solid research backgrounds, and in the references at the end of the articles are a few meaty morsels worthy of further reading if you have access to a decent library where you can find the books and journals.

My favorite article is “How to Use Value-Added Measures Right,” by Matthew Di Carlo. He offers suggestions on how VA can be an appropriate element of teacher evaluation. The importance of getting this right is summarized by his comments in the first few paragraphs. He writes: “….there is virtually no empirical evidence as to whether using value-added or other growth models–they types of models being used vary from state to state–in high-stakes evaluation can improve teacher performance or student outcomes The reason is simple:It has never really been tried before.

Your health insurance is unlikely to pay for experimental treatments. We have mandates all over America to implement a diverse range of experimental treatments for teacher evaluation. It would have made sense to try such programs in small scale pilots across several states in multiple district contexts. Instead, we are demoralizing the profession by using unproven and unreliable means to evaluate performance, and then often publishing the results and destroying individual teachers.

You can find these articles on the ASCD website, though if you are not an ASCD member you may only be able to read abstracts. If you are a member, you’ll have the journal in your mailboxes. Share your copies with colleagues, check it out online. If you’re interested in or affected by VA assessments, Di Carlo’s bibliography is a good one–I’ll be adding some of the reference he lists to my own research page here at k12edtalk.

k12edtalk.com

Support for K-12 Public Education

Category Archives: Teacher Evaluation

Race to the Top – A “Race … in the Right Direction?”

Value Added and Teacher Evaluation