Why the Opposition to Testing and the Common Core?

I spent the last 20 years of my career dealing with data as a means to improve instruction, and I coached lots of people about how to make sense of data, standardized and local, as a useful component of daily instruction.  As teachers or administrators, we can’t know what our students have mastered without good tests, whether they are formative tests that are not graded, unit tests, interim benchmark tests, or standardized accountability tests.  In my own state of New York, there is a rising movement among some parents, some of whom are teachers, to opt-out of state NCLB tests.  Opposition to ‘national’ testing and the Common Core is growing in other states, supported by various groups who are concerned about increasing federal influence on local educational policies, costs, the misuse of tests for purposes other that what they were designed to do.

The two national testing consortia are talking about designing the tests to improve their utility as vehicles for instructional improvement, but whether these goals will be a part of the basic package of new tests or available at additional cost to local districts remains a bit fuzzy.  States are beginning to question the cost benefit analytics of the programs–both consortia are designing online testing while many school districts, including the low income districts that have always been the targets of the accountability movement, are going to struggle with paying for the Internet infrastructure and hardware required to support online testing.

So in broad terms, the growing opposition to the Common Core and the national testing programs represent multiple issues of concern to a growing variety of stakeholders, and they overlap significantly, making this a very complex situation.  Let’s review some of them here.

1.  Use of data for instructional improvement.  In New York, where at one time all state sponsored tests were released publicly after testing was complete, we could use the results for instructional improvement planning.  We had specific information about how the questions aligned to standards, we could see the question, and we had a p-value which told us how difficult the item was state-wide.  This was very useful for teachers, but that’s gone now that NCLB tests are secure.  Without seeing the questions, the rest of the information (how I did compared to anyone/everyone else) provides very poor instructional information.  I can still see whether I did well or poorly on a particular performance indicator, but I don’t have decent details about exactly what my kids have missed and I have to guess about why.  If I am in a district that has purchased commercial assessments that give me access to the questions, I have this capability during the school year, but many publishers tests are also secure and I have to rely on a generic description of academic trends that isn’t particularly useful.  I get far more information on what a student’s weaknesses are when I can review the test question and analyze the wrong answer responses as I want to plan interventions.

2.  Purpose of the tests.  Critics of NCLB testing write about this regularly.  The tests were not designed for either instructional improvement or teacher evaluation.  They have been co-opted by politicians and businessmen as a means to promote agendas other than school improvement.  Psychometricians who are not involved in developing or marketing these tests have written extensively about their concerns when they are used for high stakes teacher evaluation, or for determining student placement.  We don’t do the latter in NY, as far as I know, though there have been some examples of placement consequences for kids–it’s a local issue.  But in other states, testing occasionally has greater impact–promotion or retention, acceptance in accelerated programs or placement in remediation–based on results.  Often the uses of results doesn’t match the psychometric properties of the tests, which is problematic.  And in New York, where the state says it’s own tests count only for 20 of 100 points in a teacher evaluation, the rubric for the 100 points actually creates situations in which the 20 points on the state test can actually override the other 80 points.  This means everything about New York’s intricate evaluation systems can be blown away by a weak performance on the state 20% measure, meaning this 20% can, for some teachers, amount to the only measure of teacher quality that counts.

3.  Lack of transparency.  I hope this issue will go away with time.  It’s rather difficult to get information about the actual behavior of state tests–how did the results break out by student sub group?  By district demographics?  By years of teacher experience?  By class size?  By SES factors?  How did low income kids do in a district where they are 6% of the population compared to a big city, where they are a majority?

Moreover, everyone involved in rolling out new teacher assessments and new academic goals of the Common Core has been overworked and hard pressed to clarify what is coming.  As we heard repeatedly from the highest levels of New York State Education officials early on, these changes are like an airplane which is being constructed during flight.  Sadly, this is about the closest thing to  transparency I can think of early in the process–there wasn’t much more to be said.  They didn’t know how to get from where they started to the intended goal.

The proper answer to such comments from school boards and the public was often something like this:  “Where is the FAA?  Who approved the takeoff of this untested aircraft in the first place?  There’s too much being done here with no piloting, no research, no shakedown flights. But this concern was shoved under the rug.  New York adopted these changes to get $700 million of federal funds over 4 years, which was less than a half a percent of the per pupil spending on public education in the state.

And this is not just a New York issue, it’s national.  We don’t usually do this kind of take-off in other arenas, but it happens too often in education.  Let’s try the next new idea that some expert says is the solution to our problems, while we ignore the fact that it’s an untested and unproven program or methodology.  Did we know the Common Core was a singular organizational initiative from an educational think tank hired by the Council of Chief State School Officers which has, in effect unilaterally influenced the direction of public education?  Would truly empowered state education officials around the nation have adopted something this massive and unproven if they were not required to do so by other political forces, or if substantial federal funding during an economic downturn didn’t bribe them into agreement with Race to the Top conditions?  Would the public have found this a good use of funds if it were clear that there’s no evidence it would work?

4.  New teacher and principal evaluation systems.  A few districts in my region of New York who closely monitor new evaluation regulations are concerned about how honors class teachers did–they have all the high performing kids, and they think their special ed teachers got higher evaluation growth scores in their state testing measures because a point or two of improvement at the lowest level of performance is easier to get than increases among students already at the top.  This is one of the general criticisms of accountability programs that use any version of growth or value-added scoring.  From state to state, have the data been made available to independent researchers for an objective review? Do we have multivariate analyses available to look deeply at the results?  If not, why not, or when will this happen?  And if we do, so what?  Most states are about to change their tests again, from what they have been using to the new PARCC or Smarter Balanced consortia tests under development.  So the nature of available data is changing from last year, before the Common Core curriculum implementation, to Common Core based testing, and then in two years, to tests from one of the two consortia.  This fundamentally means three versions of accountability tests in as few as 4 years in many states, making comparisons of student achievement a statistical challenge for accountability purposes, to say the least.

5.  Narrowing of the curriculum.  How many fewer art, music, drama, etc. classes are gone, and replaced by supports for ELA and math?  Why does the edu/politico establishment ignore the evidence about the utility of arts and music education and their connection to math and science success?   How many kids have no more recess because they are scheduled for support time?  How many schools have dropped career education options in high schools?  How have the fiscal pressures of the past few years forced districts to make narrowing curricular decisions because of the fear of poor test results?  Is the focus on ELA and math appropriate for all students?

6.  What is college and career ready, anyway?  In the Atlantic Monthly of October 2012, Dana Goldstein wrote an important feature called “The Schoolmaster” about David Coleman, credited by many as the leading creator of the Common Core State Standards.  A nonprofit Coleman founded, Student Achievement Partners, provided the intellectual basis to the Council of Chief State School Officers. If Goldstein’s work is as accurate as it feels, he is far more important than our Secretary of Eduction.  He clearly has had more influence on the direction of US education than any other single individual, likely in our history.  And he’s now the head of the College Board, where his concepts of college and career readiness could transform the nature of SAT tests in the near future.

I happen to agree with the concept of Common Core State Standards, and I think their emphasis on critical thinking skills are long overdue, but defining success with a narrowly defined concept of college and career reading misses some important options for a substantial population of students.  Many educators think this focus rather significantly neglects the notion of career ready–we have been eliminating career and tech education all over, or transforming it into expensive tech honors programs at regional BOCES or Intermediate School or Educational Service Agency  locations as career education begins to emphasize forensics and high tech.  So where will we get our plumbers and carpenters and cabinet makers and auto mechanics and house painters and landscape gardeners?

7.  Local control of education.  Here’s the philosophical issue of the day: Should states/local school districts have virtually given up their role in determining the direction of their children’s education?  Constitutionally, this is a state role (which all but one state, Hawaii, turn over largely to local school boards) and not a federal one.  Though the Common Core is an initiative of states working together, it was not an initiative that involved local professional educators from the beginning.  It was an initiative farmed out to Coleman’s nonprofit business, pushed by the business/political wing of K-12 reformers, gained support from those concerned about the generic failures in urban centers across the country, and then offered as fundamentally a take it or leave federal money on the table proposition to the educational establishment throughout the nation.  The claim that all the states were directly involved in these plans is technically correct, but practically speaking the results were top down impositions, not bottom up reforms.

8.  Lack of funds for professional development.  In New York, teacher and principal evaluation rules mandate districts to provide professional development for teachers in order to be able to terminate them.  Given that schools have to upgrade/update technology infrastructure and hardware to prepare for online testing, and given a new state property tax cap, and given that increasing numbers of districts around New York are facing real economic stress and even bankruptcy, there won’t be any money to provide PD for low performing teachers.  Even a mediocre lawyer can prevent the termination of a teacher because the district’s responsibility for support to a weak teacher will be missing.  So who thought this was a good idea?

Many states claim their test results will be used to guide professional development programs to improve the work of teachers.  However, it’s typical to observe the decline in funding for educational professional development nationwide.  Schools are spending the money on assessments, not on professional development, and the tests themselves, as noted earlier, are both secured and not designed for teacher use to promote instructional objectives.  So two forces disconnect assessments from teacher improvement–funding and inappropriate test design.

Nationally, the movement to tie teacher evaluations to test scores was fueled significantly by Federal Race to the Top eligibility requirements.  This is another ‘reform’ initiative fueled without a quality research base of support.  Proponents of testing, and those wanting to pry public money out of the hands of school boards so that it’s available to alternate commercial programs have jumped on the novel idea that one can appropriately predict a student’s future success based on the scores of a teacher’s students.  The model is appealing, as it suggests an easy metric on which to judge the performance of schools and individual teachers.  But having followed the arguments closely for 10 years, it isn’t working particularly well anywhere.

Today, many groups are finally responding to some or all of these concerns by pushing back.  Some of the parent arguments arise out of fear for the well-being of their children, some out of frustration at loss of local control, some out of growing awareness of rising assessment costs and little demonstrable efficacy in using tests for instructional improvement.  Some opponents are teachers who are fearful of adverse consequences to themselves.  Academics, politicians and the occasional state education official are more openly questioning the speed of implementation, the lack of piloting, the unintended consequences, the public relations disasters in states where Common Core based testing produces a drop in test scores (teachers are not yet trained in the new curriculum objectives), and the mismatch between test design and the use of test scores.

Does the push-back accomplish much?  To date, not really.  Does it make a statement that might force politicians to take a second look at the unintended consequences of parental opposition?  The consequences of testing?  A real cost-benefit analysis of CCSS and national testing?  I personally hope it does, but I don’t think it will amount to much unless it continues to grow.  Get up to speed, and get involved!

Richard Allington on Literacy and the Common Core

Many teachers are concerned about the literacy implications of the Common Core.  One of the nation’s literacy experts, Dr. Richard Allington, was interviewed about the Common Core, arranged by SAANYS, the School Administrators Association of New.  SAANYS represents many supervisors, principals, and even Superintendents in NY, and of which I am a retired member.  The interview is published in the Winter 2013 issue of the SAANYS publication VANGUARD, on Special Education, which is not my expertise.  However, the interview, entitled “The Road to Literacy Instruction” is worth sharing with others in your district.  The interviewer is Peter DeWitt, an author and elementary principal in Averill Park CSD, in the Albany region of New York.  DeWitt also blogs for Education Week.  Allington is considered by many to be one of the nation’s literacy experts, and was once at U Albany, now at U of Tennessee.  At Southern Westchester BOCES, we used him as a literacy consultant to present while I was a Director of Professional Development.   Here’s the link to the Winter 2013 issue:

http://www.saanys.org/uploads/content/VANGUARDwinter2013.pdf

I’m struck by many of Allington’s points about literacy, about common misunderstandings relating to literacy and the Common Core, and about various forms of supplemental instruction.  Some examples:
-Computerized reading support program research shows little value to students.  Teacher led supportive instruction is what works.  “There isn’t much good to write about computerized instruction.” (p. 26).
-Whole Language also works, when done well, better than phonics only, but later in the article Allington points out that phonics instruction is vital for the 10-20% of students who don’t pick up on phonic awareness without explicit instruction.  These kids need specific help.
-The Common Core is frequently misunderstood — it does not mandate consistently harder texts.  Read his comments carefully to see how he interprets the Core Standards, and how he points out that teachers should be exposing kids to multiple levels of reading.  “There is no evidence that giving kids harder texts will have any positive effects on reading achievement.  Even the CCSS doesn’t suggest this occur….. States did not approve the Advice to Publishers so it is a puzzle to me why so many teachers seems to think harder texts are recommended by anyone.” (p. 24)

Given the reputation that Allington has in literacy instruction, these comments on the Common Core should be somewhat reassuring to teachers concerned that the Core stretches reading levels too high for many students, or that the Core mandates instructional texts at levels higher than appropriate for grade levels.  Allington also suggests some strategies and responsibilities for teacher and districts serving low income students that can support the vocabulary deficits that are found when children first enter schools.

This is an article worth reading and widely sharing.  I hope you find it as encouraging as I did.  Check out the references at the end as well if you can get them — several are not online for free.

Creativity, the Arts, and the Common Core

Years ago as I was enrolled in a multi-session new administrators professional development workshop series, one of the presenters was Terrence Deal, who with Leo Bolman, had written one of my organizational leadership texts.  For many years, Deal has been writing with Bolman and others about educational leadership and school culture.  Among the most important insights I recall from that meeting was Deal’s commentary on school culture compared to corporate culture.  He remarked that schools were headed in the wrong direction when they attempted to become more like businesses, and he noted that he had a growing business as a consultant to corporations that were attempting to become more like schools.  Why? Because schools represented, among other things, a culture that valued creativity and collaboration, that recognized differences among individuals, and that nurtured individuals toward self-improvement.

In the February issue of Educational Leadership, the theme is creativity.  It’s a good issue with many good articles, and you can read some of them without a subscription if you click on the titles without locks.  Of course you can find a colleague with the magazine and read them all!  Some of the authors express their concerns about how the Common Core could pose a challenge to teacher and student creativity.  Nowhere in the Common Core is creativity explicitly valued, nor do we see any clear places where creativity will be tested.  And we know too often that what is tested is taught.

If one considers the increase in cognitive complexity that the Core implies, it might be easy to suggest that the increased requirement to problem solve, whether in more complex math problems or more textual analysis of reading passages, is the same as creativity.  It isn’t, and I’ve been guilty of making this generalization myself.  Cognitive complexity and creativity are not the same–demonstrating multiple pathways to the solution of a math problem might show a ‘creative’ approach, but it often actually means the math teacher has done a good job of helping students find multiple pathways to understand math concepts.

One particularly interesting article in Educational Leadership is “The Art & Craft of Science,” by Robert and Michele Root-Bernstein.  You can read this article for free, by the way.  The authors have an impressive collection of research showing how the arts support science and mathematics.  They mention their own work, showing that four years of arts or music in high school would “confer a 100-point advantage over the average SAT score, whereas four years of science confer only a 69-point advantage.”  Their entire article is filled with examples from the writings of noted scientists and researchers that suggest the arts is somehow related to good results in science and math occupations.  Their work with graduates from Michigan State University, where the authors are employed, shows that “MSU Honors College STEM grads are 3 to 10 times more likely to be engaged in arts and crafts than the average American.”

I don’t read them as claiming that arts education creates better scientists, but the general evidence of creativity as an element that rounds out the education of successful STEM graduates is quite interesting.  The notion that keeping explicitly creative programs as a part of our general educational priorities runs through this Educational Leadership issue.

A major fear among critics of our testing accountability mania and of the Common Core’s emphasis on textual analysis (to the feared reduction in more creative ELA elements), coupled with national economic woes that have cut funding for public education, is the narrowing of the curriculum.  Schools for many years have been cutting teachers and programs that are not core academic topics in the arts, cutting vocational programs, cutting foreign language options, and cutting electives from science and social studies programs to focus on passing tests.  Many writers at all levels lament the decline of STEM programs–science, technology, engineering and mathematics–and suggest our international standing is suffering because of a decline among American students in STEM high school and college programs.  Schools and teachers are caught in the maelstrom of preparing kids for state accountability tests, the results of which are being touted as appropriate measures of whether schools and teachers are effective.  Too often, I think we’re throwing out the child-centered, healthy adult opportunities that a broad curriculum has traditional offered.  We need to foster creativity and individuality, not neglect them in pursuit of a narrowly aimed test score to define success.

If we look at the educational systems of several nations that outperform us on numbers, we will find many in which an end of school year test is critical.  China and Japan come to mind here, where high student suicide rates speak poorly about pressures on kids.  We should also take note of how much effort these two countries are making in attempting to put creativity into their schools–they have a clear understanding that the inventiveness and entrepreneurial spirit that characterized American education and makes America, still, the best source of new inventions and new discoveries, is related to our historically creative and ‘liberal arts’ k-12 educational programs.  And an examination of test scores compared to worker productivity demonstrates that high test scores don’t relate to economic productivity.

As we move to the Common Core, which I like, we must not negate the broad and humanistic elements of our traditional educational system.  Our students’ measures of ability should be more than a number.  We won’t success by matching the Chinese model of superior test takers — they have created an economic power based on the imitation and reproduction of intellectual discoveries from elsewhere, and they regularly visit our schools trying to find the secret to our own entrepreneurial success.  Our future lies in how well we continue to have a broadly educated, well-rounded population with the ability to think out of the box and discover new solutions to the challenges around us.

Value-Added Models – New Research

If you’re following educational reform, you are aware that some flavor of value-added statistical modeling is being pushed throughout the nation as a means to identify good/bad teachers or good/bad schools.  Pushed now by the US Department of Education and Race to the Top, and promoted by a specialized cadre of assessment developers who stand to rake in massive profits from the testing required to generate data to be used in value-added analyses, this movement has been sweeping across the country.  It has, on the other hand, always raised the suspicions of educational researchers, some economists, and numerous statisticians who have suggested that the models, and the assessments behind them, simply don’t work well as a basis to make high stakes decisions about educational policy, about schools and teachers, or about children.

In January, the Educational Policy Analysis Archives, a peer-reviewed online policy journal, ran several research articles in a special issue covering value-added, called “Value-Added: What America’s Policymakers Need to Know and Understand.”  The title is very misleading, since my reading of this research suggests a better title might be “Value-Added:  What America’s Policymakers Consistently Ignore.”  I’m highlighting the articles here, with links, so readers can pick and choose what interests you.  Each separate article offers a substantial list of references for further exploration.

The opening discussion by the editors is “Value-Added Model (VAM) Research for Educational Policy: Framing the Issue.”  The editors provide background and relevance of six additional papers included in the special issue.

Diana Pullin, of Boston College, a lawyer and legal scholar, offers “Legal Issues in the Use of Student Test Scores and Value-Added Models (VAM) to Determine Educational Quality.”  Given that many states are implementing teacher evaluation models in which VAM measures will be used to dismiss teachers, Pullin analyzes the complexity of predictable court challenges as experts on both sides of the VAM movement offer conflicting testimony about the validity and reliability of tests and the statistics used for employment decisions.

Bruce Baker, Joseph Oluwole and Preston Green of Rutgers, Montclair State, and Penn State respectively,  in “The Legal Consequences of Mandating High Stakes Decisions Based on Low Quality Information: Teacher Evaluation in the Ract-to-the-Top Era,”  review issues around the utility of student growth models and VA models.  Their lengthy appendix outlines the VAM evaluation policies of several states.

Nicole Kersting and Mei-Kuang Chen of U of Arizona, and James Stigler of UCLA, offer a statistically heavy analysis titled “Value-Added Teacher Estimates as Part of Teacher Evaluations: Exploring the Effects of Data and Model Specifications on the Stability of Teacher Value-Added Scores.”  They conclude there are several problems with stability and sample sizes which suggest a need for more work to improve the measures.  If you’re not statistically competent, read the narrative–the supporting details are a second language for many of us.

Elizabeth Graue, Katherine Delaney, and Anne Karch of the U of Wisconsin, Madison provide a qualitative research paper entitled “Ecologies of Education Quality.”  Their analysis suggests that VAMs can’t capture or control for all the variables of effective teaching.

Sentinels Guarding the Grail: Value-Added Measurement and the Quest for Education Reform,” from Rachel Gabriel of the University of Connecticut and Jessica Nina Lester of Washington State is another qualitative analysis.  This is an interesting review of how proponents of VA suggest the methods are scientific and accurate, while the cautions of educational researchers with significantly opposite points of view are ignored or dismissed.  This analysis of the effective public promotional efforts of VAM and the glib solutions to educational problems that VAM supporters say will come through it’s adoption highlights how public policy has been shaped by politics and the media.

Finally, Moshe Adler of Columbia University, in “Findings vs. Interpretation in “The Long-Term Impacts of Teachers” by Chetty et al.” offers a a refutation of the claims of an often-cited study suggesting that the positive effects of a singe year with a good teacher last a lifetime,  Since the Chetty el al. study is frequently cited as justification of VAM and teacher evaluation changes, this is an important commentary on what Adler says are the false claims of this seminal economic impact study.