scrap the map5
Students and teachers at Ballard High School in Seattle

Beginning this fall, teachers who have been on this system for two school years prior to 2012-2013, and who teach tested subjects and grades, will receive a student growth rating based upon two assessments and a two year rolling average of student assessment data.”

– José Banda, Superintendent Seattle Public Schools

There has been a push underway to judge a teacher’s performance, or that of a principal, a school, a district or a superintendent based on student test scores.

Someone had the idea to determine how well a teacher or a school was faring based on how a student was doing on a standardized test over a certain period of time. That length of time has been decided upon from what I’ve seen in a rather haphazard manner from state to state.

The term “value added measures” or VAM is the term that is used when a student’s performance is measured over that particular length of time.

No other factors are included in this measure such as the socioeconomic situation of the student, family or health factors, or the academic status of  the student, for example if the student is an English Language Learner (ELL)   or has an Individualized Education Program (IEP).

That is my layman’s explanation of VAM and will help readers who are not steeped in statistics and mathematics to understand the following post.

What the superintendent in Seattle wants to do, as is the fashion these days thanks to Race to the Top, is determine a teacher’s performance based on VAM. This would then lead to determining a grade for a school, a principal and even the superintendent himself if he’s not careful.

The unfortunate effect is that a school grade can then determine if a school remains open, is closed or turned into a charter school.

With that introduction, I would like to share a response that I received from a math teacher in Seattle to a query about the MAP test and VAM and its use in Seattle.

To follow was their response:

Why we should Scrap the MAP

No experts anywhere in the field of K-12 educational assessment consider just two years of this type of data as valid.

The idea that the leaders of our LMO (Labor Management Organization, the Seattle Education Association) would agree to this just shows how little concern they have for science, mathematics, statistics, or teachers in classrooms.

Seattle Public Schools should not release the two year Value Added Measures (VAM) data on its teachers as planned.

“A study examining data from five separate school districts [over three years], found for example, that of teachers who scored in the bottom 20% of rankings in one year, only 20-30% had similar ratings the next year, while 25 – 45% of these teachers moved to the top part of the distribution, scoring well above average.” –

National Research Council, Board on Testing and Assessment (2009).

Report to the U.S. Department of Education.

Links to the full reports at The National Academy Press for source research publications and books:\

The list below of peer-reviewed, academically sound research and reports on the use and abuse of VAM in K-12 is long and compelling.

We don’t understand how or why anyone whose job it is within a school system to collect and meaningfully apply teacher and student assessments to improve student learning is allowed to keep their job without ever doing the needed due diligence and to inform themselves about the core facts of their work. Absurd, really.

Virtually all the research on VAM as applied to teacher evaluation indicates that the planned Seattle Public School (SPS) action will seriously mislead the public – not inform them as apparently has been falsely assumed.

Economic Policy Institute:

“Analyses of VAM results show that they are often unstable across time, classes and tests; thus, test scores, even with the addition of VAM, are not accurate indicators of teacher effectiveness. Student test scores, even with VAM, cannot fully account for the wide range of factors that influence student learning, particularly the backgrounds of students, school supports and the effects of summer learning loss…. Furthermore, VAM does not take into account nonrandom sorting of teachers to students across schools and students to teachers within schools.”

Annenberg Institute: this is an excellent recent and major review of current principles and practices of VAM measure as relevant to K-12 educational reform.

“At least in the abstract, value-added assessment of teacher effectiveness has great potential to improve instruction and, ultimately, student achievement. However, the promise that value-added systems can provide such a precise, meaningful, and comprehensive picture is not supported by the data.”

The Kappan: PDK International

From “The Kappan,” the excellent magazine of PDK International (a must subscription for SPS board members and administrators in my view) is that after reviewing the critical problems with VAM… it does not abandon the idea improving teacher evaluations as part of the effort to improve K-12 education and instead presents practices that are more likely to actually accomplish those goals.

1. Value-added models of teacher effectiveness are inconsistent…

2. Teachers’ value-added performance is affected by the students assigned to them…

3. Value-Added Ratings Can’t Disentangle the Many Influences on Student Progress…

National Bureau of Economic Research: Student Storing and Bias in Value Added Estimation:

“The results here suggest that it is hazardous to interpret typical value added estimates as indicative of causal effects… assumptions yield large biases…. More evidence, from studies more directly targeted at the assumptions of value added modeling, is badly needed, as are richer VAMs that can account for real world assignments. In the meantime, causal claims will be tenuous at best.”

Test Score Ceiling Effects of Value Added Measures of School Quality

From: U. of California, U. of Missouri, and the American Statistical Association

This is a pure research that is often cited by experts but is not an easy read for a non-educator or lay person. Its critical findings are around test score ceilings and non-random populations of students (think Roosevelt vs Rainier Beach). This creates statistical problems and misconception when amalgamating or disaggregating student/teacher data from test scores.

The Problems with Value-Added Assessment – Diane Ravitch

With her perspective as an education historian this is a recent, thoughtful and fact based review of VAM use.

“I concluded that value-added assessment should not be used at all. Never. It has a wide margin of error. It is unstable. A teacher who is highly effective one year may get a different rating the next year depending on which students are assigned to his or her class. Ratings may differ if the tests differ. To the extent it is used, it will narrow the curriculum and promote teaching to tests. Teachers will be mislabeled and stigmatized. Many factors that influence student scores will not be counted at all.”

Research Calls Data-Driven Education Reforms into Question

Recent reports by National Academies, National Research Council and the National Center on Education and the Economy.

“Both organizations are respected for their high quality, comprehensive, and non-ideological research. Together, they reach the undeniable conclusion that today’s array of testing and choice fails to meet the instructional needs of American students and the national goal of broadly-based high academic achievement.”

Why Teacher Evaluation Shouldn’t Rest on Student Test Scores

FairTest.Org has a clearly stated agenda, but that does not discount this excellent list of the practical problems applying VAM (as currently used) to teacher evaluation and concludes with a list of solid, academically sound research references.


An excellent, unbiased resource on educational issues and the relevant research. The George Lucas Educational Foundation is dedicated to improving the K-12 learning process by documenting, disseminating, and advocating for innovative, replicable, and evidence-based strategies that prepare students to thrive in their future education, careers, and adult lives. Edutopia’s byline is, “What Works in Education.”

“Value-added modeling” is indeed all the rage in teacher evaluation: The Obama administration supports it, and the Los Angeles Times used it to grade more than 6,000 California teachers in a controversial project. States are changing laws in order to make standardized tests an important part of teacher evaluation. Unfortunately, this rush is being done without evidence that it works well. “


And this letter from a Seattle Public Schools parent to our superintendent:

Dear Superintendent Banda,

I am writing to express my support for the teachers boycotting the MAP
test and to urge you to take NO disciplinary action against them.

I have been reflecting on standardized test scores and the uses to which
they have been put. At the same time as these kids are boycotting, I
have received a response to my advanced learning applications for my
kids. I have several years of experience with attempting to test my two
children into the program at the same time – in the hopes that they could
attend the same program at the same school.

You may be pleased to hear that they have been successful. One actually
was rejected, but upon appeal she will qualify based on her spring MAP

Although my children are quite bright and do require some kind of
ALO, the success was based just as much on LUCK. Why
do I say this? Well, for my two children to test into advanced learning,
both of them had to meet a threshold for four standardized tests (two
Cog-AT and two MAP tests). My son is receiving special education
services for a processing delay – he works slowly. So this year’s Cog-AT
scores would have disqualified him for both Spectrum and APP.
Last year, though, he qualified for APP. Why the difference? He’s the
same kid!!! If anything, he’s smarter this year, on account of the
excellent teachers he has.

Meanwhile, my daughter was disqualified despite having Cog-AT scores
in the 98th and 99th percentile, because her spring MAP for reading is in
the 75th percentile. But her more recent winter score is higher so I have
been assured that she will get in based on that.

Once again, this is the same kid. Two wildly different scores – which is
of course the norm for the MAP test.

I’m lucky she didn’t have a bad day for this winter test, aren’t I?

But education shouldn’t be based on luck. Teacher evaluations shouldn’t
be based on luck.

Now, a big community of parents, teachers, and students have been saying
this all along. The district hasn’t listened, and that’s why we don’t trust
your task force. Deal justly with the teachers, and we’ll listen then.

Editor’s note:


Dora Taylor