It looks like you're new here. If you want to get involved, click one of these buttons!
VAM Analysis and Merit Pay
Displaying all 12 posts.
*
yes"> Robert Valiant
yes"> Here you will find articles and data
to refute the claims of proponents of VAM and merit pay.
yes"> about 10 months ago · Delete Post
*
yes"> Robert Valiant
or, if you prefer,
and
Guy Brandenburg analysis of NYC VAM evaluations
John Rogers
yes"> Director, UCLA's Institute for
Democracy, Education, and Access
yes"> From Huffington Post
yes"> Posted: August 24, 2010 11:05 AM
yes"> Value Added is No Magic: Assessing
Teacher Effectiveness
yes"> Read More: Lausd , Los Angeles Times
, Los Angeles Unified School District , School Reform , Teacher Effectiveness ,
Teacher Rankings , Value Added , Los Angeles News
yes"> That old sorcerer has vanished
yes"> And for once has gone away!
yes"> Spirits called by him, now banished,
yes"> My commands shall soon obey.
yes"> In Goethe's classic, the apprentice
uses a sorcerer's spell to ease his daily chores. Chanting the master's words,
he brings a broomstick to life and tells it to fetch water to clean the
workshop. The broomstick obeys, only too well. It races between the well and back
until the workshop begins to flood. Although the apprentice had enough
knowledge to set magic in motion, he could not think ahead to what he did not
know.
yes"> I worry about a similar flood of
unintended consequences if the Los Angeles Times moves forward with its plans
to publish a database that places 6,000 Los Angeles third- to fifth-grade
teachers on a spectrum from "least effective" to "most
effective." The Times believes that the data will be a powerful tool to
force better teaching, but it cannot anticipate all of the consequences. For
example, consider that capable prospective teachers might avoid a profession in
which they risk public embarrassment based on an undeveloped science. Consider
the well-documented estimates that 25% of the value-added assessments are
likely to be in error.
yes"> Publishing the database might easily
undermine parent and teacher morale and make it more difficult for principals
to advance school improvement. Being told that their child's teacher is
"ineffective," or even marginally less effective than a teacher
across the hall, may lead some parents to pressure the principal to place their
child with a "high-scoring" teacher. Pitting parents against one
another or against their principal is not a recipe for school improvement.
yes"> The Times' teacher effectiveness
rankings are based on an elaborate statistical model created by Richard Buddin,
a senior economist and education researcher at the Rand Corporation. (Significantly,
Buddin did not attach teachers' names to his analysis; that was done by the
Times.)
yes"> Buddin is one of many researchers
across the country exploring so-called value-added approaches to assessing
teacher quality. The assessments measure gains that students make on standardized
tests from one year to the next. For example, researchers compare test scores
of fourth graders with their scores as third graders to determine the
"value added" by the fourth grade teacher. Proponents believe that
the "value added" reliably distinguishes between more and less
effective teachers. And they think that school officials would use such
comparisons to target support to struggling teachers and motivate them to do
better.
yes"> Yet value-added analyses focus
narrowly on standardized tests, usually in math and English Language Arts.
These tests give important information about student learning, but they ignore
much learning that matters to students, parents, and teachers. That's why it
can be a useful tool, but cannot possibly stand alone as a measure of
"effectiveness." The National Academy of Sciences has identified
several of the problems posed by value-added methods. These cautions should be
taken seriously.
yes"> * First, student assignments to
schools and classrooms are rarely random. As a consequence it is not possible
to definitively determine whether higher or lower students test scores result
from teacher effectiveness or are an artifact of how students are distributed.
yes"> * Second, it is difficult to compare
growth of struggling students with the growth of high performers. In technical
terms, standardized tests do not form equal interval scales. Enabling students
to move from the 20th percentile to the 30th is not the same as helping
students move from the 80th to the 90th percentile. These test score numbers
are not like inches along a tape measure that have the same value regardless of
where they occur.
yes"> * Third, estimates of teacher
effectiveness can range widely from year to year. In recent studies, 10-15% of
teachers in the lowest category of effectiveness one year moved to the highest
category the following year while 10-15% of teachers in the highest category
fell to the lowest tier.
yes"> The National Academy of Sciences
concluded that value-added analysis "should not be used as the sole or
primary basis for making operational decisions because the extent to which the
measures reflect the contribution of teachers themselves, rather than other
factors, is not understood."
yes"> And yet, the Los Angeles Times is
about to publish a database with the teacher effectiveness rankings of 6,000
elementary school teachers. The Times argues that its role is to provide
"parents and the public ... information that would otherwise be
withheld" about the "performance of public employees." The Times
should not believe in the magic of this data, and should realize that it cannot
foresee or control all of the consequences.
yes"> Follow John Rogers on Twitter:
www.twitter.com/UCLA_IDEA
yes"> about 10 months ago · Delete Post
*
yes"> Robert Valiant
yes"> Evidence about the use of test
scores to evaluate teachers: Economic Policy Institute, 2010
yes"> “…there is broad agreement among
statisticians, psychometricians, and economists that student test scores alone
are not sufficiently reliable and valid indicators of teacher effectiveness to
be used in high-stakes personnel decisions, even when the most sophisticated
statistical applications such as value-added modeling are employed.
yes"> For a variety of reasons, analyses
of VAM results have led researchers to doubt whether the methodology can
accurately identify more and less effective teachers. VAM estimates have proven
to be unstable across statistical models, years, and classes that teachers
teach. One study found that across five large urban districts, among teachers
who were ranked in the top 20% of effectiveness in the first year, fewer than a
third were in that top group the next year, and another third moved all the way
down to the bottom 40%. Another found that teachers’ effectiveness ratings in
one year could only predict from 4% to 16% of the variation in such ratings in
the following year. Thus, a teacher who appears to be very ineffective in one
year might have a dramatically different result the following year. The same
dramatic fluctuations were found for teachers ranked at the bottom in the first
year of analysis. This runs counter to most people’s notions that the true
quality of a teacher is likely to change very little over time and raises
questions about whether what is measured is largely a “teacher effect” or the
effect of a wide variety of other factors.”
yes"> about 10 months ago · Delete Post
*
yes"> Robert Valiant
yes"> Neither Fair Nor Accurate •
Research-Based Reasons Why High-Stakes Tests Should Not Be Used to Evaluate
Teachers
yes"> By Wayne Au
yes"> A pitched battle raged in my
hometown of Seattle this fall. Superintendent Maria Goodloe-Johnson and the
Seattle Public Schools district fought with the Seattle Education Association over
their most recent teachers’ union contract. At the heart of the dispute: Should
teacher evaluations be based in part on student scores on standardized tests?
yes"> Seattle is not unique in this
struggle, and it is clear that Superintendent Goodloe-Johnson takes her cue
from what is happening nationally.
yes"> In August, for instance, the Los
Angeles Times printed a massive study in which LA student test scores were used
to rate individual teacher effectiveness. The study was based on a statistical
model referred to as value-added measurement (VAM). As part of the story, the
Times published the names of roughly 6,000 teachers and their VAM ratings (see
sidebar, p. 37).
yes"> In October the New York City
Department of Education followed suit, publicizing plans to release the VAM
scores for nearly 12,000 public school teachers. U.S. Secretary of Education
Arne Duncan lauded both the Times study and the NYC Department of Education
plans, a stance consistent with Race to the Top guidelines and President
Obama’s support for using test scores to evaluate teachers and determine merit
pay.
yes"> Current and former leaders of many
major urban school districts, including Washington, D.C.’s Michelle Rhee and
New Orleans’ Paul Vallas, have sought to use tests to evaluate teachers. In
fact, the use of high-stakes standardized tests to evaluate teacher performance
à la VAM has become one of the cornerstones of current efforts to reshape
public education along the lines of the free market.
yes"> On the surface, the logic of VAM and
using student scores to evaluate teachers seems like common sense: The more
effective a teacher, the better his or her students should do on standardized
tests.
yes"> However, although research tells us
that teacher quality has an effect on test scores, this does not mean that a
specific teacher is responsible for how a specific student performs on a
standardized test. Nor does it mean we can equate effective teaching (or actual
learning) with higher test scores.
yes"> Given the current attacks on
teachers, teachers’ unions, and public education through the use of educational
accountability schemes based wholly or partly on high-stakes standardized test
scores and VAM, it is important that educators, students, and parents
understand why, based on educational research, such tests should not be used to
evaluate teachers.
yes"> Although there are many
well-documented problems with using VAM to evaluate teachers, I’ve chosen to
highlight six critical issues with VAM that are so problematic they alone
should be enough to stop the use of high-stakes standardized tests for such
evaluations. I hope these will be helpful as talking points for op-ed pieces,
blogs, and discussions at school board meetings, PTA meetings, and in the
bleachers at basketball games.
yes"> Statistical Error Rates
There is a statistical error
rate of 35 percent when using one year’s worth of test data to measure a
teacher’s effectiveness, and an error rate of 25 percent when using data from
three years, researchers Peter Schochet and Hanley Chiang find in their 2010
report “Error Rates in Measuring Teacher and School Performance Based on Test
Score Gains,” released by the U.S. Department of Education’s National Center
for Education Statistics.
yes"> Bruce Baker, finance expert at
Rutgers University, explains that using high-stakes test scores to evaluate
teachers in this manner means there is a one-in-four chance that a teacher
rated as “average” could be incorrectly rated as “below average” and face
disciplinary measures. Because of these error rates, a teacher’s performance
evaluation may pivot on what amounts to a statistical roll of the dice.
yes"> Year-to-Year Test Score Instability
yes"> As Tim Sass, economics professor at
Florida State University, points out in “The Stability of Value-Added Measures
of Teacher Quality and Implications for Teacher Compensation Policy,” test
scores of students taught by the same teacher fluctuate wildly from year to
year. In one study comparing two years of test scores across five urban
districts, more than two-thirds of the bottom-ranked teachers one year had
moved out of the bottom ranks the next year. Of this group, a full third went
from the bottom 20 percent one year to the top 40 percent the next. Similarly,
only one-third of the teachers who ranked highest one year kept their top
ranking the next, and almost a third of the formerly top-ranked teachers landed
in the bottom 40 percent in year two.
yes"> If test scores were an accurate
measurement of teacher effectiveness, “effective” teachers would rate high
consistently from year to year because they are good teachers; and one would
expect “ineffective” teachers to rate low in terms of test scores just as
consistently. Instead, the year-to-year instability that Sass highlights shows
that test scores have very little to do with the effectiveness of a single
teacher and have more to do with the change of students from year to year
(unless, of course, one believes that one-third of the highest ranked teachers
in the first year of the study simply decided to teach poorly in the second).
yes"> Day-to-Day Score Instability
yes"> Fifty to 80 percent of any
improvement or decline in a student’s standardized test scores can be
attributed to one-time, randomly occurring factors, according to Thomas Kane of
Harvard University and Douglas Staiger of Dartmouth College in their research
report “Volatility in Test Scores.”
yes"> This means that factors such as
whether or not a child ate breakfast on test day, whether or not a child got
into an argument with parents or peers on the way to school, which other
students happened to be in attendance while taking the test, and the child’s
feelings about the test administrator account for at least half of any given
student’s standardized test score gains or losses. Some factors, such as a dog
barking outside an open window, can affect an entire class.
yes"> Kane and Staiger’s findings
illustrate that using tests to evaluate teachers ignores the reality that a
host of individual daily factors that are completely out of a teacher’s control
contribute to how a student performs on any given test. To reward or punish a
teacher based on such scores could literally mean rewarding or punishing a
teacher based on how well or poorly a student’s morning went.
yes"> Nonrandom Student Assignments
yes"> The grouping of students—either
within schools through formal and informal tracking or across schools through
race, socioeconomic class, and linguistic (ELL) segregation—greatly influences
VAM test results, as 10 leading researchers in teacher quality and educational
assessment highlight in their policy brief “Problems with the Use of Student
Test Scores to Evaluate Teachers,” published by the Economic Policy Institute.
yes"> These researchers note that
“teachers who have chosen to teach in schools serving more affluent students
may appear to be more effective simply because they have students with more
home and school supports for their prior and current learning, and not because
they are better teachers.”
yes"> Even when VAM models attempt to take
into account a student’s prior achievement or demographic characteristics, the
models assume that all students will show test gains at an equal rate. This
assumption, however, does not necessarily hold true for groups of students who
historically have performed poorly on tests, for English language learners who
are asked to become proficient in both a new language and a tested subject
area, or for students with disabilities whose test-based rates of progress may
be incomparable to any other student.
yes"> Nonrandom student assignment means
that a teacher could be punished, dismissed, or lose tenure purely because the
course they teach or the school they teach in has a significant population of
traditionally low-scoring students who may show variable or slower test score
gains.
yes"> Imprecise Measurement
yes"> High-stakes, standardized tests are
also unable to account for the complexities of learning (and, by extension,
teaching). For instance, we know from the linguistic research of Steven Pinker
and others that learning often happens in a U-shape—that making mistakes is an
integral part of the learning process. When children are tested, we never quite
know where on the U-shaped learning curve they might be, nor do we realize that
their mistakes could be a vital part of a natural learning process. When tests
are used to evaluate teachers, it is possible that highly effective teachers
who push students out of their cognitive comfort zones are penalized for
provoking the deep learning that requires students to make mistakes on the way
to greater understanding.
yes"> Standardized tests are also too
crude to account for the possibility of cognitive transfer of skills that
students learn across different subjects. Using VAM, as the researchers in the
above-mentioned Economic Policy Institute policy brief explain, means that “the
essay writing a student learns from his history teacher may be credited to his
English teacher, even if the English teacher assigns no writing; the
mathematics a student learns in her physics class may be credited to her math
teacher.” In other words, we can never be certain which class and which teacher
contributed to a given student’s test performance in any given subject.
yes"> Out-of-School Factors
yes"> Out-of-school factors such as
inadequate access to health care, food insecurity, and poverty-related stress,
among others, negatively impact the in-school achievement of students so
profoundly that they severely limit what schools and teachers can do on their
own, explains David Berliner, Regents Professor of Education at Arizona State
University, in his report “Poverty and Potential.”
yes"> Although it is clear from the
research of Stanford University’s Linda Darling-Hammond and others that
teachers play an absolutely pivotal role in student success, when we use
high-stakes tests to evaluate teachers, we incorrectly assume that teachers
have the ability to overcome any obstacle in students’ lives to improve
learning. Although good teachers are critically necessary, they are not always
sufficient.
yes"> To assume otherwise is to think that
teachers (and schools) can somehow make up for the lack of housing, food,
safety, and living wage employment, among other factors, all on their own. The
social safety net is the responsibility of a much broader socioeconomic
network—not the sole responsibility of the teacher.
yes"> Politics, Not Reality
yes"> The reality of standardized tests is
that they are too imprecise and inaccurate to measure the effectiveness of
individual teachers. The sad thing is that testing experts, researchers, and
psychometricians have known this for quite some time. In 1999, for instance,
the expert panel that made up the Committee on Appropriate Test Use of the
National Research Council cautioned that “an educational decision that will
have a major impact on a test-taker should not be made solely or automatically
on the basis of a single test score.”
yes"> Yet two short years later, a
bipartisan Congress and the presidential administration of George W. Bush
passed No Child Left Behind and its test-and-punish approach to school reform
into law.
yes"> Although the Bush administration
seemed to ignore educational research as a matter of policy (as illustrated
through NCLB’s Reading First program and the advocacy of using phonics-only
teaching methods that had little basis in research), many hoped for something
different with the election of President Obama.
yes"> Unfortunately, the Obama
administration has sent a clear message: When it comes to high-stakes
standardized testing, the research doesn’t matter.
yes"> It hasn’t mattered that, according
to the above cited U.S. Department of Education report, “More than 90 percent
of the variation in student gain scores is due to the variation in student-level
factors that are not under control of the teacher.”
yes"> It hasn’t mattered that the National
Research Council of the National Academy of Sciences has stated that “VAM
estimates of teacher effectiveness should not be used to make operational
decisions because such estimates are far too unstable to be considered fair or
reliable.”
yes"> It hasn’t mattered that even the
researchers who completed the Los Angeles Times study acknowledged that VAM
data were too unreliable to use as the sole measure of teacher performance (a
point that the Times neglected to clearly articulate in their article).
yes"> Sadly, with Bush, now with Obama,
politics and ideology trump educational research.
yes"> One would think that all of the
policy makers, politicians, pundits, superintendents, talk show hosts,
documentary movie makers, business leaders, and philanthropic foundations so in
love with the idea of using test score data to evaluate teachers would be
equally as passionate about accuracy. People’s lives are at stake, and yet the
“data” underlying important decisions about teacher performance couldn’t be
shakier.
yes"> The shakiness of test-based VAM data
illustrates that the current fight over teacher “accountability” isn’t really
about effectiveness. The more substantial public conversation we should be
having about rising poverty, the racial resegregation of our schools,
increasing unemployment, lack of health care, and the steady defunding of the
public sector—all factors that have an overwhelming impact on students’
educational achievement—has been buried. Instead, teachers and their unions
have become convenient scapegoats for our social, educational, and economic
woes.
yes"> Yes, teachers’ performance needs to
be evaluated, but in a manner that is fair and accurate. Using high-stakes
standardized tests and VAM to make such evaluations is neither.
yes"> A former high school teacher, Wayne
Au is a Rethinking Schools editor and assistant professor at the University of
Washington, Bothell Campus.
yes"> about 9 months ago · Delete Post
*
School District Citizens
yes"> One of the best compendiums of
arguments against VAM can be found here:
<http://rdsathene.blogspot.com/2011/02/are-value-added-methods-vam-new-flat.html>
yes"> about 8 months ago · Delete Post
*
yes"> School District Citizens
yes"> Here is another great source for
arguing against VAM: http://www.njspotlight.com/ets_symposium/
yes"> about 8 months ago · Delete Post
*
yes"> School District Citizens
yes"> Read the EPI study of VAM here.
Theirfindings: VAM is a SCAM.
http://voices.washingtonpost.com/answer-sheet/teachers/new-study-blasts-popular-teach.html
yes"> about 8 months ago · Delete Post
*
yes"> School District Citizens
yes">
http://www.economics.harvard.edu/faculty/fryer/files/teacher+incentives.pdf
ABSTRACT
yes"> Financial incentives for teachers to
increase student performance is an increasingly popular education policy around
the world. This paper describes a school-based randomized trial in over
two-hundred New York City public schools designed to better understand the
impact of teacher incentives on student achievement. I find no evidence that
teacher incentives increase student performance, attendance, or graduation, nor
do I find any evidence that the incentives change student or teacher behavior.
If anything, teacher incentives may decrease student achievement, especially in
larger schools. The paper concludes with a speculative discussion of theories
that may explain these stark results.
yes"> Roland G. Fryer Department of
Economics Harvard University
yes"> about 7 months ago · Delete Post
*
yes"> School District Citizens
yes"> Of course we should hold teachers
accountable,
yes"> but this does not mean we have to
pretend
yes"> that mathematical models can do
something they
yes"> cannot. Of course we should rid our
schools of
yes"> incompetent teachers, but
value-added models are
yes"> an exceedingly blunt tool for this
purpose. In any
yes"> case, we ought to expect more from
our teachers
yes"> than what value-added attempts to
measure.
yes"> John Ewing
yes"> I came across this article by
Mathematician John Ewing and wanted to share it with you.
yes"> Dora
yes"> Mathematical Intimidation: Driven by
Data
yes"> by John Ewing
yes"> Mathematicians occasionally worry
yes"> about the misuse of their subject.
yes"> G. H. Hardy famously wrote about
yes"> mathematics used for war in his
yes"> autobiography, A Mathematician’s
yes"> Apology (and solidified his
reputation as a foe of
yes"> applied mathematics in doing so).
More recently,
yes"> groups of mathematicians tried to
organize a boycott
yes"> of the Star Wars project on the
grounds that
yes"> it was an abuse of mathematics. And
even more
yes"> recently some fretted about the role
of mathematics
yes"> in the financial meltdown.
yes"> But the most common misuse of
mathematics
yes"> is simpler, more pervasive, and
(alas) more
yes"> insidious: mathematics employed as a
rhetorical
yes"> weapon—an intellectual credential to
convince
yes"> the public that an idea or a process
is “objective”
and hence better than other
competing ideas or
yes"> processes. This is mathematical
intimidation. It is
yes"> especially persuasive because so
many people are
yes"> awed by mathematics and yet do not
understand
yes"> it—a dangerous combination.
yes"> The latest instance of the
phenomenon is
yes"> valued-added modeling (VAM), used to
interpret
yes"> test data. Value-added modeling pops
up everywhere
yes"> today, from newspapers to television
to
yes"> political campaigns. VAM is heavily
promoted with
unbridled and uncritical
enthusiasm by the press,
yes"> by politicians, and even by (some)
educational experts,
yes"> and it is touted as the modern,
“scientific”
yes"> way to measure educational success
in everything
yes"> from charter schools to individual
teachers.
yes"> Yet most of those promoting
value-added
yes"> modeling are ill-equipped to judge
either its
yes"> effectiveness or its limitations.
Some of those
yes"> who are equipped make extravagant
claims without
yes"> much detail, reassuring us that
someone
yes"> has checked into our concerns and we
shouldn’t
yes"> worry. Value-added modeling is
promoted because
yes"> it has the right pedigree—because it
is based on
yes"> “sophisticated mathematics”. As a
consequence,
yes"> mathematics that ought to be used to
illuminate
yes"> ends up being used to intimidate.
When that happens,
yes"> mathematicians have a responsibility
to
yes"> speak out.
yes"> Background
yes"> Value-added models are all about
tests—standardized
yes"> tests that have become ubiquitous in
K–12
yes"> education in the past few decades.
These tests have
yes"> been around for many years, but
their scale, scope,
yes"> and potential utility have changed
dramatically.
yes"> Fifty years ago, at a few key points
in their education,
schoolchildren would bring
home a piece of
yes"> paper that showed academic
achievement, usually
yes"> with a percentile score showing
where they landed
yes"> among a large group. Parents could
take pride in
yes"> their child’s progress (or fret over
its lack); teachers
yes"> could sort students into those who
excelled
yes"> and those who needed remediation;
students could
yes"> make plans for higher education.
yes"> Today, tests have more consequences.
“No
yes"> Child Left Behind” mandated that
tests in reading
yes"> and mathematics be administered in
grades 3–8.
yes"> Often more tests are given in high
school, including
yes"> high-stakes tests for graduation.
With all that
yes"> accumulating data, it was inevitable
that people
yes"> would want to use tests to evaluate
everything
yes"> educational—not merely teachers,
schools, and
yes"> entire states but also new
curricula, teacher training
yes"> programs, or teacher selection
criteria. Are
yes"> the new standards better than the
old? Are experienced
teachers better than
novice? Do teachers
yes"> need to know the content they teach?
Using data
yes"> from tests to answer such questions
is part of the
yes"> current “student achievement”
ethos—the belief
yes"> that the goal of education is to
produce high test scores. But it is also part of a broader trend in modern
yes"> society to place a higher value on
numerical
yes"> (objective) measurements than verbal
(subjective)
yes"> evidence. But using tests to
evaluate teachers,
yes"> schools, or programs has many
problems. (For a
yes"> readable and comprehensive account,
see [Koretz
yes"> 2008].) Here are four of the most
important problems,
yes"> taken from a much longer list.
yes"> 1. Influences. Test scores are
affected by many factors,
yes"> including the incoming levels of
achievement,
yes"> the influence of previous teachers,
the
yes"> attitudes of peers, and parental
support. One
yes"> cannot immediately separate the
influence of a
yes"> particular teacher or program among
all those
yes"> variables.
2. Polls. Like polls,
tests are only samples. They
yes"> cover only a small selection of
material from
yes"> a larger domain. A student’s score
is meant to
yes"> represent how much has been learned
on all
yes"> material, but tests (like polls) can
be misleading.
yes"> 3. Intangibles. Tests (especially
multiple-choice
yes"> tests) measure the learning of facts
and procedures
yes"> rather than the many other goals of
yes"> teaching. Attitude, engagement, and
the ability
yes"> to learn further on one’s own are
difficult
yes"> to measure with tests. In some
cases, these
yes"> “intangible” goals may be more
important
yes"> than those measured by tests. (The
father of
yes"> modern standardized testing, E. F.
Lindquist,
yes"> wrote eloquently about this [Lindquist
1951];
yes"> a synopsis of his comments can be
found in
yes"> [Koretz 2008, 37].)
yes"> 4. Inflation. Test scores can be
increased without
yes"> increasing student learning. This
assertion has
yes"> been convincingly demonstrated, but
it is widely
ignored by many in the
education establishment
yes"> [Koretz 2008, chap. 10]. In fact,
the assertion
yes"> should not be surprising. Every
teacher knows
yes"> that providing strategies for
test-taking can
yes"> improve student performance and that
narrowing
yes"> the curriculum to conform precisely
to the
yes"> test (“teaching to the test”) can
have an even
yes"> greater effect. The evidence shows
that these
yes"> effects can be substantial: One can
dramatically
yes"> increase test scores while at the same
time actually
yes"> decreasing student learning. “Test
scores”
yes"> are not the same as “student
achievement”.
yes"> This last problem plays a larger
role as the stakes
yes"> increase. This is often referred to
as Campbell’s
yes"> Law: “The more any quantitative
social indicator
yes"> is used for social decision-making,
the more
yes"> subject it will be to corruption
pressures and
yes"> the more apt it will be to distort
and corrupt the
yes"> social processes it is intended to
measure” [Campbell
1976]. In its simplest
form, this can mean
yes"> that high-stakes tests are likely to
induce some
yes"> people (students, teachers, or
administrators)
yes"> to cheat…and they do [Gabriel 2010].
But the
yes"> more common consequence of
Campbell’s Law
is a distortion of the
education experience, ignoring
yes"> things that are not tested (for
example, student
yes"> engagement and attitude) and
concentrating on
yes"> precisely those things that are.
yes"> The remainder of this paper can be
read at Mathematical Intimidation:
yes"> Driven by the Data.
yes"> about 5 months ago · Delete Post
*
yes"> School District Citizens
yes"> May 15, 2011 To The New York State
Board of Regents:
yes"> As researchers who have done
extensive work in the area of testing and measurement, and the use of
value-added methods of analysis, we write to express our concern about the
decision pending before the Board of Regents to require the use of state test
scores as 40% of the evaluation decision for teachers.
yes"> As the enclosed report from the
Economic Policy Institute describes, the research literature includes many
cautions about the problems of basing teacher evaluations on student test
scores. These include problems of attributing student gains to specific
teachers; concerns about overemphasis on “teaching to the test” at the expense
of other kinds of learning; and disincentives for teachers to serve high-need
students, for example, those who do not yet speak English and those who have
special education needs.
yes"> Reviews of research on value-added
methodologies for estimating teacher “effects” based on student test scores
have concluded that these measures are too unstable and too vulnerable to many
sources of error to be used as a major part of teacher evaluation. A report by
the RAND Corporation concluded that:
yes"> The research base is currently
insufficient to support the use of VAM for high-stakes decisions about
individual teachers or schools.1
yes"> The Board on Testing and Assessment
of the National Research Council of the National Academy of Sciences stated,
yes"> ...VAM estimates of teacher
effectiveness ... should not be used to make operational decisions because such
estimates are far too unstable to be considered fair or reliable.
yes"> Henry Braun, then of the Educational
Testing Service, concluded in his review of research:
yes"> VAM results should not serve as the
sole or principal basis for making consequential decisions about teachers.
There are many pitfalls to making causal attributions of teacher effectiveness
on the basis of the kinds of data available from typical school districts. We
still lack sufficient understanding of how seriously the different technical
problems threaten the validity of such interpretations.2
yes"> According to these studies, the
problems with using value-added testing models to determine teacher
effectiveness include:
yes"> 1 Daniel F. McCaffrey, Daniel
Koretz, J. R. Lockwood, Laura S. Hamilton (2005). Evaluating Value-Added Models
for Teacher Accountability. Santa Monica: RAND Corporation. 2 Henry Braun,
Using Student Progress to Evaluate Teachers: A Primer on Value-Added Models
(Princeton, NJ: ETS, 2005), p. 17.
yes"> 1
yes">&nbs
http://epaa.asu.edu/ojs/article/view/1298/1043
This is 2 years old, but still relevant. http://www.washingtonpost.com/blogs/answer-sheet/post/the-myths-of-standardized-testing/2011/04/14/AFNxTggD_blog.html