Regie Stite's Responses
In October 2000, the NIFL Equipped for the Future (EFF) Discussion List
invited Regie Stites, Educational Researcher and consultant to the Equipped for
the Future initiative, to engage in a virtual Q&A on the list. Several list
members posed questions for Dr. Stites about the EFF Standards for Adult
learning and standards based system reform. The Q&A session has been
organized into the following categories:
Moderator's introduction to the Q&A session and
Regie Stites
NIFL-4EFF Colleagues:
I am pleased to announce that Regie Stites has agreed to join us online to
respond to our questions about the EFF standards and standards-based educational
reform.
Regie Stites is an Educational Researcher in the Center for Education and
Human Services of SRI International. Dr. Stites is a specialist in literacy
studies with a background in cultural anthropology and applied linguistics. He
received his Ph.D. in Education from the University of California, Los Angeles
in 1992. Prior to work at SRI, Dr. Stites was a Research Associate in the
National Center for Research on Standards and Student Testing (CRESST) at UCLA
and a Senior Researcher in the National Center on Adult Literacy and
International Literacy Institute at the University of Pennsylvania. He also has
experience as a college instructor and as a teacher of English to speakers of
other languages in the USA and in the People's Republic of China. His current
research focuses on applications of educational technology and on assessment and
policy issues in literacy and adult basic education.
Dr. Stites was an advisor to the Mayor's Commission on Literacy's work on the
EFF Citizen/Community Member Role Map. He is the author of the 1999 Focus on
Basics article: A User's
Guide to Standards-Based Educational Reform: From Theory to Practice. He
is now assisting EFF in planning assessment development and validation
processes.
Here is a short list of topics Regie and I think are of interest to the list
and that he would enjoy discussing with us. Please post your questions for Regie
in these areas:
- The role of assessment in standards-based educational reform
- Ways of judging the validity of an assessment
- Differences between 'traditional' and performance-based assessment
- Aligning instruction and assessment with the EFF Standards
- The role of technology in supporting learning aligned with the EFF
Standards.
Thanks,
Ronna Spacone NIFL-4EFF Discussion List Moderator
Regie's introductory remarks and common threads in the
questions
Hello everyone,
I would like to thank Ronna Spacone for this opportunity to respond to
questions from the 4EFF list. Having been a listener (aka lurker) on the list
for some time I feel that it is quite an honor to be given a virtual soap box to
stand on to address the list. As of the end of the day today, there have already
been a number of interesting and challenging questions posted by Debbie Tuler,
Ronna Spacone, Amy Trawick, Sue Barton, Donna Curry, Andy Nash, and Mary
Hannaman. Rather than respond separately to each of the questions I will instead
address my comments to what I see as some common threads that run through this
initial set of questions.
The common threads so far as I see them are the following:
- The role of EFF standards and assessment in systemic reform (including
improving instructional practice)
- The pros and cons of performance-based assessment (and portfolios)
- Assessing ESOL students on the EFF standards
I will be responding to issues in each of these areas in subsequent posts. In
doing so, I hope to touch on most (but probably not all) of the particular
questions that have been addressed to me.
Before I begin to respond, I think it will be a good idea for me to say
something about my role in helping to plan assessment development and validation
processes for EFF. Last month, at a meeting of the EFF National Policy Group, I
presented a "road map" for validating the EFF Assessment Framework. The road map
describes a process and criteria that can be used to develop valid and reliable
performance standards and measures aligned with the EFF Content Standards.
The road map includes recommendations for a behavioral-anchoring process that
can be used to develop descriptions (and examples) of performance at various
levels for each of the EFF standards. It also describes the types of validity
evidence that may be needed to satisfy the concerns of various stakeholders.
From a measurement perspective, the central concern is likely to be construct
validity - the degree to which an assessment system meets technical criteria for
validity and reliability. From a policy perspective, the central concern is
likely to be consequential validity - the degree to which the uses of an
assessment system lead to fair and equitable outcomes for learners, instructors,
programs, and funders. Finally, from a popular perspective, the central concern
is likely to be face validity - the degree to which an assessment system is
meaningful and understandable to all.
I believe that all three general types of validity (construct, consequential,
and face) are equally important. Much of what is wrong with current systems of
assessment in adult basic education can be interpreted as problems in one or
more of these types of validity. The validity concerns (and processes for
avoiding validity problems) that I identified in the road map have recently been
the focus of my attention in thinking about EFF. These concerns are the
background for most of what I will have to say in response to the questions
addressed to me on the list.
On EFF and systemic reform
A number of the questions that have been posted to the list ask about the
mechanisms through which the EFF standards and assessments can contribute to
improvements in instructional quality and outcomes. These questions were asked
at the program level, but I would like to begin my response by working down from
the systemic level. One of the key strengths of EFF is the foundation that it
offers for standards-based reform of the adult language and literacy educational
system. I described the components of an ideal model of standards-based systemic
reform in the article I wrote for Focus on Basics. Here's a snippet from the
conclusion of that piece:
"According to the ideal model of standards-based reform, all forms of
standards -- content, performance, and opportunity-to-learn -- should be
aligned. To bring practice closer to the ideal, we must somehow connect EFF,
NRS, and NAAL as well as state level standards. This will not be easy, but will
offer many benefits. First, coherent content standards can provide a clear
vision of what every adult should know and be able to do. Performance standards
and related assessment matched to this vision provide the tools for individual
learners, literacy programs, and everyone to monitor progress toward goals.
Opportunity-to-learn standards may be especially critical for a system of
education (adult literacy) that is chronically underfunded."
I wrote the FOB article with applications of standards and assessment in
large-scale accountability and reporting systems in mind. But I would argue that
effective alignment of learning goals and instructional objectives (which can be
described in a generic way by content standards) with measures and expectations
for learning outcomes (which can be described in a generic way by performance
standards) is a key indicator of educational quality at all levels. Another way
to put this is to say that teaching to the test is not a bad thing, as long as
the test in question measures knowledge and skills that learners (and others)
recognize as important goals for learning. Opportunity-to-learn is also a
critical piece of the quality equation at the program level.
A good system of educational standards should support teaching and learning
in at least three ways. Content standards should help to clarify long-term
learning goals for learning and to situate particular learning objectives on a
pathway leading to long-term goals. Performance standards should help to
establish milestones on the pathway that both teachers and learners can use to
mark progress and plan further learning. Opportunity-to-learn standards should
help to develop a better understanding of the time and resources that teachers
and learners will need to make reasonable progress toward learning goals.
Debbie Tuler asked about what pieces of the EFF framework should be used for
what purposes. My general response is that the EFF role maps, standards, common
activities, and eventually performance continua and benchmarks of performance
are part of an integrated framework that can help teachers to align learners'
goals with curriculum design and teaching practices and with measures of
learning progress. Admittedly, this is a tall order for the teacher and the fact
that curriculum and assessments used to inform instruction are poorly aligned
with the standardized tests used for external reporting and accountability makes
the job even harder.
To improve on the current 'misaligned' system, progress needs to be made at
both the program level and the systems level. As indicated by Ronna Spacone's
question, substantial investment in staff development is the starting point for
alignment at the program level. Staff development should provide teachers with
the models, support, and guided practice they need to be able to apply the EFF
framework to aligning learning goals with teaching practice and with measures of
learning at the program level. At the same time, work is needed at the level of
accountability systems to better capture results that matter.
EFF has only recently begun the long and hard work of developing an
assessment framework (elaborating performance continua for each standards,
establishing benchmarks for levels of performance, selecting and developing
tasks to measure performance on the standards, and combining all this into a
qualifications framework). In this context, I would respond to Sue Barton's
question about "the most pressing public policy issues affecting the
implementation of EFF into a program" by saying that the first priority should
be garnering broad-based support (and involvement) for developing an assessment
framework that supports measures of meaningful results in adult learning and
establishes reasonable expectations for resources needed to support such
results. Beyond this the other public policy supports that need to be in place
to make EFF work at the program level include accountability structures that are
aligned with learners' goals and program curricula, expanded professional
development opportunities for teachers, expanded access to high-quality learning
opportunities for students, and the resources and political will to make all
this possible.
On Performance assessment validity
Hello again,
This message is in response to questions that were asked about the pros and
cons of using performance assessment and portfolios for internal(instructional)
and external (accountability) purposes. I think Amy Trawick hit the nail on the
head when she noted that "multiple customers will need to buy in to a new way of
thinking about assessment" to make EFF's reform vision a reality. That new
thinking includes an understanding of the value of aligning standards and
assessments at the program, state, and national levels -- for reasons that I
described in the last post. It also includes the goal of making assessment an
integral part of instruction and learning, rather than -- as is too often the
case in using standardized tests -- a separate (and often painful) event that
interrupts learning and instruction and distances adult learners from their
instructors and from their motivation to learn.
Amy went on to ask my opinion on the factors that affect the utility of a
performance-based assessment system for "learner, teacher, program, funder,
*and* state/federal purposes." In my view, this is a question about the validity
of performance-based assessment and it is exactly the right way to frame such a
question. The validity of any assessment should be judged in terms of the
purpose of the assessment. For example, I would argue that the methods used to
assess certain types of reading and numeracy skills in the last National Adult
Literacy Survey (NALS) are valid for the purpose of profiling the distribution
of various levels of those reading and numeracy skills in the adult population
of the U. S. On the other hand, I would argue that the NALS measures are not
valid for the purpose of assessing the overall impact of the adult language and
literacy educational system on literacy levels among U.S. adults. The primary
reason that the NALS is not valid for the latter purpose is the narrow range and
poor alignment of the skills it measures relative to what is being taught and
learned in the adult language and literacy educational system.
Mary Hannaman pointed out a similar validity problem (and one that is closer
to home) in her questioning of the appropriateness of using standardized tests
that have little connection to standards that "have been developed based on the
needs or goals of the state." Bringing the issue of validity even closer to home
was Donna Curry's question about whether teachers need to be concerned about
validity in informal and day-to-day assessment. I think teachers should always
be concerned about the validity of any assessment, formal or informal. I would
also argue that validity is much easier to achieve when assessment is closely
aligned with instructional goals and integrated into instructional activities.
This is why many assessment specialists see performance-based assessment as a
potentially more valid alternative to traditional testing.
To understand why alternative assessment systems (performance tasks,
portfolio assessments, and other integrated measures of knowledge and skills)
may be more valid than traditional forms of testing (multiple-choice,
fill-in-the-blank, and other discrete measures of knowledge and skills) for
various purposes you may want to look back at the three general areas of
validity that I described in my "Introductory remarks" -- construct validity,
consequential validity, and face validity.
In terms of construct validity, the advantage of alternative assessment is
the opportunity to create more direct and more authentic measures of desired
knowledge, skills, and abilities than is typically possible with traditional
testing. On the other hand, construct validity also includes concerns about
reliability (consistency of scores/ratings over time and among raters).
Standardized tests are strong on reliability. Performance based assessments are
scored more subjectively and therefore reliability must be strengthened by use
of well-structured scoring guidelines and training of teachers to make effective
and consistent use of scoring guidelines.
In terms of consequential validity, alternative assessment systems again have
the advantage over traditional testing in many situations because of the fact
that performance-based assessments and especially portfolios assessment systems
give learners more opportunities (in more "real-world" contexts) to demonstrate
desirable knowledge, skills, and abilities. I recently heard Sri Ananda make the
argument that the relatively high costs of using performance assessment (because
of the training and process needed to achieve reliability) is justified in cases
where only direct measures of performance will do. She used the example of the
behind-the-wheel test required to get a driver's license. A paper and pencil
test alone will clearly not suffice to guide this high-stakes decision. Even in
low-stakes testing situations, performance assessment (particularly when results
are collected and regularly reviewed in a portfolio) has the advantage of
providing more feedback to learners and instructors that is more directly
applicable to improving learning activities and opportunities than the guidance
that standardized tests can provide.
On the issue of face validity, performance assessment is again a clear
winner. Scoring criteria used in performance-based assessment are more easily
communicated (and often more meaningful) to learners and teachers than is the
case in traditional forms of assessment. Within an alternative assessment
system, learning and assessment activities are combined. A well-structured
performance task should also be a learning activity. Good use of a portfolio is
one way to capitalize on the potential for strong face validity in performance
assessment. The primary purpose of the portfolio should be to aid communication
between the learner and the instructor so that learning goals and progress can
be reviewed and evaluated in an ongoing dialogue. In the end, the portfolio can
become a richly textured and substantial piece of evidence of learning
achievement.
The challenge is to convince policy makers and funding agencies that such
evidence is as valid (and reliable) as standardized test results. Basically,
this means changing the ways that policy makers think about validity. The face
validity of a standardized test rests mostly on the authority of the experts who
design the test and analyze its results. This is seen as having advantages for
high-stakes testing because the judgements of experts are viewed as legitimate
(even though the bases for arriving at judgements are not widely understood). In
my opinion, education is a different sort of system than law or medicine. In
medicine and to some extent in law, the public puts its trust in authoritative
judgements above popular understanding. In education, it is relatively more
important to work toward achieving a balance (and making connections) between
expertise and popular understanding.
On EFF and ESOL (English for speakers of other languages)
Hi again,
This is the last of the three general responses that I promised. This one
will be shorter (a good thing, no?) than the previous posts and I am hoping that
it leads into a more open discussion.
The topic is (as Debbie Tuler put it) "the meaningfulness and use of [an EFF
Standard] for planning instruction and assessment for ESOL. Debbie was
particularly interested in the "speak so others can understand" standard. Andy
Nash asked a similar question about using EFF to guide assessment and
instruction for beginning English learners. I think that an assessment expert
and policy wonk (like me) is less likely to provide useful guidance in this area
than teachers who have worked through this problem in practice. But, of course,
I do have some advice to offer.
First, there is the issue of how much guidance the EFF Standards can provide
to teachers in planning instruction and assessment in any case. The EFF
Standards (like any Content Standards) should be "visionary and not at all
prescriptive" (to borrow a phrase from Andy Porter. In other words, the role of
the standards should be to help organize and frame instruction and assessment,
but never to set limits on what should be taught and assessed. The EFF Standards
cannot be the sum total of any program or classroom curriculum, but they can
help learners, teachers, and program managers to see the "big picture" of
learning goals and perhaps point to gaps where new curriculum development is
needed.
Second, let me repeat the mantra of the assessment specialists -- "multiple
measures." The guidance that EFF can provide for developing assessments and for
aligning instruction and assessment is also limited. If we want to assess an ESL
learner on the "speak so others can understand" or "reflect and evaluate"
standards, the best and most direct approach would be some form of performance
assessment that provides an opportunity to evaluate the learner's ability to
"perform" in a authentic situation. However, constructing such a performance
task that is appropriate for use with beginning English learners will be a
challenge. It will probably require a degree of scaffolding by the instructor
that makes the situation somewhat less than authentic (for example, simulating a
conversation with a librarian rather than sending students out to library to
find something). In this case and many others, developing performance tasks to
measure progress on the EFF standards should not be seen as replacing other
forms of assessment used to guide instruction.
Thanks for the great questions (and I realize that I have not responded to
all of them yet). I am looking forward to the discussion,
Regie
Regie Stites, Ph.D. Education Researcher Center for Education and Human
Services SRI International Menlo Park, CA
|