National Institute for Literacy
 

Regie Stite's Responses

In October 2000, the NIFL Equipped for the Future (EFF) Discussion List invited Regie Stites, Educational Researcher and consultant to the Equipped for the Future initiative, to engage in a virtual Q&A on the list. Several list members posed questions for Dr. Stites about the EFF Standards for Adult learning and standards based system reform. The Q&A session has been organized into the following categories:


Moderator's introduction to the Q&A session and Regie Stites

NIFL-4EFF Colleagues:

I am pleased to announce that Regie Stites has agreed to join us online to respond to our questions about the EFF standards and standards-based educational reform.

Regie Stites is an Educational Researcher in the Center for Education and Human Services of SRI International. Dr. Stites is a specialist in literacy studies with a background in cultural anthropology and applied linguistics. He received his Ph.D. in Education from the University of California, Los Angeles in 1992. Prior to work at SRI, Dr. Stites was a Research Associate in the National Center for Research on Standards and Student Testing (CRESST) at UCLA and a Senior Researcher in the National Center on Adult Literacy and International Literacy Institute at the University of Pennsylvania. He also has experience as a college instructor and as a teacher of English to speakers of other languages in the USA and in the People's Republic of China. His current research focuses on applications of educational technology and on assessment and policy issues in literacy and adult basic education.

Dr. Stites was an advisor to the Mayor's Commission on Literacy's work on the EFF Citizen/Community Member Role Map. He is the author of the 1999 Focus on Basics article: A User's Guide to Standards-Based Educational Reform: From Theory to Practice. He is now assisting EFF in planning assessment development and validation processes.

Here is a short list of topics Regie and I think are of interest to the list and that he would enjoy discussing with us. Please post your questions for Regie in these areas:

  • The role of assessment in standards-based educational reform
  • Ways of judging the validity of an assessment
  • Differences between 'traditional' and performance-based assessment
  • Aligning instruction and assessment with the EFF Standards
  • The role of technology in supporting learning aligned with the EFF Standards.

Thanks,

Ronna Spacone
NIFL-4EFF Discussion List Moderator

Regie's introductory remarks and common threads in the questions

Hello everyone,

I would like to thank Ronna Spacone for this opportunity to respond to questions from the 4EFF list. Having been a listener (aka lurker) on the list for some time I feel that it is quite an honor to be given a virtual soap box to stand on to address the list. As of the end of the day today, there have already been a number of interesting and challenging questions posted by Debbie Tuler, Ronna Spacone, Amy Trawick, Sue Barton, Donna Curry, Andy Nash, and Mary Hannaman. Rather than respond separately to each of the questions I will instead address my comments to what I see as some common threads that run through this initial set of questions.

The common threads so far as I see them are the following:

  • The role of EFF standards and assessment in systemic reform (including improving instructional practice)
  • The pros and cons of performance-based assessment (and portfolios)
  • Assessing ESOL students on the EFF standards

I will be responding to issues in each of these areas in subsequent posts. In doing so, I hope to touch on most (but probably not all) of the particular questions that have been addressed to me.

Before I begin to respond, I think it will be a good idea for me to say something about my role in helping to plan assessment development and validation processes for EFF. Last month, at a meeting of the EFF National Policy Group, I presented a "road map" for validating the EFF Assessment Framework. The road map describes a process and criteria that can be used to develop valid and reliable performance standards and measures aligned with the EFF Content Standards.

The road map includes recommendations for a behavioral-anchoring process that can be used to develop descriptions (and examples) of performance at various levels for each of the EFF standards. It also describes the types of validity evidence that may be needed to satisfy the concerns of various stakeholders.

From a measurement perspective, the central concern is likely to be construct validity - the degree to which an assessment system meets technical criteria for validity and reliability. From a policy perspective, the central concern is likely to be consequential validity - the degree to which the uses of an assessment system lead to fair and equitable outcomes for learners, instructors, programs, and funders. Finally, from a popular perspective, the central concern is likely to be face validity - the degree to which an assessment system is meaningful and understandable to all.

I believe that all three general types of validity (construct, consequential, and face) are equally important. Much of what is wrong with current systems of assessment in adult basic education can be interpreted as problems in one or more of these types of validity. The validity concerns (and processes for avoiding validity problems) that I identified in the road map have recently been the focus of my attention in thinking about EFF. These concerns are the background for most of what I will have to say in response to the questions addressed to me on the list.

On EFF and systemic reform

A number of the questions that have been posted to the list ask about the mechanisms through which the EFF standards and assessments can contribute to improvements in instructional quality and outcomes. These questions were asked at the program level, but I would like to begin my response by working down from the systemic level. One of the key strengths of EFF is the foundation that it offers for standards-based reform of the adult language and literacy educational system. I described the components of an ideal model of standards-based systemic reform in the article I wrote for Focus on Basics. Here's a snippet from the conclusion of that piece:

"According to the ideal model of standards-based reform, all forms of standards -- content, performance, and opportunity-to-learn -- should be aligned. To bring practice closer to the ideal, we must somehow connect EFF, NRS, and NAAL as well as state level standards. This will not be easy, but will offer many benefits. First, coherent content standards can provide a clear vision of what every adult should know and be able to do. Performance standards and related assessment matched to this vision provide the tools for individual learners, literacy programs, and everyone to monitor progress toward goals. Opportunity-to-learn standards may be especially critical for a system of education (adult literacy) that is chronically underfunded."

I wrote the FOB article with applications of standards and assessment in large-scale accountability and reporting systems in mind. But I would argue that effective alignment of learning goals and instructional objectives (which can be described in a generic way by content standards) with measures and expectations for learning outcomes (which can be described in a generic way by performance standards) is a key indicator of educational quality at all levels. Another way to put this is to say that teaching to the test is not a bad thing, as long as the test in question measures knowledge and skills that learners (and others) recognize as important goals for learning. Opportunity-to-learn is also a critical piece of the quality equation at the program level.

A good system of educational standards should support teaching and learning in at least three ways. Content standards should help to clarify long-term learning goals for learning and to situate particular learning objectives on a pathway leading to long-term goals. Performance standards should help to establish milestones on the pathway that both teachers and learners can use to mark progress and plan further learning. Opportunity-to-learn standards should help to develop a better understanding of the time and resources that teachers and learners will need to make reasonable progress toward learning goals.

Debbie Tuler asked about what pieces of the EFF framework should be used for what purposes. My general response is that the EFF role maps, standards, common activities, and eventually performance continua and benchmarks of performance are part of an integrated framework that can help teachers to align learners' goals with curriculum design and teaching practices and with measures of learning progress. Admittedly, this is a tall order for the teacher and the fact that curriculum and assessments used to inform instruction are poorly aligned with the standardized tests used for external reporting and accountability makes the job even harder.

To improve on the current 'misaligned' system, progress needs to be made at both the program level and the systems level. As indicated by Ronna Spacone's question, substantial investment in staff development is the starting point for alignment at the program level. Staff development should provide teachers with the models, support, and guided practice they need to be able to apply the EFF framework to aligning learning goals with teaching practice and with measures of learning at the program level. At the same time, work is needed at the level of accountability systems to better capture results that matter.

EFF has only recently begun the long and hard work of developing an assessment framework (elaborating performance continua for each standards, establishing benchmarks for levels of performance, selecting and developing tasks to measure performance on the standards, and combining all this into a qualifications framework). In this context, I would respond to Sue Barton's question about "the most pressing public policy issues affecting the implementation of EFF into a program" by saying that the first priority should be garnering broad-based support (and involvement) for developing an assessment framework that supports measures of meaningful results in adult learning and establishes reasonable expectations for resources needed to support such results. Beyond this the other public policy supports that need to be in place to make EFF work at the program level include accountability structures that are aligned with learners' goals and program curricula, expanded professional development opportunities for teachers, expanded access to high-quality learning opportunities for students, and the resources and political will to make all this possible.

On Performance assessment validity

Hello again,

This message is in response to questions that were asked about the pros and cons of using performance assessment and portfolios for internal(instructional) and external (accountability) purposes. I think Amy Trawick hit the nail on the head when she noted that "multiple customers will need to buy in to a new way of thinking about assessment" to make EFF's reform vision a reality. That new thinking includes an understanding of the value of aligning standards and assessments at the program, state, and national levels -- for reasons that I described in the last post. It also includes the goal of making assessment an integral part of instruction and learning, rather than -- as is too often the case in using standardized tests -- a separate (and often painful) event that interrupts learning and instruction and distances adult learners from their instructors and from their motivation to learn.

Amy went on to ask my opinion on the factors that affect the utility of a performance-based assessment system for "learner, teacher, program, funder, *and* state/federal purposes." In my view, this is a question about the validity of performance-based assessment and it is exactly the right way to frame such a question. The validity of any assessment should be judged in terms of the purpose of the assessment. For example, I would argue that the methods used to assess certain types of reading and numeracy skills in the last National Adult Literacy Survey (NALS) are valid for the purpose of profiling the distribution of various levels of those reading and numeracy skills in the adult population of the U. S. On the other hand, I would argue that the NALS measures are not valid for the purpose of assessing the overall impact of the adult language and literacy educational system on literacy levels among U.S. adults. The primary reason that the NALS is not valid for the latter purpose is the narrow range and poor alignment of the skills it measures relative to what is being taught and learned in the adult language and literacy educational system.

Mary Hannaman pointed out a similar validity problem (and one that is closer to home) in her questioning of the appropriateness of using standardized tests that have little connection to standards that "have been developed based on the needs or goals of the state." Bringing the issue of validity even closer to home was Donna Curry's question about whether teachers need to be concerned about validity in informal and day-to-day assessment. I think teachers should always be concerned about the validity of any assessment, formal or informal. I would also argue that validity is much easier to achieve when assessment is closely aligned with instructional goals and integrated into instructional activities. This is why many assessment specialists see performance-based assessment as a potentially more valid alternative to traditional testing.

To understand why alternative assessment systems (performance tasks, portfolio assessments, and other integrated measures of knowledge and skills) may be more valid than traditional forms of testing (multiple-choice, fill-in-the-blank, and other discrete measures of knowledge and skills) for various purposes you may want to look back at the three general areas of validity that I described in my "Introductory remarks" -- construct validity, consequential validity, and face validity.

In terms of construct validity, the advantage of alternative assessment is the opportunity to create more direct and more authentic measures of desired knowledge, skills, and abilities than is typically possible with traditional testing. On the other hand, construct validity also includes concerns about reliability (consistency of scores/ratings over time and among raters). Standardized tests are strong on reliability. Performance based assessments are scored more subjectively and therefore reliability must be strengthened by use of well-structured scoring guidelines and training of teachers to make effective and consistent use of scoring guidelines.

In terms of consequential validity, alternative assessment systems again have the advantage over traditional testing in many situations because of the fact that performance-based assessments and especially portfolios assessment systems give learners more opportunities (in more "real-world" contexts) to demonstrate desirable knowledge, skills, and abilities. I recently heard Sri Ananda make the argument that the relatively high costs of using performance assessment (because of the training and process needed to achieve reliability) is justified in cases where only direct measures of performance will do. She used the example of the behind-the-wheel test required to get a driver's license. A paper and pencil test alone will clearly not suffice to guide this high-stakes decision. Even in low-stakes testing situations, performance assessment (particularly when results are collected and regularly reviewed in a portfolio) has the advantage of providing more feedback to learners and instructors that is more directly applicable to improving learning activities and opportunities than the guidance that standardized tests can provide.

On the issue of face validity, performance assessment is again a clear winner. Scoring criteria used in performance-based assessment are more easily communicated (and often more meaningful) to learners and teachers than is the case in traditional forms of assessment. Within an alternative assessment system, learning and assessment activities are combined. A well-structured performance task should also be a learning activity. Good use of a portfolio is one way to capitalize on the potential for strong face validity in performance assessment. The primary purpose of the portfolio should be to aid communication between the learner and the instructor so that learning goals and progress can be reviewed and evaluated in an ongoing dialogue. In the end, the portfolio can become a richly textured and substantial piece of evidence of learning achievement.

The challenge is to convince policy makers and funding agencies that such evidence is as valid (and reliable) as standardized test results. Basically, this means changing the ways that policy makers think about validity. The face validity of a standardized test rests mostly on the authority of the experts who design the test and analyze its results. This is seen as having advantages for high-stakes testing because the judgements of experts are viewed as legitimate (even though the bases for arriving at judgements are not widely understood). In my opinion, education is a different sort of system than law or medicine. In medicine and to some extent in law, the public puts its trust in authoritative judgements above popular understanding. In education, it is relatively more important to work toward achieving a balance (and making connections) between expertise and popular understanding.

On EFF and ESOL (English for speakers of other languages)

Hi again,

This is the last of the three general responses that I promised. This one will be shorter (a good thing, no?) than the previous posts and I am hoping that it leads into a more open discussion.

The topic is (as Debbie Tuler put it) "the meaningfulness and use of [an EFF Standard] for planning instruction and assessment for ESOL. Debbie was particularly interested in the "speak so others can understand" standard. Andy Nash asked a similar question about using EFF to guide assessment and instruction for beginning English learners. I think that an assessment expert and policy wonk (like me) is less likely to provide useful guidance in this area than teachers who have worked through this problem in practice. But, of course, I do have some advice to offer.

First, there is the issue of how much guidance the EFF Standards can provide to teachers in planning instruction and assessment in any case. The EFF Standards (like any Content Standards) should be "visionary and not at all prescriptive" (to borrow a phrase from Andy Porter. In other words, the role of the standards should be to help organize and frame instruction and assessment, but never to set limits on what should be taught and assessed. The EFF Standards cannot be the sum total of any program or classroom curriculum, but they can help learners, teachers, and program managers to see the "big picture" of learning goals and perhaps point to gaps where new curriculum development is needed.

Second, let me repeat the mantra of the assessment specialists -- "multiple measures." The guidance that EFF can provide for developing assessments and for aligning instruction and assessment is also limited. If we want to assess an ESL learner on the "speak so others can understand" or "reflect and evaluate" standards, the best and most direct approach would be some form of performance assessment that provides an opportunity to evaluate the learner's ability to "perform" in a authentic situation. However, constructing such a performance task that is appropriate for use with beginning English learners will be a challenge. It will probably require a degree of scaffolding by the instructor that makes the situation somewhat less than authentic (for example, simulating a conversation with a librarian rather than sending students out to library to find something). In this case and many others, developing performance tasks to measure progress on the EFF standards should not be seen as replacing other forms of assessment used to guide instruction.

Thanks for the great questions (and I realize that I have not responded to all of them yet). I am looking forward to the discussion,

Regie

Regie Stites, Ph.D.
Education Researcher
Center for Education and Human Services
SRI International
Menlo Park, CA

Dividing Bar
Home   |   About Us   |   Staff   |   Employment   |   Contact Us   |   Questions   |   Site Map


Last updated: Wednesday, 31-Jan-2007 10:50:41 EST