Return-Path: <nifl-4eff@literacy.nifl.gov> Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id e9LLpe910554; Sat, 21 Oct 2000 17:51:40 -0400 (EDT) Date: Sat, 21 Oct 2000 17:51:40 -0400 (EDT) Message-Id: <39F20DCA.CA8467B7@sri.com> Errors-To: rgspacone@aol.com Reply-To: nifl-4eff@literacy.nifl.gov Originator: nifl-4eff@literacy.nifl.gov Sender: nifl-4eff@literacy.nifl.gov Precedence: bulk From: Regie Stites <regie.stites@sri.com> To: Multiple recipients of list <nifl-4eff@literacy.nifl.gov> Subject: [NIFL-4EFF:1226] Performance assessment validity X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas Content-transfer-encoding: 7bit Content-type: text/plain; charset=us-ascii X-Mailer: Mozilla 4.7 [en] (Win98; I) Status: O Content-Length: 7348 Lines: 127 Hello again, I apologize for the scrambling of my last message. In case you missed it, the message was continued below my name and contact information. I'll try to do better this time. This message is in response to questions that were asked about the pros and cons of using performance assessment and portfolios for internal (instructional) and external (accountability) purposes. I think Amy Trawick hit the nail on the head when she noted that "(m)ultiple customers will need to buy in to a new way of thinking about assessment" to make EFF's reform vision a reality. That new thinking includes an understanding of the value of aligning standards and assessments at the program, state, and national levels -- for reasons that I described in the last post. It also includes the goal of making assessment an integral part of instruction and learning, rather than -- as is too often the case in using standardized tests -- a separate (and often painful) event that interrupts learning and instruction and distances adult learners from their instructors and from their motivation to learn. Amy went on to ask my opinion on the factors that affect the utility of a performance-based assessment system for "learner, teacher, program, funder, *and* state/federal purposes." In my view, this is a question about the validity of performance-based assessment and it is exactly the right way to frame such a question. The validity of any assessment should be judged in terms of the purpose of the assessment. For example, I would argue that the methods used to assess certain types of reading and numeracy skills in the last National Adult Literacy Survey (NALS) are valid for the purpose of profiling the distribution of various levels of those reading and numeracy skills in the adult population of the U. S. On the other hand, I would argue that the NALS measures are not valid for the purpose of assessing the overall impact of the adult language and literacy educational system on literacy levels among U.S. adults. The primary reason that the NALS is not valid for the latter purpose is the narrow range and poor alignment of the skills it measures relative to what is being taught and learned in the adult language and literacy educational system. Mary Hannaman pointed out a similar validity problem (and one that is closer to home) in her questioning of the appropriateness of using standardized tests that have little connection to standards that "have been developed based on the needs or goals of the state." Bringing the issue of validity even closer to home was Donna Curry's question about whether teachers need to be concerned about validity in informal and day-to-day assessment. I think teachers should always be concerned about the validity of any assessment, formal or informal. I would also argue that validity is much easier to achieve when assessment is closely aligned with instructional goals and integrated into instructional activities. This is why many assessment specialists see performance-based assessment as a potentially more valid alternative to traditional testing. To understand why alternative assessment systems (performance tasks, portfolio assessments, and other integrated measures of knowledge and skills) may be more valid than traditional forms of testing (multiple-choice, fill-in-the-blank, and other discrete measures of knowledge and skills) for various purposes you may want to look back at the three general areas of validity that I described in my "Introductory remarks" -- construct validity, consequential validity, and face validity. In terms of construct validity, the advantage of alternative assessment is the opportunity to create more direct and more authentic measures of desired knowledge, skills, and abilities than is typically possible with traditional testing. On the other hand, construct validity also includes concerns about reliability (consistency of scores/ratings over time and among raters). Standardized tests are strong on reliability. Performance based assessments are scored more subjectively and therefore reliability must be strengthened by use of well-structured scoring guidelines and training of teachers to make effective and consistent use of scoring guidelines. In terms of consequential validity, alternative assessment systems again have the advantage over traditional testing in many situations because of the fact that performance-based assessments and especially portfolios assessment systems give learners more opportunities (in more "real-world" contexts) to demonstrate desirable knowledge, skills, and abilities. I recently heard Sri Ananda make the argument that the relatively high costs of using performance assessment (because of the training and process needed to achieve reliability) is justified in cases where only direct measures of performance will do. She used the example of the behind-the-wheel test required to get a driver's license. A paper and pencil test alone will clearly not suffice to guide this high-stakes decision. Even in low-stakes testing situations, performance assessment (particularly when results are collected and regularly reviewed in a portfolio) has the advantage of providing more feedback to learners and instructors that is more directly applicable to improving learning activities and opportunities than the guidance that standardized tests can provide. On the issue of face validity, performance assessment is again a clear winner. Scoring criteria used in performance-based assessment are more easily communicated (and often more meaningful) to learners and teachers than is the case in traditional forms of assessment. Within an alternative assessment system, learning and assessment activities are combined. A well-structured performance task should also be a learning activity. Good use of a portfolio is one way to capitalize on the potential for strong face validity in performance assessment. The primary purpose of the portfolio should be to aid communication between the learner and the instructor so that learning goals and progress can be reviewed and evaluated in an ongoing dialogue. In the end, the portfolio can become a richly textured and substantial piece of evidence of learning achievement. The challenge is to convince policy makers and funding agencies that such evidence is as valid (and reliable) as standardized test results. Basically, this means changing the ways that policy makers think about validity. The face validity of a standardized test rests mostly on the authority of the experts who design the test and analyze its results. This is seen as having advantages for high-stakes testing because the judgements of experts are viewed as legitimate (even though the bases for arriving at judgements are not widely understood). In my opinion, education is a different sort of system than law or medicine. In medicine and to some extent in law, the public puts its trust in authoritative judgements above popular understanding. In education, it is relatively more important to work toward achieving a balance (and making connections) between expertise and popular understanding. Regie Regie Stites, Ph.D. Education Researcher Center for Education and Human Services SRI International Menlo Park, CA e-mail: regie.stites@sri.com voice: (650) 859-3768
This archive was generated by hypermail 2b30 : Mon Oct 29 2001 - 15:04:17 EST