Return-Path: <nifl-assessment@literacy.nifl.gov> Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id i3ALAGm26289; Sat, 10 Apr 2004 17:10:16 -0400 (EDT) Date: Sat, 10 Apr 2004 17:10:16 -0400 (EDT) Message-Id: <SEA2-F70UtIfPm7R9Qz0004b39e@hotmail.com> Errors-To: listowner@literacy.nifl.gov Reply-To: nifl-assessment@literacy.nifl.gov Originator: nifl-assessment@literacy.nifl.gov Sender: nifl-assessment@literacy.nifl.gov Precedence: bulk From: "Eileen Eckert" <eileeneckert@hotmail.com> To: Multiple recipients of list <nifl-assessment@literacy.nifl.gov> Subject: [NIFL-ASSESSMENT:496] Re: why "valid and reliable"? X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas Content-Type: text/plain; format=flowed Status: O Content-Length: 7084 Lines: 130 David, Perhaps I erred in thinking that your and Robert's messages had enough in common to address both messages with the same reply. You said, "The definition I gave of validity, perhaps oversimplified, was 'If an assessment is valid it measures what it says it measures, not something else.'" Then you said, "I see nothing in this definition which requires an 'isolat(ion) of knowledge and skills and test(ing) them apart from the context in which they are used.'" How can we measure what we say we're measuring, and not something else, without separating or isolating what we say we're measuring from everything else? Can you give a specific example of how to achieve a valid assessment? And address reliability as well? By the way, all this <still> leaves unanswered the original question about the assumptions upon which the whole concept of validity and reliability is based, the same assumptions upon which the concepts of validity and reliability in social sciences research are based, and open to the same critiques. Eileen From: David Rosen <DJRosen@TheWorld.com> Reply-To: nifl-assessment@nifl.gov To: Multiple recipients of list <nifl-assessment@literacy.nifl.gov> Subject: [NIFL-ASSESSMENT:495] Re: why "valid and reliable"? Date: Sat, 10 Apr 2004 15:19:08 -0400 (EDT) Eileen and others, I did not assume anything about your understanding of validity. The definition I gave says nothing about standardized testing, or even testing, which is only one kind of assessment. The definition I gave does not preclude direct, "authentic" assessment -- it's often what is needed to reach the gold standard in validity. I also said the problem is that holding this standard requires more resources than our field -- including programs -- often have. I might add that I am concerned that holding these standards for assessment, validity and reliability, but not significantly increasing resources results in sacrifice of instruction time, compromises the standards in practice, and drives teachers nuts. Raising standards, especially assessment standards, has a cost. Congress has raised the standards but so far has not paid for the cost. It's an unfunded mandate. David J. Rosen djrosen@comcast.net On Saturday, April 10, 2004, at 02:02 PM, Eileen Eckert wrote: >Hi Robert and David, >You’ve both given definitions, but those don’t get to the underlying >assumptions. And it looks to me like you’re both assuming that if I really >understood validity and reliability, then their appropriateness would be >self-evident. If that’s the case, could you try to set that assumption >aside for a while if you continue reading? I do know about validity and >reliability, and I still don’t think they are meaningful or useful criteria >for judging assessment of adult learning. > >Robert, you said: A valid assessment tests what you want it to test: the >specific skills and knowledge being taught. > >Underlying that definition of validity is the assumption that you can >isolate knowledge and skills and test them apart from the context in which >they are used, and that doing so will tell you something meaningful about >what the student has learned. Robert gave the example, “If I teach a class >in number skills to a group of limited English speakers and then assess >their skills with a series of word problems that use vocabulary the >students don't understand, then the assessment is not valid.” > >Using that example, in order to have valid test items, you’d have to either >construct word problems using vocabulary the students know, or test the >number skills without the words to confound the issue. If you construct >word problems the students know, then you likely introduce problems of >reliability—will students in every testing situation share that particular >vocabulary so that the test functions the same from group to group? Then, >to address that, you need to have standardized teaching—make sure the >students in every test situation share the same vocabulary by teaching that >same vocabulary. Or you could test the math skills without the words, but >we like to use word problems because they address the concern that students >don’t encounter numbers in isolation; they encounter them in context. > >But there’s the rub. They encounter numbers in different contexts, and >transfer of skills from one to another is one of the most difficult issues >in teaching and learning, so being able to choose the “right” answer on a >test doesn’t necessarily mean they can use the skill to do something that >matters to them personally. And being able to use the skill in a way that >matters to them personally doesn’t necessarily mean they can >de-contextualize that skill and re-contextualize it to pick the right >answer on a test. So what does the test score mean? > >In order to get a valid assessment, you have to isolate the knowledge >and/or skill you are trying to assess and remove other, confounding >knowledge and skills (or else make sure you have taught all the knowledge >and skills being tested by standardizing the teaching). This leads to >narrowly focused test items (or performance tasks). In order to get a >reliable assessment, you have to make sure the items are not open to >different interpretations; they have to function the same way with every >group of students tested. Essentially, you have to separate what’s been >learned from the person (or people) who have learned it, and the more we >know about how people learn, the less sense this makes. To do valid and >reliable assessment, you’re always trying to make up for the fact that each >person is a unique individual. > >The criteria of validity and reliability in educational assessment share >their roots with the criteria of validity and reliability in social >sciences research; they’re open to the same critiques and you can find >those critiques in any number of books and articles on grounded theory, >qualitative research, and naturalistic research. There are possible >alternatives to this view of assessment, as there are to the positivist >view of research. There are also all sorts of other issues we haven't >talked about, like using a single measure to assess learning, as we do with >mandates to use CASAS, TABE, or other standardized tests, but this message >is probably too long already. I think that to choose validity and >reliability as the standard we endorse, we have to look at the assumptions >on which that standard is based, how well it matches and represents what we >know about learning, how it works in practice, and alternatives to it. > >Eileen > >_________________________________________________________________ >FREE pop-up blocking with the new MSN Toolbar – get it now! >http://toolbar.msn.com/go/onm00200415ave/direct/01/ > _________________________________________________________________ Watch LIVE baseball games on your computer with MLB.TV, included with MSN Premium! http://join.msn.com/?page=features/mlb&pgmarket=en-us/go/onm00200439ave/direct/01/
This archive was generated by hypermail 2b30 : Thu Dec 23 2004 - 09:46:14 EST