Return-Path: <nifl-assessment@literacy.nifl.gov> Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id i3CHlWm14433; Mon, 12 Apr 2004 13:47:32 -0400 (EDT) Date: Mon, 12 Apr 2004 13:47:32 -0400 (EDT) Message-Id: <Sea2-F167eISiOyNANG000368da@hotmail.com> Errors-To: listowner@literacy.nifl.gov Reply-To: nifl-assessment@literacy.nifl.gov Originator: nifl-assessment@literacy.nifl.gov Sender: nifl-assessment@literacy.nifl.gov Precedence: bulk From: "Eileen Eckert" <eileeneckert@hotmail.com> To: Multiple recipients of list <nifl-assessment@literacy.nifl.gov> Subject: [NIFL-ASSESSMENT:501] RE: why "valid and reliable"? X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas Content-Type: text/plain; format=flowed Status: O Content-Length: 6404 Lines: 112 Hi Allan, I'm structuring your statements and my responses as a kind of dialogue. I hope that doing so will make it easier to follow. Allan: Eileen, I believe that your examples do not negate the usefulness of the characteristics validity and reliability, but show the limitations of each. Eileen: It's the limitations that concern me. In order to accommodate or overcome the limitations, I think we move in a direction that is counterproductive for students. I think this is illustrated in David's example of performance assessment in voc ed, in which the circumstances of the task are controlled to increase reliability. While his example illustrates a "good" assessment, I don't think controlling the environment serves any purpose <except> increasing reliability, and to me, that's not a compelling reason to do it when there are other quality criteria that could be used (read on, please, I will get to that). Allan: Validity and reliability need to be considered, not as absolutes, but in relative terms, or in degrees. Eileen: Agreed. Allan: Assessment instruments are not simply valid or invalid, but more or less valid in comparison to each other, and in comparison to some ideal of those terms or some model assessment. Eileen: Yes, and they are not only valid or invalid, but they are valid or not valid for the use to which they are put. Say, for example, someone comes to a literacy program with a specific goal of passing the written portion of the test for a driver's license. The student takes a CASAS test upon entry. Then the tutor works with the student on the goal of passing the written test for a driver's license, and the student passes the test. Then the program coordinator administers the "correct" level CASAS test to assess progress, and the pre- to post-test scores indicate very little progress. The CASAS hasn't changed, it hasn't changed in relation to the TABE or BEST, its validity as an assessment of the "competencies" it measures are the same. But it is not a valid assessment of what that person has learned. In this case, the written driver's license exam the student passed is the valid assessment--it measures what we want to measure, that is, did the student meet her or his goal? But that assessment would only be valid for students who shared that goal, not for others who want to enter vocational training, pass the GED, or read to their kids. Allan: Still, they are necessary and useful criteria for choosing or creating assessment tools. Eileen: No, in many cases they're not. I think that instead of pursuing validity and reliability we should be looking at the criteria for building trustworthiness. Trustworthiness encompasses credibility, dependability, transferability, and generalizability. In research, it's an alternative to the positivist paradigm (from which validity and reliability come). I think it can be adapted to assessment, and that to do so would be more worthwhile, more meaningful, and more consistent with the aims of adult literacy and basic education than pursuing validity and reliability. I will try to make that case, but probably not all in one message! Allan: Surely, you cannot contest the argument that a test of math ability which gives low scores to Vietnamese students because they can't read the English is not an accurate (i.e., valid) assessment of the math skills of those test-takers. Some level of validity is an important and even necessary characteristic for any assessment, but it is not the only characteristic and it is not an absolute one. Eileen: Allan, I'm not even trying to make the argument that the test you posit above is valid. When you say "validity is not the only characteristic and it is not an absolute one," I think your concern for quality is consistent with the idea of credibility and dependability. That is, the assessment has "truth value" to the subjects and the assessment reflects their math skills, not their language. But whereas a "valid" assessment would be one that tries to be free of bias, a credible and dependable one might be one in which the students document their math skills, and that documentation might take various forms. This would bring us back to assessments such as portfolios, which have kind of dropped off the radar with the focusing of accountability requirements on standardized tests. Allan: Many have argued that assessment instruments are only estimates of the knowledge and abilities of those they assess. In assessing human endeavors, any assessment is frought with uncontrollable and nonquantifiable conditions and variables.** But that doesn't mean we don't assess or we don't make judgements of our students' abilities. And it doesn't mean we simply reject what is not perfect, but try in many cases to make the imperfect better, and to keep the discussion going about what constitutes good assessment. Eileen: Again, I agree, but I think that "better" in terms of validity and reliability is not "better" in terms of education and learning. As your note at the end illustrates, validity and reliability are better suited to scientific experiments in which the variables can be controlled to a much greater extent than people and their learning environments can (or should be). "Better" in terms of trustworthiness <is> consistent with improving learning and education. The "uncontrollable and nonquantifiable conditions and variables" you speak of are a huge concern, but instead of seeing them as barriers to validity and reliability, we can see them as part of "complex systems"* and make our assessments work with them, instead of trying to control for them. I think I'll stop there. There's much more to your message, and to this discussion, but this is getting too long. Eileen *A good article on complex systems is Davis, B. & Sumara, D. (2001). Learning communities: Understanding the workplace as a complex system. In Imel, S. (Series Ed.) & Fenwick, T. (Vol. Ed.), Sociocultural perspectives on learning through work: No. 92. New Directions for Adult and Continuing Education (pp. 85-95). San Francisco: Jossey-Bass. _________________________________________________________________ Free up your inbox with MSN Hotmail Extra Storage! Multiple plans available. http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1/go/onm00200362ave/direct/01/
This archive was generated by hypermail 2b30 : Thu Dec 23 2004 - 09:46:14 EST