Return-Path: <nifl-assessment@literacy.nifl.gov> Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id g6GIDuX07818; Tue, 16 Jul 2002 14:13:56 -0400 (EDT) Date: Tue, 16 Jul 2002 14:13:56 -0400 (EDT) Message-Id: <F9ywJxrUxwhbuKzzp8q00015010@hotmail.com> Errors-To: listowner@literacy.nifl.gov Reply-To: nifl-assessment@literacy.nifl.gov Originator: nifl-assessment@literacy.nifl.gov Sender: nifl-assessment@literacy.nifl.gov Precedence: bulk From: "John Makay" <makay00@hotmail.com> To: Multiple recipients of list <nifl-assessment@literacy.nifl.gov> Subject: [NIFL-ASSESSMENT:152] RE: norm vs criterion X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas Content-Type: text/plain; format=flowed Status: O Content-Length: 5930 Lines: 123 Thanks for the clarification on norm and criterion referenced tests. Although, my concern is selecting a test for my school to do placement. I have some questions in this area. Is the CASAS, which is a criterion referenced test, appropriate to place students in different levels of a program if the curriculum of the program is not predominantly a CASAS-based curriculum? Also, in evaluating a test to see which test has the best psychometric variables, how important is the population (sampling) factor if the test is a criterion-referenced test such as the CASAS. We have been been considering giving the Diagnostic Assessment of Reading (DAR) to place students in the varying levels of our Pre-GED program, but discovered that its was intended for a K-12 population. What are the issues in using a test like this for an adult population? Also, what are the strongest variables in assessing a test? In other words, are some variable not as important as others. Are some of the test-quality criteria more important than other criteria? For example, several reviewers in the Buros Mental Measurements Yearbook point out that many test manuals are lax in providing information in the area of content validity; therefore, giving great emphasis to content validity when judging tests may not be the best way to judge them since this information is rarely available. I have found a few sources that identify criteria for test evaluation. According to Popham (2000, p. 195-196), the following factors should be considered when reviewing a set of comparative data for norm-referenced tests. 1. Sample size. Is the sample in the norm group large enough to assure a reasonable degree of stability in the database from which educators must draw interpretations? 2. Representative ness. Is the sample drawn in such a way as to represent the kinds of students for whom interpretations must be made? 3. Recency. Were the normative data gathered in the last few years or is the information out-of-date because it was collected too long ago? 4. Description of procedures. Are the procedures associated with the gathering of the normative data sufficiently well described so that those procedures can be properly evaluated? In addition to the criteria above, “key standards should be considered from the Standards for Educational and Psychological Testing established by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement ” (Rudner, 2000, v-13). These seem to have a broader range and appear to cover both norm and criterion referenced tests. Assessment Standards for Selection of a Test Test Coverage and Usage - There must be a clear statement of recommended uses and a description of the population for which the test is intended. The use intended by the test developer must be justified by the publisher on technical grounds. Appropriate Samples for Test Validation and Norming - The samples used for test validation and norming must be of adequate size and representative of the group for which the test is intended in terms of age, experience, and background. Reliability - Test publishers should be able to demonstrate that thetest is sufficiently reliable to permit stable estimates of individual ability. Predictive Validity - Evidence of the predictive validity of the test must include a comparison of performance on the test being validated against performance on some outside criteria such as course grades, class rank, other tests, teacher ratings, or other related criteria. Content Validity - Content validity can be evaluated by examining the planand procedures reportedly used in the construction of the test. Construct Validity - Test publishers are in a position to demonstrate that the test adequately measures a particular construct. Test Administration - All test administration specifications, such as instructions to test takers, time limits, use of reference materials, use of calculators, lighting, equipment, assigning seats, monitoring, room requirements, testing sequence, and time of day, should be fully described. Test Reporting - Test publishers are responsible for fully describing the methods used to report test results, including scaled scores, subtest results and combined test results. Test and Item Bias - Test developers are expected to exhibit a sensitivity to the demographic characteristics of test takers, and steps should be taken during test development, validation, standardization, and documentation to minimize the influence of cultural factors on individual test scores. So, what variable above are more important and for situation? Are there any books or reference out their to answer this question or can some one share their experience on this one? Also, more specifically, what kind of test is appropriate for a placement instrument in an adult basic education program with many levels. Can both criterion-referenced and norm-referenced test both do the job? If anyone can give me some insight into how to best evaluate a test for broad use in our school for placement purposes please, please step forward. John Makay Literacy Instructor Baltimore City Communtiy College REFERENCES Mueller, R. O. and Freitag, P. K. (n.d.) Comprehensive Adult Student Assessment System [Review of the CASAS test]. Mental Measurements Yearbook (13th ed.). Lincoln: University of Nebraska Press. Popham, W. J. (2000). Modern educational measurement. Practical guidelines for the education leader. Needham: Allyn & Bacon. Rudner, L. E. (2000). Assessing Student Learning. Newark: Delaware Education Research and Development Center. (Originally from ERIC ERIC/ AE 12/ 93) _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com
This archive was generated by hypermail 2b30 : Fri Jan 17 2003 - 14:46:25 EST