NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 631 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Confrey, Jere; Toutkoushian, Emily; Shah, Meetal – Applied Measurement in Education, 2019
Fully articulating validation arguments in the context of classroom assessment requires connecting evidence from multiple sources and addressing multiple types of validity in a coherent chain of reasoning. This type of validation argument is particularly complex for assessments that function in close proximity to instruction, address the fine…
Descriptors: Test Validity, Item Response Theory, Middle School Students, Mathematics Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Ketterlin-Geller, Leanne R.; Perry, Lindsey; Adams, Elizabeth – Applied Measurement in Education, 2019
Despite the call for an argument-based approach to validity over 25 years ago, few examples exist in the published literature. One possible explanation for this outcome is that the complexity of the argument-based approach makes implementation difficult. To counter this claim, we propose that the Assessment Triangle can serve as the overarching…
Descriptors: Validity, Educational Assessment, Models, Screening Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019
Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…
Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Krupa, Erin Elizabeth; Carney, Michele; Bostic, Jonathan – Applied Measurement in Education, 2019
This article provides a brief introduction to the set of four articles in the special issue. To provide a foundation for the issue, key terms are defined, a brief historical overview of validity is provided, and a description of several different validation approaches used in the issue are explained. Finally, the contribution of the articles to…
Descriptors: Test Items, Program Validation, Test Validity, Mathematics Education
Peer reviewed Peer reviewed
Direct linkDirect link
Cohen, Dale J.; Zhang, Jin; Wothke, Werner – Applied Measurement in Education, 2019
Construct-irrelevant cognitive complexity of some items in the statewide grade-level assessments may impose performance barriers for students with disabilities who are ineligible for alternate assessments based on alternate achievement standards. This has spurred research into whether items can be modified to reduce complexity without affecting…
Descriptors: Test Items, Accessibility (for Disabled), Students with Disabilities, Low Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Seonghoon; Kolen, Michael J. – Applied Measurement in Education, 2019
In applications of item response theory (IRT), fixed parameter calibration (FPC) has been used to estimate the item parameters of a new test form on the existing ability scale of an item pool. The present paper presents an application of FPC to multiple examinee groups test data that are linked to the item pool via anchor items, and investigates…
Descriptors: Item Response Theory, Item Banks, Test Items, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Thompson, W. Jake; Clark, Amy K.; Nash, Brooke – Applied Measurement in Education, 2019
As the use of diagnostic assessment systems transitions from research applications to large-scale assessments for accountability purposes, reliability methods that provide evidence at each level of reporting are needed. The purpose of this paper is to summarize one simulation-based method for estimating and reporting reliability for an…
Descriptors: Test Reliability, Diagnostic Tests, Classification, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Applied Measurement in Education, 2019
The identification of rapid guessing is important to promote the validity of achievement test scores, particularly with low-stakes tests. Effective methods for identifying rapid guesses require reliable threshold methods that are also aligned with test taker behavior. Although several common threshold methods are based on rapid guessing response…
Descriptors: Guessing (Tests), Identification, Reaction Time, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Michaelides, Michalis P. – Applied Measurement in Education, 2019
The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…
Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Zwick, Rebecca; Ye, Lei; Isham, Steven – Applied Measurement in Education, 2019
In US colleges, the scarcity of students from low-income families is a major concern. We present a novel way of boosting the percentage of qualified low-income students using constrained optimization (CO), an operations research technique. CO allows incorporation of both academic requirements and diversity goals in college admissions. The incoming…
Descriptors: Low Income Students, Operations Research, College Admission, Admission Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Quesen, Sarah; Lane, Suzanne – Applied Measurement in Education, 2019
This study examined the effect of similar vs. dissimilar proficiency distributions on uniform DIF detection on a statewide eighth grade mathematics assessment. Results from the similar- and dissimilar-ability reference groups with an SWD focal group were compared for four models: logistic regression, hierarchical generalized linear model (HGLM),…
Descriptors: Test Items, Mathematics Tests, Grade 8, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Haladyna, Thomas M.; Rodriguez, Michael C.; Stevens, Craig – Applied Measurement in Education, 2019
The evidence is mounting regarding the guidance to employ more three-option multiple-choice items. From theoretical analyses, empirical results, and practical considerations, such items are of equal or higher quality than four- or five-option items, and more items can be administered to improve content coverage. This study looks at 58 tests,…
Descriptors: Multiple Choice Tests, Test Items, Testing Problems, Guessing (Tests)
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Zhang, Mo; Deane, Paul – Applied Measurement in Education, 2019
Analysis of keystroke logging data is of increasing interest, as evident from a substantial amount of recent research on the topic. Some of the research on keystroke logging data has focused on the prediction of essay scores from keystroke logging features, but linear regression is the only prediction method that has been used in this research.…
Descriptors: Scores, Prediction, Writing Processes, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019
The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…
Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  43