NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 61 to 75 of 694 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Mattern, Krista; Sanchez, Edgar; Ndum, Edwin – Educational Measurement: Issues and Practice, 2017
In the context of college admissions, the current study examined whether differential prediction of first-year grade point average (FYGPA) by gender could be explained by an omitted variable problem--namely, academic discipline, or the amount of effort a student puts into schoolwork and the degree to which a student sees him/herself as hardworking…
Descriptors: Females, Academic Achievement, Predictive Validity, Grade Point Average
Peer reviewed Peer reviewed
Direct linkDirect link
Allalouf, Avi; Gutentag, Tony; Baumer, Michal – Educational Measurement: Issues and Practice, 2017
Quality control (QC) in testing is paramount. QC procedures for tests can be divided into two types. The first type, one that has been well researched, is QC for tests administered to large population groups on few administration dates using a small set of test forms (e.g., large-scale assessment). The second type is QC for tests, usually…
Descriptors: Quality Control, Scoring, Computer Assisted Testing, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Castellano, Katherine E.; McCaffrey, Daniel F. – Educational Measurement: Issues and Practice, 2017
Mean or median student growth percentiles (MGPs) are a popular measure of educator performance, but they lack rigorous evaluation. This study investigates the error in MGP due to test score measurement error (ME). Using analytic derivations, we find that errors in the commonly used MGP are correlated with average prior latent achievement: Teachers…
Descriptors: Teacher Evaluation, Teacher Effectiveness, Value Added Models, Achievement Gains
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A.; Jurich, Daniel P. – Educational Measurement: Issues and Practice, 2017
Recent research has proposed a criterion to evaluate the reportability of subscores. This criterion is a value-added ratio ("VAR"), where values greater than 1 suggest that the true subscore is better approximated by the observed subscore than by the total score. This research extends the existing literature by quantifying statistical…
Descriptors: Guidelines, Scores, Research Reports, Value Added Models
Peer reviewed Peer reviewed
Direct linkDirect link
Furtak, Erin Marie; Ruiz-Primo, Maria Araceli; Bakeman, Roger – Educational Measurement: Issues and Practice, 2017
Formative assessment is a classroom practice that has received much attention in recent years for its established potential at increasing student learning. A frequent analytic approach for determining the quality of formative assessment practices is to develop a coding scheme and determine frequencies with which the codes are observed; however,…
Descriptors: Sequential Approach, Formative Evaluation, Alternative Assessment, Incidence
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Peer reviewed Peer reviewed
Direct linkDirect link
Wind, Stefanie A.; Schumacker, Randall E. – Educational Measurement: Issues and Practice, 2017
The term measurement disturbance has been used to describe systematic conditions that affect a measurement process, resulting in a compromised interpretation of person or item estimates. Measurement disturbances have been discussed in relation to systematic response patterns associated with items and persons, such as start-up, plodding, boredom,…
Descriptors: Measurement, Testing Problems, Writing Tests, Performance Based Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017
This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…
Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Clark, A. K.; Nash, B.; Karvonen, M.; Kingston, N. – Educational Measurement: Issues and Practice, 2017
The purpose of this study was to develop a standard-setting method appropriate for use with a diagnostic assessment that produces profiles of student mastery rather than a single raw or scale score value. The condensed mastery profile method draws from established holistic standard-setting methods to use rounds of range finding and pinpointing to…
Descriptors: Diagnostic Tests, Standard Setting (Scoring), Cutting Scores, Performance
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017
This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…
Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sheehan, Kathleen M. – Educational Measurement: Issues and Practice, 2017
Automated text complexity measurement tools (also called readability metrics) have been proposed as a way to help teachers, textbook publishers, and assessment developers select texts that are closely aligned with the new, more demanding text complexity expectations specified in the Common Core State Standards. This article examines a critical…
Descriptors: Reading Material Selection, Difficulty Level, Common Core State Standards, Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Shewach, Oren R.; Shen, Winny; Sackett, Paul R.; Kuncel, Nathan R. – Educational Measurement: Issues and Practice, 2017
The literature on differential prediction of college performance of racial/ethnic minority students for standardized tests and high school grades indicates the use of these predictors often results in overprediction of minority student performance. However, these studies typically involve native English-speaking students. In contrast, a smaller…
Descriptors: Prediction, Minority Group Students, Standardized Tests, High School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrara, Steve – Educational Measurement: Issues and Practice, 2017
Test security is not an end in itself; it is important because we want to be able to make valid interpretations from test scores. In this article, I propose a framework for comprehensive test security systems: prevention, detection, investigation, and resolution. The article discusses threats to test security, roles and responsibilities, rigorous…
Descriptors: Testing Programs, Educational Practices, Educational Policy, Program Improvement
Peer reviewed Peer reviewed
Direct linkDirect link
Evans, Carla M.; Lyons, Susan – Educational Measurement: Issues and Practice, 2017
The purpose of this study was to test methods that strengthen the comparability claims about annual determinations of student proficiency in English language arts, math, and science (Grades 3-12) in the New Hampshire Performance Assessment of Competency Education (NH PACE) pilot project. First, we examined the literature in order to define…
Descriptors: Academic Achievement, Language Arts, Mathematics Achievement, Science Achievement
Peer reviewed Peer reviewed
Direct linkDirect link
Davis, Laurie; Morrison, Kristin; Kong, Xiaojing; McBride, Yuanyuan – Educational Measurement: Issues and Practice, 2017
The use of tablets for large-scale testing programs has transitioned from concept to reality for many state testing programs. This study extended previous research on score comparability between tablets and computers with high school students to compare score distributions across devices for reading, math, and science and to evaluate device…
Descriptors: Computer Assisted Testing, Handheld Devices, Telecommunications, Scoring
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  47