NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 106 to 120 of 694 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Monroe, Scott; Cai, Li – Educational Measurement: Issues and Practice, 2015
Student growth percentiles (SGPs, Betebenner, 2009) are used to locate a student's current score in a conditional distribution based on the student's past scores. Currently, following Betebenner (2009), quantile regression (QR) is most often used operationally to estimate the SGPs. Alternatively, multidimensional item response theory (MIRT) may…
Descriptors: Item Response Theory, Reliability, Growth Models, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L.; Kingsbury, G. Gage; Webb, Norman L. – Educational Measurement: Issues and Practice, 2015
The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do…
Descriptors: Computer Assisted Testing, Adaptive Testing, Alignment (Education), Test Content
Peer reviewed Peer reviewed
Direct linkDirect link
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2015
This article uses definitions provided by Cronbach in his seminal paper for coefficient a to show the concepts of reliability, dimensionality, and internal consistency are distinct but interrelated. The article begins with a critique of the definition of reliability and then explores mathematical properties of Cronbach's a. Internal consistency…
Descriptors: Reliability, Definitions, Mathematics, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Beatty, Adam S.; Walmsley, Philip T.; Sackett, Paul R.; Kuncel, Nathan R.; Koch, Amanda J. – Educational Measurement: Issues and Practice, 2015
Little is known about the reliability of college grades relative to how prominently they are used in educational research, and the results to date tend to be based on small sample studies or are decades old. This study uses two large databases (N > 800,000) from over 200 educational institutions spanning 13 years and finds that both first-year…
Descriptors: Reliability, Grades (Scholastic), College Students, Grade Point Average
Peer reviewed Peer reviewed
Direct linkDirect link
Sijtsma, Klaas – Educational Measurement: Issues and Practice, 2015
I discuss the contribution by Davenport, Davison, Liou, & Love (2015) in which they relate reliability represented by coefficient a to formal definitions of internal consistency and unidimensionality, both proposed by Cronbach (1951). I argue that coefficient a is a lower bound to reliability and that concepts of internal consistency and…
Descriptors: Reliability, Mathematics, Validity, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Green, Samuel B.; Yang, Yanyun – Educational Measurement: Issues and Practice, 2015
In the lead article, Davenport, Davison, Liou, & Love demonstrate the relationship among homogeneity, internal consistency, and coefficient alpha, and also distinguish among them. These distinctions are important because too often coefficient alpha--a reliability coefficient--is interpreted as an index of homogeneity or internal consistency.…
Descriptors: Reliability, Factor Analysis, Computation, Factor Structure
Peer reviewed Peer reviewed
Direct linkDirect link
Anderson, Daniel; Irvin, Shawn; Alonzo, Julie; Tindal, Gerald A. – Educational Measurement: Issues and Practice, 2015
The alignment of test items to content standards is critical to the validity of decisions made from standards-based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the…
Descriptors: Test Items, Alignment (Education), Standards, Online Systems
Peer reviewed Peer reviewed
Direct linkDirect link
Plake, Barbara S.; Wise, Lauress L. – Educational Measurement: Issues and Practice, 2014
With the 2014 publication of the 5th revision of the "Standards for Educational and Psychological Testing," the cochairs of the Joint Committee for the revision process were asked to consider the role and importance of the "Standards" for the educational testing community, and in particular for members of the National Council…
Descriptors: Standards, Educational Testing, Psychological Testing, Role
Peer reviewed Peer reviewed
Direct linkDirect link
Bradshaw, Laine; Izsák, Andrew; Templin, Jonathan; Jacobson, Erik – Educational Measurement: Issues and Practice, 2014
We report a multidimensional test that examines middle grades teachers' understanding of fraction arithmetic, especially multiplication and division. The test is based on four attributes identified through an analysis of the extensive mathematics education research literature on teachers' and students' reasoning in this content…
Descriptors: Middle School Teachers, Numbers, Arithmetic, Multiplication
Peer reviewed Peer reviewed
Direct linkDirect link
Penfield, Randall David – Educational Measurement: Issues and Practice, 2014
A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of…
Descriptors: Item Response Theory, Test Items, Models, Equations (Mathematics)
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Haberman, Shelby J. – Educational Measurement: Issues and Practice, 2014
Standard 3.9 of the Standards for Educational and Psychological Testing ([, 1999]) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ([Hambleton, R. K., 2005]) and Sinharay ([Sinharay, S., 2005]) recommended the assessment of practical significance of misfit of IRT models, but…
Descriptors: Item Response Theory, Goodness of Fit, Models, Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Margolis, Melissa J.; Clauser, Brian E. – Educational Measurement: Issues and Practice, 2014
This research evaluated the impact of a common modification to Angoff standard-setting exercises: the provision of examinee performance data. Data from 18 independent standard-setting panels across three different medical licensing examinations were examined to investigate whether and how the provision of performance information impacted judgments…
Descriptors: Cutting Scores, Standard Setting (Scoring), Data, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Direct linkDirect link
Camara, Wayne – Educational Measurement: Issues and Practice, 2014
This article reviews the intended uses of these college- and career-readiness assessments with the goal of articulating an appropriate validity argument to support such uses. These assessments differ fundamentally from today's state assessments employed for state accountability. Current assessments are used to determine if students have…
Descriptors: College Readiness, Career Readiness, Aptitude Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Sheehan, Kathleen M. – Educational Measurement: Issues and Practice, 2014
Many proposed cohesion metrics focus on the number and types of explicit cohesive ties detected within a text without also considering differences in the ease or difficulty of required referential and connective inferences. A new cohesion measure structured to address this limitation is proposed. Empirical analyses confirm that this new measure…
Descriptors: Connected Discourse, Measurement, Sentences, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2014
Brennan (Brennan, R. L., 2012) noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman (Haberman, S. J., 2008) suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. According to this…
Descriptors: Scores, Test Theory, Test Interpretation
Pages: 1  |  ...  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  12  |  ...  |  47