Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2019

This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights…

Descriptors: Test Validity, Test Reliability, Test Items, Correlation

Raykov, Tenko; Marcoulides, George A.; Harrison, Michael; Menold, Natalja – Educational and Psychological Measurement, 2019

This note confronts the common use of a single coefficient alpha as an index informing about reliability of a multicomponent measurement instrument in a heterogeneous population. Two or more alpha coefficients could instead be meaningfully associated with a given instrument in finite mixture settings, and this may be increasingly more likely the…

Descriptors: Statistical Analysis, Test Reliability, Measures (Individuals), Computation

Raykov, Tenko; Marcoulides, George A.; Li, Tenglong – Educational and Psychological Measurement, 2017

The measurement error in principal components extracted from a set of fallible measures is discussed and evaluated. It is shown that as long as one or more measures in a given set of observed variables contains error of measurement, so also does any principal component obtained from the set. The error variance in any principal component is shown…

Descriptors: Error of Measurement, Factor Analysis, Research Methodology, Psychometrics

Li, Wei; Konstantopoulos, Spyros – Educational and Psychological Measurement, 2017

Field experiments in education frequently assign entire groups such as schools to treatment or control conditions. These experiments incorporate sometimes a longitudinal component where for example students are followed over time to assess differences in the average rate of linear change, or rate of acceleration. In this study, we provide methods…

Descriptors: Educational Experiments, Field Studies, Models, Randomized Controlled Trials

Wilcox, Rand R.; Serang, Sarfaraz – Educational and Psychological Measurement, 2017

The article provides perspectives on p values, null hypothesis testing, and alternative techniques in light of modern robust statistical methods. Null hypothesis testing and "p" values can provide useful information provided they are interpreted in a sound manner, which includes taking into account insights and advances that have…

Descriptors: Hypothesis Testing, Bayesian Statistics, Computation, Effect Size

Miller, Jeff – Educational and Psychological Measurement, 2017

Critics of null hypothesis significance testing suggest that (a) its basic logic is invalid and (b) it addresses a question that is of no interest. In contrast to (a), I argue that the underlying logic of hypothesis testing is actually extremely straightforward and compelling. To substantiate that, I present examples showing that hypothesis…

Descriptors: Hypothesis Testing, Testing Problems, Test Validity, Relevance (Education)

García-Pérez, Miguel A. – Educational and Psychological Measurement, 2017

Null hypothesis significance testing (NHST) has been the subject of debate for decades and alternative approaches to data analysis have been proposed. This article addresses this debate from the perspective of scientific inquiry and inference. Inference is an inverse problem and application of statistical methods cannot reveal whether effects…

Descriptors: Hypothesis Testing, Statistical Inference, Effect Size, Bayesian Statistics

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017

Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016

The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…

Descriptors: Test Theory, Item Response Theory, Models, Correlation

Devlieger, Ines; Mayer, Axel; Rosseel, Yves – Educational and Psychological Measurement, 2016

In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and…

Descriptors: Regression (Statistics), Comparative Analysis, Structural Equation Models, Monte Carlo Methods

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2015

Existing tests of interrater agreements have high statistical power; however, they lack specificity. If the ratings of the two raters do not show agreement but are not random, the current tests, some of which are based on Cohen's kappa, will often reject the null hypothesis, leading to the wrong conclusion that agreement is present. A new test of…

Descriptors: Interrater Reliability, Monte Carlo Methods, Measurement Techniques, Accuracy

Raykov, Tenko; Marcoulides, George A.; Patelis, Thanos – Educational and Psychological Measurement, 2015

A critical discussion of the assumption of uncorrelated errors in classical psychometric theory and its applications is provided. It is pointed out that this assumption is essential for a number of fundamental results and underlies the concept of parallel tests, the Spearman-Brown's prophecy and the correction for attenuation formulas as well as…

Descriptors: Psychometrics, Correlation, Validity, Reliability

Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung – Educational and Psychological Measurement, 2015

Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

Descriptors: Regression (Statistics), Models, Statistical Analysis, Comparative Analysis

Hayduk, Leslie – Educational and Psychological Measurement, 2014

Researchers using factor analysis tend to dismiss the significant ill fit of factor models by presuming that if their factor model is close-to-fitting, it is probably close to being properly causally specified. Close fit may indeed result from a model being close to properly causally specified, but close-fitting factor models can also be seriously…

Descriptors: Factor Analysis, Goodness of Fit, Factor Structure, Structural Equation Models

Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013

A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…

Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation