Psychometric issues confronted when implementing a system of item response theory (IRT) tools for test development at the Educational Testing Service (ETS) are discussed. These issues include selecting and assessing the appropriateness of IRT models, choosing methods of IRT scaling for item pools, considering test scoring strategies, and applying IRT tools within the context of different test scoring strategies. Since all items currently used in ETS testing programs are scored either right or wrong, this paper deals only with models for binary-scored items and focuses on unidimensional models. Use of IRT tools with theta-hat, number-right or formula, and scaled scoring methods is addressed. The following test development steps are listed: (1) collect pretest data; (2) select an IRT model; (3) assess the IRT model's appropriateness; (4) place all item parameters on a single scale; (5) choose IRT tools appropriate for the test scoring method used; (6) develop one or more forms of the test; and (7) assess the test development effort's success. (TJH)
Paper presented at the Annual Meeting of the American Psychological Association (Washington, DC, August 22-26, 1986).
Educational Testing Service; Rights and Formula Scoring; Unidimensional Scaling