|
|
Pub Date: |
2013-00-00 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Standard Setting (Scoring); Cutting Scores; Validity; Reliability; Medicine; Licensing Examinations (Professions); Performance; Data; Correlation; Judges
Abstract:
This study investigated the extent to which the performance data format impacted data use in Angoff standard setting exercises. Judges from two standard settings (a total of five panels) were randomly assigned to one of two groups. The full-data group received two types of data: (1) the proportion of examinees selecting each option and (2) plots showing the proportion of examinees selecting the correct answer by deciles defined by total test score. The options-only group received only the option data. Results indicated that judgments in the full-data group were in substantially closer alignment with the empirical data than those in the options-only group. This suggests that either the decile data alone or the combination of both pieces of data leads to a greater reliance on the data. The results are discussed from the perspective of the validity/credibility of the resulting cut scores. (Contains 6 figures and 6 tables.)
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2013-00-00 |
Pub Type(s): |
Journal Articles; Reports - Descriptive |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Standard Setting (Scoring); Evidence; Validity; Cutting Scores; Testing Programs; Case Studies; Licensing Examinations (Professions)
Abstract:
A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1) the procedural elements of the study, (2) the internal consistency of the recommendations, and (3) the external consistency of the impact or results of other measures of examinee performance. For many programs, the availability of external validity evidence is limited due the nature of the testing program. This is particularly the case for national testing programs in developing nations or international programs that span diverse populations across the world. In this article, we review two plausible approaches for identifying and evaluating external validity evidence in settings where other national or international benchmarks may not be available to guide policymakers. Each approach is presented along with a demonstration of how it could be applied in a case study from a national testing program. (Contains 3 tables.)
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
Author(s): |
McDougle, Leon; Mavis, Brian E.; Jeffe, Donna B.; Roberts, Nicole K.; Ephgrave, Kimberly; Hageman, Heather L.; Lypson, Monica L.; Thomas, Lauree; Andriole, Dorothy A. |
Source: |
Advances in Health Sciences Education, v18 n2 p279-289 May 2013 |
|
Pub Date: |
2013-05-00 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
College Graduates; Medical Education; Medical Schools; Physicians; Graduates; Primary Health Care; Grouping (Instructional Purposes); Medical Students; Professional Education; Licensing Examinations (Professions)
Abstract:
This study sought to determine the academic and professional outcomes of medical school graduates who failed the United States Licensing Examination Step 1 on the first attempt. This retrospective cohort study was based on pooled data from 2,003 graduates of six Midwestern medical schools in the classes of 1997-2002. Demographic, academic, and career characteristics of graduates who failed Step 1 on the first attempt were compared to graduates who initially passed. Fifty medical school graduates (2.5%) initially failed Step 1. Compared to graduates who initially passed Step 1, a higher proportion of graduates who initially failed Step 1 became primary care physicians (26/49 [53%] vs. 766/1,870 [40.9%]), were more likely at graduation to report intent to practice in underserved areas (28/50 [56%] vs. 419/1,939 [21.6%]), and more likely to take 5 or more years to graduate (11/50 [22.0%] vs. 79/1,953 [4.0%]). The relative risk of first attempt Step 1 failure for medical school graduates was 13.4 for African Americans, 7.4 for Latinos, 3.6 for matriculants greater than 22 years of age, 3.2 for women, and 2.3 for first generation college graduates. The relative risk of not being specialty board certified for those graduates who initially failed Step 1 was 2.2. Our observations regarding characteristics of graduates in our study cohort who initially failed Step 1 can inform efforts by medical schools to identify and assist students who are at particular risk of failing Step 1.
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2012-10-00 |
Pub Type(s): |
Journal Articles; Reports - Evaluative |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Licensing Examinations (Professions); Test Items; Dentistry; Minimum Competency Testing; Standard Setting; Scoring; Test Construction; Test Reliability
Abstract:
The consequences associated with the uses and interpretations of scores for many credentialing testing programs have important implications for a range of stakeholders. Within licensure settings specifically, results from examination programs are often one of the final steps in the process of assessing whether individuals will be allowed to enter practice. This article focuses on the concept of domain critical errors and suggests a framework for considering their use in practice. Domain critical errors are defined here as knowledge, skills, abilities, or judgments that are essential to the definition of minimum qualifications in a testing program's pass-fail decision-making process. Using domain critical errors has psychometric and policy implications, particularly for licensure programs that are mandatory for entry-level practice. Because these errors greatly influence pass-fail decisions, the measurement community faces an ongoing challenge to promote defensible practices while concurrently providing assessment literacy development about the appropriate design and use of testing methods like domain critical errors. (Contains 2 tables and 1 figure.)
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2012-12-00 |
Pub Type(s): |
Journal Articles; Reports - Evaluative |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Certification; Tests; Foreign Countries; Structural Equation Models; Undergraduate Students; Models; Behavior Change; Attitude Change; Comparative Analysis; Metacognition; Business Education; Management Development; Behavior Standards; Correlation; Student Attitudes; Licensing Examinations (Professions)
Abstract:
Previous research on professional certification has primarily focused on graduate certificates in intensive care nursing, writing certificates for practitioners, maintenance of certification in radiation oncology, and the certification of teachers and surgeons. Research on certification in the domain of business and management from an attitudinal-behavioral approach has been lacking. Social psychological theories provide potentially useful tools for explaining how attitudes, intentions, and behaviors are changed. The current study compared four intention-based models--the theory of planned behavior, the theory of self-regulation (TSR), the revised TSR (in which desire is a partial mediator), and the other revised TSR (in which desire is a full mediator)--in terms of their ability to predict the intentions of business and management students to obtain certification in their fields. Participants were drawn from the southern, middle, and northern areas of Taiwan. A structural equation model applied to a sample of 273 undergraduates demonstrated that attitudes, subjective norms, perceived behavioral controls, desires, intentions, and behaviors were associated with certification in business and management domains. The explanatory power of the revised TSR in which desire was a full mediator was superior to that of the competing models. Implications and future directions are discussed.
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2012-10-00 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Correlation; Gender Differences; Tests; Physicians; Females; Interviews; Scores; Interaction; Performance; Licensing Examinations (Professions)
Abstract:
Multiple studies examining the relationship between physician gender and performance on examinations have found consistent significant gender differences, but relatively little information is available related to any gender effect on interviewing and written communication skills. The United States Medical Licensing Examination (USMLE[R]) Step 2 Clinical Skills[R] (CS[R]) examination is a multi-station examination where examinees (physicians in training) interact with, and are rated by, standardized patients (SPs) portraying cases in an ambulatory setting. Data from a recent complete year (2009) were analyzed via a series of hierarchical linear models to examine the impact of examinee gender on performance on the data gathering (DG) and patient note (PN) components of this examination. Results from both components show that not only do women have higher scores on average, but women continue to perform significantly better than men when other examinee and case variables are taken into account. Generally, the effect sizes are moderate, reflecting an approximately 2% score advantage by encounter. The advantage for female examinees increased for encounters that did not require a physical examination (for the DG component only) and for encounters that involved a Women's Health issue (for both components). The gender of the SP did not have an impact on the examinee gender effect for DG, indicating a desirable lack of interaction between examinee and SP gender. The implications of the findings, especially with respect to the validity of the use of the examination outcomes, are discussed.
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
Author(s): |
Moser, Kelly |
Source: |
Current Issues in Education, v15 n2 Aug 2012 |
|
Pub Date: |
2012-08-15 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Spanish; Language Teachers; Preservice Teachers; Licensing Examinations (Professions); Teacher Competency Testing; Test Preparation; Test Content; Test Format; Test Anxiety; Student Attitudes; Interviews
Abstract:
Researchers (Sandarg & Schomber, 2009; Wilkerson, Schomber, & Sandarg, 2004) have urged the profession to develop a new subject-matter licensure test to reflect the best practices in the foreign language classroom. In October 2010, the Praxis II: World Language Test joined the Praxis Series. Given that this standards-driven test differs significantly from its previous versions, the Content Knowledge and Productive Language Skills tests, it is unknown how teacher candidates will respond to its unique challenges. This qualitative study examines the perspectives of five prospective foreign language teachers who took one of the versions of the Praxis II subject-matter test. The data revealed that two groups, Surprised Prevailers and Frustrated Forgoers, perceived the Praxis II differently. Their test experiences may provide foreign language teacher educators with strategies to overcome test challenges and improve curricula. (Contains 1 table.)
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2012-08-00 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Evidence; Medical Students; Grade Point Average; Medical Schools; Physical Examinations; Licensing Examinations (Professions); Physical Sciences; Biological Sciences; Sciences; Clinical Experience; Measures (Individuals); Communication Skills
Abstract:
Medical schools employ a variety of preadmission measures to select students most likely to succeed in the program. The Medical College Admission Test (MCAT) and the undergraduate college grade point average (uGPA) are two academic measures typically used to select students in medical school. The assumption that presently used preadmission measures can predict clinical skill performance on a medical licensure examination was evaluated within a validity argument framework (Kane 1992). A hierarchical generalized linear model tested relationships between the log-odds of failing a high-stakes medical licensure performance examination and matriculant academic and non-academic preadmission measures, controlling for student-and school-variables. Data includes 3,189 matriculants from 22 osteopathic medical schools tested in 2009-2010. Unconditional unit-specific model expected average log-odds of failing the examination across medical schools is -3.05 (se = 0.11) or 5%. Student-level estimated coefficients for MCAT Verbal Reasoning scores (0.03), Physical Sciences scores (0.05), Biological Sciences scores (0.04), uGPA[subscript science] (0.07), and uGPA[subscript non-science] (0.26) lacked association with the log-odds of failing the COMLEX-USA Level 2-PE, controlling for all other predictors in the model. Evidence from this study shows that present preadmission measures of academic ability are not related to later clinical skill performance. Given that clinical skill performance is an important part of medical practice, selection measures should be developed to identify students who will be successful in communication and be able to demonstrate the ability to systematically collect a medical history, perform a physical examination, and synthesize this information to diagnose and manage patient conditions.
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|
|
Pub Date: |
2012-08-00 |
Pub Type(s): |
Journal Articles; Reports - Research |
Peer Reviewed: |
Yes |
|
|
|
Descriptors:
Evidence; Generalizability Theory; Error of Measurement; Clinical Experience; English; High Stakes Tests; Medical Students; Measures (Individuals); Licensing Examinations (Professions)
Abstract:
Examinees who initially fail and later repeat an SP-based clinical skills exam typically exhibit large score gains on their second attempt, suggesting the possibility that examinees were not well measured on one of those attempts. This study evaluates score precision for examinees who repeated an SP-based clinical skills test administered as part of the US Medical Licensing Examination sequence. Generalizability theory was used as the basis for computing conditional standard errors of measurement ("SEM") for individual examinees. Conditional "SEMs" were computed for approximately 60,000 single-take examinees and 5,000 repeat examinees who completed the Step 2 Clinical Skills Examination[R] between 2007 and 2009. The study focused exclusively on ratings of communication and interpersonal skills. Conditional "SEMs" for single-take and repeat examinees were nearly indistinguishable across most of the score scale. US graduates and IMGs were measured with equal levels of precision at all score levels, as were examinees with differing levels of skill speaking English. There was no evidence that examinees with the largest score changes were measured poorly on either their first or second attempt. The large score increases for repeat examinees on this SP-based exam probably cannot be attributed to unexpectedly large errors of measurement.
Note:The following two links
are not-applicable for text-based browsers or screen-reading software.
Show
Hide
Full Abstract
Related Items: Show Related Items
Full-Text Availability Options:
More Info:
Help |
Tutorial
Help Finding Full Text
|
More Info:
Help
Find in a Library
|
Publisher's website
|
|