Equipercentile equating software testing

This book provides an introduction to test equating, scaling and linking, including those concepts and practical issues that are critical for developers and all other testing professionals. Equipercentile equating involves percentile rank or score to be found for all scores in each of the forms and of all forms and clubbed together to generate a merit list. For practitioners, the book provides a splendid introduction to the topics considered. In largescale testing programs, various equating methods are available to ensure the.

The computer programs listed below can be used to conduct many of the equating analyses described in kolen and brennan 2004. There are three general approaches to irt equating. The major testing companies of course have the software they need for scaling and equating but software available for researchers and graduate students is very limited. Unlike with item response theory, equating based on classical test theory is somewhat distinct from scaling. Annual meeting of the national council on measurement in education. In addition to statistical procedures, successful equating, scaling and linking involves many aspects of testing, including procedures to develop tests, to administer and score tests and to interpret.

An equipercentile version of the levine linear observedscore equating function using the methods of kernel equating alina a. A comparison of linear, equipercentile, and fipc equating methods across multidimensional test forms for nonequivalent groups. Patrick meyer is an assistant professor in the curry school of education at the university of virginia. The results show that the 2pl and the trt approaches produce comparable results that more closely agree with the results of the equipercentile method than the grm does. Effect of item response theory irt model selection on. Equating scores from adaptive to linear tests iacat.

The book is appealing to anyone interested in the topic of equating, scaling, and linking. Irteq windows application that implements irt scaling and. The scores on these multiple forms were equated using the equipercentile equating method the legally sanctioned format and the merit list created. An equipercentile version of the levine linear observed. The test form to which we are equating the new form. An analytical procedure for the equipercentile method of. Equipercentile equating was not necessary for the european continent, because all contributing studies administered versions with 30point totals. A common approach is known as equipercentile equating. Impact of group differences on equating accuracy and the.

Software enabling complex machine learning has become widely. Since the turn of the century, much has been written on score equating and linking. Unlike polytomous irt models, the trt model yielded quite stable equating results across different equating methods investigated in their study. Methods and practices statistics for social and behavioral sciences kolen, michael j. Equipercentile equating produced scores on a 30point scale in all studies. Equipercentile equating is typically done by computer, though it is relatively easily done by. The merit list is the combination of different batches taking different test forms and from different districts each.

An r package for observedscore linking and equating. Test equating is the statistical process that accounts for the differences in test difficulty and then adjusts the scale of the current test administration so that the same criterion standard can be used. Hypothesis testing of equating differences in the ke. The most complete coverage of the entire field of score equating and score linking in general has been provided by kolen and brennan 2004. An investigation into the test equating methods used during 2006. Finally, examples of currently available software will be inventoried.

A comparison of linear, equipercentile, and fipc equating. Equating in smallscale language testing programs sage journals. Equating determines for each score on the new form the corresponding score on the reference form. Equating is an important step in the process of collecting, analyzing, and reporting test scores in any program of assessment. Center for advanced studies in measurement and assessment. Several other studies, including a generalizability study and an equipercentile equating study, were conducted to determine the equivalency between the two forms. Equipercentile equating determines the equating relationship as one where a score could have an equivalent percentile on either form.

Test score equating is used to compare different test scores from different test forms. We give the assumptions for the two methods in order to emphasize that all equating methods require some nontestable assumptions to be fulfilled. Equating unl digital commons university of nebraskalincoln. Fair and equitable measurement of student learning in moocs. While equating methods research has flourished because of the need for technically sound designs and analyses, software development has been limited.

The class consists of illustrated lectures, interspersed with selftests for the participants. A handful of statistical packages are available for linking and equating test forms. Statistical equating with measures of oral reading fluency. However, one of the reasons that irt was invented was that equating with ctt was very weak. In the descriptions that follow, forms are referred to as x and y, where scores on x will be equated to the scale of y. The assumptions and the formulas for the chain equipercentile equating function. You can equate forms with classical test theory ctt or item response theory irt. Abd in applied mathematics and computational science, 2008. The advantages and disadvantages of each equating method are discussed along with the conditions conducive to satisfactory equating. Approach 1 includes a commonitem linking strategy using item response theory irt, with external anchor sets embedded in the new test administration. A didactic approach to the use of irt truescore equating.

Frequently asked questions equating of scores on multiple. A sas program for calculating equivalent scores using the equipercentile method. Comparison of approaches for equating different versions of. Estimates bootstrap standard errors of linear equating and equipercentile equating under the random groups design. A query was sent for seeking districtwise merit list. The class is a nonmathematical introduction to the topic. The table below shows how the test equating process works. Ir provides unlimited scoring and report generation after handentry of drs2 and drs2. Considering that irt data simulation might unequally favor irt equating methods, pseudo tests and pseudo groups were also constructed to make equating results comparable with those. All of them can be accomplished with our industryleading software xcalibre, though conversion equating requires an additional software called irteq. An equipercentile version of the levine linear observedscore.

In the case of the common pupils design, nfer developed its own software to. Aug, 2014 the traditional equipercentile method was used as an evaluation baseline. Impact of group differences on equating accuracy and the adequacy of equating assumptions. Statistical methods for test equating computer software manual.

Therefore, the construction and administration of alternate forms of the same test is a necessary requirement for operating these testing programs cook, 2007. This booklet grew out of a halfday class on equating that i teach for new statistical staff at educational testing service ets. It is based on a flexible family of equipercentilelike equating functions and contains the linear equating function as a special case. Approach 2 is an equipercentile linking method kolen and brennan, 2004 where the scale scores from the new administration are linked to those of the old administration through percentile ranks. Conducts linear and equipercentile equating under the commonitem nonequivalent groups design. In observed score equating, the characteristics of score distributions are set equal for a specified population of examinees angoff, 1971. The proposed procedure requires a approximating the empirical score distributions of the two forms by means of the first terms of an infinite series, and b contrasting the results obtained when only the first two moments are used i. Pie for pc console, pie for pc gui, pie for mac os9, pie for mac os10 conducts irt true and observed scoring equating for dichotomously scored tests. The r package equate albano, 2014 is free, opensource software for conducting observedscore linking and equating under singlegroup, equivalentgroups, and nonequivalentgroups designs with one. For the equipercentile equating property eep, the converted scores on form x have the same distribution as scores on form y. Methods of equating utilize functions to transform scores on two or more versions of a test, so that they can be compared and used interchangeably. Dementia rating scale2 drs2 publisher, online testing. Ctt methods include tucker, levine, and equipercentile. The general form of the levine function will be soon available in ke software at.

The kernel levine equipercentile observedscore equating. Item response theory irt observed score kernel equating was evaluated and compared with equipercentile equating, irt observed score equating, and kernel equating methods by varying the sample size and test length. This booklet grew out of a halfday class on equating that author samuel livingston teaches for new statistical staff at educational testing service ets. Kernel equating ke is a powerful, modern and unified approach to test equating. Web table 2 provides score equivalencies for each equated pointtotal version of the mmse. Two local methods for observedscore equating are applied to the problem of equating an adaptive test to a linear test. He is the inventor of jmetrik, an open source psychometric software program. Frequently asked questions equating of scores on multiple forms. Prior use of the equipercentile method of test equating was based on a graphic procedure which is tedious, subject to smoothing errors, and nonanalytical. Pdf equating in smallscale language testing programs. A comparison of irt observed score kernel equating and. This article presents a sas program that uses equipercentile equating to derive equated scores on two. Artificial intelligence, advanced numerical analysis, and.

Any equipercentile equating method has five steps or parts. The class is a nonmathematical introduction to the topic, emphasizing conceptual understanding and practical applications. Equipercentile equating with equal interval scores brad hanson february 10, 1993 revised 5295 let x and y be discrete random variables representing the distribution of scores on two forms of a test labeled form x and form y, respectively in the some population. The new edition of test equating, scaling, and linking. Equating in smallscale language testing programs geoffrey. Methods and practices is a welcome update to a book which has become a classic in equating and linking. Computer programs college of education university of iowa. An analytical procedure for the equipercentile method of equating tests. Several methods have been developed to conduct equating.

631 227 1300 936 285 455 400 1101 3 546 1575 912 1579 177 728 349 350 437 586 464 1577 936 925 435 298 120 1491 1229 728 1158 1170 639 970 804 137 1399 964 939 428 149 667 682 1161 1159 1476 73 543 762 204