Computerized adaptive testing

[3] A related methodology called multistage testing (MST) or CAST is used in the Uniform Certified Public Accountant Examination.

[citation needed] An adaptive test can typically be shortened by 50% and still maintain a higher level of precision than a fixed version.

[citation needed] Large target populations can generally be exhibited in scientific and research-based fields.

Once not accepted in medical facilities and laboratories, CAT testing is now encouraged in the scope of diagnostics.

However, it may increase the exposure of others (namely the medium or medium/easy items presented to most examinees at the beginning of the test).

This is a serious security concern because groups sharing items may well have a similar functional ability level.

Or, test-takers could be coached to deliberately pick a greater number of wrong answers leading to an increasingly easier test.

After tricking the adaptive test into building a maximally easy exam, they could then review the items and answer them correctly—possibly achieving a very high score.

[10] The large sample sizes (typically hundreds of examinees) required by IRT calibrations must be present.

Psychometricians experienced with IRT calibrations and CAT simulation research are necessary to provide validity documentation.

[citation needed] There are five technical components in building a CAT (the following is adapted from Weiss & Kingsbury, 1984[2]).

This list does not include practical issues, such as item pretesting or live field release.

[citation needed] In CAT, items are selected based on the examinee's performance up to a given point in the test.

[citation needed] After an item is administered, the CAT updates its estimate of the examinee's ability level.

If the examinee answered the item correctly, the CAT will likely estimate their ability to be somewhat higher, and vice versa.

[8] Maximum likelihood is asymptotically unbiased, but cannot provide a theta estimate for an unmixed (all correct or incorrect) response vector, in which case a Bayesian method may have to be used temporarily.

[2] The CAT algorithm is designed to repeatedly administer items and update the estimate of examinee ability.

[2][12] In many situations, the purpose of the test is to classify examinees into two or more mutually exclusive and exhaustive categories.

[citation needed] For example, a new termination criterion and scoring algorithm must be applied that classifies the examinee into a category rather than providing a point estimate of ability.

[citation needed] A confidence interval approach is also used, where after each item is administered, the algorithm determines the probability that the examinee's true-score is above or below the passing score.

Otherwise, it would be possible for an examinee with ability very close to the cutscore to be administered every item in the bank without the algorithm making a decision.

[citation needed] The item selection algorithm utilized depends on the termination criterion.

MCATs seek to maximize the test's accuracy, based on multiple simultaneous examination abilities (unlike a computer adaptive test – CAT – which evaluates a single ability) using the sequence of items previously answered (Piton-Gonçalves & Aluísio 2012).