Sequential probability ratio test

The sequential probability ratio test (SPRT) is a specific sequential hypothesis test, developed by Abraham Wald[1] and later proven to be optimal by Wald and Jacob Wolfowitz.

[2] Neyman and Pearson's 1933 result inspired Wald to reformulate it as a sequential analysis problem.

The Neyman-Pearson lemma, by contrast, offers a rule of thumb for when all the data is collected (and its likelihood ratio known).

While originally developed for use in quality control studies in the realm of manufacturing, SPRT has been formulated for use in the computerized testing of human examinees as a termination criterion.

[3][4][5] As in classical hypothesis testing, SPRT starts with a pair of hypotheses, say

The reason for being only an approximation is that, in the discrete case, the signal may cross the threshold between samples.

Thus, depending on the penalty of making an error and the sampling frequency, one might set the thresholds more aggressively.

A textbook example is parameter estimation of a probability distribution function.

Consider the exponential distribution: The hypotheses are Then the log-likelihood function (LLF) for one sample is The cumulative sum of the LLFs for all x is Accordingly, the stopping rule is: After re-arranging we finally find The thresholds are simply two parallel lines with slope

For example, suppose you are performing a quality control study on a factory lot of widgets.

In this example, p1 = 0.01 and p2 = 0.03 and the region between them is the IR because management considers these lots to be marginal and is OK with them being classified either way.

The SPRT is currently the predominant method of classifying examinees in a variable-length computerized classification test (CCT)[citation needed].

The test then evaluates the likelihood that an examinee's true score on that metric is equal to one of those two points.

A cutscore should always be set with a legally defensible method, such as a modified Angoff procedure.

The upper parameter p2 is conceptually the highest level that the test designer is willing to accept for a Fail (because everyone below it has a good chance of failing), and the lower parameter p1 is the lowest level that the test designer is willing to accept for a pass (because everyone above it has a decent chance of passing).

While the SPRT was first applied to testing in the days of classical test theory, as is applied in the previous paragraph, Reckase (1983) suggested that item response theory be used to determine the p1 and p2 parameters.

Research on CCT since then has applied this methodology for several reasons: Spiegelhalter et al.[6] have shown that SPRT can be used to monitor the performance of doctors, surgeons and other medical practitioners in such a way as to give early warning of potentially anomalous results.

More recently, in 2011, an extension of the SPRT method called Maximized Sequential Probability Ratio Test (MaxSPRT)[7] was introduced.

The salient feature of MaxSPRT is the allowance of a composite, one-sided alternative hypothesis, and the introduction of an upper stopping boundary.