The ability to make a difference between pre- and post-test probabilities of various conditions is a major factor in the indication of medical tests.
The most important systematic reference group-based methods to estimate post-test probability includes the ones summarized and compared in the following table, and further described in individual sections below.
Predictive values can be used to estimate the post-test probability of an individual if the pre-test probability of the individual can be assumed roughly equal to the prevalence in a reference group on which both test results and knowledge on the presence or absence of the condition (for example a disease, such as may determined by "Gold standard") are available.
The above methods are inappropriate to use if the pretest probability differs from the prevalence in the reference group used to establish, among others, the positive predictive value of the test.
Such difference can occur if another test preceded, or the person involved in the diagnostics considers that another pretest probability must be used because of knowledge of, for example, specific complaints, other elements of a medical history, signs in a physical examination, either by calculating on each finding as a test in itself with its own sensitivity and specificity, or at least making a rough estimation of the individual pre-test probability.
In these cases, the prevalence in the reference group is not completely accurate in representing the pre-test probability of the individual, and, consequently, the predictive value (whether positive or negative) is not completely accurate in representing the post-test probability of the individual of having the target condition.
The relation can also be estimated by a so-called Fagan nomogram (shown at right) by making a straight line from the point of the given pre-test probability to the given likelihood ratio in their scales, which, in turn, estimates the post-test probability at the point where that straight line crosses its scale.
On the other hand, the effect of interference can potentially improve the efficacy of subsequent tests as compared to usage in the reference group, such as some abdominal examinations being easier when performed on underweight people.
Another method to overcome such inaccuracies is by evaluating the test result in the context of diagnostic criteria, as described in the next section.
In clinical practice, this is usually applied in evaluation of a medical history of an individual, where the "test" usually is a question (or even assumption) regarding various risk factors, for example, sex, tobacco smoking or weight, but it can potentially be a substantial test such as putting the individual on a weighing scale.
Subsequently, it can be estimated that a woman in the United Kingdom that is aged between 55 and 59 and that has been exposed to high-dose ionizing radiation should have a risk of developing breast cancer over a period of one year of between 588 and 1.120 in 100.000 (that is, between 0,6% and 1.1%).
However, this does not compensate for (former mentioned) effect of any difference between pre-test probability of an individual and the prevalence in the reference group.
A method to compensate for both sources of inaccuracy above is to establish the relative risks by multivariate regression analysis.
Such establishment can include usage of predictive values, likelihood ratios as well as relative risks.
For example, the ACR criteria for systemic lupus erythematosus defines the diagnosis as presence of at least 4 out of 11 findings, each of which can be regarded as a target value of a test with its own sensitivity and specificity.
In this case, there has been evaluation of the tests for these target parameters when used in combination in regard to, for example, interference between them and overlap of target parameters, thereby striving to avoid inaccuracies that could otherwise arise if attempting to calculate the probability of the disease using likelihood ratios of the individual tests.
Another factor is the pre-test probability, with a lower pre-test probability resulting in a lower absolute difference, with the consequence that even very powerful tests achieve a low absolute difference for very unlikely conditions in an individual (such as rare diseases in the absence of any other indicating sign), but on the other hand, that even tests with low power can make a great difference for highly suspected conditions.
The absolute difference can be put in relation to the benefit for an individual that a medical test achieves, such as can roughly be estimated as:
, where: In this formula, what constitutes benefit or harm largely varies by personal and cultural values, but general conclusions can still be drawn.