The term contingency table was first used by Karl Pearson in "On the Theory of Contingency and Its Relation to Association and Normal Correlation",[1] part of the Drapers' Company Research Memoirs Biometric Series I published in 1904.
A crucial problem of multivariate statistics is finding the (direct-)dependence structure underlying the variables contained in high-dimensional contingency tables.
Suppose there are two variables, sex (male or female) and handedness (right- or left-handed).
Further suppose that 100 individuals are randomly sampled from a very large population as part of a study of sex differences in handedness.
The numbers of the males, females, and right- and left-handed individuals are called marginal totals.
If the proportions of individuals in the different columns vary significantly between rows (or vice versa), it is said that there is a contingency between the two variables.
There may also be more than two variables, but higher order contingency tables are difficult to represent visually.
For more on the use of a contingency table for the relation between two ordinal variables, see Goodman and Kruskal's gamma.
For a more complete discussion of their uses, see the main articles linked under each subsection heading.
The odds ratio has a simple expression in terms of probabilities; given the joint probability distribution: the odds ratio is: A simple measure, applicable only to the case of 2 × 2 contingency tables, is the phi coefficient (φ) defined by where χ2 is computed as in Pearson's chi-squared test, and N is the grand total of observations.
[3] C can be adjusted so it reaches a maximum of 1.0 when there is complete association in a table of any number of rows and columns by dividing C by
Tetrachoric correlation assumes that the variable underlying each dichotomous measure is normally distributed.
Asymmetric lambda measures the percentage improvement in predicting the dependent variable.