In statistics, the correlation ratio is a measure of the curvilinear relationship between the statistical dispersion within individual categories and the dispersion across the whole population or sample.
The measure is defined as the ratio of two standard deviations representing these types of variation.
The correlation ratio η (eta) is defined as to satisfy which can be written as i.e. the weighted variance of the category means divided by the variance of all samples.
is linear (which is certainly true when there are only two possibilities for x) this will give the same result as the square of Pearson's correlation coefficient; otherwise the correlation ratio will be larger in magnitude.
represents the special case of no dispersion among the means of the different categories, while
is undefined when all data points of the complete population take the same value.
The sums of squares of the differences from the subject averages are 1952 for Algebra, 308 for Geometry and 600 for Statistics, adding to 2860.
The trivial requirement for this extreme is that all category means are the same.
The correlation ratio was introduced by Karl Pearson as part of analysis of variance.
Ronald Fisher commented: "As a descriptive statistic the utility of the correlation ratio is extremely limited.
depends on the number of the arrays"[1] to which Egon Pearson (Karl's son) responded by saying "Again, a long-established method such as the use of the correlation ratio [§45 The "Correlation Ratio" η] is passed over in a few words without adequate description, which is perhaps hardly fair to the student who is given no opportunity of judging its scope for himself.