Watterson estimator

[1][2] It is estimated by counting the number of polymorphic sites.

It is a measure of the "population mutation rate" (the product of the effective population size and the neutral mutation rate) from the observed nucleotide diversity of a population.

is the per-generation mutation rate of the population of interest (Watterson (1975) ).

haploid individuals from the population of interest with effective size

, and that there are infinitely many sites capable of varying (so that mutations never overlay or reverse one another).

Because the number of segregating sites counted will increase with the number of sequences looked at, the correction factor

When its assumptions are met, the estimator is unbiased and the variance of the estimator decreases with increasing sample size or recombination rate.

However, the estimator can be biased by population structure.

is downwardly biased in an exponentially growing population.

It can also be biased by violation of the infinite-sites mutational model; if multiple mutations can overwrite one another, Watterson's estimator will be biased downward.

Comparing the value of the Watterson's estimator, to nucleotide diversity is the basis of Tajima's D which allows inference of the evolutionary regime of a given locus.