A diversity index is a method of measuring how many different types (e.g. species) there are in a dataset (e.g. a community).
The equation is:[3][4] The denominator Mq−1 equals the average proportional abundance of the types in the dataset as calculated with the weighted generalized mean with exponent q − 1.
The general equation of diversity is often written in the form[1][2] and the term inside the parentheses is called the basic sum.
Some popular diversity indices correspond to the basic sum as calculated with different values of q.
This is because negative values of q would give rare species so much more weight than abundant ones that qD would exceed R.[3][4] Richness R simply quantifies how many different types the dataset of interest contains.
Richness is a simple measure, so it has been a popular diversity index in ecology, where abundance data are often not available.
It is most often calculated as follows: where pi is the proportion of characters belonging to the ith type of letter in the string of interest.
In ecology, pi is often the proportion of individuals belonging to the ith species in the dataset of interest.
Then the Shannon entropy quantifies the uncertainty in predicting the species identity of an individual that is taken at random from the dataset.
[9] The Shannon index (H') is related to the weighted geometric mean of the proportional abundances of the types.
The more unequal the abundances of the types, the larger the weighted geometric mean of the pi values, and the smaller the corresponding Shannon entropy.
The measure equals the probability that two entities taken at random from the dataset of interest represent the same type.
If the dataset is small, and sampling without replacement is assumed, the probability of obtaining the same type with both random draws is: where ni is the number of entities belonging to the ith type and N is the total number of entities in the dataset.
[1][2] Both of these have also been called the Simpson index in the ecological literature, so care is needed to avoid accidentally comparing the different indices as if they were the same.
The inverse Simpson index equals: This simply equals true diversity of order 2, i.e. the effective number of types that is obtained when the weighted arithmetic mean is used to quantify average proportional abundance of types in the dataset of interest.
The original Simpson index λ equals the probability that two entities taken at random from the dataset of interest (with replacement) represent the same type.
Its transformation 1 − λ, therefore, equals the probability that the two entities represent different types.