Incomplete lineage sorting

Incomplete lineage sorting (ILS)[1][2][3] (also referred to as hemiplasy, deep coalescence, retention of ancestral polymorphism, or trans-species polymorphism) is a phenomenon in evolutionary biology and population genetics that results in discordance between species and gene trees.

[4][5] By contrast, complete lineage sorting results in concordant species and gene trees.

This is of course a simplified example of incomplete lineage sorting, and in real research it is usually more complex containing more genes and species.

There is a chance that when creating a phylogenetic tree it may not resemble actual relationships because of this incomplete lineage sorting.

[10] One of the resolutions to reduce the implications of incomplete lineage sorting is to use multiple genes for creating species or population phylogenies.

[8] Incomplete lineage sorting commonly happens with sexual reproduction because the species cannot be traced back to a single person or breeding pair.

When you get large ancestral populations together with closely timed speciation events these different pieces of DNA retain conflicting affiliations.

Still, for 1.6% of the bonobo genome, sequences are more closely related to homologues of humans than to chimpanzees, which is probably a result of incomplete lineage sorting.

[12] Lineage sorting is a method that allows paleoanthropologists to explore the genetic relationships and divergences that may not fit with their previous speciation models based on phylogeny alone.

[15] Incomplete lineage sorting is a common feature in viral phylodynamics, where the phylogeny represented by transmission of a disease from one person to the next, which is to say the population level tree, often doesn't correspond to the tree created from a genetic analysis due to the population bottlenecks that are an inherent feature of viral transmission of disease.

[16] Jacques and List (2019)[17] show that the concept of incomplete lineage sorting can be applied to account for non-treelike phenomena in language evolution.

Figure 1. Incomplete lineage sorting: see the text for an explanation.
Figure 2. Apparent incomplete lineage sorting: see the text for an explanation.
Figure 3. The pretransmission interval and incomplete lineage sorting in the phylogeny of a human-transmissible virus. The shaded tree represents a transmission chain where each region represents the pathogen population in each of three patients. The width of the shaded regions corresponds to the genetic diversity. In this scenario, A infects B with an imperfect transmission bottleneck, and then B infects C. The genealogy at the bottom is reconstructed from a sample of a single lineage from each patient at three distinct time points. When diversity exists in donor A, a pre-transmission interval will occur at each inferred transmission event (MRCA(A,B) precedes transmission from A to B), and the order of transmission events may become randomized in the virus genealogy. Note that the pre-transmission interval also is a random variable defined by the donor's diversity at time of each transmission. Terminal branch lengths are also elongated due to these processes.