Phylogenetic reconciliation

Phylogenetic reconciliation can account for a diversity of evolutionary trajectories of what makes life's history, intertwined with each other at all scales that can be considered, from molecules to populations or cultures.

A recent avatar of the importance of interactions between levels of organization is the holobiont concept, where a macro-organism is seen as a complex partnership of diverse species.

Because all levels essentially deal with the same object, a phylogenetic tree, the same models of reconciliation—in particular those based on duplication-transfer-loss events, which are central to this article—can be transposed, with slight modifications, to any pair of connected levels:[13] an "inner", "lower", or "associate" entity (e.g. gene, symbiont species, population) evolves inside an "upper", or "host" one (respectively species, host, or geographical area).

The upper and lower entities are partially bound to the same history, leading to similarities in their phylogenetic trees, but the associations can change over time, become more or less strict or switch to other partners.

First models for reconciliation, taking explicitly into account the two topologies and using a mechanistic event-based approach, were proposed for host and symbiont and biogeography.

[24] The progressive development of phylogenetic reconciliation was thus possible through exchanges between multiple research communities studying phylogenies at the host and symbiont, gene and species, or biogeography levels.

However the use of this principle is debated,[4] and it is commonly admitted that it is more accurate in molecular evolution to fit a probabilistic model as a random walk, which does not necessarily produce parsimonious scenarios.

[35] Host switch, i.e. inheritance of a symbiont from a kin lineage, is a crucial event in the evolution of parasitic or symbiotic relationships between species.

[43] Unlike LCA mapping, DTL reconciliation typically yields several scenarios of minimal cost, in some cases an exponential number.

The strength of the dynamic programming approach is that it enables to compute a minimum cost of coevolution of the input upper and lower tree in quadratic time,[44] and to get a most parsimonious scenario through backtracking.

CoRe-PA[46] explores in a recursive manner the space of cost vectors, searching for a good matching with the event frequencies in reconciliations.

Alternatively, COALA[47] is a preprocess using approximate Bayesian computation with sequential Monte Carlo: simulation and statistic rejection or acceptance of parameters with successive refinement.

For example, the software Angst[49] chooses the costs that minimize the variation of genome size, in number of genes, between parent and children species.

Most of the software taking undated trees does not look for temporal feasibility, except Jane,[56] which explores the space of total orders via a genetic algorithm, or, in a post process, Notung,[57] and Eucalypt,[58] which searches inside the set of optimal solutions for time consistent ones.

[63] Originally, DTL reconciliation methods did not recognize this phenomenon and only allowed for transfer between contemporaneous branches of the tree, hence ignoring most plausible solutions.

However, methods working on undated upper trees can be seen as implicitly handling the unknown diversity by allowing transfers "to the future" from the point of view of one phylogeny, that is, the donor is more ancient than the recipient.

Finding and presenting structure among the multitude of possible reconciliations has been at the center of recent methodological developments, especially for host and symbiont aimed methods.

[77] This can be achieved by giving support values to specific events based on all optimal (or suboptimal) reconciliations,[78] or with the use of a consensus reconciled tree.

The space of most parsimonious reconciliation can be expanded or reduced when increasing or decreasing horizontal transfer allowed distance,[58] which is easily done by dynamic programming.

[96] The sample of lower trees can similarly reflect their likelihood according to the aligned sequences, as obtained from Bayesian Markov chain Monte Carlo methods as implemented for example in Phylobayes.

The dynamic programming framework, like usual birth and death models, works under the hypothesis of independent evolution of children lineages in the lower tree.

In this latter case, a polynomial algorithm which does not use dynamic programming and is an extension of the LCA method can find all optimal solutions, including gene conversions.

It is a probabilistic model with a parsimony translation,[134] proposing two sequential LCA-type heuristics handled via an intermediate locus tree between gene and species.

An iconic example is the case for blood-feeding or sap-feeding insects, which often depend on one or several bacterial symbionts to thrive on a resource that is abundant in sugar, but lacks essential amino-acids or vitamins.

[154] As in genetics with symbionts sharing host promoting HGTs, linguistic barriers can foreclose the transmission of folktales or language elements.

[161] The link between two consecutive genes can also be modeled as an evolving character, subject to gain, loss, origination, breakage, duplication and transfer.

[167][168][169][170] Similarly, a study used reconciliation methods to differentiate the effect of diet evolution and phylogenetic inertia on the composition of mammalian gut microbiomes.

By reconstructing ancestral diets and microbiome composition onto a mammalian phylogeny, the study revealed that both effects contribute but at different time scales.

Trying to face the limitation of these uses of standard two-level reconciliations with systems involving inter-dependencies at multiple levels, a methodological effort has been undertaken in the last decade to construct and use multi-level models.

The model can for example give higher likelihoods to reconciliation scenarios where horizontal gene transfers happen between entities sharing the same habitat.

A phylogenetic reconciliation between an upper phylogenetic tree (blue) and a lower one (red), annotated with the most often used evolutionary events (S, D, T, L) and their respective names in the contexts of phylogeography , host / symbiont and gene / species . For instance, the S event is called allopatric speciation when reconciling geographical areas and species, cospeciation between host and symbiont, and speciation for gene and species, but always corresponds to the same co-diversification pattern.
Some of the levels of biological organization commonly conceptualized as phylogenetic trees and to which phylogenetic reconciliation has been applied.
Tanglegram and two proposed reconciliation scenarios for pocket gophers and their chewing lice symbionts. For the host, O. stands for Orthogeomys , G. for Geomys and T. for Thomomys ; for the symbiont, G. stands for Geomydoecus and T. for Thomoydoecus .
Graphical overview of reconciliation events, inputs, outputs, and computational difficulties. [ 30 ]
Phylogenetic reconciliations in Duplication Loss and Duplication Transfer Loss
Different cost assignments can give different most parsimonious solutions.
Not all scenarios including transfers are time feasible, some might include time constraints incompatible with the species tree.
Transfer can go from a species to one of its descendant via a sister lineages that went extinct.
In biogeography, a tree like structure can be constructed to account for the possible migrations between different geographical areas.
An exponential number of scenarios might be most parsimonious, for example when two equivalent patterns have the same cost.
The lower tree can be unrooted, multifurcating, or given as a sample of potential trees and reconciliation can be used to resolve those uncertainties to get a binary rooted lower tree.
A reconciliation score can be used to help construct an upper tree
Events such as replacing transfer or gene conversion can not be modeled with independent children lineages.
Failure to diverge and Incomplete Lineage Sorting are two population level events resulting in a particular reconciliation pattern.
Illustration of input, output and events, of published methods which can be identified with 3-level methods. [ 135 ]
A higher level of organization can structure two lower levels in the context of phylogenetic reconciliation.