Phylogenetic footprinting is a technique used to identify transcription factor binding sites (TFBS) within a non-coding region of DNA of interest by comparing it to the orthologous sequence in different species.
[1] Researchers have found that non-coding pieces of DNA contain binding sites for regulatory proteins that govern the spatiotemporal expression of genes.
These transcription factor binding sites (TFBS), or regulatory motifs, have proven hard to identify, primarily because they are short in length, and can show sequence variation.
Phylogenetic footprinting relies upon two major concepts: Phylogenetic footprinting was first used and published by Tagle et al. in 1988, which allowed researchers to predict evolutionary conserved cis-regulatory elements responsible for embryonic ε and γ globulin gene expression in primates.
Therefore, detecting these sites by phylogenetic footprinting is likely impossible unless a large number of closely related species are available.
To eliminate false positives statistical analysis must be performed that will show that the motifs reported have a mutation rate meaningfully less than that of the surrounding nonfunctional sequence.
This type of information could help us to identify regulatory elements that are not adequately conserved but occur in several copies in the input sequence.