Origin of replication

[4] Incomplete, erroneous, or untimely DNA replication events can give rise to mutations, chromosomal polyploidy or aneuploidy, and gene copy number variations, each of which in turn can lead to diseases, including cancer.

[2][7][8][9] Additionally, origin sequences commonly have high AT-content across all kingdoms, since repeats of adenine and thymine are easier to separate because their base stacking interactions are not as strong as those of guanine and cytosine.

[2][6][9][13][14][15][16][17] More than five decades ago, Jacob, Brenner, and Cuzin proposed the replicon hypothesis to explain the regulation of chromosomal DNA synthesis in E.

[2] A fundamental feature of the replicon hypothesis is that it relies on positive regulation to control DNA replication onset, which can explain many experimental observations in bacterial and phage systems.

[25][26][27][28][29][30][31][32][33][34] Eukaryotic chromosomes are also much larger than their bacterial counterparts, raising the need for initiating DNA synthesis from many origins simultaneously to ensure timely replication of the entire genome.

Taken together, the discovery and isolation of origin sequences in various organisms represents a significant milestone towards gaining mechanistic understanding of replication initiation.

In addition, these accomplishments had profound biotechnological implications for the development of shuttle vectors that can be propagated in bacterial, yeast and mammalian cells.

[47][48][49][50][51][52][53] While the sequence, number, and arrangement of origin-associated DnaA-boxes vary throughout the bacterial kingdom, their specific positioning and spacing in a given species are critical for oriC function and for productive initiation complex formation.

E. coli oriC comprises an approximately ~260 bp region containing four types of initiator binding elements that differ in their affinities for DnaA and their dependencies on the co-factor ATP.

[41][59][60][61][62][63] By contrast, the I, τ, and C-sites, which are interspersed between the R-sites, are low-affinity DnaA-boxes and associate preferentially with ATP-bound DnaA, although ADP-DnaA can substitute for ATP-DnaA under certain conditions.

[47][67][68][69] DNA strand separation is additionally aided by direct interactions of DnaA's AAA+ ATPase domain with triplet repeats, so-called DnaA-trios, in the proximal DUE region.

[70] After melting, the DUE provides an entry site for the E. coli replicative helicase DnaB, which is deposited onto each of the single DNA strands by its loader protein DnaC.

[2] Although the different DNA binding activities of DnaA have been extensively studied biochemically and various apo, ssDNA-, or dsDNA-bound structures have been determined,[50][51][52][68] the exact architecture of the higher-order DnaA-oriC initiation assembly remains unclear.

Archaeal genomes typically encode multiple paralogs of Orc1/Cdc6 that vary substantially in their affinities for distinct ORB elements and that differentially contribute to origin activities.

[79][86][87][88] In Sulfolobus solfataricus, for example, three chromosomal origins have been mapped (oriC1, oriC2, and oriC3), and biochemical studies have revealed complex binding patterns of initiators at these sites.

[91][92] Interestingly, the DUE-flanking ORB or miniORB elements often have opposite polarities,[74][79][88][96][97] which predicts that the AAA+ lid subdomains and the winged-helix domains of Orc1/Cdc6 are positioned on either side of the DUE in a manner where they face each other.

[2] Origin organization, specification, and activation in eukaryotes are more complex than in bacterial or archaeal domains and significantly deviate from the paradigm established for prokaryotic replication initiation.

Nonetheless, DNA replication does initiate at discrete sites that are not randomly distributed across eukaryotic genomes, arguing that alternative means determine the chromosomal location of origins in these systems.

This region undergoes DNA-replication-dependent gene amplification at a defined stage during oogenesis and relies on the timely and specific activation of chorion origins, which in turn is regulated by origin-specific cis-elements and several protein factors, including the Myb complex, E2F1, and E2F2.

[148][149][150][151][152] This combinatorial specification and multifactorial regulation of metazoan origins has complicated the identification of unifying features that determine the location of replication start sites across eukaryotes more generally.

In addition to the recognition of certain DNA or epigenetic features, ORC also associates directly or indirectly with several partner proteins that could aid initiator recruitment, including LRWD1, PHIP (or DCAF14), HMGA1a, among others.

Likewise, whether and how different epigenetic factors contribute to initiator recruitment in metazoan systems is poorly defined and is an important question that needs to be addressed in more detail.

[180] Understanding the molecular and biochemical mechanisms that orchestrate this complex interplay between 3D genome organization, local and higher-order chromatin structure, and replication initiation is an exciting topic for further studies.

Observations that metazoan origins often co-localize with promoter regions in Drosophila and mammalian cells and that replication-transcription conflicts due to collisions of the underlying molecular machineries can lead to DNA damage suggest that proper coordination of transcription and replication is important for maintaining genome stability.

[183][147] Sequence-independent (but not necessarily random) initiator binding to DNA additionally allows for flexibility in specifying helicase loading sites and, together with transcriptional interference and the variability in activation efficiencies of licensed origins, likely determines origin location and contributes to the co-regulation of DNA replication and transcriptional programs during development and cell fate transitions.

Computational modeling of initiation events in S. pombe, as well as the identification of cell-type specific and developmentally-regulated origins in metazoans, are in agreement with this notion.

For instance, Polyoma viruses utilize host cell DNA polymerases, which attach to a viral origin of replication if the T antigen is present.

[190][191][192][193][194] Nonetheless, despite the ability of cells to sustain viability under these exceptional circumstances, origin-dependent initiation is a common strategy universally adopted across different domains of life.

The extensively studied fungi and metazoa are both members of the opisthokont supergroup and exemplify only a small fraction of the evolutionary landscape in the eukaryotic domain.

[2] This article was adapted from the following source under a CC BY 4.0 license (2019) (reviewer reports): Babatunde Ekundayo; Franziska Bleichert (12 September 2019).

Models for bacterial ( A ) and eukaryotic ( B ) DNA replication initiation. A ) Circular bacterial chromosomes contain a cis -acting element, the replicator, that is located at or near replication origins. i ) The replicator recruits initiator proteins in a DNA sequence-specific manner, which results in melting of the DNA helix and loading of the replicative helicase onto each of the single DNA strands ( ii ). iii ) Assembled replisomes bidirectionally replicate DNA to yield two copies of the bacterial chromosome. B ) Linear eukaryotic chromosomes contain many replication origins. Initiator binding ( i ) facilitates replicative helicase loading ( ii ) onto duplex DNA to license origins. iii ) A subset of loaded helicases is activated for replisome assembly. Replication proceeds bidirectionally from origins and terminates when replication forks from adjacent active origins meet ( iv ).
Origin organization and recognition in bacteria. A ) Schematic of the architecture of E. coli origin oriC , Thermotoga maritima oriC , and the bipartite origin in Helicobacter pylori . The DUE is flanked on one side by several high- and weak-affinity DnaA-boxes as indicated for E. coli oriC . B ) Domain organization of the E. coli initiator DnaA. Magenta circle indicates the single-strand DNA binding site. C ) Models for origin recognition and melting by DnaA. In the two-state model (left panel), the DnaA protomers transition from a dsDNA binding mode (mediated by the HTH-domains recognizing DnaA-boxes) to an ssDNA binding mode (mediated by the AAA+ domains). In the loop-back model, the DNA is sharply bent backwards onto the DnaA filament (facilitated by the regulatory protein IHF) [ 38 ] so that a single protomer binds both duplex and single-stranded regions. In either instance, the DnaA filament melts the DNA duplex and stabilizes the initiation bubble prior to loading of the replicative helicase (DnaB in E. coli ). HTH – helix-turn-helix domain, DUE – DNA unwinding element, IHF – integration host factor.
Origin organization and recognition in archaea. A ) The circular chromosome of Sulfolobus solfataricus contains three different origins. B ) Arrangement of initiator binding sites at two S. solfataricus origins, oriC1 and oriC2. Orc1-1 association with ORB elements is shown for oriC1. Recognition elements for additional Orc1/Cdc6 paralogs are also indicated, while WhiP binding sites have been omitted. C ) Domain architecture of archaeal Orc1/Cdc6 paralogs. The orientation of ORB elements at origins leads to directional binding of Orc1 / Cdc6 and MCM loading in between opposing ORBs (in B ). (m)ORB – (mini-)origin recognition box, DUE – DNA unwinding element, WH – winged-helix domain.
Origin organization and recognition in eukaryotes. Specific DNA elements and epigenetic features involved in ORC recruitment and origin function are summarized for S. cerevisiae , S. pombe , and metazoan origins. A schematic of the ORC architecture is also shown, highlighting the arrangement of the AAA+ and winged-helix domains into a pentameric ring that encircles origin DNA. Ancillary domains of several ORC subunits involved in targeting ORC to origins are included. Other regions in ORC subunits may also be involved in initiator recruitment, either by directly or indirectly associating with partner proteins. A few examples are listed. Note that the BAH domain in S. cerevisiae Orc1 binds nucleosomes [ 102 ] but does not recognize H4K20me2. [ 103 ]
BAH – bromo-adjacent homology domain, WH – winged-helix domain, TFIIB – transcription factor II B-like domain in Orc6, G4 – G quadruplex, OGRE – origin G-rich repeated element. ORC gene names are indicated by a single number; e.g. 3 refers to ORC3 .
HHV-6 genome
Genome of human herpesvirus-6 , a member of the Herpesviridae family. The origin of replication is labeled as "OOR."