[3] However, a limitation of this method is that it generates only short sequences of DNA, which presents challenges to mapping its reads to a reference genome.
Bioinformatics are used to analyze the fluorescence data and make a base call, and for mapping or quantifying the 50bp, 100bp, or 150bp single- or paired-end reads.
Bioinformatic mapping of the sequencing reads is most efficient when the sample DNA contains a narrow length range.
An exonuclease is added to remove all remaining linear single-stranded and double-stranded DNA products.
[10] The resulting nanoparticle self-assembles into a tight ball of DNA approximately 300 nanometers (nm) across.
Nanoballs remain separated from each other because they are negatively charged naturally repel each other, reducing any tangling between different single stranded DNA lengths.
The emission of fluorescence from each DNA nanoball is captured on a high resolution CCD camera.
In the FASTQ file created by BGI/MGI sequencers using DNA nanoballs on a patterned array flowcell, the read names look like this:
Because DNA nanoballs remain confined their spots on the patterned array there are no optical duplicates to contend with during bioinformatics analysis of sequencing reads.
Another advantage of DNA nanoball sequencing include the use of high-fidelity Phi 29 DNA polymerase[10] to ensure accurate amplification of the circular template, several hundred copies of the circular template compacted into a small area resulting in an intense signal, and attachment of the fluorophore to the probe at a long distance from the ligation point results in improved ligation.
This can introduce PCR bias and possibly amplify contaminants in the template construction phase.
[2] However, these disadvantages are common to all short-read sequencing platforms are not specific to DNA nanoballs.
The cost of sequencing an entire human genome has fallen from about one million dollars in 2008, to $4400 in 2010 with the DNA nanoball technology.