![]() Given a set of molecules to be sequenced, the barcoding process works as follows. One such technique is tagging DNA molecules with short DNA barcodes, also known as unique molecular identifiers (UMI) ( Kou et al., 2016). Therefore, handling such errors is an important problem in NGS protocols for which several library preparation techniques have been developed ( Davidsson et al., 2016 Lou et al., 2013). These two issues increase the false positive rate of variation prediction significantly ( Newman et al., 2016). Available variation discovery tools ( DePristo et al., 2011 Garrison and Marth, 2012 Kockan et al., 2017 Rimmer et al., 2014) may have to deal with two major issues with this type of data: (i) significant number of PCR duplicates and (ii) low alternative allele frequency. PCR errors have the potential to cause false positive SNV predictions in downstream variation discovery analysis. Thus, errors in the sequenced reads can be introduced during PCR or during sequencing, with a PCR error being more severe than a sequencing error, since errors introduced in a given PCR cycle are propagated to all the descendant products of that cycle. For example, Taq, a popular polymerase used in PCR, has a point mutation error rate of 10 − 5–10 − 6 mutations per duplicated base pair ( Clarke et al., 2001). ![]() Furthermore, all DNA polymerase enzymes are known to have some point mutation error rate. ![]() One of the main challenges involved in working with ctDNA/cfDNA is the low proportion of the ctDNA involved (it can be as low as 0.01%), which could be lower than the sequencing error rate of the commonly used sequencing technologies. A second round of PCR may be applied to the captured material to amplify the amount of DNA before sequencing. Polymerase chain reaction (PCR) techniques are then used to enrich the genomic fragments before capturing the regions of interest. ![]() Most of the available approaches using ctDNA/cfDNA are performed through targeting the specific regions of the genome (a panel of cancer genes) that may potentially be involved in tumor progression. Because the proportion of tumor could be very low, deep (ultra-deep) sequencing of ctDNA/cfDNA needs to be performed in order to be able to monitor the progression of the tumor in patients. A proportion of these cells, and consequently their DNA, may derive from a tumor and is known as circulating tumor DNA (ctDNA). cfDNA arises in the blood or urine from dying cells as a result of apoptosis or necrosis ( Schwarzenbach et al., 2011). blood), circumventing the aforementioned drawbacks. Alternatively, it is now possible to sample and sequence circulating cell-free DNA (cfDNA) from the patient’s bodily fluids (e.g. However, performing tissue biopsy is an invasive, expensive and time-consuming procedure and conducting it at multiple time points is not possible in many cases. The traditional approach for this would require tissue biopsies at different time points. In order to effectively understand progression of a patient-specific tumor, comprehensive sampling of the tumor DNA at several time points is required. Indeed, cancer is a complex disease which involves rapidly evolving cells, often with multiple distinct clones. This error rate, although very low, is an issue for detecting single nucleotide variations (SNV) in some challenging datasets, such as cancer genomes. For example, Illumina HiSeq 2000, one of the leading NGS platforms, has a reported error rate of 0.2–0.4% dominated mostly by substitution errors ( Schirmer et al., 2016). However, an important issue is tackling the level of noise in sequenced data that can have a significant impact on downstream analysis. Massively parallel Next-Generation Sequencing (NGS) technologies have paved the way for cost-effective high-throughput sequencing of genomic data.
0 Comments
Leave a Reply. |