Rapid low-cost assembly of the Drosophila melanogaster reference genome using low-coverage, long-read sequencing

2018 
Accurate and comprehensive characterization of genetic variation is essential for deciphering the genetic basis of diseases and other phenotypes. A vast amount of genetic variation stems from large-scale sequence changes arising from the duplication, deletion, inversion, and translocation of sequences. In the past 10 years, high-throughput short reads have greatly expanded our ability to assay sequence variation due to single nucleotide polymorphisms. However, a recent de novo assembly of a second Drosophila melanogaster reference genome has revealed that short-read genotyping methods miss hundreds of structural variants, including those affecting phenotypes. While genomes assembled using high-coverage long reads can achieve high levels of contiguity and completeness, concerns about cost, errors, and low yield have limited the widespread adoption of such sequencing approaches. Here we resequence the reference strain of D. melanogaster (ISO1) on a single Oxford Nanopore MinION flow cell run for 24 hours. Using only reads longer than 1 kb, or 30x coverage, we de novo assemble a highly contiguous genome. The addition of inexpensive paired reads and subsequent scaffolding using an optical map technology achieved an assembly with completeness and contiguity comparable to the D. melanogaster reference assembly. Surprisingly, comparison of our assembly to the reference assembly of ISO1 uncovered a number of structural variants, including novel LTR transposable element insertions and duplications affecting genes with developmental, behavioral, and metabolic functions. Collectively, these structural variants provide a rare snapshot of the dynamics of metazoan genome evolution. Furthermore, our assembly and comparison to the D. melanogaster reference genome demonstrates that reference-quality de novo assembly of metazoan genomes and comprehensive variant discovery using such assemblies are now possible for under $1,000 USD.
    • Correction
    • Source
    • Cite
    • Save
    60
    References
    5
    Citations
    NaN
    KQI
    []
    Baidu
    map