Novel metrics for quantifying bacterial genome composition skews
2018
Bacterial
genomeshave characteristic compositional
skews, which are differences in nucleotide frequency between the leading and lagging DNA strands across a segment of a
genome. It is thought that these strand asymmetries arise as a result of mutational biases and selective constraints, particularly for energy efficiency. Analysis of compositional
skewsin a diverse set of bacteria provides a comparative context in which mutational and selective environmental constraints can be studied. These analyses typically require finished and well-annotated
genomicsequences. We present three novel metrics for examining
genomecomposition
skews; all three metrics can be computed for unfinished or partially-annotated
genomes. The first two metrics, (dot-
skewand cross-
skew) depend on sequence and gene annotation of a single
genome, while the third metric (residual
skew) highlights unusual
genomesby subtracting a
GC content-based model of a library of
genomesequences. We applied these metrics to 7738 available bacterial
genomes, including partial drafts, and identified outlier species. A
phylogenetically diverseset of these outliers (i.e.,
Borrelia,
Ehrlichia, Kinetoplastibacterium, and
Phytoplasma) display similar
skewpatterns but share lifestyle characteristics, such as intracellularity and biosynthetic dependence on their hosts. Our novel metrics appear to reflect the effects of biosynthetic constraints and adaptations to life within one or more hosts on
genomecomposition. We provide results for each analyzed
genome, software and
interactive visualizationsat http://db.systemsbiology.net/gestalt/ skew_metrics .
Keywords:
-
Correction
-
Source
-
Cite
-
Save
44
References
0
Citations
NaN
KQI