HierCC: A multi-level clustering scheme for population assignments based on core genome MLST

2020
MotivationRoutine infectious disease surveillance is increasingly based on large-scale whole genome sequencing databases. Real-time surveillance would benefit from immediate assignments of each genome assembly to hierarchical population structures. Here we present HierCC, a scalable clustering scheme based on core genome multi-locus typing that allows incremental, static, multi-level cluster assignments of genomes. We also present HCCeval, which identifies optimal thresholds for assigning genomes to cohesive HierCC clusters. HierCC was implemented in EnteroBase in 2018, and has since genotyped >400,000 genomes from Salmonella, Escherichia, Yersinia and Clostridioides. AvailabilityImplementation: http://enterobase.warwick.ac.uk/ and Source codes: https://github.com/zheminzhou/HierCC Contactzhemin.zhou@warwick.ac.uk Supplementary informationSupplementary data are available at Bioinformatics online.
    • Correction
    • Source
    • Cite
    • Save
    11
    References
    1
    Citations
    NaN
    KQI
    []
    Baidu
    map