Variant-driven multi-wave pattern of COVID-19 via Machine Learning clustering of spike protein mutations

2021
Never before such a vast amount of data has been collected for any viral pandemic than for the current case of COVID-19. This offers the possibility to answer a number of highly relevant questions, regarding the evolution of the virus and the role mutations play in its spread among the population. We focus on spike proteins, as they bear the main responsibility for the effectiveness of the virus diffusion by controlling the interactions with the host cells. Using the available temporal structure of the sequencing data for the SARS-CoV-2 spike protein in the UK, we demonstrate that every wave of the pandemic is dominated by a different variant. Consequently, the time evolution of each variant follows a temporal structure encoded in the epidemiological Renormalisation Group approach to compartmental models. Machine learning is the tool of choice to determine the variants at play, independent of (but complementary to) the virological classification. Our Machine Learning algorithm on spike protein sequencing provides a simple and unbiased way to identify, classify and track relevant virus variants without any prior knowledge of their characteristics. Hence, we propose a new tool that can help preventing and forecasting the emergence of new waves, and that can be used by decision makers to define short and long term strategies to curb the current COVID-19 pandemic or future ones.
    • Correction
    • Source
    • Cite
    • Save
    38
    References
    1
    Citations
    NaN
    KQI
    []
    Baidu
    map