Adversarial domain translation networks enable fast and accurate large-scale atlas-level single-cell data integration

2021
The rapid emergence of large-scale atlas-level single-cell RNA-sequencing (scRNA-seq) datasets from various sources presents remarkable opportunities for broad and deep biological investigations through integrative analyses. However, harmonizing such datasets requires integration approaches to be not only computationally scalable, but also capable of preserving a wide range of fine-grained cell populations. We created Portal, a unified framework of adversarial domain translation to learn harmonized representations of datasets. With innovation in model and algorithm designs, Portal achieves superior performance in preserving biological variation during integration, while having significantly reduced running time and memory compared to existing approaches, achieving integration of millions of cells in minutes with low memory consumption. We demonstrate the efficiency and accuracy of Portal using diverse datasets ranging from mouse brain atlas projects, the Tabula Muris project, and the Tabula Microcebus project. Portal has broad applicability and in addition to integrating multiple scRNA-seq datasets, it can also integrate scRNA-seq with single-nucleus RNA-sequencing (snRNA-seq) data. Finally, we demonstrate the utility of Portal by applying it to the integration of cross-species datasets with limited shared-information between them, and are able to elucidate biological insights into the similarities and divergences in the spermatogenesis process between mouse, macaque, and human.
    • Correction
    • Source
    • Cite
    • Save
    63
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map