devCellPy is a machine learning-enabled pipeline for automated annotation of complex multilayered single-cell transcriptomic data.
| Publication Type | Academic Article |
| Authors | Galdos F, Xu S, Goodyer W, Duan L, Huang Y, Lee S, Zhu H, Lee C, Wei N, Lee D, Wu S |
| Journal | Nat Commun |
| Volume | 13 |
| Issue | 1 |
| Pagination | 5271 |
| Date Published | 09/07/2022 |
| ISSN | 2041-1723 |
| Keywords | Induced Pluripotent Stem Cells, Transcriptome |
| Abstract | A major informatic challenge in single cell RNA-sequencing analysis is the precise annotation of datasets where cells exhibit complex multilayered identities or transitory states. Here, we present devCellPy a highly accurate and precise machine learning-enabled tool that enables automated prediction of cell types across complex annotation hierarchies. To demonstrate the power of devCellPy, we construct a murine cardiac developmental atlas from published datasets encompassing 104,199 cells from E6.5-E16.5 and train devCellPy to generate a cardiac prediction algorithm. Using this algorithm, we observe a high prediction accuracy (>90%) across multiple layers of annotation and across de novo murine developmental data. Furthermore, we conduct a cross-species prediction of cardiomyocyte subtypes from in vitro-derived human induced pluripotent stem cells and unexpectedly uncover a predominance of left ventricular (LV) identity that we confirmed by an LV-specific TBX5 lineage tracing system. Together, our results show devCellPy to be a useful tool for automated cell prediction across complex cellular hierarchies, species, and experimental systems. |
| DOI | 10.1038/s41467-022-33045-x |
| PubMed ID | 36071107 |
| PubMed Central ID | PMC9452519 |
