ROIpad ← Back to Search
openalex.org › research concept

APPLICATION OF MACHINE LEARNING FOR MASS CYTOMETRY DATA ANALYSIS OF CHRONIC LYMPHOCYTIC LEUKAEMIA

Muizdeen Raji
Published: May 10, 2026
Mass cytometry is a powerful technique for quantifying intracellular and membrane proteins at single-cell resolution. However, the vast amount of data it generates requires advanced analytical methods to extract meaningful biological insights. While machine learning (ML) has emerged as a key analytical tool in many fields, its application in mass cytometry remains underexplored. In this study, ML was employed to analyse mass cytometric data from primary Chronic Lymphocytic Leukaemia (CLL) samples, focusing on batch effect correction, classification, and marker association analysis. An ML-based batch correction method was developed and compared to the non-ML tool CytofRUV. The ML-based approach demonstrated superior performance, reducing Earth Mover’s Distance (EMD) between reference samples with statistically significant p-values of 0.003 and 0.004 for anchor and validation samples, respectively. Additionally, classical ML algorithms were used to identify associations between cellular markers and CLL immunoglobulin gene mutational status. Phenotypic analysis of CLL cells using FlowSOM clustering based on cell surface markers revealed two major clusters, 10 and 1, which significantly differed between mutated (M-CLL) and unmutated (UM-CLL). A classification model trained on 20 FlowSOM-generated clusters from 51 CLL cases achieved 75% accuracy in distinguishing M-CLL from UM-CLL, indicating that mutational status influences cell surface marker expression. Furthermore, XGBoost was used to predict Ki67 and MYC mRNA expression levels (high or low) in CLL cells using PLAYRCyTOF data. The model achieved an accuracy of 94% when incorporating intracellular markers and 80% without them. Key determinants of Ki67 and MYC mRNA expression included TCL1A, TXNIP, and HSPA5 mRNAs, as well as CD27, CD5, and IgM proteins. These findings highlight the potential of ML to enhance mass cytometry data analysis, surpassing traditional methods in extracting valuable biological insights.
Cluster analysis Flow cytometry Computational biology Machine learning Artificial intelligence
View on OpenAlex ↗