Abstract
Many hematological diseases are characterized by altered abundance and morphology of blood cells and their progenitors. Myelodysplastic syndromes (MDS), for example, are a type of blood cancer manifesting via a range of cytopenias and dysplastic changes of blood and bone marrow cells. While experts analyze cytomorphology to diagnose MDS, similar alterations can be observed in other conditions such as haematinic deficiency anemias, and definitive diagnosis requires complementary information such as blood counts, karyotype and molecular testing. However, recent works demonstrated that computational analysis of bone marrow slides predicts not only MDS or AML but also the presence of specific mutations. Here, we present and make available Haemorasis, a computational method that detects and characterizes white and red blood cells (WBC and RBC, respectively) in peripheral blood slides, and apply it to over 300 individuals with different conditions (SF3B1-mutant and SF3B1-wildtype MDS, megaloblastic anemia and iron deficiency anemia), where Haemorasis detects over half a million WBC and millions of RBC. We then show how these large sets of cell images can be used in diagnosis and prognosis, whilst identifying novel associations between computational morphotypes and disease. We find that hypolobulated neutrophils and large RBC are characteristic of SF3B1-mutant MDS, and, while prevalent in both iron deficiency and megaloblastic anemia, hyperlobulated neutrophils are larger in the latter. Finally, we externally validate these methods, showing they generalize to other centers and scanners.
Competing Interest Statement
G.S.V. is a consultant for Astrazeneca and STRM.BIO. The other authors declare no competing interests.
Funding Statement
GSV is supported by a Cancer Research UK Senior Cancer Fellowship (C22324/A23015) and work in his lab is also funded by the European Research Council, Kay Kendall Leukaemia Fund, Blood Cancer UK, and the Wellcome Trust. JGA receives funding from EMBL.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
All Munich Leukemia Laboratory data provided for this investigation were reviewed and approved by Munich Leukemia Laboratory's internal institutional review board and follow the European Union's General Data Protection Regulation (GDPR). This research was conducted in line with the European Molecular Biology Laboratory's internal policy 53 (Internal Policy regarding the Use of Human Biological Material).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The different cohorts of digitized peripheral blood slides can be made available upon request. The annotated datasets for tile quality classification, white blood cell segmentation and red blood cell filtering can be found at https://doi.org/10.6084/m9.figshare.19153760. The machine-learning model parameters are available at https://doi.org/10.6084/m9.figshare.19164209. The necessary data to run Morphotype analysis is available at https://doi.org/10.6084/m9.figshare.19372292. The output of the Morphotype analysis, as well as the expert annotated cells, and the data necessary for downstream analysis are available at https://doi.org/10.6084/m9.figshare.19369391 and https://doi.org/10.6084/m9.figshare.19371008, respectively.
https://www.doi.org/10.6084/m9.figshare.19372292
https://www.doi.org/10.6084/m9.figshare.19371008
https://www.doi.org/10.6084/m9.figshare.19369391