RT Journal Article SR Electronic T1 Distributed Statistical Analyses: A Scoping Review and Examples of Operational Frameworks Adapted to Healthcare JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.21.23300389 DO 10.1101/2023.12.21.23300389 A1 Lemyre, Félix Camirand A1 Lévesque, Simon A1 Domingue, Marie-Pier A1 Herrmann, Klaus A1 Ethier, Jean-François YR 2023 UL http://medrxiv.org/content/early/2023/12/24/2023.12.21.23300389.abstract AB Data from multiple organizations are crucial for advancing learning health systems. However, ethical, legal, and social concerns may restrict the use of standard statistical methods that rely on pooling data. Although distributed algorithms offer alternatives, they may not always be suitable for healthcare research frameworks. This paper aims to support researchers and data custodians in three ways: (1) providing a concise overview of the literature on statistical inference methods for horizontally partitioned data; (2) describing the methods applicable to generalized linear models (GLM) and assessing their underlying distributional assumptions; (3) adapting existing methods to make them fully usable in healthcare research. A scoping review methodology was employed for the literature mapping, from which methods presenting a methodological framework for GLM analyses with horizontally partitioned data were identified and assessed from the perspective of applicability in healthcare research. From the review, 41 articles were selected, and six approaches were extracted for conducting standard GLM-based statistical analysis. However, these approaches assumed evenly and identically distributed data across nodes. Consequently, statistical procedures were derived to accommodate uneven node sample sizes and heterogeneous data distributions across nodes. Workflows and detailed algorithms were developed to highlight information-sharing requirements and operational complexity.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was funded by Health Data Research Network Canada, Natural Sciences and Engineering Research council of Canada, Fonds de recherche du Québec - Nature et Technologie, the Chaire en informatique de la santé de l'Université de Sherbrooke and the Chaire MEIE Québec - Le numérique au service des systèmes de santé apprenants.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAbbreviationsCCCoordinating centreGLMGeneralized linear modelHPSAHorizontally Partitioned Statistical AnalyticsICESInstitute for Clinical Evaluative SciencesLHSLearning health systemMCHPManitoba Centre for Health PolicyMLEMaximum likelihood estimatorPACSPicture archiving and communication system