RT Journal Article SR Electronic T1 A unified data infrastructure to support large-scale rare disease research JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.20.23299950 DO 10.1101/2023.12.20.23299950 A1 Johansson, Lennart F. A1 Laurie, Steve A1 Spalding, Dylan A1 Gibson, Spencer A1 Ruvolo, David A1 Thomas, Coline A1 Piscia, Davide A1 de Andrade, Fernanda A1 Been, Gerieke A1 Bijlsma, Marieke A1 Brunner, Han A1 Cimerman, Sandi A1 Yavari Dizjikan, Farid A1 Ellwanger, Kornelia A1 Fernandez, Marcos A1 Freeberg, Mallory A1 van de Geijn, Gert-Jan A1 Kanninga, Roan A1 Maddi, Vatsalya A1 Mehtarizadeh, Mehdi A1 Neerincx, Pieter A1 Ossowski, Stephan A1 Rath, Ana A1 Roelofs-Prins, Dieuwke A1 Stok-Benjamins, Marloes A1 van der Velde, K. Joeri A1 Veal, Colin A1 van der Vries, Gerben A1 Wadsley, Marc A1 Warren, Gregory A1 Zurek, Birte A1 Keane, Thomas A1 Graessner, Holm A1 Solve-RD consortium A1 Beltran, Sergi A1 Swertz, Morris A. A1 Brookes, Anthony J. YR 2023 UL http://medrxiv.org/content/early/2023/12/20/2023.12.20.23299950.abstract AB The Solve-RD project brings together clinicians, scientists, and patient representatives from 51 institutes spanning 15 countries to collaborate on genetically diagnosing (“solving”) rare diseases (RDs). The project aims to significantly increase the diagnostic success rate by co-analysing data from thousands of RD cases, including phenotypes, pedigrees, exome/genome sequencing and multi-omics data. Here we report on the data infrastructure devised and created to support this co-analysis. This infrastructure enables users to store, find, connect, and analyse data and metadata in a collaborative manner. Pseudonymised phenotypic and raw experimental data are submitted to the RD-Connect Genome-Phenome Analysis Platform and processed through standardised pipelines. Resulting files and novel produced omics data are sent to the European Genome-phenome Archive, which adds unique file identifiers and provides long-term storage and controlled access services. MOLGENIS “RD3” and Café Variome “Discovery Nexus” connect data and metadata and offer discovery services, and secure cloud-based “Sandboxes” support multi-party data analysis. This proven infrastructure design provides a blueprint for other projects that need to analyse large amounts of heterogeneous data.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThe Solve-RD project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 779257. The RD‐Connect Genome‐Phenome Analysis Platform, received funding from EU projects RD‐Connect, Solve-RD and EJP-RD (Grant Numbers FP7 305444, H2020 779257, H2020 825575), Instituto de Salud Carlos III (Grant Numbers PT13/0001/0044, PT17/0009/0019; Instituto Nacional de Bioinformatica, INB) and ELIXIR Implementation Studies. The UMCG VRE and RD3 received funding from the EU projects Solve-RD, EJP-RD and CINECA Project (H2020 779257, H2020 825575, H2020 825775, respectively) and NWO VIDI grant number 917.164.455.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The Ethics committee of the Eberhard Karl University of Tubingen gave ethical approval for this workI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesData will be deposited at EGA. Accession numbers to be provided. Pseudonymised phenotypic information for all individuals and their genetic variants are accessible through the RD-Connect GPAP (https://platform.rd-connect.eu/) upon validated registration. All raw and processed data files will be made available at the EGA (Solve-RD study EGAS00001003851) upon approval by data access committee. The Ethics committee of the Eberhard Karl University of Tubingen gave ethical approval for this work.https://platform.rd-connect.eu/