Abstract
Objectives Many meta-research studies have investigated rates and predictors of data and code sharing in medicine. However, most of these studies have been narrow in scope and modest in size. We aimed to synthesise the findings of this body of research to provide an accurate picture of how common data and code sharing is, how this frequency has changed over time, and what factors are associated with sharing.
Design Systematic review with meta-analysis of individual participant data (IPD) from meta-research studies. Data sources: Ovid MEDLINE, Ovid Embase, MetaArXiv, medRxiv, and bioRxiv were searched from inception to July 1st, 2021.
Eligibility criteria Studies that investigated data or code sharing across a sample of scientific articles presenting original medical and health research.
Data extraction and synthesis Two authors independently screened records, assessed risk of bias, and extracted summary data from study reports. IPD were requested from authors when not publicly available. Key outcomes of interest were the prevalence of statements that declared data or code were publicly available, or ‘available on request’ (declared availability), and the success rates of retrieving these products (actual availability). The associations between data and code availability and several factors (e.g., journal policy, data type, study design, research subjects) were also examined. A two-stage approach to IPD meta-analysis was performed, with proportions and risk ratios pooled using the Hartung-Knapp-Sidik-Jonkman method for random-effects meta-analysis. Three-level random-effects meta-regressions were also performed to evaluate the influence of publication year on sharing rate.
Results 105 meta-research studies examining 2,121,580 articles across 31 specialties were included in the review. Eligible studies examined a median of 195 primary articles (IQR: 113-475), with a median publication year of 2015 (IQR: 2012-2018). Only eight studies (8%) were classified as low risk of bias. Useable IPD were assembled for 100 studies (2,121,197 articles), of which 94 datasets passed independent reproducibility checks. Meta-analyses revealed declared and actual public data availability rates of 8% (95% CI: 5-11%, 95% PI: 0-30%, k=27, o=700,054) and 2% (95% CI: 1-3%, 95% PI: 0-11%, k=25, o=11,873) respectively since 2016. Meta-regression indicated that only declared data sharing rates have increased significantly over time. For public code sharing, both declared and actual availability rates were estimated to be less than 0.5% since 2016, and neither demonstrated any meaningful increases over time. Only 33% of authors (95% CI: 5-69%, k=3, o=429) were estimated to comply with mandatory data sharing policies of journals.
Conclusion Code sharing remains persistently low across medicine and health research. In contrast, declarations of data sharing are also low, but they are increasing. However, they do not always correspond to the actual sharing of data. Mandatory data sharing policies of journals may also not be as effective as expected, and may vary in effectiveness according to data type - a finding that may be informative for policymakers when designing policies and allocating resources to audit compliance.
Data collection, analysis, and curation, each play integral roles in the research lifecycle across most scholarly fields, including medicine and health. It is also well recognised that archived research products like raw data and analytic code are valuable commodities to the broader medical research community. Among other things, greater access to raw data, analytic code and other materials that underly research findings provides researchers with opportunities to strengthen their methods, validate discovered findings, answer questions not originally considered by the data creators, accelerate research through the synthesis of existing datasets, and educate new generations of medical researchers [1]. While there are many valid challenges with sharing research materials (particularly navigating privacy considerations and time and resource burdens), in recognition of the benefits, funders and publishers of medical research have been carefully and continuously increasing the pressure on medical researchers over the last two decades to maximise the availability of such products for other researchers [2-6]. Recent examples include the United States government advising its federal funding agencies to update their public access policies before the end of 2025 to require that all federally funded research publications and supporting data are freely and immediately available [7].
While policy changes have fuelled optimism that data and code sharing rates in medicine will increase, important questions remain around what the culture of sharing is like currently, how it has evolved over time, how successful stakeholder policies are at instigating sharing, and when researchers are observed to share, how often useful data are made available. Many meta-research studies in medicine have aimed to address these questions, however, most have been small in size and narrow in scope, focussing on specific research participants (e.g., human participants [8], animals [9]), data types (e.g., gene expression data [10], modelling data [11]), study designs (e.g., clinical trials [12], systematic reviews [13]), and outcomes (e.g., data and code sharing declarations [14], data ‘FAIRness’ [15]). Therefore, the objectives of this review are to synthesise the findings of this research to establish an accurate picture of how common data and code sharing is in medicine, assess compliance with stakeholder policies on data and code availability, as well as explore what factors are associated with sharing. We anticipate that the findings of this review will highlight several areas for future policymaking and meta-research activities.
METHODS
Protocol and registration
We registered our systematic review on May 28th, 2021 on the Open Science Framework (OSF), prior to commencing the literature search [16], and subsequently prepared a detailed review protocol [17]. We report seven deviations from the protocol in Supplementary Table 1. As the research subjects of interest were scientific publications, ethics approval was not required for this research. The findings of this review are reported in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) 2020 statement [18] and its IPD extension [19]. We summarise key aspects of the methods below; for further details, please refer to the review protocol [17].
Eligibility criteria
Any study in which researchers investigated the prevalence of, or factors associated with, data or code sharing (termed “meta-research studies”) across a sample of published scientific articles presenting original medical or health-related research findings (termed “primary articles”) was eligible for inclusion in the review. No restrictions were placed on the publication location (e.g., preprint server, peer-reviewed journal) or the format (e.g., conference abstract, research letter) of either group. Nor were restrictions placed on the strategy used to identify and select primary articles, the type of data assessed (e.g., trial data, review data) or the level of sharing assessed (e.g., partial versus complete sharing). Furthermore, we included studies that used either manual or automated methods to assess data and code sharing provided it involved some examination of the body text of sampled primary articles. Exclusion criteria for this review included meta-research studies that investigated data or code sharing: as a routine part of a systematic review and IPD meta-analysis; among scientific articles outside of medicine and health; or via avenues other than journal articles (e.g., clinical trial registries).
Information sources and search strategy
On July 1st, 2021, we searched Ovid MEDLINE, Ovid Embase, and the medRxiv, bioRxiv, and MetaArXiv preprint servers to identify potentially relevant studies indexed from database inception up to the search date. The full search strategies, bibliographic citation files, as well as snapshots of the medRxiv and bioRxiv databases are available on the project’s OSF page [20]. Details on the development of the search strategy are outlined in the review protocol [17]. In addition to the database searches, other preprint servers (PeerJ, Research Square) and relevant online resources (Open Science Framework, aspredicted.org and connectedpapers.com) were searched to locate additional published, unpublished and registered studies of relevance to the review. Backward and forward citation searches of meta-research studies meeting the inclusion criteria were also performed using citationchaser on August 30th, 2022 [21]. Finally, potentially relevant studies recommended by colleagues, discovered through collaborations, and seen at meta-research conferences were also screened for eligibility. No language restrictions were imposed on any of the searches.
Study selection
Results from all main database and preprint server searches were imported into Covidence (Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia) and deduplicated. For the preprint searches, if a version of an eligible meta-research study was discovered in a peer-reviewed journal, it was included in place of the original preprint. All titles, abstracts, and full-text articles were then screened for eligibility in Covidence by DGH and another author (HF, ARF, or KH) independently, with disagreements resolved via discussion between authors, or by a third author if necessary (MJP). All literature identified by the additional preprint and online searches were screened against the eligibility criteria by one author (DGH). When multiple reports on the same dataset were identified, we used data from the most up-to-date report. A spreadsheet containing all screening decisions is available on the project’s OSF page [20].
Data collection
Once a meta-research study was found to be eligible, one member of the team (DGH) determined whether sufficiently unprocessed article-level IPD and article identifiers (e.g., digital object identifiers (DOIs), PubMed identifiers (PMIDs), article titles) for the included primary articles were publicly available. For meta-research studies where complete IPD were not available (i.e., no data or partial data had been shared), the corresponding author was contacted and asked if they would provide the complete or remaining IPD. If meta-research authors responded that they were either unable or unwilling to share, we then asked whether they would calculate the summary statistics necessary for the review. For meta-research authors who were unable or refused to provide summary data for the review, did not respond, or did not provide the promised IPD by the census date of December 31st, 2022, summary data reported in the meta-research papers were independently extracted by two authors (DGH; MJP), with discrepancies resolved through discussion. A list of all the data that were extracted from each meta-research study for the review can be found on the project’s OSF page [20].
Assessments of risk of bias
The risk of bias of included meta-research studies was assessed using a tool designed based on methods used in previous Cochrane Methodology reviews [22, 23]. The tool included four domains: i) sampling bias, ii) selective reporting bias, iii) article selection bias, and iv) the risk of errors in the accuracy of reported estimates (Supplementary Table 2). Each meta-research article was independently assessed by DGH and one other author (KH or ARF), with discrepancies resolved via discussion, or a third author (MJP) if necessary. Where domains were rated as unclear, clarification was sought from meta-research authors. Given the purpose of the tool was to differentiate between studies at a high risk of bias from those with a low risk, a study was only classified as low risk of bias if all criteria were assessed as low risk. We did not assess the likelihood of publication bias affecting the findings of the review (e.g., using a funnel plot), nor did we assess certainty in the body of evidence, as available methods are not well suited for methodology reviews such as ours.
IPD integrity checks and harmonisation
When complete IPD were obtained for a meta-research study, one author (DGH) performed the following integrity checks prior to harmonising the data: i) an evaluation of the completeness of the dataset (e.g., whether any variables or values were missing), ii) a check of the validity of the dataset (e.g., presence of out-of-range values, incorrectly coded values) and iii) a check that the overall sample size and data and/or code sharing rates as stated in the report could be exactly reproduced (note that the checks for an included study led by the first author of this review (Hamilton et al 2022 [15]) were performed by another author (HF)). In instances where any of these checks failed, clarification was sought from the meta-research authors. We also checked for, and removed duplicate rows in datasets (i.e., checked if the same primary articles were sampled more than once). Additionally, for meta-research studies that sampled primary articles across multiple scientific disciplines, Digital Science’s Dimensions platform (https://app.dimensions.ai) was used to identify which were medical and health-related using their automated 2020 Australia and New Zealand Standard Research Classification (ANZSRC) Fields of Research (FOR) Codes classification service [24]. When primary articles were not indexed in Dimensions, the first author (DGH), who has close to a decade of experience working as an allied health professional, clinical trial coordinator and medical researcher, classified articles as being medical or health-related or not. Furthermore, for meta-research studies with sample sizes less than 500, primary articles not assigned medical FOR codes by the Dimensions platform were manually reviewed and recoded if deemed false negatives.
Once the IPD checks were complete, one author (DGH) then manually extracted and reclassified required data in line with the study’s codebook. When all available IPD had been assembled and harmonised, datasets were then merged and the extent of overlapping primary articles between meta-research studies was assessed for each outcome of interest by checking for duplicate DOIs and PMIDs in R (R Foundation for Statistical Computing, Vienna, Austria, v4.2.1) using the duplicated function. We decided to keep data originating from primary articles that were flagged as having been sampled by more than one meta-research study only for the study with the highest score for the fourth risk of bias domain (i.e., lowest risk of errors in the accuracy of reported estimates), or in the event of a tie, the overall lowest risk of bias judgement, or the most recent publication date. More details on the scoring system developed to resolve overlap can be found on the project’s OSF page. For eligible meta-research studies where summary data were only available from study reports, but primary study identifiers were known, information from overlapping primary articles was removed from the meta-research studies that shared complete IPD. For meta-research studies where both primary study identifiers and article-level data were unavailable, we assessed the likelihood of overlap with other meta-research studies by comparing: i) outcome data collected, ii) primary article date range and iii) sampled journals.
Outcomes of interest
The following four pre-specified outcome measures for both research data and code availability were of primary interest to the review:
the prevalence of primary articles where authors declared that their data or code are publicly available (‘declared public availability’);
the prevalence of primary articles in which meta-researchers verified that data or code were indeed publicly available (‘actual public availability’);
the prevalence of primary articles where authors declared their data or code are privately available (i.e., “available on request” statements) (‘declared private availability’), and;
the prevalence of primary articles in which meta-researchers confirmed that study data or code were released in response to a private request (‘actual private availability’).
‘Actual public availability’ represented the results of the most intensive investigation of an availability statement by meta-researchers (e.g., checks that reported URLs were functional, that data could befreely downloaded and opened, that datasets were complete, that reported results could be independently reproduced). We also required data to be immediately available for it to be classified as actually publicly available (i.e., did not accept ‘intention to share’ and ‘under embargo’ statements), and took the strictest definition of actual availability when alternatives were available (i.e., if a study assessed both partial and complete sharing, we took the results of the ‘full’ data availability). Further information on how we defined ‘actual availability’ as well as all our other outcome measures can be found in the review protocol and the study codebook on the project’s OSF page [20].
In addition to the primary outcome measures, we also included eight secondary outcome measures:
the prevalence of formalised sections within primary articles dedicated to addressing data and/or code availability;
the association between the presence of a data availability statement and public sharing of data in primary articles;
the association between the presence of a code availability statement and public sharing of research code in primary articles;
the association between a journal’s policy on data sharing (any ‘mandatory posting’ policy versus other policy) and public sharing of research data in primary articles;
the association between a journal’s policy on data sharing (‘make available on request’ policy versus other non-mandatory policy) and private sharing of research data in primary articles;
the association between study design (clinical trial versus non-trial) and public sharing of data in primary articles;
the association between the subjects of the research (human participants versus non-human participants) and public sharing of data in primary articles, and;
the association between public sharing of research data and the sharing of code in primary articles.
Statistical analysis
A ‘two-stage’ approach to IPD meta-analysis was used, whereby summary statistics were computed from available IPD, abstracted from included study reports, or obtained directly from meta-research authors, then pooled using conventional meta-analysis techniques. We calculated proportions and 95% confidence intervals (CI) for all prevalence outcomes. Where possible, we calculated risk ratios with 95% confidence intervals for all association outcomes. For primary outcome measures, we considered the methodological characteristics of the included studies to determine which were appropriate for aggregation and decided that we would pool studies that met the following criteria: i) did not use non-random sampling methods, ii) did not restrict primary article evaluations to specific journals, preprint servers, funders, institutions, or data types, and iii) reported outcome data on primary articles published after 2016. These criteria were specifically chosen to minimise biasing of estimates (i.e., reduce upward or downward biasing of pooled estimates due to the overrepresentation of studies of journals with mandatory sharing policies, certain study designs, etc), and to provide a modern picture of data and code sharing (i.e., an estimate of sharing since the introduction of the FAIR principles [25]). The same criteria were applied to secondary outcome measures and subgroup analyses unless specified otherwise.
We pooled prevalence estimates by first stabilising the variances of the raw proportions using arcsine square root transformations, then applied random-effects models using the Hartung-Knapp-Sidik-Jonkman method which has shown to be preferable to the DerSimonian and Laird method when including a small number of studies, and when including studies with differing sample sizes [26]. The same approach was also used for meta-analyses of risk ratios; however, no transformations were used, and the ‘treatment arm’ continuity correction proposed by Sweeting et al 2004 [27] was applied to studies reporting zero events in a single group (double zero-cell events were excluded from the main analysis). Statistical heterogeneity was assessed via visual inspection of forest plots, the size of the I2 statistics and their 95% confidence intervals, and via 95% prediction intervals (PI) where more than four studies were included. Data deduplication, preparation, analysis and visualisation was performed in R (R Foundation for Statistical Computing, Vienna, Austria, v4.2.1) using the meta (v5.5) [28], metafor (v3.8) [29] and altmeta (v4.1) [30] packages. Risk of bias plots were created using robvis [31]. The Python (v3.10.7) client Dimcli (v0.9.9.1) was used to access Dimensions Analytic’s API and retrieve required primary article meta-data (e.g., DOIs, PMIDs, ANZSRC FOR codes). All R and Python scripts are publicly available on the project’s OSF page [20].
Subgroup and sensitivity analyses
We planned to conduct the following subgroup analyses to investigate whether prevalence estimates of public data sharing differed depending on i) the data type, or whether primary articles: ii) were subject to any mandatory sharing policies by the funders of the research or not, or iii) posted a preprint prior to publication or not. Furthermore, we also investigated the influence of publication year on data and code sharing rates by fitting three-level mixed-effects meta-regression models on arcsine-transformed proportions. A multi-level model was used to account for dependencies between effect estimates due to some studies contributing multiple yearly estimates. Due to substantially differing levels of variation between the pre- and post-2014 periods, to preserve assumptions of homoscedasticity we only modelled changes in sharing rates from 2014 onwards.
We also performed sensitivity analyses to assess changes in pooled estimates when excluding meta-research studies that i) were rated as high or unclear risk of bias, ii) did not provide IPD for the review, iii) were at high risk of overlap with other meta-research studies, iv) did not assess compliance with the FAIR principles, v) did not manually assess primary articles and vi) did not examine COVID-19-related research. Finally, we also examined differences in pooled proportions and risk ratios when using generalised linear mixed models (GLMMs) to aggregate findings, which have been specifically recommended in situations when the probability of the event of interest is rare [32,33]. Such methods also circumvent the need to add arbitrary continuity corrections to zero events, which can produce biased results when most cases are zero events, and group sample sizes are highly imbalanced [27]. For meta-analyses of risk ratios, we report the results of analyses both excluding and including studies with no events in both groups.
RESULTS
Study selection and IPD retrieval
The search of Ovid MEDLINE, Ovid Embase and the medRxiv, bioRxiv and MetaArXiv preprint servers, once deduplicated, identified 4,952 potentially eligible articles for the review, of which 4,736 were excluded following the screening of titles and abstracts. Of the remaining 216 articles, full-text articles were retrieved for all papers, and 70 were adjudicated as eligible for the review. Furthermore, the additional searches revealed another 44 eligible reports for inclusion, resulting in a total of 114 eligible meta-research studies examining a combined total of 2,254,031 primary articles for the review [8-15,34-142]. Following confirmation of eligibility, we searched for publicly available IPD for the 114 meta-research studies. Of these studies, 70 had already made complete IPD publicly available (61%), 20 studies had posted partial IPD (18%), and 24 had not publicly shared any IPD (21%), with three of the latter articles declaring upfront that IPD could not be shared. Of the 70 complete datasets that were originally posted publicly, 60 (86%) were deposited into data repositories, 36 (51%) had a DOI, 26 (37%) provided a data dictionary, and 14 (20%) applied a license to the data. Most data were archived in Microsoft Excel (N=33, 47%) or CSV (N=25, 36%) formats, with a minority of meta-researchers storing their data in PDFs (N=5, 7%) and Microsoft Word documents (N=3, 4%).
Of the 40 meta-research studies that had not posted complete IPD, did not state in the study report that data could not be released, and had publicly available contact information, we contacted all authors and asked them to share article-level IPD for the review. We received 32 responses to our 40 requests (80%), of which 20 meta-researchers (50%) shared the required IPD by the census date. The median time taken to receive IPD was 7 days (range: 0-216 days). For the 20 articles where complete IPD was not assembled, 10 studies had useable IPD and/or summary data. The nine studies that were eligible for the review but could not be included in the quantitative analysis are outlined in Supplementary Table 3. They are also included in relevant forest plots, without providing usable data for the meta-analysis. Ultimately, 108 reports of 105 meta-research studies collecting information from a total of 2,121,580 primary articles were included in the quantitative analysis [8-15,34-133], with complete IPD available for 90 studies, a combination of partial IPD and summary data for 10 studies, and only summary data available for 5 studies. Refer to Figure 1 for the full PRISMA flow diagram.
Study characteristics
Summary information on the 105 meta-research studies that are included in the quantitative analysis of this review is outlined in Table 1. Eligible meta-research studies examined a median of 195 primary articles (IQR: 113-475; sample size range: 10-1,475,401), with a median publication year of 2015 (IQR: 2012-2018, publication date range: 1781-2022). Meta-research studies assessed data and code sharing across 31 specialties. Most commonly, studies were interdisciplinary, examining several medical fields simultaneously (N=17, 16%), followed by biomedicine and infectious disease (each N=10, 10%), general medicine (N=9, 9%), addiction medicine, clinical psychology, and oncology (each N=5, 5%). Eleven studies (10%) examined COVID-19-related articles. Additionally, most meta-research studies did not set any restrictions concerning data types (N=63) or journals of interest (N=56). However, when data restrictions were imposed, they were most often limited to trial data (N=16), sequence data (N=6), gene expression data and review data (each N=5). When journal restrictions were incorporated, the scope was most often limited to papers published in ‘high impact’ journals (variably defined by authors) (N=18), one or two journals of interest (N=10 and 5 respectively), or multiple journals subjectively deemed relevant to a field (N=7). Of the 105 meta-research studies, 31 and 4 also evaluated compliance with journal data and code sharing policies, respectively. However, none of the meta-research studies examined compliance with policies instituted by medical research funders or institutions.
In total, 95 and 58 meta-research studies, respectively, examined the prevalence of public data and code sharing in primary articles, with five studies examining how compliant publicly shared data was with the FAIR principles. In contrast, 10, 4 and 2 studies, respectively, assessed whether study data, code, or both data and code could be retrieved in response to a private request (i.e., actual private availability). Of these 16 studies, the stated reasons underpinning requests were: to perform a re-analysis (N=6), for a meta-research study (N=5), to populate a registry (N=1), to validate their findings (N=1) and for interest and coursework (N=1), with the remaining two not reporting what reason they gave. Of the 14 meta-research articles that shared the request templates they used, 12 meta-researchers provided primary article authors with an honest account of why they wished to source data and/or code, whereas two used deception.
Risk of bias assessment
The overall and individual results of the risk of bias assessments are reported in Supplementary Figures 1 & 2. Most eligible meta-research studies were judged favourably on the first risk of bias domain (sampling bias), having randomly sampled primary articles from populations of interest, or assessed all eligible articles identified by their literature searches (N=95, 90%). In contrast, a minority of meta-research studies were judged to be at low risk of selective reporting bias (N=45, 42%) and article selection bias (N=24, 23%) (i.e., shared study protocols and information on which primary articles were excluded and why). Similarly, only half of meta-research studies (N=54, 51%) were judged to have used a primary article coding strategy considered to be at low risk of errors. Ultimately, only eight studies (8%) were classified as low risk of bias for all four domains.
IPD integrity checks
In total, 100 meta-researchers’ datasets (90 complete and 10 partial) were obtained for the review. For the 90 complete datasets, sample sizes, as well as data and/or code sharing rates reported in study reports, were reproduced in all but five cases (94%), with the reasons for irreproducibility being due to simple typographical errors in the report (N=2), unclear data filtering steps (N=2) and an error in the meta-researchers’ code (N=1). For the ten partial datasets, we were able to independently verify sample sizes and sharing estimates for all but one case due to the receipt of an incorrect version of the data.
Of the 105 included meta-research studies examining 2,121,580 primary articles, we were able to retrieve identifying details (i.e., DOIs, PMIDs) for 2,121,197 primary articles (99.98%) from 100 studies (95%). After the removal of non-medical articles and duplicate articles observed within each of the 100 datasets, we were left with 1,849,828 primary articles with which to explore the extent of overlap between eligible studies. Of these 1,849,828 primary articles, we observed that 704,310 (38%) were flagged as having been sampled by more than one included meta-research study (some articles being repeatedly sampled by up to five studies). Notably, articles examined by the three largest studies by Serghiou et al [14], Colavizza et al [43] and Federer et al [50] were implicated in 681,595 of the 704,310 flagged cases (96.77%). Further, for some studies, all sampled primary articles had been completely assessed by other included studies (e.g., Sumner et al [122], Strcic et al [121]), whereas others demonstrated zero overlap (e.g. Rufiange et al [9]) (see Supplementary Figure 3 for further details).
For the five meta-research studies where identifying details for the primary articles were unavailable, only a single study was deemed to be at high risk of overlap [73]. Furthermore, for the nine meta-research studies excluded from the quantitative analysis, 127,985 of the 132,451 observations (97%) would have come from two meta-research studies of articles published in PLOS One, which would have had a high risk of overlap with the included studies by Serghiou et al [14], Colavizza et al [43] and Federer et al [50]. Given the likelihood of high overlap, our inability to include these nine meta-research studies in the quantitative analyses is unlikely to have influenced our results.
Public data and code sharing rates
Combination of eligible studies in a random-effects meta-analysis suggests that 8% of medical articles published since 2016 declare data to be publicly available (95% CI: 5-11%, 95% PI: 0-30%, k = 27 studies, o = 700,054 primary articles, I2 = 96%; Figure 2) and 2% actually share data publicly (95% CI: 1-3%, 95% PI: 0-11%, k = 25, o = 11,873, I2 = 90%; Figure 3). Despite the included meta-research studies following similar methodologies, we do note high statistical heterogeneity for both analyses, with influence analyses showing that the greatest contributors to between-study heterogeneity for declared data sharing were the high precision findings of Uribe et al [125] and Serghiou et al [14], who used automated coding strategies. For actual data sharing, the high estimate by Hamilton et al [15], who assessed partial sharing of data rather than complete, was also a large contributor to between-study heterogeneity.
For public code sharing, declared and actual code sharing rates since 2016 are estimated to be 0.3% (95% CI: 0-1%, 95% PI: 0-8%, k = 26, o = 707,943, I2 = 89%; Figure 4) and 0.1% (95% CI: 0-0.3%, 95% PI: 0-1%, k = 21, o = 3,843, I2 = 52%; Figure 5), respectively. Like declared data sharing rates, despite similar methodologies, declared code sharing estimates were also associated with high statistical heterogeneity. Again, influence analyses revealed high precision estimates from Uribe et al [125] and Serghiou et al [14], in addition to the high estimate by Kobres et al [78], who evaluated the sharing of model code from Zika virus forecasting and prediction research, were the biggest contributors to between-study heterogeneity.
Private data and code sharing rates
In contrast to declarations of public availability, ‘available upon request’ declarations were not commonly observed in primary articles published since 2016 for data (2%, 95% CI: 1-4%, 95% PI: 0-10%, k = 23, o = 3,058, I2 = 80%) or code (0%, 95% CI: 0-0.1%, 95% PI: 0-0.5%, k = 22, o = 2,825, I2 = 0%) (refer to Supplementary Figures 4 & 5 for forest plots). For actual private data and code availability rates, we could not combine the findings of eligible meta-research studies due to methodological differences, particularly in journal restrictions (i.e., policy differences), as well as the type of data being requested, both of which are explored via subgroup analyses below.
Overall, we observed that success rates in privately obtaining data and code from authors of published medical research ranged between 0-37% (k = 12, I2 = 88%) and 0-23% (k = 5, I2 = 94%) respectively (Figure 6). However, we note that when authors who declared data and code to be ‘available on request’ were asked for these products by meta-researchers, the upper limits of success increased to 100% (k = 7, I2 = 83%) and 43% (k = 4, I2 = 86%) respectively. In comparison, when requests for data and code were made to authors who did not include a statement concerning availability, success rates dropped to between 0-30% (k=7, I2 = 65%) and 0-12% (k=3, I2 = 89%) respectively. Lastly, and unsurprisingly, we also note that attempts to obtain data from authors explicitly declaring it to be unavailable were associated with a 0% sharing rate (k = 2, I2 = 0%). See Supplementary Figure 6 for the full results. Interestingly, we also noted during the IPD deduplication process that two of four primary article authors who were asked to share data by two independent meta-research teams on two separate occasions responded differently, providing some anecdotal evidence that requestor and requestee characteristics likely also play a role in success.
Secondary outcomes
Insufficient data were available to evaluate the first three secondary outcome measures (i.e., outcomes concerning data and code availability statements), due to only a single study recording information about both the prevalence of statements and journal policies across a random sample of articles [15]. Similarly, very few meta-research studies recorded information on compliance with multiple data sharing policies across random samples of primary articles. This review was therefore also unable to evaluate the fourth and fifth secondary outcomes measures (i.e., direct comparison of mandatory and ‘share on request’ policies with non-mandatory data sharing policies).
However, for journals implementing mandatory data sharing policies, we estimate that 65% of primary articles (95% CI: 36-88%, 95% PI: 2-100%, k = 5, o = 28,499, I2 = 99%) declared data to be publicly available and 33% actually shared data (95% CI: 5-69%, k = 3, o = 429, I2 = 93%). In contrast, we estimate the success rate for retrieving data from authors subject to ‘share on request’ policies to be 21% (95% CI: 4-47%, k = 3, o = 166, I2 = 30%). For comparison, declared and actual data sharing rates under ‘encourage’ systems are estimated to be 17% (95% CI: 0-62%, k = 6, o = 1,010, I2 = 98%) and 8% (95% CI: 0-48%, k = 3, o = 284, I2 = 90%) respectively. Similarly, declared and actual sharing rates for articles published in journals with no sharing policy are estimated to be 17% (95% CI: 0-59%, k = 4, o = 686, I2 = 95%) and 4% (95% CI: 0-95%, k = 2, o = 198, I2 = 83%) respectively. Refer to Supplementary Figure 7 for the results of declared and actual public code sharing rates according to journal policies.
We were able to assess the last three secondary outcomes. Our data suggest that triallists are 31% less likely to declare data are publicly available in comparison to non-triallists (RR: 0.69, 95% CI: 0.45-1.07, 95% PI: 0.12-4.13, k = 23, I2 = 0%). However, when examining actual data sharing, neither group appears more or less likely to share their data than the other (RR: 0.96, 95% CI: 0.53-1.72, 95% PI: 0.15-5.95, k = 19, I2 = 0%) (see Figure 7). We also estimate that researchers using data derived from human participants are also 35% less likely to declare data to be publicly available than researchers working with non-human participants (RR: 0.65, 95% CI: 0.42-0.99, 95% PI: 0.12-3.61, k = 19, I2 = 57%). However, this decreased likelihood became more pronounced when examining actual data sharing rates (RR: 0.44, 95% CI: 0.24-0.81, 95% PI: 0.05-3.57, k = 16, I2 = 28%) (see Figure 8). Lastly, we estimate that researchers who declare that their data are publicly available are eight times more likely to declare code to be available also (RR: 8.03, 95% CI: 2.86-22.53, 95% PI: 0.33-194.43, k = 12, I2 = 32%). Additionally, researchers who are verified to have made data available are estimated to be 42 times more likely than researchers who withheld data to share code as well (RR: 42.05, 95% CI: 12.15-145.52, 95% PI: 0.94-1879.62, k = 7 I2 = 0%) (Supplementary Figure 8).
Subgroup analyses
Insufficient data were available to evaluate whether prevalence estimates of public data sharing differed depending on whether primary articles were subject to any mandatory sharing policies by the funders of the research or posted as a preprint prior to publication. However, we did observe that both declared and actual public data sharing rates significantly differed according to the data type, with the highest rates of actual data sharing occurring among authors working with sequence data (57%, 95% CI: 12-96%, k = 3, o = 444, I2 = 86%), review data (6%, 95% CI: 0-77%, k =2, o = 372, I2 = 75%) then trial data (1%, 95% CI: 0-6%, k = 3, o = 235, I2 = 6%) (Supplementary Figures 9 & 10). Additionally, we also observed substantial differences in compliance rates with journal policies depending on the data type (Table 2). For example, estimates from a single study by Page et al [97] showed that actual data sharing rates among systematic review authors decreased from 28% for mandatory sharing policies, to 1% and 0% for encourage and no policy systems, respectively. Whereas in the context of sequence and gene expression data, decreases in actual sharing rates between mandatory policies (67% and 43%), encourage policies (57% and 43%) and no policy (46% and NA) were much less apparent.
Finally, changes in public data and code sharing rates over time were investigated by fitting three-level mixed-effect meta-regression models to arcsine-transformed data (refer to Supplementary Table 4 for the full results). Publication year was found to be a significant moderator of declared data sharing rates (β=0.017, 95% CI: 0.008-0.025, p=0.0001, between-study I2 = 91%, within-study I2 = 9%) but not actual data sharing rates (β = 0.004, 95% CI: −0.005-0.013, p = 0.3589, between-study I2 = 75%, within-study I2 = 3%). Specifically, we note an estimated rise in declared data sharing rates from 4% in 2014 (95% CI: 2-6%, 95% PI: 0-18%) to 9% in 2020 (95% CI: 6-12%, 95% PI: 0-26%). Refer to Figure 9 for a bubble plot comparing declared data sharing rates and actual sharing rates over time. Comparatively, both declared and actual code sharing rates did not appear to have meaningfully increased over time.
Sensitivity analyses
The results of the sensitivity analyses of the primary outcomes are reported in Table 3. For public data and code sharing outcomes, meta-analysis of prevalence rates using GLMMs did not result in any substantial changes to combined estimates in comparison to the standard inverse-variance aggregation methods. Similarly, limiting analyses to meta-research studies in which authors manually coded articles (i.e., removal of meta-research studies that used automated or unclear coding methods) did not result in any meaningful changes. When limiting analyses to meta-research studies where summary data were only derived from available IPD, no changes were observed to the declared data availability analysis. Insufficient data were available to evaluate whether findings from meta-research studies that assessed compliance with FAIR or were classified as low risk of bias resulted in meaningful changes to pooled estimates. Similarly, with respect to the impact of overlapping primary articles, removing the only meta-research study that was deemed to be at risk of overlapping with other included meta-research studies had no impact on any of the analyses. Lastly, we estimate declared and actual public data sharing rates for studies investigating COVID-19 (including both preprints or peer-reviewed publications) to be 9% (95% CI: 0-57%, k=3, o = 7,804, I2 = 95%) and 11% (95% CI: 0-76%, k=3, o = 934, I2 = 84%) respectively. Both of which compare favourably to our best estimates for declared (8%) and actual data sharing (2%) since 2016. The findings of the sensitivity analyses of secondary outcomes and subgroup analyses are reported in Supplementary Table 5. Most notably, we observed stronger associations between data and code sharing when including studies with no events in both groups.
DISCUSSION
Principal findings of the review
In this, the first systematic review and IPD meta-analysis of this topic, we used multiple data sources and analytic methods to investigate public and private availability of data and code in the medical and health literature. We also examined several factors associated with sharing. Aggregation of the findings of 27 meta-research studies (which themselves examined 700,054 primary articles) suggests that on average, only 8% of medical papers published since 2016 declare that their data are publicly available. Additionally, meta-analysis of 25 meta-research studies (examining 11,873 articles), suggests that only 2% of medical papers published since 2016 will have verifiably shared their data. In comparison, estimated declared and actual code sharing rates since 2016 were even lower, with both estimated to be less than 0.5%, with little changes over time. The prediction intervals from our analyses are also relatively precise, suggesting that we now have very good estimates of data and code sharing for medical and health research since 2016.
In contrast to public availability rates, the overall success rate of privately obtaining data from authors of published medical research was observed to range between 0-37%. However, the range of success became much more variable when the scope was limited to requests made to authors who declared their data to be ‘available on request’. For private requests for code, overall success rates ranged between 0-23%, and increased to 0-43% when examining requests that were mad e to authors that declared code to be available on request. These findings are consistent with similar research conducted in fields outside of medicine [140, 143-147]. Finally, while data were not available to assess compliance with funder and institutional data sharing policies, we did observe varying compliance rates with journal data sharing policies, particularly depending on the data type.
Review findings in context
When examining similar research conducted in other scientific fields, declared data and code sharing rates in medicine appear to be higher than some fields (e.g., Humanities, Earth Sciences and Engineering [14]), but lower than others (e.g., Experimental Biology [144, 148], Hydrology [149]). One common explanation for the differences between these sharing rates is that researchers in fields outside of the medical, health, behavioural and social sciences are more likely to make data available as they do not typically need to navigate privacy protections associated with the collection and sharing of data from human participants [150]. Our results support this notion, finding that medical researchers studying data from human participants were 66% less likely to actually make their data publicly available than those using data derived from non-human participants. This discrepancy has likely become more pronounced over time since the implementation of national and international protection laws imposing strong restrictions on the processing of personal medical data, like the US’ Health Insurance Portability and Accountability Act (HIPAA) and the European Union’s 2018 General Data Protection Regulation (GDPR) [151, 152].
We also note examples of differences in data sharing rates between medical and non-medical researchers working with the same human-derived data types. For example, in the context of human mitochondrial and Y-chromosomal data, Anagnostou and colleagues [153] observed different data sharing rates between medical genetics (64%) and forensic genetics (90%) researchers. Follow up work by the same authorship team reported that the discrepancies between sharing rates may be due to differences in cultures concerning the value of openness and transparency, as opposed to burdens associated with navigating privacy constraints [153].
Potential implications of our findings
Our findings raise some important implications for researchers and policymakers. At a journal policymaking level, our findings suggest that overall, blanket mandatory data sharing policies do not appear to work well in medicine and health [8,146,154]. Other research suggests suboptimal compliance with mandatory sharing policies may also apply to research funders as well [155]. Furthermore, we also note such policies may vary in their effectiveness according to the data type. For example, our findings suggest that mandatory sharing policies might be an effective measure in incentivising triallists and systematic reviewers to share data but may be less effective at motivating researchers working with sequence and gene expression data to share, given the high levels of sharing under non-mandatory policies. Consequently, it may be in policymakers’ interest to periodically audit compliance with such policies, possibly triaging audits by data type, and strengthening policing if substantial non-compliance is detected. Enforcement of policies in this setting could range from simple checks of common issues (e.g., that links are present and functional [156]), up to confirming that data can be freely downloaded and are well-annotated, complete and sufficiently unprocessed.
Given average data sharing rates remain low, the medical research community could consider trialling additional incentives to increase the rate and quality of data sharing. For example, some commonly proposed strategies, beyond implementation of policies mandating sharing, include: open science badges, data embargoes, data publications, novel altmetrics, as well as changes to funding schemes to allow applicants to budget for data archival costs, and academic hiring and promotion criteria to reward sharing [87,157,158]. While such strategies have long been suggested by important medical research stakeholders such as the United States National Academy of Medicine [159], as previous research has noted, in medicine, there are more opinion pieces on the lack of incentives for researchers to share data than there have been empirical tests of these incentives. Consequently, the effectiveness of most of these strategies in medicine remain an open area of inquiry [157].
Finally, we also observed that useful data are not only difficult to retrieve from medical researchers, but also difficult to retrieve from meta-researchers who are interested in studying the very topic of data sharing; a phenomenon that we are not the first to lament [160]. As such, the results of our review also extend concerns with regards to suboptimal archival practices, beyond the medical research community to the meta-research community as well. We have also uncovered substantial amounts of research duplication among meta-research studies examining open science practices in medicine and health. Consequently, we recommend that less research attention be paid to estimating overall data and code sharing rates in medicine, particularly between 2014 and 2020.
Strengths and limitations of the study
Our review has many methodological advantages over previous research in this area. Firstly, as data and code sharing are relatively rare events, IPD meta-analysis allowed us to bring together many imprecise findings to yield more precise estimates. Furthermore, retrieval of useful IPD from 95% of included studies has allowed us to conduct several data quality checks, identify and remove substantial amounts of redundant assessments, perform subgroup analyses not possible when conducting a meta-analysis of aggregate data, as well as minimise the risk of data availability biases. Second, the meta-analyses of our primary and secondary outcomes included more studies than the average meta-analysis of prevalence and rare events [161,162], reducing the risk of power issues, as well as making our review the largest analysis of ‘actual’ data and code sharing rates to our knowledge. Similarly, we had more than double the recommended number of estimates per covariate for our meta-regression analyses, minimising the risk of issues such as overfitting [163]. Third, the review included robustness checks using generalised linear mixed models, which have been recommended over conventional meta-analyses of arcsine-transformed proportions [164]. The use of GLMMs also allowed for analysis of studies with zero events in both groups which can alter conclusions in some circumstances [165], as well as circumvent the need to add arbitrary continuity corrections when meta-analysing risk ratios which can bias results [27].
Nevertheless, our review was not without limitations. First, we may have missed relevant literature due to challenges in designing the search strategies (e.g., lack of controlled vocabulary, variations in the way meta-research studies described themselves) and limiting searches to predominantly English-language databases. Second, we were unable to include the findings of nine studies due to the inability to source IPD or useable summary data. However, given 97% of primary articles examined by the excluded studies were at high risk of overlap with studies that were included in the analysis, we do not think their omission would have substantially altered our findings. Third, only one author performed IPD checks and harmonisation. Fourth, we assumed that authors will always declare in-text when data and/or code have been made publicly available, which previous studies have shown is not always the case [92]. However, this appears to be an uncommon practice, and therefore was unlikely to have significantly impacted our results. Finally, despite efforts to ensure studies were clinically homogeneous, our meta-analyses of proportions demonstrated high levels of statistical heterogeneity. However, given three-quarters of published meta-analyses of proportions report I2 values greater than 90% [161], the statistic’s usefulness for assessing heterogeneity in this context is debated. Consequently, evidence synthesis researchers have recommended greater priority be placed on visually inspecting forest plots and prediction interval widths instead [161]. Therefore, while we acknowledge these high I2 values, given the consistency of study methods and reported estimates (refer to Figure 5 for an illustrative example), we do not believe these values indicate concerning levels of variability in this context.
Unanswered questions and future research
We note several questions that have not been answered by our review. Our review was unable to comment on compliance rates with mandatory sharing policies introduced by medical research institutions and funders as we could not find relevant meta-research on this question. Furthermore, while we were able to explore how sharing rates differed according to data type, our analyses were restricted to a limited number of studies effectively examining four types of data (trial data, systematic review data, gene expression data and sequence data). Consequently, we were unable to establish precise estimates for compliance rates, nor comment on sharing rates for the myriad of other types of data that medical researchers have discussed with respect to data sharing. For example, model data [166], imaging data [167], flow cytometry data [168], spectroscopic data [169], diffraction data [170,171], and qualitative data [172] to name a few. Both are areas worthy of future empirical meta-research.
With regards to future research, we hope that the data that we have collected and harmonised for this review will serve as a useful resource to track changes in data and code sharing in medicine beyond 2020, as well as explore other factors that we were unable to assess (e.g., association between preprinting practices and sharing) or had not considered (e.g., association between career stage and sharing [173]). However, as we have been able to establish precise estimates of public data and code sharing rates, we do not think additional research that examines high-level data and code sharing rates in medicine between 2014-2020 is warranted.
CONCLUSION
The results of the current review suggest that while increasing numbers of medical and health researchers are stating that their data are publicly available, declaration rates remain uncommon, and not all declarations lead to the stated data. In contrast, code sharing rates remain persistently low across medicine. We also note large variability in success rates in privately obtaining data and code from authors of published medical research. While no data were available to evaluate the effectiveness of funder and institutional policies on data sharing, assessments of journal policies suggest that mandatory sharing policies are more effective than non-mandatory policies, as well as may demonstrate varying success rates according to the data type -a finding that may be informative for policymakers when designing policies and allocating resources to audit compliance.
Data Availability
Summary level data and the code required to reproduce all the findings of the review are freely available on the Open Science Framework (DOI: 10.17605/OSF.IO/U3YRP) under a Creative Commons Zero v1.0 Universal (CC0 1.0) license. Harmonised versions of IPD that were originally made publicly available can be shared on request, whereas to preserve the rights of data owners, harmonised versions of IPD that were shared privately with the review team will only be released with the permission of the data guarantor of the relevant meta-research study. To request harmonised IPD please follow the instructions on the project's Open Science Framework page (https://osf.io/stnk3).
Disclosures
Ethics approval
Not applicable.
Availability of data and materials
Summary level data and the code required to reproduce all the findings of the review are freely available on the Open Science Framework (DOI: 10.17605/OSF.IO/U3YRP) under a Creative Commons Zero v1.0 Universal (CC0 1.0) license. Harmonised versions of IPD that was originally made publicly available can be shared on request, whereas to preserve the rights of data owners, harmonised versions of IPD that were shared privately with the review team will only be released with the permission of the data guarantor of the relevant meta-research study. To request harmonised IPD please follow the instructions on the project’s Open Science Framework page (https://osf.io/stnk3).
Competing interests
The authors declare that they have no competing interests
Funding
No funding was received for this study. DGH is a PhD candidate supported by an Australian Commonwealth Government Research Training Program Scholarship. MJP is supported by an Australian Research Council Discovery Early Career Researcher Award (DE200101618).
Acknowledgements
We thank Steve McDonald for his assistance with the literature searches, A/Prof Sue Finch for her advice on the statistical analyses and all the meta-researchers who took the time to prepare data and address clarifications for the review.
References
- 1.↵
- 2.↵
- 3.
- 4.
- 5.
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.↵
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.↵
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.↵
- 74.
- 75.
- 76.
- 77.
- 78.↵
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.↵
- 88.
- 89.
- 90.
- 91.
- 92.↵
- 93.
- 94.
- 95.
- 96.
- 97.↵
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.
- 109.
- 110.
- 111.
- 112.
- 113.
- 114.
- 115.
- 116.
- 117.
- 118.
- 119.
- 120.
- 121.↵
- 122.↵
- 123.
- 124.
- 125.↵
- 126.
- 127.
- 128.
- 129.
- 130.
- 131.
- 132.
- 133.↵
- 134.
- 135.
- 136.
- 137.
- 138.
- 139.
- 140.↵
- 141.
- 142.↵
- 143.↵
- 144.↵
- 145.
- 146.↵
- 147.↵
- 148.↵
- 149.↵
- 150.↵
- 151.↵
- 152.↵
- 153.↵
- 154.↵
- 155.↵
- 156.↵
- 157.↵
- 158.↵
- 159.↵
- 160.↵
- 161.↵
- 162.↵
- 163.↵
- 164.↵
- 165.↵
- 166.↵
- 167.↵
- 168.↵
- 169.↵
- 170.↵
- 171.↵
- 172.↵
- 173.↵