Abstract
Purpose To assess the trustworthiness and impact of preprint trial reports during the COVID-19 pandemic.
Data sources WHO COVID-19 database and the L-OVE COVID-19 platform by the Epistemonikos Foundation (up to August 3rd, 2021)
Design We compare the characteristics of COVID-19 trials with and without preprints, estimate time to publication of COVID-19 preprint reports, describe discrepancies in key methods and results between preprint and published trial reports, report the number of retracted preprints and publications, and assess whether including versus excluding preprint reports affects meta-analytic estimates and the certainty of evidence. For the effects of eight therapies on mortality and mechanical ventilation, we performed meta-analyses including preprints and excluding preprints at 1 month, 3 months, and 6 months after the first trial addressing the therapy became available either as a preprint or publication (120 meta-analyses in total).
Results We included 356 trials, 101 of which are only available as preprints, 181 as journal publications, and 74 as preprints first and subsequently published in journals. Half of all preprints remain unpublished at six months and a third at one year. There were few important differences in key methods and results between trial preprints and their subsequent published reports. We identified four retracted trials, three of which were published in peer-reviewed journals. With two exceptions (2/60; 3.3%), point estimates were consistent between meta-analyses including versus excluding preprints as to whether they indicated benefit, no appreciable effect, or harm. There were nine comparisons (9/60; 15%) for which the rating of the certainty of evidence differed when preprints were included versus excluded, for four of these comparisons the certainty of evidence including preprints was higher and for five of these comparisons the certainty of evidence including preprints was lower.
Limitations The generalizability of our results is limited to COVID-19. Preprints that are subsequently published in journals may be the most rigorous and may not represent all trial preprints.
Conclusion We found no compelling evidence that preprints provide less trustworthy results than published papers. We show that preprints remain the only source of findings of many trials for several months, a length of time that is unacceptable in a health emergency. We show that including preprints may affect the results of meta-analyses and the certainty of evidence. We encourage evidence users to consider data from preprints in contexts in which decisions are being made rapidly and evidence is being produced faster than can be peer-reviewed.
Summary Box 1
What is already known on this topic
What is already known on this topic
Clinicians and decision-makers need rapidly available and credible data addressing the comparative effectiveness of treatments and prophylaxis for COVID-19.
Investigators have adopted preprint servers, which allow the rapid dissemination of research findings before publication in peer-reviewed journals.
What this study adds
What this study adds
We found no compelling evidence that preprints provide less trustworthy results than published papers.
We show that including preprints may affect the results of meta-analyses and the certainty of evidence and we encourage evidence users to consider data from preprints in contexts in which decisions are being made rapidly and evidence is being produced faster than can be peer-reviewed.
Background
During the COVID-19 pandemic, the scientific community adopted preprint servers, which allow investigators to disseminate research findings before publication in peer-reviewed journals. Authors of seminal COVID-19 trials, for example, representing massive international collaborations, such as RECOVERY (1-4) and SOLIDARITY (5, 6), reported their results in preprints before subsequent publication in journals.
Growing interest in preprints predates the COVID-19 pandemic (7, 8). Researchers and evidence users have raised concerns that the traditional publication model is slow, peer review may not always improve the quality of manuscripts, journals impede dissemination due to paywalls and high publication fees and encourage publication bias by prioritizing statistically significant or anomalous findings—issues preprints may avoid (9-15). Despite these concerns, and the potential of preprints to address them, because preprints may result in the dissemination of provisional findings that contain important errors that, presumably, published papers do not, the medical community has been cautious regarding their adoption (16, 17).
Authors of systematic reviews, guideline developers, and other decision makers face a trade-off when considering preprints: on the one hand, including preprints could reduce the credibility of evidence syntheses and risk serious errors if important differences appear in published reports; on the other, including preprints may increase the precision of estimates; allow timely dissemination of research, and may minimize the effects of publication bias.
Knowledge of the extent to which preprints may accelerate the dissemination of findings, the frequency and nature of discrepancies between pre-prints and subsequent published reports, frequency of retractions of preprints compared to publications, and the impact preprints on meta-analytic estimates could inform the trade-off that evidence users face. Our study capitalizes on our living systematic reviews and network meta-analyses (SRNMAs) of drug treatments, antiviral antibodies and cellular therapies, and prophylaxis for COVID-19—an initiative launched in July 2020 that provides real-time summaries addressing the comparative effectiveness of treatments and prophylaxis for COVID-19—to report on the characteristics, trustworthiness—that is, complete and consistent reporting of key aspects of the methods and results between preprint and published trial reports—and impact of COVID-19 trial preprint reports (18-20).
Methods
We submitted a protocol for this study for publication on September 7th, 2021. Because the protocol was still under review at the time the study was completed, we withdrew the protocol for publication and present it in Supplement 1.
Search
Our study uses the search strategy of our living SRNMA that includes daily searches in the World Health Organization (WHO) COVID-19 database—a comprehensive multilingual source of global published and preprint literature on COVID-19 (https://search.bvsalud.org/global-literature-on-novel-coronavirus-2019-ncov/). Prior to its merge with the WHO COVID-19 database on 9 October 2020, we searched the US Centers for Disease Control and Prevention (CDC) COVID-19 Research Articles Downloadable Database. A validated machine learning model facilitates efficient identification of randomized trials (21).
Our search is supplemented by ongoing surveillance of living evidence retrieval services, including the Living Overview of the Evidence (L-OVE) COVID-19 platform by the Epistemonikos Foundation (https://app.iloveevidence.com/loves/5e6fdb9669c00e4ac072701d) and the Systematic and Living Map on COVID-19 Evidence by the Norwegian Institute of Public Health (https://www.fhi.no/en/qk/systematic-reviews-hta/map/). Using the above sources, we also monitor for retraction notices and concerns regarding trial integrity. Supplement 2 includes additional details of our search strategy.
Study selection
As part of the living SRNMA, pairs of reviewers, following calibration exercises to ensure sufficient agreement, work independently and in duplicate to screen titles and abstracts of search records and subsequently the full texts of records determined potentially eligible at the title and abstract screening stage. Reviewers also link preprint reports with their subsequent publications based on trial registration numbers, the names of investigators, recruiting centres and countries, dates of recruitment, and other trial characteristics. When links between preprints and subsequent publications are unclear, we contact the author for confirmation. Reviewers resolve discrepancies by discussion or, when necessary, by adjudication with a third-party reviewer.
Eligible preprint and peer reviewed articles report trials that randomize patients with suspected, probable, or confirmed COVID-19 to drug treatments, antiviral antibodies and cellular therapies, placebo, or standard care or trials that randomize healthy participants exposed or unexposed to COVID-19 to prophylactic drugs, standard care, or placebo. We do not apply any restrictions on severity of illness, setting, or language of publication but do exclude trials that report on nutritional interventions, traditional Chinese herbal medicines without standardization in formulations and dosing across batches, and non-drug supportive care interventions.
We did not perform a sample size calculation since we included all eligible trial reports identified through our living SRNMAS up to August 3rd, 2021.
Data collection
As part of the living SRNMA, for each eligible trial, pairs of reviewers, following training and calibration exercises, independently extract trial characteristics, methods, and results using a standardized, pilot tested data extraction form. To assess risk of bias, reviewers, following training and calibration exercises, use a revision of the Cochrane tool for assessing risk of bias in randomized trials (RoB 2.0) (22) (Supplement 3). Reviewers resolve discrepancies by discussion and, when necessary, by adjudication with a third party.
For the current study, pairs of trained and calibrated reviewers, working independently and in duplicate and using a pilot-tested form, collected data on differences in key methods and results between preprint and published trial reports. Key methods included description of the randomization process and allocation concealment, blinding of patients and healthcare providers, extent of and handling of missing outcome data, blinding of outcome assessors and adjudicators, and prespecification of outcomes and analyses. Key results included number of participants analyzed and number of events in each trial arm for dichotomous outcomes and number of participants analyzed, means or medians and measures of variability for continuous outcomes. We focused on the same outcomes as our living SRNMA: mortality, mechanical ventilation, adverse events leading to discontinuation, viral clearance, admission to hospital, viral clearance, hospital length of stay, ICU length of stay, duration of mechanical ventilation, time to symptom resolution or clinical improvement, time to viral clearance, days free from mechanical ventilation, and time to viral clearance. For preprints with more than one version, we extracted data from the first version of the preprint, which is the least likely to have been modified in response to peer review.
Because risk of bias may vary across outcomes, for this analysis we present risk of bias judgements corresponding to the following hierarchy of outcomes for therapy trials: mortality, mechanical ventilation, duration of hospitalization, time to symptom resolution or clinical improvement, and virologic outcomes. For prophylaxis trials, we used the following hierarchy: mortality, laboratory confirmed and suspected COVID-19 infection, and laboratory confirmed COVID-19 infection. These hierarchies represent the relative importance of outcomes based on rankings made by the linked WHO guideline panel (23).
Data synthesis and analysis
Characteristics and risk of bias of trials
We compare the characteristics and risk of bias of trials with preprints, trials with publications, and trials first posted as a preprint and subsequently published by calculating differences in proportions and associated confidence intervals.
Time to publication of trial preprints
We calculated the median time from a trial being posted on a preprint server to its eventual publication in a journal and used Kaplan-Meier curves and log-rank tests to assess whether source of funding, number of centers and participants, early termination for benefit, intensity of care (inpatient versus outpatient), and severity (mild/moderate versus severe/critical COVID-19), statistically significant primary or secondary outcomes (based on cut-offs defined by the authors or, when no cut-offs were defined, based on a cut-off of p<0.05 or confidence intervals not including the null), and risk of bias (trials rated at low versus high risk of bias) are predictive of time to publication of trial preprints.
Differences between preprint and published trial reports
Among trial preprints that were subsequently published in a peer reviewed journal, we described the number and types of discrepancies in key methods and results between preprint and published trial reports. For discrepancies in the reporting of key methods, we report the number and percentage of the changes between preprints and publications that affected risk of bias judgements—changes that we considered to be critical.
Retractions
We compare the number of preprint and published trials that have been retracted.
Comparison of meta-analyses including versus excluding evidence from preprints
For trials that report on interventions that have been addressed by the linked WHO living guideline (23) up to August 3rd 2021 (i.e., corticosteroids, remdesivir, lopinavir-ritonavir, hydroxychloroquine, ivermectin, IL-6 receptor blockers, and convalescent plasma for treatment and hydroxychloroquine for prophylaxis) and the two most commonly reported outcomes (i.e., mortality, mechanical ventilation), we conducted pairwise frequentist random-effects meta-analyses with the restricted maximum likelihood estimator including versus excluding evidence from preprints at one, three, and six months after the first trial addressing the drug of interest was made public, either via preprint or publication. We also conducted an analysis including versus excluding evidence from preprints at August 3rd —the longest timepoint at which we collected data. For hydroxychloroquine for prophylaxis, because mechanical ventilation was not an outcome of interest for prophylaxis trials, we report only on mortality.
To facilitate interpretation, we calculated absolute effects. To calculate absolute effects, for drug treatments, we used mortality data from the CDC and data on ventilation from the International Severe Acute Respiratory and Emerging Infection COVID-19 database (24-26). For prophylaxis, we used the event rate among all participants randomized to standard care or placebo to calculate the baseline risk.
We compared the direction of effect between meta-analyses including versus excluding preprints. We considered the direction of effect different if one point estimate suggested no effect and another suggested a benefit or harm or if one point estimate suggested benefit and another suggested harm. For treatment, we considered an effect to be beneficial if the point estimate indicated a reduction in risk of mortality of 1% or greater or a reduction in risk of mechanical ventilation of 2% or greater for treatment. For prophylaxis, we considered an effect to be beneficial if the point estimate indicated a reduction in risk of mortality of 0.5% or greater. For treatment, we considered an effect to be harmful if the point estimate indicated that there was an increase in risk of mortality of 1% or greater or an increase in risk of mechanical ventilation of 2% or greater. For prophylaxis, we considered an effect to be harmful if the point estimate indicated an increase in risk of mortality of 0.5% or greater for prophylaxis. Otherwise, we inferred that there was no important effect.
The GRADE approach provided the framework for assessing certainty of evidence and informed whether including versus excluding preprint reports led to differences in ratings of the overall certainty of evidence, judgments related to specific GRADE domains, and whether differences in ratings are likely to impact decision making (i.e., evidence rated as high/moderate versus low/very low) (27). A minimally contextualized approach informed ratings of imprecision (28). We considered any effect on mortality and mechanical ventilation to be important. Thresholds of 1% risk difference for mortality and a 2% risk difference for mechanical ventilation informed judgements of minimal or no treatment effect. For prophylaxis and mortality, we used a 0.5% risk difference (28).
We performed all statistical analysis in R (version 4.03, R Foundation for Statistical Computing), using the meta, forestplot, survival, and survminer packages.
Patient/Public Involvement
Patients were involved in outcome selection, interpretation of results, and the generation of parallel recommendations, as part of the parallel SRNMA and guidelines (23).
Results
Trial characteristics
As of August 3rd, we identified 356 eligible trials, 101 of which were only available as preprints, 181 only available as journal publications, and 74 first available as preprints and subsequently published as journal articles. Supplement 4 presents additional details on the results of the search and Table 1 presents overall trial characteristics overall and trial characteristics stratified by publication status.
Typical trials were registered, completed at the time of reporting, addressed drug therapies, enrolled less than 250 participants, reported one or more secondary outcomes that were statistically significant, and were funded by governments and/or institutions. Nearly two thirds of trials were at high risk of bias, primarily due to their open-label design.
Compared to published trials without preprints, trials only available as preprints and trials first available as preprints and then subsequently published were more likely to be registered; trials only available as preprints were more likely to report on interim results, describe drug therapies compared to antiviral antibodies and cellular therapies or prophylaxis, and to have received industry funding; and trials first posted as preprints and subsequently published were more likely to have received government funding.
Predictors of publication and time to publication
During the 1.5 year span of this study, of 175 preprints, 74 (42.3%) were subsequently published in peer-reviewed journals. Table 2 presents the proportion of preprints published up to one year and table 3 presents predictors of publication. A quarter of preprints were published in peer reviewed journals by 3 months after the preprint first became available and half by 6 months. At one year, a third of preprints remained unpublished.
Preprints that received government funding, reported on inpatients, or reported on patients with severe disease were published faster than preprints that did not receive government funding, reported on outpatients, or patients with mild/moderate disease, respectively.
Differences between preprint and published trial reports
Forty-two trials (56.8%) had one or more discrepancies in the reporting of key methods and results between the preprint and the later published trial report. Supplement 5 describes these discrepancies.
Thirty trials (40.5%) had one or more discrepancies in the reporting of key methods--all of which had to do with information either missing from the preprint or the publication. The most common discrepancy in the reporting of key methods was the description of allocation concealment, which occurred in eight trials. Box 1 presents an example. Our judgement of risk of bias for the randomization domain changed from ‘probably high risk of bias’ to ‘low risk of bias’ for four of these eight trials due to additional details reported in the published report.
Box 1: Example of a trial that reported additional information on allocation concealment in the published report
The PANAMO trial, which was initially available as a preprint on SSRN and later published in Lancet Rheumatology, provided additional details in the publication on allocation concealment. The publication describes central randomization with an online tool and the development of the randomization list by a third party—all of which were not reported in the preprint (29, 30).
Preprint
“Patients were randomly assigned in a 1:1 ratio to receive IFX-1, at a dose of 800 mg intravenously, for a maximum of seven doses, plus best supportive care, or best supportive care only. [..] Randomization was performed with an online tool within the eCRF [electronic case report form] and was stratified by study site.”
Publication
“Patients were randomly assigned in a 1:1 ratio to IFX-1 plus best supportive care (the IFX-1 group) or to best supportive care only (the control group). Randomisation was done by investigators centrally with an online tool within the electronic case report form and was stratified by study site. The tool used a randomised variable block length of either 2 or 4. The randomisation list was only available to contract research organisation (Metronomia) staff involved in the production of the randomisation list and set-up of the online randomisation tool.”
Other differences in the reporting of key methods were the publication reporting one or more additional statistics important for meta-analysis (e.g., IQR or SD) that were not previously reported in the preprint (n=6; 8.1%), the preprint reporting on interim results and the publication on completed trial results (n=4; 5.4%), and the publication including a protocol and/or statistical analysis plan as a supplementary that was not previously included with the preprint (n=3; 4.1%). The overall trial rating of risk of bias, however, changed only for one trial based on additional information provided in the published report.
Thirty-one trials (41.9%) had one or more differences in the reporting of key results between preprints and publications. The most common discrepancy in the reporting of key results were changes in outcome data between preprints and publications, which was seen in 20 trials (27.0%)—though most of these discrepancies may be attributed to events accumulating in trials from the time at which the preprint was posted to when the trial was published. Box 2 presents an example.
Box 2: Example of a trial that reported different outcome data between the preprint and publication
The RECOVERY publication on dexamethasone, for example, reported different results for mortality and mechanical ventilation—likely not because of an error in the preprint but because of events accumulating in the trial from when the preprint was posted to when it was published (2, 31).
Despite discrepancies in outcome data being common, the magnitude and precision of effects were similar between preprints and publications. Figure 1 shows differences in results on mortality and mechanical ventilation between preprints and publications. Among all preprints with differences in outcomes, differences in relative effects did not exceed 15%, except for one trial with very few events that included just one additional event in the publication (32, 33). Other differences between preprints and publications in key results included the publication reporting an outcome that was not included in the preprint (n=11; 14.9%).
Retractions
We identified four retracted trials (34-41). Two trials reported on (hydroxy)chloroquine (34, 35)(36, 37), two on favipiravir (36, 37)(38, 39), and two on ivermectin (34, 35)(40, 41). One of the four trials was single-centre (40, 41) and the remainder were multi-centre trials. One of the trials was retracted when the authors noticed an error in their analysis (40, 41) and the remainder were retracted due to concerns about data fabrication or falsification (e.g., inconsistencies between the eligibility criteria and patients included in the trial, discrepancies between when the trial was reported to have been conducted and when patients were recruited, inconsistencies between the dataset and the results reported in the preprint, inconsistencies between the distribution of baseline variables and the described randomization procedure). We compared the number of retractions between preprints and journal publications. One of the retracted trials was posted as a preprint (34, 35) and the remainder were published in peer-reviewed journals.
Evidence synthesis including versus excluding preprint reports
Table 4 presents results of meta-analyses including and excluding data from unpublished preprints for the comparison of placebo or standard care with corticosteroids, remdesivir, lopinavir-ritonavir, hydroxychloroquine, ivermectin, IL-6 receptor blockers, and convalescent plasma and mortality and mechanical ventilation at 1 month, 3 months, and 6 months after the first trial addressing the intervention was made public, either as a preprint or a publication, and at the longest point of follow-up of the trials (August 3rd). In total, we performed and assessed the certainty of evidence of 120 meta-analyses, 60 of which included preprints and 60 of which excluded preprints. Supplement 6 presents forest plots for meta-analyses.
Because of insufficient data, we could not perform meta-analyses for six comparisons without preprints: mortality and mechanical ventilation, at one month, for ivermectin vs. placebo/standard care and IL-6 receptor blockers vs. placebo/standard care and mortality and mechanical ventilation, at three months, for ivermectin vs. placebo/standard care.
Differences in estimates from meta-analyses including versus excluding preprints
Except for two cases, across all meta-analyses including and excluding results from unpublished preprints, the point estimates were consistent as to whether they indicated benefit, no appreciable effect, or harm. The meta-analysis of corticosteroids at one month suggested a reduction in risk of mechanical ventilation when preprints were excluded (43 fewer per 1,000 people [95% CI 59.24 fewer to 22.12 fewer]; moderate certainty) and no appreciable effect when preprints were included (1.2 more per 1,000 people [95% CI 60.3 fewer to 131.1 more]; very low certainty). The meta-analysis of ivermectin at 6 months suggested no appreciable effect on risk of mechanical ventilation when preprints were excluded (2.3 fewer per 1,000 people [95% CI 52.2 fewer to 83.5 more]; low certainty) and a reduction in risk of mechanical ventilation (26.7 fewer per 1,000 people [95% CI 74.2 fewer to 75.4 more]; very low certainty) when preprints were included.
Differences in ratings of the certainty of evidence from meta-analyses including versus excluding preprints
There were nine cases for which the rating of the certainty of evidence was different when preprints were included versus excluded. For four of these nine cases, we rated the certainty of evidence including preprints higher than the evidence excluding preprints due to no serious concerns in imprecision in analyses including preprints. For five of these nine cases, we rated the certainty of evidence excluding preprints higher than the evidence including preprints, due to serious imprecision for four of five cases and due to serious imprecision and serious risk of bias for one of five cases. In six of these cases, differences in ratings of the certainty of evidence may have impacted decision-making (i.e., evidence including preprints is rated as high/moderate whereas evidence excluding preprints is rated as low/very low or vice versa).
Differences in ratings of the GRADE risk of bias domain from meta-analyses including versus excluding preprints
Between meta-analyses including versus excluding preprints, judgements related to the GRADE risk of bias domain differed only for the meta-analysis of remdesivir and mechanical ventilation at 6 months. We rated the meta-analysis excluding preprints to not have any concerns related to risk of bias and downgraded the meta-analysis including preprints due to serious risk of bias.
Differences in ratings of the GRADE imprecision domain from meta-analyses including versus excluding preprints
Between meta-analyses including versus excluding preprints, judgements related to the GRADE imprecision domain differed for 13 meta-analyses. We judged nine meta-analyses excluding preprints to have more serious concerns related to imprecision than their counterparts including preprints (i.e., additional data from preprints narrowed confidence intervals) and we judged four meta-analyses excluding preprints to have less serious concerns related to imprecision that their counterparts including preprints (i.e., additional data from preprints increased statistical heterogeneity and hence imprecision).
Discussion
Main findings
Our study presents a detailed assessment of the trustworthiness and impact of COVID-19 trial preprint reports. We show that preprints remain the only source of findings of many trials for several months. Half of all preprints, for example, remain unpublished at six months and a third at one year—a length of time that may be unacceptable in a health emergency like COVID-19. Preprints can importantly accelerate the time to dissemination of trial findings.
We did not find compelling evidence of important differences between preprint and published reports of trials—though preprint reports of trials that are subsequently published in journals may not be representative of all trial preprints. Further, we found retractions to occur for both preprints and publications, suggesting that publication in a peer-reviewed journal alone does not indicate trustworthiness of a trial report.
We also found that in most cases, meta-analyses including versus excluding evidence from preprints yielded consistent results and the same certainty of evidence. In a minority of circumstances, however, including preprints improved the certainty of evidence. At six months after trial data first became available, when preprints were excluded, we found low certainty evidence that IL-6 receptor blockers may reduce mortality—downgraded due to risk of bias and imprecision. Conversely, we found moderate certainty evidence IL-6 receptor blockers probably reduce mortality when preprints were included— downgraded due to risk of bias. For the case of IL-6 receptor blockers, considering evidence from preprints may have importantly accelerated the time to incorporation of IL-6 receptor blockers as part of standard care (23). In a minority of circumstances, including preprints also reduced the certainty of evidence. At one month, for example, when preprints were excluded, we found moderate certainty evidence that corticosteroids probably reduce mechanical ventilation—downgraded due to serious risk of bias—and very low certainty evidence when we included preprints—downgraded due to serious risk of bias and very serious imprecision. For the case of corticosteroids, including preprints increased heterogeneity (and hence increased imprecision), and may have misinformed evidence users.
Implications
Our findings have implications for evidence users who are concerned with the trustworthiness of preprints and for systematic reviewers and guideline developers deciding whether to consider preprint reports in systematic reviews and guideline recommendations. Our results support the overall trustworthiness of preprints and preprints as a venue through which the dissemination of trial findings may be accelerated. Peer review likely only addresses transparency of the trial reports and interpretation of results because at the time of submission the trial has already been conducted and major methodological decisions (such as whether to collect data on an outcome or whether to blind investigators) cannot be changed. We encourage systematic reviewers and guideline developers to consider trial evidence from preprints, especially in circumstances in which decisions are being made rapidly and evidence is being produced faster than can be peer reviewed and published.
We caution, however, that preprints (and publications) may describe untrustworthy trials with fabricated or falsified data. Evidence users may consider scrutinizing both preprint and published trials for anomalies that may suggest fabrication and falsification (Examples of which are reported in Box 3) (42, 43). While such methods cannot be used to definitively identify untrustworthy trials and require subjective judgements, they may be useful to identify trials that are at high risk of such issues. Evidence users may subsequently investigate such trials further or systematic reviewers may consider sensitivity analyses excluding such trials from meta-analyses. We direct readers to other sources that describe these methods (44).
Box 3: Methods to assess for fabrication and falsification in clinical trials (44)
Judge whether the reported recruitment speed is feasible given local disease patterns, trial eligibility criteria, and capacity of the recruiting centres (45, 46).
Review the trial registration or protocol and assess the consistency between the registration and the manuscript in aspects of the trial that cannot be modified after completion (e.g., blinding status).
Review profiles of investigators and/or institutions involved for a history of research misconduct.
Review baseline patient characteristics and test whether reported distributions are consistent with randomization (42, 43, 47).
Review primary data, when available, for duplicate records or inconsistencies between the data and reported statistics in the trial manuscript.
Review primary data, when available, and assess whether correlations between variables are plausible (45).
Relation to previous work
Our study is the first to present data addressing the contribution of preprints to the body of evidence addressing the comparative effectiveness of COVID-19 therapies and prophylaxis.
Two studies have reported on differences between preprint and published study reports and citations and Altmetric attention metrics (48, 49). One study addressed publication characteristics and dissemination of COVID-19 preprints and the other spin in interpretation of results. Both studies were, however, restricted to only publications up to August and October 2020—which is not representative of the current landscape of COVID-19 research and which does not include the majority of evidence being currently used to guide COVID-19 care, including critical trials addressing the effects of corticosteroids and IL-6 receptor blockers (1, 2). These studies did not compare the effects of including preprints on meta-analytic estimates and the certainty of the body of evidence, which is particularly important because evidence users use the totality of the body of evidence, rather than single studies, to make treatment decisions and recommendations (48).
The COVID-19 pandemic highlighted the need for rapid dissemination of research and incited increased interest in preprint servers, which yielded to an incredible amount of research that was published on preprint servers and which made this study possible. We are not aware of studies addressing the trustworthiness or impact of preprints in other areas—though such research in other areas would also be useful.
Strengths and limitations
The strengths of this study include the comprehensive search for, and inclusion of preprint and published COVID-19 trial reports and rigorous data collection. The generalizability of our results is, however, limited to COVID-19. Journals have expedited the publication of COVID-19 research and have been publishing more prolifically on COVID-19 than in other areas, which may reduce opportunity for revisions between preprints and their subsequent publications and may mean time to and predictors of publication may be different than in other research areas.
Although the WHO COVID-19 database is a comprehensive source of published and preprint literature, it does not include all preprint servers—though preprint servers not covered by our search address other subjects and are unlikely to include COVID-19 trials.
It is likely that preprint reports of trials that are subsequently published in journals represent the most rigorous or transparently reported preprints and that they are not representative of all trial preprints.
To assess preprint trustworthiness, we compared reporting of key aspects of the methods and results between preprint and published trial reports. We acknowledge, however, that published trial reports may still contain errors and that posting trial reports as preprints may allow more errors to be identified prior to final publication.
We report on the number of publications and preprints that were retracted. Preprints, however, may be less likely to be retracted because they may draw less attention and because preprint servers may be less likely than journals to have formal policies addressing research integrity.
Our assessment of the contribution of trial preprint reports to meta-analytic estimates and their effect on the certainty of evidence was undertaken in the context of pairwise meta-analysis and the minimally contextualized approach for assessing the certainty of evidence whereas parallel guideline recommendations have been based on network meta-analyses and the fully contextualized approach (23, 28). There are no compelling reasons that our results will not be generalizable to network meta-analyses. On the other hand, there will be more differences in judgements related to the certainty of evidence in a fully contextualized framework where judgements are more dependent on the magnitude and precision of estimates.
We limited our assessment of the impact of including versus excluding preprint reports on meta-analytic estimates to only interventions that have been addressed by the WHO living guideline at the time of analysis (23). While the effects of including or excluding preprints in meta-analyses may vary across interventions, our analysis addresses the interventions for which there has been sufficient interest and research to instigate guideline recommendations. Interventions informed by fewer trials may be more sensitive to including or excluding evidence from preprints.
Our estimate of the time to publication of preprint reports may be overestimated if some preprint authors do not attempt to subsequently publish in peer-reviewed journals—although evidence shows that most authors of COVID-19 preprints intend to publish their findings (48). Time to publication may also be underestimated if preprints are made public later in the submission process.
Finally, although we describe discrepancies in the reporting of key methods and results, we did not assess differences in the discussion or conclusion sections of trial reports.
Conclusions
We found no compelling evidence that preprint trial reports provide less trustworthy results than published trials. We show that including preprints may affect the results of meta-analyses and the certainty of evidence and encourage evidence users to consider data from preprints in contexts in which decisions are being made rapidly and evidence is being produced faster than can be peer reviewed and published. Skepticism may still be warranted when suspicion arises regarding falsified data (for which we provide criteria).
Acknowledgements
None.
Footnotes
Disclaimers: None.
Funding: This study was supported by the Canadian Institutes of Health Research CIHR-IRSC:057900132 and Coronavirus Rapid Research Funding Opportunity - OV2170359. DZ is funded by a Banting Postdoctoral Fellowship.
Competing interests: None
Data: Data is available at https://osf.io/9adxb/.