Abstract
Background Research papers related to COVID-19 have exploded. We aimed to explore the academic value of preprints through comparing with peer-reviewed publications, and synthesize the parameter estimates of the two kinds of literature.
Method We collected papers regarding the estimation of four key epidemiological parameters of the COVID-19 in China: the basic reproduction number (R0), incubation period, infectious period, and case-fatality-rate (CFR). PubMed, Google Scholar, medRxiv, bioRxiv, arRxiv, and SSRN were searched by 20 March, 2020. Distributions of parameters and timeliness of preprints and peer-reviewed papers were compared. Further, four parameters were synthesized by bootstrap, and their validity was verified by susceptible-exposed-infectious-recovered-dead-cumulative (SEIRDC) model based on the context of China.
Findings 106 papers were included for analysis. The distributions of four parameters in two literature groups were close, despite that the timeliness of preprints was better. Four parameter estimates changed over time. Synthesized estimates of R0 (3·18, 95% CI 2·85-3·53), incubation period (5·44 days, 95% CI 4·98-5·99), infectious period (6·25 days, 95% CI 5·09-7·51), and CFR (4·51%, 95% CI 3·41%-6·29%) were obtained from the whole parameters space, all with p<0·05. Their validity was evaluated by simulated cumulative cases of SEIRDC model, which matched well with the onset cases in China.
Interpretation Preprints could reflect the changes of epidemic situation sensitively, and their academic value shouldn’t be neglected. Synthesized results of literatures could reduce the uncertainty and be used for epidemic decision making.
Funding The National Natural Science Foundation of China and Beijing Municipal Natural Science Foundation.
Evidence before this study Since its outbreak, scientific articles about the COVID-19 have greatly surged, with a significant portion as non-peer-reviewed preprints. Although preprints captured great attention, the credibility of preprints was widely debated. We searched PubMed and Google on March 20, 2020, for publications that discussed the preprints during the COVID-19 pandemic, using the terms (“preprints” AND “COVID-19”). We identified 12 papers and news, and found that scientists were skeptical of preprints mainly because rigorous peer review is absent and thus the conclusions of preprints may not be reliable. However, scientists’ opinions could have been biased towards limited data, and there is few knowledges about the validity of the results reported in the preprints. Further, to examine how scientists utilize results of preprints, taking the epidemiological parameter estimation as the objects, we searched reviews on Google using the terms (“epidemiology” AND (“meta-analysis” OR “reviews”) AND “COVID-19”) on May 23, 2020. Nine papers were identified. We found that existing meta-analysis and reviews included few preprints. This may be due to the fact that the quality of preprints was not recognized, and thus their academic value was underestimated. Overall, the validity of the results as reported in the preprints should be further examined and the potential of synthesizing preprints with formally published papers should be explored.
Added value of this study Our study adds value in four main ways. First, we collected preprints and peer-reviewed papers on estimations of the four most important epidemiological parameters (the basic reproduction number, incubation period, infectious period, and case-fatality-rate) for the COVID-19 outbreak in China. 106 papers were included and available data were extracted. Second, we quantitatively compared the differences and timeliness between preprints and peer-reviewed publications in the estimation of the four parameters, and found that the validity of the preprints’ estimations was largely consistent with that of the peer-reviewed group. Third, we synthesized the estimations of the two groups of literatures using bootstrap method, and found that the values of infectious period and case-fatality-rate decreased over time, indicating that the synthesized results timely reflected the changing trend of the COVID-19 in China. Finally, the practicability of the synthesized parameter estimations was verified by the data of confirmed cases in China. The cumulative infection curve simulated using synthesized parameters fitted the real data well.
Implications of all the available evidence Results of our study indicate that the validity of the COVID-19 parameter estimations of the preprints is on par with that of peer-reviewed publications, and the preprints are relatively timelier. Further, the synthesized parameters of the two literature groups can effectively reduce the uncertainty and capture the patterns of epidemics. These results provide data-driven insights into the academic value of preprints, which have been arguably underestimated. The scientific community should actively capitalize the collective wisdom generated by the huge amount of preprints, particularly during the emerging infectious diseases like the COVID-19.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study was funded by National Natural Science Foundation of China (Nos. 72042018,91546112, 71621002) and Beijing Municipal Natural Science Foundation (No. L192012).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Paper in collection COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv
The Chan Zuckerberg Initiative, Cold Spring Harbor Laboratory, the Sergey Brin Family Foundation, California Institute of Technology, Centre National de la Recherche Scientifique, Fred Hutchinson Cancer Center, Imperial College London, Massachusetts Institute of Technology, Stanford University, University of Washington, and Vrije Universiteit Amsterdam.