Data Availability
I use Covid-19 Open Research Dataset (CORD-19) to calculate COVID-19 preprint corpus' conversion rate to peer-reviewed articles. Arguably the most ambitious bibliometric COVID-19 project, CORD-19 is the collaborative effort between the Allen Institute for AI and half a dozen organizations including NIH and the White House (for more details, see Wang et al., 2020). This is an open-source dataset. I also used bioRxiv API pipeline to determine if COVID-19 preprints were associated with a peer-review final counterpart. I also scraped pubmed and pmc NIH's websites for the same purpose. Finally, I use the Python 'wrapper' package "arxiv" to query arXiv aPI to, again, determine if certain COVID-19 arXiv preprints had also been a published peer-reviewed journal.
https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
https://api.biorxiv.org/details/biorxiv/
https://pubmed.ncbi.nlm.nih.gov/