Summary
Background Genomic surveillance is essential for monitoring the emergence and spread of SARS-CoV-2 variants. SARS-CoV-2 diagnostic testing is the starting point for SARS-CoV-2 genomic sequencing. However, testing rates in many low- and middle-income countries (LMICs) are low (mean = 27 tests/100,000 people/day) and global testing rates are falling in the post-crisis phase of the pandemic, leading to spatiotemporal biases in sample collection. Various public health agencies and academic groups have produced recommendations on sample sizes and sequencing strategies for effective genomic surveillance. However, these recommendations assume very high volumes of diagnostic testing that are currently well beyond reach in most LMICs.
Methods To investigate how testing rates, sequencing strategies and the degree of spatiotemporal bias in sample collection impact variant detection and monitoring outcomes, we used an individual-based model to simulate COVID-19 epidemics in a prototypical LMIC. Within the model, we simulated a range of testing rates, accounted for likely testing demand and applied various genomic surveillance strategies, including sentinel surveillance.
Findings Diagnostic testing rates play a substantially larger role in monitoring the prevalence and emergence of new variants than the proportion of samples sequenced. To enable timely detection and monitoring of emerging variants, programs should achieve average testing rates of at least 100 tests/100,000 people/day and sequence 5-10% of test-positive specimens, which may be accomplished through sentinel or other routine surveillance systems. Under realistic assumptions, this averages to ∼10 samples for sequencing/1,000,000 people/week.
Interpretation For countries where testing capacities are low and sample collection is spatiotemporally biased, surveillance programs should prioritize investments in wider access to diagnostic testing to enable more representative sampling, ahead of simply increasing quantities of sequenced samples.
Funding European Research Council, the Rockefeller Foundation, and the Governments of Germany, Canada, UK, Australia, Norway, Saudi Arabia, Kuwait, Netherlands and Portugal.
Evidence before this study Genomic sequencing has been an integral part of the COVID-19 pandemic response, critical to monitoring the evolution of SARS-CoV-2 and identifying novel variants of interest and variants of concern (VOCs). As of March 2022, more than 10 million unique sequences had been submitted to GISAID. However, SARS-CoV-2 sequences have been disproportionately submitted from high-income countries (HICs), with large surveillance gaps existing in most LMICs. To strengthen genomic surveillance of SARS-CoV-2, previous studies focused on estimating a minimal number of positive SARS-CoV-2 tests to reflex for sequencing for effective variant detection and monitoring. We searched PubMed and Google Scholar using combinations of search terms (i.e., “SARS-CoV-2”, “COVID-19”, “diagnostic”, “genomic surveillance”, “sequencing”, “LIC”, “LMIC”) and critically considered published articles and preprints that studied or reviewed SARS-CoV-2 testing and genomic surveillance, especially in the LMIC context. We also reviewed SARS-CoV-2 sequencing recommendations published by the World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). We reviewed all studies and the latest recommendations published in English up to February 2022. We found that prevailing recommendations for estimating sequencing sample size to identify or monitor the prevalence of new variants assume that COVID-19 testing is performed at high rates per capita and in high absolute numbers, such that the sequenced samples are largely representative of the circulating SARS-CoV-2 viral diversity. This is, however, not the case in many countries, particularly in many LMICs, and can vary dramatically depending on the epidemiological situation.
Added value of this study To our knowledge, this is the first study that quantitatively estimates the joint impact of COVID-19 testing rates and sequencing strategies on SARS-CoV-2 variant detection and monitoring. We developed an individual-based COVID-19 transmission model that was specifically designed to simulate VOC emergence in LMICs under a wide range of test availability and sampling strategies for sequencing. We showed that given the current average COVID-19 testing rate of 27 tests per 100,000 people per day across LMICs, the sequencing sample size recommendations for early variant detection from WHO/ECDC and other academic groups would likely result in delayed detection of a new VOC until it had spread through a substantial portion of the population. We quantitatively demonstrated that increasing COVID-19 testing rates to at least 100 tests per 100,000 people per day, including through sentinel surveillance sites, and sampling as broadly as possible, yields far earlier VOC detection and greater accuracy of variant prevalence estimates than simply increasing the proportion of samples to be sequenced.
Implications of the available evidence Spatiotemporal representativeness of SARS-CoV-2 positive samples being sequenced, which can be accomplished by increasing diagnostic testing rates, and widening the geographic coverage from where samples are collected, as well as shortening sequencing turnaround time are the key features of an effective genomic surveillance program aimed at detection and monitoring of novel SARS-CoV-2 variants. Only once these areas have been strengthened does increasing the volume of sequenced samples have significant impact.
Competing Interest Statement
A.T., E.H., S.C., B.R. and B.E.N. declare that they are employed by FIND, the global alliance for diagnostics.
Funding Statement
A.X.H. and C.A.R. were supported by ERC NaviFlu (No. 818353). C.A.R. was also supported by NIH R01 (5R01AI132362-04) and an NWO Vici Award (09150182010027).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data relevant to the study are included in the Article, the Supplementary Appendix and the github repository (https://github.com/AMC-LAEB/PATAT-sim). The PATAT model source code is also available at https://github.com/AMC-LAEB/PATAT-sim.