SARS-CoV-2 Community Transmission During Shelter-in-Place in San Francisco
==========================================================================

* Gabriel Chamie
* Carina Marquez
* Emily Crawford
* James Peng
* Maya Petersen
* Daniel Schwab
* Joshua Schwab
* Jackie Martinez
* Diane Jones
* Douglas Black
* Monica Gandhi
* Andrew D. Kerkhoff
* Vivek Jain
* Francesco Sergi
* Jon Jacobo
* Susana Rojas
* Valerie Tulier-Laiwa
* Tracy Gallardo-Brown
* Ayesha Appa
* Charles Chiu
* Mary Rodgers
* John Hackett, Jr.
* CLIAhub Consortium
* Amy Kistler
* Samantha Hao
* Jack Kamm
* David Dynerman
* Joshua Batson
* Bryan Greenhouse
* Joe DeRisi
* Diane V. Havlir

## ABSTRACT

**Background** We characterized SARS-CoV-2 infections in a densely-populated, majority Latinx San Francisco community six-weeks into the city’s shelter-in-place order.

**Methods** We offered SARS-CoV-2 reverse transcription-PCR and antibody (Abbott ARCHITECT IgG) testing, regardless of symptoms, to all residents (>4 years) and workers in a San Francisco census tract (population: 5,174) at outdoor, community-mobilized events over four days. We estimated SARS-CoV-2 point prevalence (PCR-positive) and cumulative incidence (antibody or PCR-positive) in the census tract and evaluated risk factors for recent (PCR-positive/antibody-negative) versus prior infection (antibody-positive/PCR-negative). SARS-CoV-2 genome recovery and phylogenetics were used to measure viral strain diversity, establish viral lineages present, and estimate number of introductions.

**Results** We tested 3,953 persons: 40% Latinx; 41% White; 9% Asian/Pacific Islander; and 2% Black. Overall, 2.1% (83/3,871) tested PCR-positive: 95% were Latinx and 52% asymptomatic when tested. 1.7% of residents and 6.0% of workers (non-census tract residents) were PCR-positive. Among 2,598 census tract residents, estimated point prevalence of PCR-positives was 2.3% (95%CI: 1.2-3.8%): 3.9% (95%CI: 2.0-6.4%) among Latinx vs. 0.2% (95%CI: 0.0-0.4%) among non-Latinx persons. Estimated cumulative incidence among residents was 6.1% (95%CI: 4.0-8.6%). Prior infections were 67% Latinx, 16% White, and 17% other ethnicities. Among recent infections, 96% were Latinx. Risk factors for recent infection were Latinx ethnicity, inability to shelter-in-place and maintain income, frontline service work, unemployment, and household income <$50,000/year. Five SARS-CoV-2 phylogenetic lineages were detected.

**Conclusion** SARS-CoV-2 infections from diverse lineages continued circulating among low-income, Latinx persons unable to work from home and maintain income during San Francisco’s shelter-in-place ordinance.

## Introduction

In early 2020, multiple introductions of SARS-CoV-2 into the United States laid the foundation for the ongoing epidemic that has claimed over 100,000 U.S. lives in less than 6 months.1 Some of the earliest clinical cases of COVID-19 were recognized in California,2 and the state led the nation in issuing a state-wide shelter-in-place mandate on March 19.3 San Francisco declared a local emergency on February 25 and issued a series of increasingly restrictive mandates on sizes of gatherings culminating in a shelter-in place order on March 16. Although peak hospitalization and death rates in San Francisco over the ensuing month were nearly 10-fold lower than hard-hit cities such as New York,4 the pattern of disproportionately higher hospitalizations among communities of color was similar.5 In San Francisco, 45% of reported COVID-19 cases are among Latinx people, who represent 15% of the city’s population.6

Hospitalizations and deaths represent a small fraction of the total SARS-CoV-2 infections in a community.7 Estimates of the burden of community SARS-CoV-2 infections from direct measurements have been difficult to obtain and compare because symptomatic testing programs capture only a proportion of cases,8 the recognized symptoms associated with COVID-19 expanded over time,9 the assays used to identify infection have variable performance characteristics, and easily accessible testing programs are not in place for some of the most highly affected communities. Data on community transmission and ethnic disparities in SARS-CoV-2 infection, as opposed to COVID-19 disease10, as well as systematic efforts to determine factors driving these disparities remain limited.11

To characterize ongoing community transmission during a citywide shelter-in-place mandate, we offered population-based, universal testing for SARS-CoV-2 infection to all residents of a densely-populated census tract within a majority Latinx community in San Francisco.

## Methods

Unidos en Salud is a longitudinal study to characterize SARS-CoV-2 epidemiology and assess impact of public health measures within a US census tract in San Francisco. Six-weeks into the city’s shelter-in-place ordinance, we offered SARS-CoV-2 reverse transcription-PCR (RT-PCR) and antibody testing, regardless of symptoms, to residents (>4 years) and people who work but may not reside in the census tract.

### Study Setting and Community Mobilization

U.S. census tract 022901 is a population-dense, 16-square-block (0.1 square-mile) area in San Francisco’s Mission District, with 5,174 residents of whom 58% are Latinx, 34% White/Caucasian, 5% Asian/Pacific Islander, and 1% Black/African American. Median per capita income was $40,420/year in 2018, with 34% of households earning <$50,000/year and 20% earning >$200,000/year.12 In partnership with the Latino Task Force for COVID-19, an umbrella organization coordinating local Latinx community-based organizations, we distributed flyers, mobilized the community on local and social media, and offered online and door-to-door pre-registration for testing appointments during the week prior to the testing campaign.

### Testing Campaign

From April 25-28, 2020, we offered outdoor testing at public parks and schools to those who provided an address in the tract or worked in the tract. On April 28, we expanded eligibility to residents of neighboring city blocks, responding to high community demand. One week later, we offered testing to home-bound residents who could not reach campaign sites. During pre-registration, we conducted a brief survey. At the time of testing, we obtained verbal consent for participation and conducted COVID-19 symptom screening. Medical staff performed a fingerstick blood collection (500µL) for antibody testing and an oropharyngeal/mid-turbinate nasal swab for quantitative RT-PCR. Participants could opt out of either test. We contacted all PCR-positive persons to disclose results and perform a clinical assessment. We provided household support via a community-led team for PCR-positive participants and evaluated symptoms among all PCR-positive participants over 2 weeks following testing.

### Laboratory assays

Swabs were collected in DNA/RNA Shield (Zymo Research) to inactivate virus and preserve RNA stability. RT-PCR of viral N and E genes and human RNAse P gene was performed on extracted RNA at a CLIA-certified laboratory operated by UCSF and the Chan Zuckerberg Biohub using a Laboratory Developed Test with a limit of detection of log10 4.5 viral genome copies/mL. SARS-CoV-2-positive RNA samples were subjected to Primal-Seq Nextera XT version 2.0,13 using the ARTIC Network V3 primers,14 followed by paired-end 2 × 150bp sequencing on an Illumina NovaSeq platform. For antibody testing, the ARCHITECT SARS-CoV-2 IgG Emergency Use Authorization (EUA) assay (Abbott Laboratories, Abbott Park, IL, USA)15 was performed on participants’ plasma from the fingerstick collections, which is a research use of the test.

### Study Outcomes

Outcomes included the estimated point prevalence of all PCR-positive infections, recent infections (PCR-positive/antibody-negative) and prior infections (antibody-positive/PCR-negative). Cumulative incidence of infection was defined as any PCR or antibody-positive result. Phylogenetics were used to measure strain diversity.

### Statistical analyses

Proportions were compared using chi-squared tests and medians compared using Wilcoxon rank-sum tests. Cumulative incidence of infection was adjusted for RT-PCR and antibody test characteristics; 95% confidence intervals incorporating uncertainty in test characteristics were based on bootstrap. For census tract residents, we further adjusted for differences in age, sex, and race/ethnicity of participants compared to 2018 census estimates (**Supplementary Methods**). We used multivariate logistic regression for the dependent outcome of PCR-positivity among participants tested. We did not adjust confidence intervals for multiple testing.

### Bioinformatics and genomic analyses

A phylogenetic tree was constructed containing all 123 SARS-CoV-2 genomes from San Francisco County on GISAID16 on May 22, 2020 together with the high-quality consensus genomes assembled from this study, using the nextstrain toolkit.17 Global clade identification and naming follows the Nextstrain proposal,18 and significance of population structure was computed by a permutation test was used for Hudson’s FST (**Supplementary Methods**).19

### Ethics Statement & Funding

The UCSF Committee on Human Research determined that the study met criteria for public health surveillance. The study was supported by the Chan Zuckerberg Biohub, UCSF, and a Program for Breakthrough Biomedical Research award. ARCHITECT SARS-CoV-2 test kits were provided by Abbott Laboratories.

## Results

### Testing Uptake and Coverage

From April 25-28, 2020, we tested 3,913 people. From May 6-7, we tested an additional 40 home-bound residents, for a total of 3,953 persons tested. Of all persons tested, 53% were male, and 40% identified as Latinx, 41% White, 9% Asian/Pacific Islander, 2% Black, and 7% other/mixed ethnicity (**Table 1**). Estimated census tract testing coverage of adult residents (age >20 years) was 60%.

View this table:
[Table 1.](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T1)

Table 1. 
Characteristics of persons participating in a population-based SARS-CoV-2 testing campaign in the study census tract (US Census tract 022901).

### SARS-CoV-2 Infection by PCR testing

Among 3,871 tested by SARS-CoV-2 RT-PCR, 2.1% (83 people) tested PCR-positive: 1.7% (43/2,598; 95%CI: 1.2%-2.2%) of census tract residents, 6.0% (27/450; 95%CI: 4.0%-8.6%) of tract workers and 1.6% (13/823; 95%CI: 0.8%-2.7%) of residents of neighboring city blocks. Among all persons tested, 237 (6.1%) reported symptoms compatible with COVID-19 of whom 31 (13.1%) tested PCR-positive.

Among PCR-positive persons, 95% identified as Latinx, median age was 38 years (interquartile range [IQR]: 28-50 years), and 76% were male. Persons testing PCR-positive were significantly more likely than persons testing PCR-negative to identify as Latinx, report inability to shelter-in-place and maintain income, work frontline-service jobs or be unemployed, and live in households with income <$50,000/year and >3 persons/household (**Table 2**). Given that 95% of PCR-positive persons were Latinx, we limited our multivariate model to Latinx participants to evaluate risk factors for PCR-positivity within this group, and found significantly higher odds of PCR-positive infection if male (OR: 2.0, 95% CI: 1.1-3.6, p=0.02), working a frontline service job (OR: 2.6, 95% CI: 1.4-5.1, p=0.004, ref: non-frontline), household income <$50,000/year (OR: 8.9, 95% CI: 1.9-158, p=0.03, ref: >$100,000/year), or reporting a COVID-19 contact (OR: 3.6, 95%CI: 2.0-6.3, p<0.001).

View this table:
[Table 2.](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T2)

Table 2. 
Characteristics of SARS-CoV-2 testing campaign participants who tested PCR-positive compared to PCR-negative, and factors associated with increased odds of PCR-positivity.

Estimated point prevalence of PCR-positive infection in the census tract after adjusting for age and sex of participants in the testing campaign versus census demographics was 2.3% (95% CI: 1.2%-3.8%): 3.9% (95%CI: 2.0%-6.4%) among Latinx vs. 0.2% (95%CI: 0.0%-0.4%) among non-Latinx tract residents. Among Latinx people who worked in the census tract, unadjusted point prevalence of PCR-positive infection was 10.4% (95% CI: 7.0%-14.8%), compared to 0.0% (95% CI: 0.0%-2.0%) among non-Latinx workers.

### Clinical characteristics of PCR-positive persons

Among 83 PCR-positive persons, 43 (52%) were asymptomatic at the time of testing. Two-week follow-up was obtained for 41 participants who were asymptomatic at the time of testing: 8/41 (20%) recalled mild symptoms that had resolved by the time of testing, 10 (24%) developed symptoms after testing (pre-symptomatic) and 23 (56%) remained asymptomatic. Based on reclassified symptom status among PCR-positive people, 39/80 (49%) were symptomatic at time of testing, 8 (10%) were previously symptomatic, 10 (12.5%) were pre-symptomatic, and 23 (29%) remained asymptomatic throughout infection. One PCR-positive person (1.3%) required hospitalization.

### SARS-CoV-2 Cumulative Incidence and Recent vs. Prior Infections

Among 3,861 participants tested for SARS-CoV-2 antibodies, 3.4% (131) tested Ab-positive: 3.1% (80/2,545; 95%CI: 2.5%-3.9%) among census tract residents compared to 7.7% (34/442; 95%CI: 5.4%-10.6%) among tract workers and 2.1% (17/829; 95%CI: 1.2%-3.3%) among adjacent city block residents. Estimated cumulative incidence (Ab or PCR-positive) among tract residents was 4.4% (95% CI: 3.2%-5.6%) after adjusting for test characteristics and 6.1% (95%CI: 4.0%-8.6%) after further adjusting for participation (**Table S1**).

Among all infections detected by PCR or Ab, 26% (48/182) were recent infection. 53% (96/182) were prior infection. Of the remaining infections, 18% (32/182) were PCR-positive/Ab-positive, and 3% (6/182) had PCR or antibody testing alone (**Table S2**). Compared to individuals with prior infection, people with recent infection were significantly more likely to be of Latinx ethnicity, report inability to shelter-in-place and maintain income, work frontline service jobs or be unemployed, and live in households with income <$50,000/year (**Table 3**). Moreover, 47% vs. 89% of prior vs. recent infections, respectively, occurred in households with income <$50,000/year (**Table S2**).

View this table:
[Table 3.](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T3)

Table 3. 
Comparison of prior (Ab+/PCR-) vs. recent infection (Ab-/PCR+) infection among persons participating in a population-based SARS-CoV-2 testing campaign.

### SARS-CoV-2 RNA levels and phylogenetic analysis

Median levels of virus as estimated by RT-PCR cycle thresholds were significantly higher among PCR-positive/Ab-negative persons compared to PCR-positive/antibody-positive persons, supporting our classification of recent infection (**Figure 1**). Among recently-infected individuals, median levels of virus by RT-PCR cycle threshold did not differ significantly between symptomatic (24, IQR: 19-25, range 11-35; N=27) and asymptomatic (24, IQR: 19-26, range 16-32; N=10; p=0.98) persons (**Figure 1**); additional comparisons by subgroup are in **Figure S1**.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/06/17/2020.06.15.20132233/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/F1)

Figure 1: Quantitative levels of virus among participants with PCR-positive SARS-CoV-2 infections (N=80) by classification as asymptomatic, pre-symptomatic, and symptomatic for COVID-19 disease as determined over longitudinal follow-up (2 weeks post-testing), and stratified by antibody status with PCR+/Abpersons considered consistent with recent infection.

We recovered SARS-CoV-2 genomes from 59% (49/83) of the PCR-positive RNA samples. The recovered genomes were diverse and phylogenetically intermixed with samples from across San Francisco, including representatives from five globally circulating clades, showing multiple independent introductions (**Figure 2, Panel A**). Overall, 58% of PCR-positive participants shared a home with another PCR-positive participant identified in the testing campaign: sequences from such households were consistent with within-household transmission (**Figure 2, Panel B)**, with no variants detected in 65% of household links (N = 11/17, 95% CI 41%-83%). We found no significant population structure separating the Mission district samples from the rest of San Francisco (p=0.19).

![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/06/17/2020.06.15.20132233/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/F2)

Figure 2: Viral genomic diversity among PCR-positive participants. Panel A: phylogenetic tree containing Mission district samples (red) and other San Francisco samples (grey), x-axis marks the number of mutations with respect to the reference genome from Wuhan. Yellow arrows mark introductions of five major global clades (right brackets) to the study population. Panel B: tree subset to Mission district samples. Shape indicates district resident or worker and color indicates antibody status. Households with multiple PCR-positive persons are drawn in green and include markers for samples from which genomes could not be recovered. Asterisk marks a household outside of the district in which an unhoused person in the district spent time.

## Discussion

We found stark ethnic and economic disparities in who is at risk for ongoing infection six-weeks into San Francisco’s shelter-in-place ordinance. The estimated point prevalence of PCR-positive infection among Latinx residents (3.9%) was twenty times that of non-Latinx residents (0.2%). Perhaps even more striking was that recent infections were concentrated almost exclusively among low-income, Latinx people working frontline jobs, whereas prior infections occurred among more ethnically and economically diverse individuals. In addition, the majority of PCR-positive infections were asymptomatic at the time of testing, and recently infected individuals had high levels of virus regardless of symptoms. These data show that San Francisco’s COVID-19 epidemic has continuing transmission in subgroups of the city population that require urgent attention.

Heterogeneity among populations most affected by the COVID-19 epidemic exists across the U.S., within states and cities, and even within neighborhoods as shown here. Population-level epidemiologic data coupled with phylogenetic analyses can help identify, track and inform testing strategies, public health policies and measures to mitigate health and economic effects. Low-barrier, community-mobilized testing is foundational to these efforts. We sought to overcome testing barriers in this census tract through our partnership with the community-led Latino Task Force in San Francisco, who provided trusted explanations about COVID-19 and messaging on the importance of testing to the community. Through this approach, we were able to test a large proportion of the population in a short period of time.

We determined that during shelter-in-place, COVID-19 transmission became increasingly concentrated in Latinx community members. The risk factors driving recent transmission among Latinx residents were largely economic and highly correlated: low-income residents working frontline jobs who could not shelter-in-place and maintain their income. Transmission was amplified in Latinx multi-generational and multi-family households – a byproduct of skyrocketing rental costs in the city. These economic drivers and ethnic disparities observed at the community level here are reflected in COVID-19 hospitalizations and deaths widely reported in the US.10,20

We observed high sequence diversity of SARS-CoV-2 infections in the census tract, similar to the diversity seen in San Francisco more broadly, suggestive of multiple independent introductions over time. Our data suggest that most recent infections during shelter-in-place are due to acquisition of virus when working or seeking work in the census tract, with subsequent within-home transmission in high-density, low-income households. These findings should help dispel common, dangerous pitfalls in interpreting ethnic disparities in infection, such as biological explanations, supposed community behaviors or stigmatizing communities as transmission “hot spots” about which others have cautioned.11

Our results also highlight the importance of SARS-CoV-2 testing in both symptomatic and asymptomatic individuals. Symptom-based testing would have failed to detect over 40% of PCR-positive infections in this community, many of whom had high levels of virus. Overall, 29% of PCR-positive infections never developed symptoms, slightly lower than the proportion found (43%) in a population-based SARS-CoV-2 screening study from Iceland.21 In our study, recently infected people had high levels of virus that did not differ significantly based on symptoms. In addition, only one infected individual required hospitalization, suggesting the vast majority would not have been diagnosed without community-based testing. The clear implication of these findings is that testing strategies limited to symptomatic individuals and those seeking testing at health centers alone, will fail to limit transmission.

These results have several implications moving forward as shelter-in-place restrictions are lifted. First, more efforts are needed to address uncontrolled epidemics among sub-populations, especially vulnerable populations such as the Latinx community highlighted here. Expanded targeted, but community-led and mobilized, low-barrier testing not based on symptoms is needed. Testing need to be coupled with social protection of job security and economic support for self-isolation and quarantine (i.e. “test and respond”) and culturally responsive contact tracing. Our testing campaign contributed to policy change in San Francisco, with the mayor announcing on May 4, 2020 that essential workers would be eligible for free SARS-CoV-2 testing regardless of symptoms,22 and then on May 28, 2020 that low-wage workers with COVID-19 would be provided funds to stay home and isolate (“Right to Recover”).23 In parallel, longitudinal, population-based cohorts that couple epidemiologic data with PCR and antibody testing and viral sequencing can provide evidence of effectiveness of public health measures and viral introductions over time, enabling evidence-based responses in a dynamic landscape.

Our study has several limitations. SARS-CoV-2 PCR tests do not detect all cases and antibody sensitivity may be lower in asymptomatic infection which could have resulted in underestimation of cumulative incidence. False positive antibody results could result in overestimation of cumulative incidence and misclassification of prior infections. Fingerstick sampling could also impact antibody test performance. However, the EUA antibody test we used has been shown to have a high sensitivity (96.9-100%) at ≥17-22 days post symptom onset, and high specificity (≥99.6%) with venous drawn plasma,24,25 and our estimates of cumulative incidence accounted for sensitivity and specificity of the PCR and antibody assays used. Second, selection bias in who chose to test may have affected our estimates. Although we adjusted for demographic differences between participants and community composition based on 2018 American Community Survey data, these data may not fully reflect tract demographics in 2020. However, population-based testing in a census tract allowed for greater clarity in understanding who did not participate. Lastly, we relied on self-reported symptoms and survey responses, which may have resulted in misclassification. With follow-up of PCR-positive participants over two weeks, we were able to further explore symptom status, allowing for monitoring and reclassification.

In conclusion, improving access to SARS-CoV-2 testing, regardless of symptoms, through community-led, low-barrier testing programs in vulnerable communities, coupled with economic support and protections for low-income workers during isolation and quarantine are urgently needed to reduce community transmission and address the massive disparities in SARS-CoV-2 infection observed in the U.S.

## Data Availability

The data that support the findings will be available following publication.

## Supplementary Appendix

Supplement to manuscript: SARS-CoV-2 Community Transmission During Shelter-in-Place in San Francisco

### Supplementary Methods

Statistical approach to point estimates and 95% Confidence intervals for cumulative incidence, adjusted for test characteristics and sample participation

#### Overview of Approach

We take as our target of inference the cumulative probability of infection by SAR-COV-2 up to three days prior to the testing period.

Cumulative incidence unadjusted for test characteristics is estimated as the empirical proportion of participants with either a positive antibody test, a positive PCR test, or both. Corresponding unadjusted 95% CIs are estimated using an exact binomial approach. In this supplement, we describe the approach used to adjust both the point estimate and 95% CI to account for imperfect test sensitivity and specificity, as well as uncertainty in the estimates of these test characteristics. We further describe how, among census tract residents, additional adjustment for demographic differences between 2018 census tract composition (Manson, 2019) and study participation is incorporated. Uncertainty due to both forms of adjustment is estimated based on a bootstrap-based approach, related to the methods described in Bendavid, et al (2020), Cherain (2020), and Gelman (2020), and described in further detail below.

Use of the bootstrap approach requires estimates of test sensitivity and specificity, together with measures of the uncertainty in those estimates. Because our estimate of cumulative incidence is based on a composite outcome of two assays (PCR and antibody), we thus require an estimate of the specificity and sensitivity of the composite test and corresponding uncertainty estimates.

#### Estimation of specificity (and uncertainty of estimate) for composite cumulative incidence outcome

Specificity of the composite testing outcome (Antibody-positive or PCR-positive) can be estimated based on estimates of each of these tests in turn. For PCR specificity, we use data from a recent study in Bolinas in which 1845/1845 persons tested with the identical PCR assays under identical conditions tested negative. Determining specificity requires knowledge of the true number of uninfected persons, which is unknown but very unlikely to be less than 18271, PCR point estimate specificity is estimated using a conservative approach applied to these data: 0.51/1827 = 99.96% (Marill, 2017). For Antibody specificity, we use test characteristics provided by the manufacturer in the FDA package insert, in which 1066/1070 persons tested negative, implying a specificity point estimate of 99.63% (95% CI: 99.05%, 99.90%).

We assume that for an uninfected person, the probability of a negative PCR test is independent of the probability of a negative Antibody test, so joint specificity = PCR specificity * Antibody specificity. Together, these data provide a point estimate of the specificity of testing Antibody-positive and/or PCR-positive of 99.59% (95% CI 99.16%, 99.91%). The confidence interval is generated by a bootstrap procedure:

1.  Repeat 5000 times:
    
    1.  Create a vector of 1827 Bernoulli variables with probability of correct 0.51/1827. Take a bootstrap sample of size 1827 with replacement (Marill, 2017)
    
    2.  Take an independent bootstrap sample of 1066 correct and 4 incorrect.
    
    3.  The joint result is correct if both bootstrap samples are correct.

2.  Find the 2.5th and 97.5th quantile of the joint result.

#### Estimation of sensitivity (and uncertainty of estimate) for composite cumulative incidence outcome

Estimation of the sensitivity of the composite test (i.e., probability of testing Antibody-positive and/or PCR-positive given an infection up to three days before testing) is considerably more complex for several reasons.

*   - First, sensitivity varies profoundly for both assays as a function of time since infection. Based on current best available data, PCR sensitivity likely peaks around Day 8 after infection (∼Day 3 after symptom onset), and declines quickly; however, it remains greater than 0 for at least three weeks after infection (Kucirka, et al, 2020). Antibody sensitivity may be zero (0 positive, 4 negative) during the first two days of symptoms, then increases to close to perfect sensitivity (88 positive, 0 negative) after day 14 following symptom onset (per FDA package insert). In estimates below, we use these sources for estimated time-varying assay-specific sensitivity. We further assume a sensitivity of zero in the initial three days following infection. Any individuals in our sample who were infected within the three days prior to testing and tested PCR-positive on PCR will thus contribute to an overestimate of cumulative incidence; however, the impact of such events, if any, is expected to be very small.

*   - Second, because assay sensitivity varies by time, the sensitivity of the combined composite assay on any given calendar date depends on the distribution of time since infection on that date among persons who have been infected with SARS-COV-2 in the underlying population. This distribution in turn depends on the epidemic course to date in the underlying population. For example, high early growth rates followed by lower recent growth rates will increase the proportion of true positives with longstanding infection; recent high growth rates will increase the proportion of true positives who have been recently infected. The true historical underlying growth rate in this population is unknown. Here, we assume a set of time-varying growth rates generally consistent with regional hospitalization curves, and perform sensitivity analysis under variations in these assumed rates.

*   - Third, available data on time dependent-sensitivity curves have, to date, been drawn almost exclusively from symptomatic individuals. It is possible that sensitivity for either or both assays may be lower among asymptomatic persons. In the absence of data, we assume equivalent time-specific sensitivity among symptomatic and asymptomatic persons for each assay. The approach presented can be readily adopted (and we provide code to do so) if additional data on differential sensitivity become available.

With these basic assumptions, we implement the following approach to estimating the sensitivity of the joint testing outcome, together with uncertainty.

Repeat 5000 times

1.  Calculate Antibody sensitivity:
    
    1.  For Antibody sensitivity by day since symptoms, sample with replacement using data from the FDA insert for days 1-13 of symptoms: (0/4 positive antibody tests on days 1-2 of symptoms; 2/8 positive antibody tests on days 3-7 of symptoms; 19/22 positive antibody tests on days 8-13 of symptoms).
    
    2.  For day ≥14 of symptoms, the FDA insert reports 88/88 positive antibody tests2. Create a vector of 88 Bernoulli variables with probability of correct 0.51/88. Take a bootstrap sample of size 88 with replacement (Marill, 2017)
    
    3.  Antibody sensitivity by day since symptoms is estimated as the mean of the bootstrap sample

2.  Calculate PCR sensitivity
    
    1.  For PCR sensitivity by day since infection, we use the results of Kucirka et al (2020), posted on [https://github.com/HopkinsIDD/covidRTPCR](https://github.com/HopkinsIDD/covidRTPCR). Sample one of the 5000 posterior draws of these results. Kucirka, et al provide estimates of PCR sensitivity from day 1 to 21 since infection.
    
    2.  We assume sensitivity on day 31 is zero and use linear interpolation to estimate PCR sensitivity on days 22 to 30.

3.  Assume that infections start on Feb 20, grow 23% per day (R0 ∼ 3.5), and that the growth rate than decreases linearly starting March 7 to 5% per day on March 23 (Re ∼ 1.2), reflecting the imposition of a Shelter In Place ordinance in San Francisco. These reproductive numbers are supported by fitting an SEIR model to San Francisco hospitalization data. Calculate the fraction of infections by day under these growth rates. Assume that symptoms start on 5th day of infection (as in Kucirka, et al).

4.  Assume that for a truly infected person, the probability of a positive PCR test is independent of the probability of a positive Antibody test. Then joint sensitivity by day since infection = PCR sensitivity + Antibody sensitivity - PCR sensitivity * Antibody sensitivity

5.  As noted above, we take as our target of inference cumulative probability of infection by SAR-COV-2 up to three days prior to the testing period, and thus assume sensitivity of zero in the initial three days following infection. Kucirka, et al estimate sensitivity of less than 0.01% on day 1, 0.4% on day 2, 6.6% on day 3, but confidence intervals include zero on all three days.

6.  Joint sensitivity integrated over days since infection is calculated by summing the product of fraction of infections by day and joint sensitivity by day.

The 5000 repetitions give a mean joint sensitivity of the combined Antibody and PCR assays of 88.4% with a standard deviation of 2.4%. Only this final mean and standard deviation are used in all adjusted estimates in main text of the paper. Because there were many assumptions in the estimation of these two quantities, we conducted sensitivity analysis considering a range of point and variance estimates for composite test sensitivity; results are shown in Table.

#### Bootstrap-based approach to adjust point estimate and confidence interval

Given the above inputs – estimated sensitivity and corresponding 95% CI of the joint test, and specificity estimates of the PCR and Antibody Tests with corresponding numbers of tests used to generate these estimates – we implemented the following bootstrap-based procedure to adjust the point estimates and generate confidence intervals accounting for uncertainty in test characteristics (and optionally allowing for adjustment for participation).

For each bootstrap iteration *j* in 1 to 5000

1.  Draw composite test sensitivity, *sens**j*, from a normal distribution with mean 88.4% and standard deviation of 2.4%

2.  Draw composite test specificity:
    
    1.  Create a vector of 1827 Bernoulli variables with probability of correct 0.51/1827. PCR specificity = mean of a bootstrap sample of size 1827 with replacement. (Marill, 2017)
    
    2.  Antibody specificity = mean of a bootstrap sample of 1066 correct and 4 incorrect.
    
    3.  Composite test specificity = *spec**j* = PCR specificity * Antibody specificity

3.  Draw *H* household at random (with replacement), and add all members of that household to the bootstrap sample, where *H* is the total number of households. Define *qgj* as the proportion of people with positive test results in the *j*th iteration in the race-sex-age cell *g*.

4.  For each race-sex-age cell *g*, convert test results to expected rate of actual prevalence using the following formula (Bendavid et al, 2020): *pgj* = max(0, (*qgj* + *spec**j* - 1) / (*sens**j* + *spec**j* - 1))

5.  Calculate the expected prevalence by summing over all of the groups, weighting group *g* by the group’s census population (*Ng*) divided by the total census population (*N*). ![Formula][1]</img> 

The 95 percent confidence interval is defined by the 2.5th and 97.5th percentiles of the observed values of *p*.

Table. Sensitivity analysis, showing adjusted point estimates and 95% CIs under varying assumptions on time-dependent sensitivity of the time-dependent Antibody and PCR tests. (Primary estimated, reported in main text, shown in blue.)

View this table:
[Table4](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T4)

## Supplementary Methods

Bioinformatics and Genomic Analyses

Consensus genomes were generated by first using minimap21 to align raw reads to the reference genome MN908947.3, then using samtools,2 mpileup and ivar3 to generate consensus genomes. We used the augur/auspice4 toolkit with mafft,5 iqtree,6 and treetime7 to construct and visualize maximum likelihood phylogenies. The inferred topology had a lower bound of 9 transmission events into and out of the Mission, based on a min-cut separating the Mission district from other San Francisco samples. Branches on the topology with zero inferred mutations were collapsed into polytomies for the min-cut analysis. To test for population structure, we used the Python package scikit-allel to compute Hudson’s F ST8 between the Mission District study census tract and sequences from GISAID9 representing the rest of San Francisco (Supplemental Table S3), including at most 1 sample per household (the first lexicographically), and including all variant sites in the multiple alignment with at most 5 missing alleles. We failed to reject the null hypothesis of no population structure, computing FST=.007 with a p-value of 0.192 based on a permutation test with 10,000 random rearrangements of the sample labels.

View this table:
[Table S1:](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T5)

Table S1: 
Summary of estimated point prevalence of PCR-positive SARS-CoV-2 infection and estimated cumulative incidence of SARS-CoV-2 infection among census tract residents, persons who work but do not live in the census tract, and residents from city blocks adjacent to the study census tract.

View this table:
[Table S2.](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T6)

Table S2. 
Characteristics of participants with SARS-CoV-2 infection diagnosed by PCR or Antibody (N=144 total with classification as recent vs. prior infection), and antibody positivity by participant characteristics.

View this table:
[Table S3.](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/T7)

Table S3. GISAID sequences used for phylogenetic analysis.
We gratefully acknowledge the following Authors from the Originating laboratories responsible for obtaining the specimens and the Submitting laboratories where genetic sequence data were generated and shared via the GISAID Initiative, on which this research is based. All submitters of data may be contacted directly via [www.gisaid.org](http://www.gisaid.org).

![Figure S1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/06/17/2020.06.15.20132233/F3.medium.gif)

[Figure S1:](http://medrxiv.org/content/early/2020/06/17/2020.06.15.20132233/F3)

Figure S1: Levels of virus as estimated by RT-PCR cycle thresholds stratified by sub-group.
RT-PCR cycle thresholds by age <18 vs. ≥18 years (Panel A), sex (Panel B), SARS-CoV-2 antibody status (Panel C), reported prior COVID-19 diagnosis (Panel D), and reported underlying medical conditions (Panel E) among PCR-positive participants (N=83).

## Acknowledgements

We thank the community members from the census tract studied for their generous participation in our study during San Francisco’s shelter-in-place ordinance. We gratefully acknowledge the contributions of all of the study volunteers who made community-based testing possible, and all the members of the Latino Task Force for COVID-19 who led community mobilization efforts, community outreach and education and dissemination of our findings to the community. We thank Chesa Cox, MPH for logistical and administrative support, Stacie Powers for her generous donation of the Brava Theater space to operate as headquarters for the study volunteers and the Community Wellness Team. We thank San Francisco Recreation and Parks and the San Francisco Unified School District for generously allowing us to offer community-based testing at two public parks and two public schools. We thank Supervisor Hillary Ronen, the San Francisco Mayor’s Office, Dr. Grant Colfax Director of San Francisco Department of Public Health, Elaine Forbes, Executive Director, Port of San Francisco as well as Naomi Kelly, SF City Administrator, Dr. Susan Ehrlich CEO of Zuckerberg San Francisco General Hospital and the Ward 86 Clinic staff for their efforts in support of this community-based testing project. We thank Andrew Kobylinski and Primary tech for assistance in their virtual support platform and online results reporting. We also wish to thank Ana Vallari, Ana Olivo, Barb Harris, and Chris Lark from Abbott Laboratories for helping conduct the antibody assays for this study.

## Footnotes

*   1 1827 true negatives (18 true negatives) would represent a true recent prevalence of 1% in Bolinas. Even if sensitivity is 30% and specificity is 100%, the probability of zero test positives in Bolinas given 18 true positives is less than 0.2%. If sensitivity is higher or specificity is lower, the probability of zero test positives in Bolinas is even lower.

*   2 In the package insert it is noted: “Five specimens from 1 immunocompromised patient were excluded from the study. … When the results from these specimens were included, the PPA at ≥ 14 days post-symptom onset was 96.77% (95% CI: 90.86, 99.33).”

*   Received June 15, 2020.
*   Revision received June 15, 2020.
*   Accepted June 17, 2020.


*   © 2020, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/)

## References

1.  1.Schuchat A, Team CC-R. Public Health Response to the Initiation and Spread of Pandemic COVID-19 in the United States, February 24-April 21, 2020. Mmwr 2020;69:551–6.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm6918e2&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32379733&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

2.  2.Heinzerling A, Stuckey MJ, Scheuer T, et al. Transmission of COVID-19 to Health Care Personnel During Exposures to a Hospitalized Patient - Solano County, California, February 2020. Mmwr 2020;69:472–6.
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

3.  3.[https://covid19.ca.gov/stay-home-except-for-essential-needs/](https://covid19.ca.gov/stay-home-except-for-essential-needs/). Last accessed on June 1, 2020. 2020.
    
    
4.  4.[https://data.sfgov.org/stories/s/San-Francisco-COVID-19-Data-and-Reports/fjki-2fab/](https://data.sfgov.org/stories/s/San-Francisco-COVID-19-Data-and-Reports/fjki-2fab/) Last accessed on June 1, 2020. 2020.
    
    
5.  5.Wadhera RK, Wadhera P, Gaba P, et al. Variation in COVID-19 Hospitalizations and Deaths Across New York City Boroughs. JAMA 2020.
    
    
6.  6.Palomino J, Sanchez T. Latinos’ coronavirus burden. [https://www.sfchronicle.com/bayarea/article/Bay-Area-Latinos-hit-hardest-by-coronavirus-15252632.php](https://www.sfchronicle.com/bayarea/article/Bay-Area-Latinos-hit-hardest-by-coronavirus-15252632.php). Last accessed on June 4, 2020. San Francisco Chronicle 2020 May 8, 2020.
    
    
7.  7.Verity R, Okell LC, Dorigatti I, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet infectious diseases 2020;20:669–77.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S1473-3099(20)30243-7&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32240634&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

8.  8.Gandhi M, Yokoe DS, Havlir DV. Asymptomatic Transmission, the Achilles’ Heel of Current Strategies to Control Covid-19. The New England journal of medicine 2020;382:2158–60.
    
    
9.  9.[https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html](https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html); Last accessed on June 4, 2020. 2020.
    
    
10. 10.Price-Haywood EG, Burton J, Fort D, Seoane L. Hospitalization and Mortality among Black Patients and White Patients with Covid-19. The New England journal of medicine 2020.
    
    
11. 11.Chowkwanyun M, Reed AL, Jr.. Racial Health Disparities and Covid-19 -Caution and Context. The New England journal of medicine 2020.
    
    
12. 12.[https://censusreporter.org/profiles/14000US06075022901-census-tract-22901-san-francisco-ca/](https://censusreporter.org/profiles/14000US06075022901-census-tract-22901-san-francisco-ca/) Last accessed on June 1, 2020. 2020.
    
    
13. 13.Quick J, Grubaugh ND, Pullan ST, et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat Protoc 2017;12:1261–76.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nprot.2017.066&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28538739&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

14. 14.[https://artic.network/resources/ncov/ncov-amplicon-v3.pdf](https://artic.network/resources/ncov/ncov-amplicon-v3.pdf); Last accessed on June 3, 2020.
    
    
15. 15.[https://www.fda.gov/media/137383/download](https://www.fda.gov/media/137383/download); Last accessed on June 4, 2020. 2020.
    
    
16. 16.Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall 2017;1:33–46.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/gch2.1018&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31565258&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

17. 17.Neher RA, Bedford T. nextflu: real-time tracking of seasonal influenza virus evolution in humans. Bioinformatics 2015;31:3546–8.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btv381&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26115986&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

18. 18.[https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstain-org/498](https://virological.org/t/year-letter-genetic-clade-naming-for-sars-cov-2-on-nextstain-org/498); Last accessed on June 5, 2020.
    
    
19. 19.Hudson RR, Slatkin M, Maddison WP. Estimation of levels of gene flow from DNA sequence data. Genetics 1992;132:583–9.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXRpY3MiO3M6NToicmVzaWQiO3M6OToiMTMyLzIvNTgzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDYvMTcvMjAyMC4wNi4xNS4yMDEzMjIzMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

20. 20.CDC COVID-NET. Characteristics of Laboratory-Confirmed COVID-19-Associated Hospitalizations. [https://gis.cdc.gov/grasp/COVIDNet/COVID19\_5.html](https://gis.cdc.gov/grasp/COVIDNet/COVID19_5.html). Last accessed on June 1, 2020. 2020.
    
    
21. 21.Gudbjartsson DF, Helgason A, Jonsson H, et al. Spread of SARS-CoV-2 in the Icelandic Population. The New England journal of medicine 2020.
    
    
22. 22.[https://www.nbcbayarea.com/news/local/san-francisco/san-francisco-now-offering-free-covid-19-testing-for-all-essential-workers/2284712/](https://www.nbcbayarea.com/news/local/san-francisco/san-francisco-now-offering-free-covid-19-testing-for-all-essential-workers/2284712/); Last accessed on June 4, 2020. 2020.
    
    
23. 23.Fracassa D. SF to pay low-wage workers who get COVID-19 to stay home and isolate. [https://www.sfchronicle.com/bayarea/article/SF-rolls-out-new-program-to-help-low-income-15299045.php](https://www.sfchronicle.com/bayarea/article/SF-rolls-out-new-program-to-help-low-income-15299045.php); Last accessed on June 4, 2020. San Francisco Chronicle 2020 May 28, 2020.
    
    
24. 24.Bryan A, Pepper G, Wener MH, et al. Performance Characteristics of the Abbott Architect SARS-CoV-2 IgG Assay and Seroprevalence in Boise, Idaho. ournal of clinical microbiology 2020.
    
    
25. 25.Ng DL, Goldgof GM, Shy BR, et al. SARS-CoV-2 seroprevalence and neutralizing activity in donor and patient blood from the San Francisco Bay Area. medRxiv (Preprint server) 2020.
    
    
## References

1.  Bureau UC. American Community Survey (ACS) [Internet]. The United States Census Bureau. [accessed 2020 May 8]; Available from: [https://www.census.gov/programs-surveys/acs](https://www.census.gov/programs-surveys/acs).
    
    
2.  Diggle, Peter. Hindawi Publishing Corporation, Epidemiology Research International. Volume 2011, Article ID 608719, 5 pages. [http://doi:10.1155/2011/608719](http://doi.org/10.1155/2011/608719)
    
    
3.   Steven Manson,  Jonathan Schroeder,  David Van Riper, and  Steven Ruggles. IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS. 2019. [http://doi.org/10.18128/D050.V14.0](http://doi.org/10.18128/D050.V14.0)
    
    
4.  Marill, K. A., Chang, Y., Wong, K. F., & Friedman, A. B. (2017). Estimating negative likelihood ratio confidence when test sensitivity is 100%: A bootstrapping approach. Statistical methods in medical research, 26(4), 1936–1948.
    
    
5.  @jjcherain (John Cherain). “Ok, so what’s wrong with the confidence intervals in this preprint?” Twitter, 17 Apr., 2020. [https://twitter.com/jjcherian/status/1251272333177880576](https://twitter.com/jjcherian/status/1251272333177880576)
    
    
6.  Gelman Andrew. “Concerns with that Stanford study of coronavirus prevalence.” Blog post. Statistical Modeling, Causal Inference, and Social Science. 19 Apr. 2020. [https://statmodeling.stat.columbia.edu/2020/04/19/fatal-flaws-in-stanford-study-of-coronavirus-prevalence/](https://statmodeling.stat.columbia.edu/2020/04/19/fatal-flaws-in-stanford-study-of-coronavirus-prevalence/)
    
    
7.  Kucirka Lauren M., et al. “Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction–Based SARS-CoV-2 Tests by Time Since Exposure.” Annals of Internal Medicine (2020).
    
    
8.  Bendavid, Eran, et al. “COVID-19 Antibody Seroprevalence in Santa Clara County, California.” MedRxiv (2020).
    
    
## References

1.  1.Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 2018;34:3094–100.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/bty191&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29750242&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

2.  2.Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009;25:2078–9.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btp352&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19505943&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000268808600014&link_type=ISI) 

3.  3.Grubaugh ND, Gangavarapu K, Quick J, et al. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol 2019;20:8.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13059-018-1618-7&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30621750&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

4.  4.Neher RA, Bedford T. nextflu: real-time tracking of seasonal influenza virus evolution in humans. Bioinformatics 2015;31:3546–8.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btv381&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26115986&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

5.  5.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 2013;30:772–80.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/molbev/mst010&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23329690&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000317002300004&link_type=ISI) 

6.  6.Minh BQ, Schmidt HA, Chernomor O, et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol Biol Evol 2020;37:1530–4.
    
    
7.  7.Sagulenko P, Puller V, Neher RA. TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol 2018;4:vex042.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ve/vex042&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29340210&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom) 

8.  8.Hudson RR, Slatkin M, Maddison WP. Estimation of levels of gene flow from DNA sequence data. Genetics 1992;132:583–9.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXRpY3MiO3M6NToicmVzaWQiO3M6OToiMTMyLzIvNTgzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDYvMTcvMjAyMC4wNi4xNS4yMDEzMjIzMy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

9.  9.Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall 2017;1:33–46.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/gch2.1018&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31565258&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F06%2F17%2F2020.06.15.20132233.atom)

 [1]: /embed/graphic-11.gif