Inferring the multiplicity of founder variants initiating HIV-1 infection: a systematic review and individual patient data meta-analysis ======================================================================================================================================== * James Baxter * Sarah Langhorne * Ting Shi * Damien C. Tully * Ch. Julián Villabona-Arenas * Stéphane Hué * Jan Albert * Andrew Leigh Brown * Katherine E. Atkins ## 1. Summary **Background** HIV-1 infections initiated by multiple founder variants are characterised by a higher viral load and a worse clinical prognosis, yet little is known about the routes of exposure through which multiple variant transmission is most likely, and whether methods of quantifying the number of founder variants differ in their accuracy. **Methods** We conducted a systematic review of studies that estimated founder variant multiplicity in HIV-1 infection, searching MEDLINE, EMBASE and Global Health databases for papers published between 1st January 1990 and 14th September 2020 (PROSPERO study CRD42020202672). Leveraging individual patient estimates from these studies, we performed a logistic meta-regression to estimate the probability that an HIV infection is initiated by multiple founder variants. We calculated a pooled estimate using a random effects model, subsequently stratifying this estimate across nine transmission routes in a univariable analysis. We then extended our model to adjust for different study methods in a multivariable analysis, recalculating estimates across the nine transmission routes. **Findings** We included 71 publications in our analysis, comprising 1664 individual patients. Our pooled estimate of the probability that an infection is initiated by multiple founder variants was 0·25 (95% CI: 0·21-0·30), with moderate heterogeneity (*Q* = 137·1, *p* < ·001, *I*2 = 65·3%). Our multivariable analysis uncovered differences in the probability of multiple variant infection by transmission route. Relative to a baseline of male-to-female transmission, the probability for female-to-male multiple variant transmission was significantly lower at 0·10 (95% CI: 0·05-0·21), while the probability for people-who-inject-drugs (PWID) transmission was significantly higher at 0·29 (0·13-0·52). There was no significant difference in the probability of multiple variant transmission between male-to-female transmission (0·16 (0·08-0·29)), post-partum mother-to-child (0·12 (0·02-0·51)), pre-partum mother-to-child (0·13 (0·05-0·32)), intrapartum mother-to-child (0.21 (0·08-0·44)) and men-who-have-sex-with-men (MSM) transmission (0·23 (0·03-0·7)). **Interpretation** We identified PWID transmissions are significantly more likely to result in an infection initiated by multiple founder variants, whilst female-to-male infections are significantly less likely. Quantifying how the routes of HIV infection impact the transmission of multiple variants allows us to better understand how the evolution and epidemiology of HIV-1 determine the clinical picture. **Funding** This study was supported by the MRC Precision Medicine Doctoral Training Programme (ref: 2259239) and a ERC Starting Grant awarded to KEA (award number 757688). **Evidence before this study** The majority of HIV-1 infections are initiated by a single, genetically homogeneous founder variant. Infections initiated by multiple founders, however, are associated with a significantly faster decline of CD4+ T Cells in untreated individuals, ultimately leading to an earlier onset of AIDS. Through our systematic search of MEDLINE, EMBASE and Global Health databases, we identified 82 studies that classify the founder variant multiplicity of acute HIV infections. As these studies vary in the methodology used to calculate the number of founder variants, it is difficult to evaluate the multiplicity of founder variants across routes of exposure. **Added value of this study** Using meta-regression, we estimated the probability of multiple founder infections across exposure routes by accounting for variability in methodology between studies. Our multivariable meta-regression adjusted for heterogeneity across study methodology and uncovered differences in the probability that an infection is initiated by multiple founder variants by transmission route, with the probability for female-to-male transmission significantly lower than for male-to-female transmission. By contrast, the probability for transmission among people-who-inject-drugs (PWID) was significantly higher. There was no difference in the probability of multiple founder variant transmission for mother-to-child transmission or men-who-have-sex-with-men (MSM) when compared with male-to-female. **Implications of all the available evidence** Because HIV-1 infections initiated by multiple founders are associated with a poorer prognosis, determining whether the route of infection affects the probability of transmission of multiple variants will facilitate an improved understanding of how the evolution and epidemiology of HIV-1 determine clinical progression. Our results identify that PWID transmissions are significantly more likely to result in an infection initiated by multiple founder variants compared to male-to-female. This reiterates the need for focussed public health programmes that reduce the burden of HIV-1 in this vulnerable risk group. ## 3. Introduction Transmission of HIV-1 results in a dramatic reduction in genetic diversity, with a large proportion of infections initiated by a single founder variant.1,2 An appreciable minority of infections, however, appear to be the result of multiple founder variants simultaneously transmitted in a single exposure.3 Importantly, these multiple founder infections are associated with both significant increases in set point viral load and the rate of CD4+ T lymphocyte decline.4–7 HIV-1 infections initiated via different routes of exposure are subject to different virological, cellular and physiological environments, which likely influence the probability of acquiring infection.8–10 For example, the probability of transmission upon exposure increases six-fold between heterosexual transmission and transmission between people who inject drugs (PWID), and up to eighteen-fold for men who have sex with men (MSM).11 Despite these differences in the probability of HIV-1 acquisition by route of exposure, there is currently no consensus about the effect of route of exposure on the transmission of multiple founder variants. Differences in selection pressure during transmission have been observed between sexual exposure routes, with reduced selection occurring during transmission from males to females than vice-versa, and less selection occurring between men who have sex with men (MSM) relative to those heterosexual exposure overall.12,13 However, studies quantifying the number of founder variants are inconsistent with these findings, which may be due to differences in methodology and study population.3,12,14,15 In sexual transmission, the probability of both transmission and founder variant multiplicity may also be influenced by inflammation, genital ulcerative disease and hormonal contraception, perhaps suggesting that the integrity of mucosal barrier underpins this process.14,16 But, a significantly higher proportion of multiple founder infections in PWID transmissions, which bypass mucosal barriers altogether, has also not been consistently observed and so the role of exposure on the risk of acquiring a multiple founder infection remains unclear.17,18 To estimate the role of exposure route on the acquisition of multiple HIV-1 founder variants, we conducted a meta-regression leveraging all available individual patient data, accounting for heterogeneity across methodology and study population. ## 4. Methods ### 4.1. Search Strategy and Eligibility Criteria We searched MEDLINE, EMBASE and Global Health databases for papers published between 1 January 1990 to 14 September 2020 (S2: Supplementary Methods). To be included, studies must have reported original estimates of founder variant multiplicity in people acutely infected with HIV-1, be written in English and document ethical approval. Studies were excluded if they did not distinguish between single and multiple founder variants, if they did not detail the methods used, or if the study was conditional on having identified multiple founder variants. Additionally, studies were excluded if they solely reported data concerning people living with HIV-1 who had known or suspected superinfection, who were documented as having received pre-exposure prophylaxis, or if the transmitting partner was receiving antiretroviral treatment. No restrictions were placed on study design, geographic location, or age of participants. Publications were screened independently by SL and JB. Reviewers were blinded to the publication authors during the title and abstract screens and full text reviews were conducted independently, before a consensus was reached, with consultation with other co-authors when necessary. ### 4.2. Data Extraction Individual patient data (IPD) were collated from all studies, with authors contacted if these data were not available. Studies were excluded from further analysis if no IPD were obtained. Only individuals for whom a route of exposure was known were included. Additionally, we removed any entries for individuals with known or suspected superinfection, who were receiving pre-exposure prophylaxis or for whom the transmitting partner was receiving antiretroviral therapy. For this final individual patient dataset for analysis, we recorded whether an infection was initiated by one or multiple variants and nine predetermined covariates: 1. *Route of exposure*. Female-to-male (HSX-FTM), male-to-female (HSX-MTF), men-who-have-sex-with-men (MSM), pre-partum, intrapartum and post-partum mother to child (MTC), or people who inject drugs (PWID)). 2. *Method of quantification*. Methodological groupings were defined by the properties of each approach, resulting in six levels: phylogenetic, haplotype, distance, model, or molecular (Table 1). Molecular methods interpret the formation of heteroduplexes during gel electrophoresis of viral RNA; haplotype methods identify linkage patterns of individual polymorphisms; distance and model-based methods assume a threshold or distribution of diversity that is reasonably expected to occur under a hypothesis of neutral exponential growth from a single founder and determine whether the observed diversity is consistent with the modelled values; and phylogenetic methods either use recipient sequences only, in which case a star-like topology is expected to be observed for single founder infections, or use source and recipient sequences from known transmission pairs, such that the number of distinct clades of recipient sequences nested within the source sequences corresponds to the number of founder variants. 3. *HIV subtype*. Canonical geographically delimited subtypes (A-D, F-H, J and K) and circulating recombinant forms (e.g. CRF01_AE).19,20 IPD where subtyping was unclear or not conducted were assigned ‘unknown,’ while putative recombinants not recognised as circulating recombinant forms were assigned ‘recombinant.’ 4. *Delay between infection and sampling*. For sexual or injection drug use exposure, the delay was classified as either less than or equal to 21 days if the patient was seronegative at time of sampling (Feibig stages I-II) or more than 21 days if the patient was seropositive (Fiebig stages III-VI). For mother-to-child infections, if infection was confirmed at birth, or within 21 days of birth, the delay was classified as either less than or equal to 21 days. A positive mRNA or antibody test definitively reported after this period was classified as a delay of greater than 21 days. 5. *Number of genomes analysed per participant*. 6. *Genomic region analysed*. Classified as envelope (Env), pol, gag or near full length genome (NFLG). 7. *Alignment length analysed*. Measured in base pairs, discretised at 250, 500, 1000, 2000, 4000, 8000, near full length genome (NFLG) intervals. 8. *Use of single genome amplification (SGA) to generate viral sequences*. A binary classification (yes or no) as to whether the viral genomic data were generated using SGA. Regular bulk or near endpoint polymerase chain reaction (PCR) amplification can generate significant errors such as Taq-polymerase mediated template switching, nucleotide misincorporation or unequal amplicons resampling.21,22 In SGA, serial dilutions of viral nucleic acids are made, which, assuming the proportion of positive PCR reaction at each dilution follows a null Poisson distribution, reduces the final reactions to contain a single variant that can be cloned, sequenced and then analysed.22,23 9. *Study cohort*. The epidemiological cohort from which the patient was sampled. View this table: [Table 1:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/T1) Table 1: Methods of quantification. Groupings of methods used to infer the founder variant multiplicity of HIV-1 infections. Model and phylogenetic methods may present as similar metrics such as the most recent common ancestor (tMRCA) and topology, but model-based approaches, unlike phylogenetic methods, do not use genealogical information in their calculation and instead are statistical models applied directly to the genomic data. If information from any of these nine covariates was missing or could not be inferred from the study, we classified its value as unknown. We excluded covariate levels for which there were fewer than 6 data points. For our base case analysis, we removed repeat measurements for the same individual, and used only those from the earliest study or, where the results of different methods were reported by the same study, the conclusive method used for each individual. ### 4.3. Pooled Meta-Analysis We calculated a pooled estimates of the probability of multiple founder variant infection from our base case model: a ‘one-step’ generalised linear mixed model (GLMM) assuming an exact binomial distribution, with a normally distributed random effect on the intercept for within-study clustering and fitted by approximate maximum likelihood.25 Heterogeneity was measured in terms of τ2, the between-study variance; I2, the percentage of variance attributable to study heterogeneity; and Cochran’s Q, an indicator of larger variation between studies than of subjects within studies.26 Publication bias was assessed using funnel plots and Egger’s regression test.27 Whilst pooled estimates obtained through a ‘one-step’ approach are usually congruent with the canonical ‘two-step’ meta-analysis model, discrepancies may arise due to differences in likelihood specification, weighting schemes, and specification of the intercept or estimation of residual variances.28 We compared the results from our base case model with a two-step binomial-normal model to confirm our estimates were consistent. We performed additional sensitivity analyses to test the robustness of our pooled estimate to our exclusion criteria: iteratively excluding single studies, excluding studies that contained fewer than 10 participants, excluding studies that consisted solely of single founder infections, excluding IPD that did not use single genome amplification, and including only those data that matched our reference methodology of haplotype-based methods and whole genome analysis. In each of these sensitivity analyses, the base case model was refitted as previously described. To investigate the impact of our treatment of repeated measurements, we created 1000 datasets in which the included datapoint for each individual was sampled at random from a pool of their possible measurements. Each of these 1000 datasets thus contained a single datapoint per individual and we refitted the base case model to calculate a distribution of pooled estimates. ### 4.4. Meta-regression We extended our base case model by conducting a univariable meta-regression with each covariate contributing a fixed effect and, assuming normally distributed random effects of publication. Pooled heterogeneity measures were calculated for each covariate level. We extended the base case model in a multivariable analysis, where we defined publication and cohort as crossed random effects before sequentially adding fixed effects covariates and evaluating interactions; assessing convergence, singularity and multicollinearity between fixed effects. The fixed effects were selected according to a ‘keep it maximal’ principle, in which covariates were only removed to facilitate a non-singular fit.29 We defined our reference case as heterosexual male-to-female transmission, evaluated through haplotype-based methods, analysis of the whole genome sequences and a sampling delay of less than 21 days. Stratified predictions of the proportion of infections initiated by multiple founders and bootstrapped 95% confidence intervals, conditioned on the reference case, were calculated. We performed sensitivity analyses to test the robustness of the selected multivariable meta-regression model: iteratively excluding single studies, excluding studies that contained fewer than 10 participants, excluding studies that consisted solely of single founder infections and excluding IPD that did not use single genome amplification. The re-sampling sensitivity analysis was repeated on our selected multivariable model as described above. ## 5. Results ### 5.1. Study and Patient Selection Our search found 7416 unique papers, of which 7334 were excluded. Of the remaining 380 results, 207 were further excluded after abstract screening, leaving a total of 82 eligible studies for individual patient data (IPD) collation. We successfully extracted IPD from 80 of these studies, comprising 3251 data points. The 80 selected studies from which IPD were collated, were published between 1992 and 2020. Of the 3251 data points extracted, 1477 were excluded from our base case analysis to avoid repeated measurements; arising either between different studies that analysed the same individuals (resulting in the exclusion of five studies), or from repeat analysis of individuals within the same study. After excluding participants for whom the route of exposure was unknown or for whom one or more of their covariate values did not meet the minimum number of observations across the whole participants range of values, our final dataset for our base case analysis comprises estimates from 1664 unique patients across 71 studies. ### 5.2. Study and Patient Characteristics Our base case dataset includes a median of 13 participants per study (range 2-124) and represents infections associated with heterosexual transmission (42·2%, (n = 703), MSM transmission (37·3%, n = 621), MTC mother-to-child transmission (14·1%, n = 234), and PWID transmission (6·4%, n = 106) (Fig. S3). Among heterosexual transmissions, 67·6% (n = 475) were male-to-female transmissions, 30% (n = 211) were female-to-male transmissions, with the remainder undisclosed (n = 17). Similarly, we subdivided MTC transmission according to the timing of infection with 44·4% (n = 104) pre-partum, 24·4% (n = 57) intrapartum, 4·7% (n = 11) post-partum, with the remainder undisclosed (n = 62). Our dataset spanned geographical regions and dominant subtypes, capturing the diversity of the HIV epidemic (Figs 2, S3). Across the base case dataset, phylogenetic methods constituted 37·1% (n = 618) of estimates, 26·7% (n = 445) were estimated using haplotype methods, 20·9% (n = 347) using molecular methods, and 12·9% (n = 215) and 2·34% (n = 39) of estimates were inferred using distance and model-based methods respectively (Table 2, Fig 2). View this table: [Table 2:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/T2) Table 2: Included studies selected for inclusion from our systematic literature search. We record the route of transmission: female-to-male (HSX:FTM), male-to-female (HSX:MTF), men-who-have-sex-with-men (MSM), mother-to-child pre-partum (MTC:PreP), intrapartum (MTC:IntP) and post-partum (MTC:PostP); people who inject drugs (PWID), or nosocomial (NOSO). Additionally, we tabulate the method grouping used to infer founder multiplicity, the genomic region analysed, the number of participants analysed and the proportion of infections initiated by multiple founders reported by each study. We note the number of single and multiple founder infections included within our base case dataset. ![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F1) Figure 1: PRISMA flowchart outlining our systematic literature search and the application of exclusion criteria for the individual patient data meta-analysis. ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F2) Figure 2: Individual patient data characteristics from the included studies that were tested for inclusion as fixed effects in the multivariable meta-regression model. ### 5.3. Meta-analyses #### 5.3.1. Pooled Estimate Our base case analysis using a GLMM estimated the probability that an infection is initiated by multiple founder variants to be 0.25 (95% CI: 0.21-0.29), identifying significant heterogeneity (*Q* = 137.1, *p* <. 001, *I*2 = 65·3%). Our sensitivity analyses revealed the pooled estimate is robust to the choice of model, the inclusion of estimates from repeat participants, and to the exclusion of studies that contained fewer than 10 participants (Fig. S4, S5). While analysing only data that matched our reference case study methodology did not change our estimate, it widened the confidence intervals of our estimate (0.25 (95% CI: 0.05-0.67)). We did not identify any studies that significantly influenced the pooled estimate (Fig. S6). Visual inspection of a funnel plot and a non-significant Egger’s Test (t = -0·2663, df = 56, p = 0·7910), were consistent with an absence of publication bias in our dataset (Fig. S7). #### 5.3.2. Meta-Regression We extended our base case binomial GLMM using uni- and multivariable fixed effects. Relative to a reference exposure route of male-to-female transmission, our univariable analysis found significantly lower odds of female-to-male transmission being initiated by multiple founder variants (Odds Ratio (OR): 0·56 (95% CI 0·33-0·87)), while other exposure routes were not significantly different. The univariable analyses also indicated significantly greater odds of multiple founder variants if the envelope genomic region was analysed (OR: 2·06 (95% CI:1·16-3·98)), relative to the whole genome. Other methodological covariates, however, such as method of quantification and sampling delay were not significantly associated with the odds that HIV-1 infection is initiated by multiple founder variants. Our base case multivariable model calculated the probability of multiple founder variants across the seven routes of transmission controlling for method, genomic region and sampling delay (Fig. 3). Compared to a male-to-female transmission probability of 0·16 (95% CI: 0·08-0·29), there was no evidence that the probability of multiple founder variants differed across MSM (0·23 (0·03-0·7)) or MTC transmission. Stratifying MTC transmissions by the putative timing of infection, we calculated pre-partum were initiated by multiple founders with probability 0·13 (0·05-0·32), post-partum with probability 0·12 (0·02-0·51), and intrapartum transmissions with probability 0·21 (0·08-0·44). ![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F3.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F3) Figure 3: Predictions and coefficients obtained from the multivariable model. A) predicted probabilities of an infection being initiated by multiple founder variants, stratified by the route of exposure. B-D) Inferred odds ratios of fixed effects variables. Blue denotes that a covariate level significantly decreases the odds of an infection being initiated by multiple founders, whilst red indicates covariate levels for which the odds are significantly greater. For each plot, the reference case is marked at the top of the y axis, with the dotted line at x=1 demarcating the reference plane. By contrast, we found that female-to-male transmissions were less likely to be initiated by multiple founders than male-to-female transmissions, with probability 0·10 (95% CI: 0·05-0·21) (OR: 0·61 (95% CI 0·36-0·94)). Conversely, PWID transmission was more likely to be initiated by multiple founders (0·29 (0·13-0·52)), compared to male-to-female (OR: 2.19 (1·10-4·42)). We calculated the accuracy of estimating the probability of multiple founder variants compared to a gold-standard methodological reference scenario of using haplotype-based methods on whole genome sequences with individuals with less than 21 delays between infection and sampling. Our base case analysis indicates using model-based methods underestimates the chance of multiple founder variants (OR: 0·32 (95% CI: 0·05-0·82)), while using the gag or envelope genomic regions overestimates the chance of detecting multiple founder variants by (OR of 4·32; 95% CI:1·03-20·47 and 1·78 (0·99-3·86) respectively). Our sensitivity analyses revealed the odds ratios calculated using the uni- and multivariable models are robust to inclusion of data from repeated participants, and to the exclusion of studies that contained fewer than 10 participants, of studies that consisted solely of single founder infections, and of individual data that did not use single genome amplification (Fig. S7). View this table: [Table 2:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/T3) Table 2: Odds ratios that an HIV-1 infection is initiated by multiple founder variants, inferred from fixed effects coefficients from the univariable and selected multivariable meta-regression models. Significant effects in bold. MSM - men who have sex with men; PWID - people who inject drugs; NFLG - near full length genome. ## 6. Discussion Using data from 71 previous studies, we estimated that a quarter of HIV-1 infections are initiated by multiple founder variants. When controlling for different methodologies across studies, the probability that an infection is initiated by multiple founders decreased relatively by 37.5% for female-to-male infections with respect to a baseline of male-to-female infections, but increased by 81.25% for infections transmitted between people who inject drugs. Further, we found that model-based methods, representing a group of approaches that determine founder multiplicity by comparing the observed distribution of diversity with that expected under neutral exponential outgrowth from single variant transmission, were less likely to identify multiple founder infections. Together these results suggest that while the exposure route probably influences the number of founder variants, previous comparison has been difficult due to different study methodologies. Our pooled estimate is consistent with the seminal study of Keele et al., who found 23·5% (24/102) of their participants had infections initiated by multiple founders.3 Our stratified predicted probabilities are also in line with those of previous smaller studies. A nine-study meta-analysis of 354 subjects found 0·34 of PWID infections were initiated by multiple founders compared with 0·29 (95% CI: 0·13-0·52) in our study; 0·2 for heterosexual infections compared to 0·23 (0·06-0·56) and 0·25 for MSM infections for which we calculated 0·23 (0·03-0·7).12 Likewise, an earlier meta-analysis of five studies and 235 subjects found PWID infections were at significantly greater odds than heterosexual infections of being initiated by a single founder, with the frequency of founder variant multiplicity increasing 3-fold, while a smaller, non-signficant 1·5-fold increase was observed with respect to MSM transmissions.17 In both instances, these studies restricted the number of participants so that the methodology in estimating founder variant multiplicity was consistent across all subjects. In this study, in contrast, we were able to extend our meta-analysis by leveraging individual level data to control for methodological sources of heterogeneity. We did not identify any significant effect of sampling delay on the probability that an infection is identified to be initiated by multiple founders. While previous work has shown a negative association between detection of multiple founder variants and the delay from infection to sampling, this discrepancy is likely due to the range of the delay analysed. Specifically, Leitner and Romero-Severson found a reduced chance of multiple founder variants over a period of 8 years, while our study analyses over a shorter time span of less or greater than a 3 week delay89 Certain routes of transmission that our analysis found to be associated with a higher or lower probability of multiple founder variants, have previously been identified as having higher or lower probabilities of transmission, respectively. For example, we estimated that female-to-male multiple variant transmission is 39% less likely than that of male-to-female, while the per exposure transmission probability has been estimated at half as likely.11 Similarly, MSM infections are 46% more likely to be initiated by multiple founders, but here the probability of infection following a given exposure can be up to 33-fold greater than that in male-to-female infections. By contrast, although PWID infections were found to be the most likely to be initiated by multiple founders, PWID are less likely to be infected upon exposure than MSM, and 14-fold greater than male-to-female infections. Further, mother-to-child infections are not significantly more likely than MSM or heterosexual infections to be initiated by multiple founders, but the probability of infection for mother-to-child exposures is 16-times and 565-times greater, respectively. Our results suggest a complicated relationship between the probability of transmission and the probability of multiple founder transmission. Our analysis has some limitations. First, our definition of single and multiple founder variants is determined by the individual studies, however questions remain concerning the definition of a founder variant. Recent studies have suggested a continuum of genotypic diversity exists, rather than discrete variants that give rise to distinct phylogenetic diversification trajectories and may not be reflected by this binary classification.90,98 Indeed, although a threshold is specified for distance-based methods, above which the observed diversity is defined to be to great to be explained by neutral exponential growth, this threshold often varies between publications.100,101 For example, both Keele et al and Li et al analysed the diversity of the envelope protein, but whilst the former classifies populations with less than 0.47% diversity as homogenous, Li et al included samples up to 0.75%.3,15 The distinction between single and multiple founder variants may further be blurred by non-coalescent sources of variation such as recombination and APOBEC mediated hypermutation, which would erroneously inflate diversity measures unless accounted for.102,103 Ultimately, the classification of multiple/single founders is subjective and may also be informed by cognitive biases of the authors. This is pertinent to studies which recruit participants from specific, often marginalised risk groups (e.g. MSM, PWID), where authors may have been more likely to classify multiple founder infections based on their prior assumptions.Second, we acknowledge that under the hypothesis that the proportion of infections initiated by multiple founders varies by transmission route, our point estimate will be influenced by the relative proportion of transmission routes in our dataset. Globally, it is estimated that 70% of infections are transmitted heterosexually, compared to 42.2% in our dataset, which reflects the longstanding geographical bias of research towards patients in the global north.104 Therefore, our point estimate should be considered a summary of the published data over the course of the HIV-1 epidemic, and not a global estimate at any fixed point in time. Third, we were unable to account for the stage of infection in the transmitter, despite recent findings that transmitters with acute infections are more likely to initiate multiple variant infections, because we had insufficient data regarding the transmitting partner within our dataset.99 Finally, we acknowledge the bootstrapped confidence intervals are wide and may lead to uncertainty in our estimates. These arise as a product of small sample sizes for certain observations, and the crossed random effects of publication and cohort used in the meta-regression. In particular, our finding that infections analysed using gag are significantly more likely to be initiated by multiple founders demonstrates substantial uncertainty, and is arguably unlikely considering the mutation rate of envelope is significantly higher than gag during primary infection.105 We note that in this case, the results of our univariable analysis of genomic region analysed are more consistent with our prior expectations. This systematic review and meta analysis has demonstrated that infections initiated by multiple founders account for a quarter of HIV-1 infections across all known routes of transmission. We find that transmissions involving people who inject drugs are significantly more likely to be initiated by multiple founder variants, whilst female-to-male infections are significantly less likely, relative to male-to-female infections. Quantifying how the routes of HIV infection impact the transmission of multiple variants allows us to better understand the evolution, epidemiology and clinical picture of HIV transmission. ## 7. Contributors KEA conceived the study. JB, SL, DT, KEA designed the study. JB and SL extracted the data. JB performed the experiments and analysed the data. All authors interpreted the data. JB and KEA drafted the manuscript, with critical revisions from all authors. All authors approved the final version of the manuscript ## Data Availability Previously published data will be available alongside the code used for this study in a GitHub repository [https://github.com/J-Baxter/foundervariantsHIV\_sysreview](https://github.com/J-Baxter/foundervariantsHIV_sysreview) ## 8. Declaration of Interests The authors declare no competing interests ## 9. Data Sharing Code use in this study is available at [https://github.com/J-Baxter/foundervariantsHIV\_sysreview](https://github.com/J-Baxter/foundervariantsHIV_sysreview) ## 12. Supplementary ### 12.1. S1: PRISMA Checklist View this table: [Table4](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/T4) ### 12.2. S2: Supplementary Methods #### 12.2.1. Full search query submitted to MEDLINE, EMBASE and Global Health databases (((((transmi*.af. or found*.af. or bottleneck.af. or single.af. or multiple.af. or multiplicity.af. or breakthrough.ti. or TF.af.) and (virus*.af. or variant*.af. or strain.af. or lineage.af. or phenotyp*.af.)) and (HIV.ti. or HIV-1.ti. or human immunodeficiency virus.ti. or env.ti. or envelope.ti or gag.ti. or pol.ti.)) and ((single genome amplification.af. or sga.af. or sgs.af. or ((sequencing.af. or characterized.af.) and (single genome.af. or deep.af. or whole genome.af. or full length.af. or full-length.af.))) or divers*.af. or distance.af. or poisson-fitter.af. or fitness.af. or (monophyletic.af. or paraphyletic.af. or polyphyletic.af.) or (phylogenetic*.af. and (clade.af. or topology.af. or tree.af. or linked.af. or diver*.af. or distance.af. or sieve.af. or molecular dating.af.)))) not ((SIV.ti,ab. or simian immunodeficiency.ti,ab. or fiv.ti,ab. or feline immunodeficiency virus.ti,ab. or exp Hepacivirus/ or Hepatitis.ti,ab. or exp Flaviviridae/ or Tuberculosis.ti,ab. or Enterovirus.ti,ab. or exp Spumavirus/ or diarrhoea.ti,ab. or diarrhea.ti,ab. or superinfection.ti. or exp Malaria/ or CMV.ti,ab. or HPV.ti,ab. or SHIV.ti,ab. OR exp HIV-2/ or phylogeo*.af. or network.ti. or exp HIV Protease Inhibitors/ or exp HIV Integrase Inhibitors/))) Set to these databases: * Ovid MEDLINE(R) and Epub Ahead of Print, In-Process & Other Non-Indexed Citations, Daily and Versions(R) * Global Health 1910 to 2020 Week 36 * EMBASE & EMBASE Classic 1947 – Sep 11 #### 12.2.2. Software and Computational Methods * All code associated with this study is available under GNU General Public License v3.0 at the following GitHub repository: foundervariantsHIV_sysreview. * The analyses were conducted in R 3.6.1, using the following packages: lme4, 1.1-23, (Bates et al. 2007); metafor, 2.4-0, (Viechtbauer 2010); performance, 0.6.1, ; cowplot, 1.0.0, ; ggplot2, 3.3.2,; dplyr, 1.0.3, (Wickham et al. 2015); forcats, 0.5.0, ; mltools, 0.3.5, ; parallel, 3.6.1,; reshape2,1.4.3, ; stringr, 1.4.0, ; tidyr, 1.0. ### 12.3. S3: Time Structure of Route of Exposure and Method ![Figure S3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F4.medium.gif) [Figure S3:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F4) Figure S3: Distributions of transmission route (A) and grouped method (B) over time, highlighting the epidemiologic and methodological step-changes that occurred over the three decades in which the selected studies were published. Importantly, this means that earlier methods may be biased to those transmission routes that were more common in earlier studies. ### 12.4. S4: Sensitivity Analyses for Pooling ![Figure S4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F5.medium.gif) [Figure S4:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F5) Figure S4: A visual comparison of the pooled estimates of the probability that an infection is initiated by multiple founders by the one-step (GLMM) and two-step (Binomial-Normal (B-N)) models and respective sensitivity analyses. Plot (A) shows both models calculate concordant estimates and are robust to sensitivity analyses designed to test our inclusion/exclusion criteria, and biases introduced by small or minimal-effect studies. B) reports the distribution of estimates, recalculated from 1000 datasets in which the representative datapoint for each individual was sampled at random from a pool of their possible measurements. The dashed lines and shaded areas denote the original point estimate and confidence intervals, respectively. ### 12.5. S5: Leave-One-Out Cross Validation ![Figure S5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F6.medium.gif) [Figure S5:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F6) Figure S5: For both one-step and two-step models, we visually inspect the influence of each study included in our analysis on the pooled estimate that an infection is initiated by multiple founders. We find that in iteratively excluding individual studies, no discernible impact on the overall pooled estimate is made. ### 12.6. S6: Evaluation of Publication Bias ![Figure S6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F7.medium.gif) [Figure S6:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F7) Figure S6: Funnel plot to visually evaluate the presence of publication bias. In the absence of publication bias, study estimates are distributed symmetrically with respect to the pooled estimate (vertical solid black line). Here, the log odds of an infection being initiated by multiple founders for each study, plotted against the standard error for each study indicate an absence of publication bias. This conclusion was supported by an (Egger’s Regression Test: t = -0.2663, df = 56, p = 0.7910). ### 12.7. S7: Sensitivity Analyses for Meta-regression (i) ![Figure S7:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/07/18/2021.07.14.21259809/F8.medium.gif) [Figure S7:](http://medrxiv.org/content/early/2021/07/18/2021.07.14.21259809/F8) Figure S7: Odds ratios that an infection is initiated by multiple founders, stratified by route of transmission, as calculated in the main analysis (A), following the iterative exclusion of individual studies (B) and bootstrapped estimates recalculated from 1000 datasets in which the representative datapoint for each individual was sampled at random from a pool of their possible measurements (C). Panel (D) plots the odds ratios of all covariate levels included in the meta-regression, stratifying by previously defined sensitivity analyses. Overly generous confidence intervals in (D), particularly under the condition of single genome analysis (SGA) only data, is likely due to small sample sizes in at those levels (n<10). ## 10. Acknowledgements JB was supported by the MRC Precision Medicine Doctoral Training Programme (ref: 2259239); CJV-A and KEA were funded by an ERC Starting Grant (award number 757688) awarded to KEA. We are grateful to Kamini Gounder, Mary Kearney, Vladimir Novitsky, Morgane Rolland and Sodsai Tovanabutra for agreeing to share additional individual patient data with the authors in order to complete this study. * Received July 14, 2021. * Revision received July 14, 2021. * Accepted July 18, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at [http://creativecommons.org/licenses/by-nc/4.0/](http://creativecommons.org/licenses/by-nc/4.0/) ## 11. References 1. Zhu T, Mo H, Wang N, et al. Genotypic and phenotypic characterization of HIV-1 patients with primary infection. Science 1993; 261: 1179 LP – 1181. 2. Zhang LQ, MacKenzie P, Cleland A, Holmes EC, Brown AJ, Simmonds P. Selection for specific sequences in the external envelope protein of human immunodeficiency virus type 1 upon primary infection. J Virol 1993; 67: 3345 LP – 3356. 3. Keele BF, Giorgi EE, Salazar-Gonzalez JF, et al. Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc Natl Acad Sci 2008; 105: 7552–7. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMToiMTA1LzIxLzc1NTIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wNy8xOC8yMDIxLjA3LjE0LjIxMjU5ODA5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 4. Sagar M, Lavreys L, Baeten JM, et al. Infection with multiple human immunodeficiency virus type 1 variants is associated with faster disease progression. J Virol 2003; 77: 12921–6. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjExOiI3Ny8yMy8xMjkyMSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 5. Cornelissen M, Pasternak AO, Grijsen ML, et al. HIV-1 Dual Infection Is Associated With Faster CD4+ T-Cell Decline in a Cohort of Men With Primary HIV Infection. Clin Infect Dis 2012; 54: 539–47. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/cir849&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22157174&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 6. Janes H, Herbeck JT, Tovanabutra S, et al. HIV-1 infections with multiple founders are associated with higher viral loads than infections with single founders. Nat Med 2015; 21: 1139. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nm.3932&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26322580&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 7. Macharia GN, Yue L, Staller E, et al. Infection with multiple HIV-1 founder variants is associated with lower viral replicative capacity, faster CD4+ T cell decline and increased immune activation during acute infection. PLoS Pathog 2020; 16: e1008853–e1008853. 8. Kariuki SM, Selhorst P, Ariën KK, Dorfman JR. The HIV-1 transmission bottleneck. Retrovirology 2017; 14: 22. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12977-017-0343-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28335782&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 9. Joseph SB, Swanstrom R, Kashuba ADM, Cohen MS. Bottlenecks in HIV-1 transmission: insights from the study of founder viruses. Nat Rev Microbiol 2015; 13: 414–25. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nrmicro3471&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26052661&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 10. Talbert-Slagle K, Atkins KE, Yan K-K, et al. Cellular Superspreaders: An Epidemiological Perspective on HIV Infection inside the Body. PLOS Pathog 2014; 10: e1004092. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1004092&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24811311&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 11. Patel P, Borkowf CB, Brooks JT, Lasry A, Lansky A, Mermin J. Estimating per-act HIV transmission risk: a systematic review. AIDS 2014; 28. 12. Tully DC, Ogilvie CB, Batorsky RE, et al. Differences in the Selection Bottleneck between Modes of Sexual Transmission Influence the Genetic Composition of the HIV-1 Founder Virus. PLoS Pathog 2016; 12: e1005619–e1005619. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1005619&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27163788&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 13. Carlson JM, Schaefer M, Monaco DC, et al. HIV transmission. Selection bias at the heterosexual HIV-1 transmission bottleneck. Science 2014; 345: 1254031–1254031. 14. Haaland RE, Hawkins PA, Salazar-Gonzalez J, et al. Inflammatory Genital Infections Mitigate a Severe Genetic Bottleneck in Heterosexual Transmission of Subtype A and C HIV-1. PLOS Pathog 2009; 5: e1000274. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1000274&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19165325&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 15. Li H, Bar KJ, Wang S, et al. High multiplicity infection by HIV-1 in men who have sex with men. PLoS Pathog 2010; 6: e1000890. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1000890&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20485520&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 16. Sagar M, Kirkegaard E, Long EM, et al. Human immunodeficiency virus type 1 (HIV-1) diversity at time of infection is not restricted to certain risk groups or specific HIV-1 subtypes. J Virol 2004; 78: 7279–83. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjEwOiI3OC8xMy83Mjc5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 17. Bar KJ, Li H, Chamberland A, et al. Wide variation in the multiplicity of HIV-1 infection among injection drug users. J Virol 2010; 84: 6241–7. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjEwOiI4NC8xMi82MjQxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 18. Masharsky AE, Dukhovlinova EN, Verevochkin SV, et al. A Substantial Transmission Bottleneck among Newly and Recently HIV-1-Infected Injection Drug Users in St Petersburg, Russia. J Infect Dis 2010; 201: 1697–702. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/652702&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20423223&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000277176200013&link_type=ISI) 19. Robertson DL, Anderson JP, Bradac JA, et al. HIV-1 nomenclature proposal. Science 2000; 288: 55. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1126/science.288.5463.55c&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10766634&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 20. Archer J, Robertson DL. Understanding the diversification of HIV-1 groups M and O. Aids 2007; 21: 1693–700. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/QAD.0b013e32825eabd0&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17690566&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000249169000003&link_type=ISI) 21. Meyerhans A, Vartanian J-P, Wain-Hobson S. DNA recombination during PCR. Nucleic Acids Res 1990; 18: 1687–91. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/18.7.1687&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=2186361&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1990CZ68400002&link_type=ISI) 22. Simmonds P, Balfe P, Peutherer JF, Ludlam CA, Bishop JO, Brown AJ. Human immunodeficiency virus-infected individuals contain provirus in small numbers of peripheral mononuclear cells and at low copy numbers. J Virol 1990; 64: 864–72. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjg6IjY0LzIvODY0IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 23. Salazar-Gonzalez JF, Bailes E, Pham KT, et al. Deciphering human immunodeficiency virus type 1 transmission and early envelope diversification by single-genome amplification and sequencing. J Virol 2008; 82: 3952–70. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6IjgyLzgvMzk1MiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 24. Giorgi EE, Funkhouser B, Athreya G, Perelson AS, Korber BT, Bhattacharya T. Estimating time since infection in early homogeneous HIV-1 samples using a poisson model. BMC Bioinformatics 2010; 11: 532. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2105-11-532&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20973976&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 25. Riley RD, Legha A, Jackson D, et al. One-stage individual participant data meta-analysis models for continuous and binary outcomes: Comparison of treatment coding options and estimation methods. Stat Med 2020; 39: 2536–55. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 26. Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Introduction to meta-analysis. John Wiley & Sons, 2011. 27. Egger M, Smith GD, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. Bmj 1997; 315: 629–34. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjEyOiIzMTUvNzEwOS82MjkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wNy8xOC8yMDIxLjA3LjE0LjIxMjU5ODA5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 28. Burke DL, Ensor J, Riley RD. Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ. Stat Med 2017; 36: 855–75. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.7141&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27747915&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 29. Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for confirmatory hypothesis testing: Keep it maximal. J Mem Lang 2013; 68: 10.1016/j.jml.2012.11.001. 30. Wolinsky SM, Wike CM, Korber BT, et al. Selective transmission of human immunodeficiency virus type-1 variants from mothers to infants. Science 1992; 255: 1134 LP – 1137. 31. Briant L, Wade CM, Puel J, Brown AJ, Guyader M. Analysis of envelope sequence variants suggests multiple mechanisms of mother-to-child transmission of human immunodeficiency virus type 1. J Virol 1995; 69: 3778–88. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6IjY5LzYvMzc3OCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 32. Poss M, Martin HL, Kreiss JK, et al. Diversity in virus populations from genital secretions and peripheral blood from women recently infected with human immunodeficiency virus type 1. J Virol 1995; 69: 8118–22. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjEwOiI2OS8xMi84MTE4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 33. Wade CM, Lobidel D, Brown AJ. Analysis of human immunodeficiency virus type 1 env and gag sequence variants derived from a mother and two vertically infected children provides evidence for the transmission of multiple sequence variants. J Gen Virol 1998; 79: 1055–68. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9603320&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 34. Long EM, Martin HL, Kreiss JK, et al. Gender differences in HIV-1 diversity at time of infection. Nat Med 2000; 6: 71–5. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/71563&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10613827&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000084583300038&link_type=ISI) 35. Dickover RE, Garratty EM, Plaeger S, Bryson YJ. Perinatal transmission of major, minor, and multiple maternal human immunodeficiency virus type 1 variants in utero and intrapartum. J Virol 2001; 75: 2194–203. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6Ijc1LzUvMjE5NCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 36. Delwart E, Magierowska M, Royz M, et al. Homogeneous quasispecies in 16 out of 17 individuals during very early HIV-1 primary infection. AIDS 2002; 16. [https://journals.lww.com/aidsonline/Fulltext/2002/01250/Homogeneous\_quasispecies\_in\_16\_out\_of\_17.7.aspx](https://journals.lww.com/aidsonline/Fulltext/2002/01250/Homogeneous\_quasispecies\_in\_16_out_of_17.7.aspx). 37. Learn GH, Muthui D, Brodie SJ, et al. Virus population homogenization following acute human immunodeficiency virus type 1 infection. J Virol 2002; 76: 11953–9. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjExOiI3Ni8yMy8xMTk1MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 38. Long EM, Rainwater SMJ, Lavreys L, Mandaliya K, Overbaugh J. HIV type 1 variants transmitted to women in Kenya require the CCR5 coreceptor for entry, regardless of the genetic complexity of the infecting virus. AIDS Res Hum Retroviruses 2002; 18: 567–76. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1089/088922202753747914&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12036486&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 39. Nowak P, Karlsson AC, Naver L, Bohlin AB, Piasek A, Sönnerborg A. The selection and evolution of viral quasispecies in HIV-1 infected children. HIV Med 2002; 3: 1–11. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1046/j.1464-2662.2001.00097.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12059945&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 40. Renjifo B, Chung M, Gilbert P, et al. In-utero transmission of quasispecies among human immunodeficiency virus type 1 genotypes. Virology 2003; 307: 278–82. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0042-6822(02)00066-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12667797&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 41. Verhofstede C, Demecheleer E, De Cabooter N, et al. Diversity of the human immunodeficiency virus type 1 (HIV-1) env sequence after vertical transmission in mother-child pairs infected with HIV-1 subtype A. J Virol 2003; 77: 3050–7. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6Ijc3LzUvMzA1MCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 42. Derdeyn CA, Decker JM, Bibollet-Ruche F, et al. Envelope-Constrained Neutralization-Sensitive HIV-1 After Heterosexual Transmission. Science 2004; 303: 2019 LP – 2022. 43. Ritola K, Pilcher CD, Fiscus SA, et al. Multiple V1/V2 env variants are frequently present during primary infection with human immunodeficiency virus type 1. J Virol 2004; 78: 11208–18. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjExOiI3OC8yMC8xMTIwOCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 44. Sagar M, Wu X, Lee S, Overbaugh J. HIV-1 V1-V2 envelope loop sequences expand and add glycosylation sites over the course of infection and these modifications affect antibody neutralization sensitivity. J Virol 2006; 80: 9586–98. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjEwOiI4MC8xOS85NTg2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 45. Gottlieb GS, Heath L, Nickle DC, et al. HIV-1 variation before seroconversion in men who have sex with men: analysis of acute/early HIV infection in the multicenter AIDS cohort study. J Infect Dis 2008; 197: 1011–5. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/529206&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18419538&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000254249500013&link_type=ISI) 46. Kwiek JJ, Russell ES, Dang KK, et al. The molecular epidemiology of HIV-1 envelope diversity during HIV-1 subtype C vertical transmission in Malawian mother-infant pairs. AIDS Lond Engl 2008; 22: 863–71. 47. Abrahams M-R, Anderson JA, Giorgi EE, et al. Quantitating the multiplicity of infection with human immunodeficiency virus type 1 subtype C reveals a non-poisson distribution of transmitted variants. J Virol 2009; 83: 3556–67. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6IjgzLzgvMzU1NiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 48. Kearney M, Maldarelli F, Shao W, et al. Human immunodeficiency virus type 1 population genetics and adaptation in newly infected individuals. J Virol 2009; 83: 2715–27. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjk6IjgzLzYvMjcxNSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 49. Novitsky V, Lagakos S, Herzig M, et al. Evolution of proviral gp120 over the first year of HIV-1 subtype C infection. Virology 2009; 383: 47–59. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.virol.2008.09.017&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18973914&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 50. Salazar-Gonzalez JF, Salazar MG, Keele BF, et al. Genetic identity, biological phenotype, and evolutionary pathways of transmitted/founder viruses in acute and early HIV-1 infection. J Exp Med 2009; 206: 1273–89. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamVtIjtzOjU6InJlc2lkIjtzOjEwOiIyMDYvNi8xMjczIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 51. Fischer W, Ganusov VV, Giorgi EE, et al. Transmission of single HIV-1 genomes and dynamics of early immune escape revealed by ultra-deep sequencing. PloS One 2010; 5: e12303–e12303. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0012303&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20808830&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 52. Zhang H, Tully DC, Hoffmann FG, He J, Kankasa C, Wood C. Restricted genetic diversity of HIV-1 subtype C envelope glycoprotein from perinatally infected Zambian infants. PloS One 2010; 5: e9294–e9294. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0009294&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20174636&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 53. Boeras DI, Hraber PT, Hurlston M, et al. Role of donor genital tract HIV-1 diversity in the transmission bottleneck. Proc Natl Acad Sci U S A 2011; 108: E1156–63. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTA4LzQ2L0UxMTU2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 54. Collins-Fairclough AM, Charurat M, Nadai Y, et al. Significantly longer envelope V2 loops are characteristic of heterosexually transmitted subtype B HIV-1 in Trinidad. PloS One 2011; 6. 55. Herbeck JT, Rolland M, Liu Y, et al. Demographic processes affect HIV-1 evolution in primary infection before the onset of selective processes. J Virol 2011; 85: 7523–34. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoianZpIjtzOjU6InJlc2lkIjtzOjEwOiI4NS8xNS83NTIzIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 56. Kishko M, Somasundaran M, Brewster F, Sullivan JL, Clapham PR, Luzuriaga K. Genotypic and functional properties of early infant HIV-1 envelopes. Retrovirology 2011; 8: 67. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1742-4690-8-67&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21843318&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 57. Nofemela A, Bandawe G, Thebus R, et al. Defining the human immunodeficiency virus type 1 transmission genetic bottleneck in a region with multiple circulating subtypes and recombinant forms. Virology 2011; 415: 107–13. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.virol.2010.12.027&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21531432&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000291713500004&link_type=ISI) 58. Novitsky V, Wang R, Margolin L, et al. Transmission of single and multiple viral variants in primary HIV-1 subtype C infection. PLoS One 2011; 6. 59. Rachinger A, Groeneveld PHP, van Assen S, Lemey P, Schuitemaker H. Time-measured phylogenies of gag, pol and env sequence data reveal the direction and time interval of HIV-1 transmission. AIDS 2011; 25. [https://journals.lww.com/aidsonline/Fulltext/2011/05150/Time\_measured\_phylogenies\_of\_gag,\_pol\_and\_env.3.as](https://journals.lww.com/aidsonline/Fulltext/2011/05150/Time\_measured\_phylogenies\_of\_gag,_pol_and_env.3.as) px. 60. Rieder P, Joos B, Scherrer AU, et al. Characterization of Human Immunodeficiency Virus Type 1 (HIV-1) Diversity and Tropism in 145 Patients With Primary HIV-1 Infection. Clin Infect Dis 2011; 53: 1271–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/cir725&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21998286&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 61. Rolland M, Tovanabutra S, DeCamp AC, et al. Genetic impact of vaccination on breakthrough HIV-1 sequences from the STEP trial. Nat Med 2011; 17: 366–71. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nm.2316&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21358627&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000288070000048&link_type=ISI) 62. Henn MR, Boutwell CL, Charlebois P, et al. Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection. PLoS Pathog 2012; 8: e1002529–e1002529. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1002529&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22412369&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 63. Kiwelu IE, Novitsky V, Margolin L, et al. HIV-1 subtypes and recombinants in Northern Tanzania: distribution of viral quasispecies. PLoS One 2012; 7. 64. Rossenkhan R, Novitsky V, Sebunya TK, Musonda R, Gashe BA, Essex M. Viral diversity and diversification of major non-structural genes vif, vpr, vpu, tat exon 1 and rev exon 1 during primary HIV-1 subtype C infection. PloS One 2012; 7: e35491–e35491. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0035491&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22590503&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 65. Sturdevant CB, Dow A, Jabara CB, et al. Central nervous system compartmentalization of HIV-1 subtype C variants early and late in infection in young children. PLoS Pathog 2012; 8: e1003094–e1003094. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1003094&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23300446&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 66. Baalwa J, Wang S, Parrish NF, et al. Molecular identification, cloning and characterization of transmitted/founder HIV-1 subtype A, D and A/D infectious molecular clones. Virology 2013; 436: 33–48. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.virol.2012.10.009&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23123038&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 67. Frange P, Meyer L, Jung M, et al. Sexually-transmitted/founder HIV-1 cannot be directly predicted from plasma or PBMC-derived viral quasispecies in the transmitting partner. PloS One 2013; 8: e69144–e69144. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0069144&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23874894&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 68. Chaillon A, Gianella S, Wertheim JO, Richman DD, Mehta SR, Smith DM. HIV migration between blood and cerebrospinal fluid or semen over time. J Infect Dis 2014; 209: 1642–52. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/infdis/jit678&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24302756&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 69. Sterrett S, Learn GH, Edlefsen PT, et al. Low multiplicity of HIV-1 infection and no vaccine enhancement in VAX003 injection drug users. In: Open forum infectious diseases. Oxford University Press, 2014. 70. Wagner GA, Pacold ME, Kosakovsky Pond SL, et al. Incidence and prevalence of intrasubtype HIV-1 dual infection in at-risk men in the United States. J Infect Dis 2014; 209: 1032–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/infdis/jit633&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24273040&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 71. Chen Y, Li N, Zhang T, et al. Comprehensive Characterization of the Transmitted/Founder env Genes From a Single MSM Cohort in China. J Acquir Immune Defic Syndr 1999 2015; 69: 403–12. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/QAI.0000000000000649&link_type=DOI) 72. Danaviah S, de Oliveira T, Bland R, et al. Evidence of long-lived founder virus in mother-to-child HIV transmission. PloS One 2015; 10: e0120389–e0120389. 73. Deymier MJ, Ende Z, Fenton-May AE, et al. Heterosexual Transmission of Subtype C HIV-1 Selects Consensus-Like Variants without Increased Replicative Capacity or Interferon-α Resistance. PLoS Pathog 2015; 11: e1005154–e1005154. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1005154&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26378795&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 74. Gounder K, Padayachi N, Mann JK, et al. High frequency of transmitted HIV-1 Gag HLA class I-driven immune escape variants but minimal immune selection over the first year of clade C infection. PloS One 2015; 10: e0119886–e0119886. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0119886&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25781986&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 75. Le AQ, Taylor J, Dong W, et al. Differential evolution of a CXCR4-using HIV-1 strain in CCR5wt/wt and CCR5Δ32/Δ32 hosts revealed by longitudinal deep sequencing and phylogenetic reconstruction. Sci Rep 2015; 5: 17607. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/srep17607&link_type=DOI) 76. Zanini F, Brodin J, Thebo L, et al. Population genomics of intrapatient HIV-1 evolution. Elife 2015; 4: e11282. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26652000&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 77. Chaillon A, Gianella S, Little SJ, et al. Characterizing the multiplicity of HIV founder variants during sexual transmission among MSM. Virus Evol 2016; 2. DOI:10.1093/ve/vew012. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ve/vew012&link_type=DOI) 78. Love TMT, Park SY, Giorgi EE, Mack WJ, Perelson AS, Lee HY. SPMM: estimating infection duration of multivariant HIV-1 infections. Bioinforma Oxf Engl 2016; 32: 1308–15. 79. Novitsky V, Moyo S, Wang R, Gaseitsiwe S, Essex M. Deciphering multiplicity of HIV-1C infection: transmission of closely related multiple viral lineages. PloS One 2016; 11. 80. Oberle CS, Joos B, Rusert P, et al. Tracing HIV-1 transmission: envelope traits of HIV-1 transmitter and recipient pairs. Retrovirology 2016; 13: 62. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12977-016-0299-0&link_type=DOI) 81. Park SY, Mack WJ, Lee HY. Enhancement of viral escape in HIV-1 Nef by STEP vaccination. AIDS Lond Engl 2016; 30: 2449–58. 82. Salazar-Gonzalez JF, Salazar MG, Tully DC, et al. Use of Dried Blood Spots to Elucidate Full-Length Transmitted/Founder HIV-1 Genomes. Pathog Immun 2016; 1: 129–53. 83. Smith SA, Burton SL, Kilembe W, et al. Diversification in the HIV-1 Envelope Hyper-variable Domains V2, V4, and V5 and Higher Probability of Transmitted/Founder Envelope Glycosylation Favor the Development of Heterologous Neutralization Breadth. PLoS Pathog 2016; 12: e1005989–e1005989. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1005989&link_type=DOI) 84. DeCamp AC, Rolland M, Edlefsen PT, et al. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120. PloS One 2017; 12: e0185959–e0185959. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0185959&link_type=DOI) 85. Iyer SS, Bibollet-Ruche F, Sherrill-Mix S, et al. Resistance to type 1 interferons is a major determinant of HIV-1 transmission fitness. Proc Natl Acad Sci U S A 2017; 114: E590–9. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMDoiMTE0LzQvRTU5MCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA3LzE4LzIwMjEuMDcuMTQuMjEyNTk4MDkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 86. Kijak GH, Sanders-Buell E, Chenine A-L, et al. Rare HIV-1 transmitted/founder lineages identified by deep viral sequencing contribute to rapid shifts in dominant quasispecies during acute and early infection. PLoS Pathog 2017; 13: e1006510–e1006510. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1006510&link_type=DOI) 87. Ashokkumar M, Aralaguppe SG, Tripathy SP, Hanna LE, Neogi U. Unique phenotypic characteristics of recently transmitted HIV-1 subtype C envelope glycoprotein gp120: use of CXCR6 coreceptor by transmitted founder viruses. J Virol 2018; 92: e00063–18. 88. Dukhovlinova E, Masharsky A, Vasileva A, et al. Characterization of the Transmitted Virus in an Ongoing HIV-1 Epidemic Driven by Injecting Drug Use. AIDS Res Hum Retroviruses 2018; 34: 867–78. 89. Leitner T, Romero-Severson E. Phylogenetic patterns recover known HIV epidemiological relationships and reveal common transmission of multiple variants. Nat Microbiol 2018; 3: 983–8. 90. Lewitus E, Rolland M. A non-parametric analytic framework for within-host viral phylogenies and a test for HIV-1 founder multiplicity. Virus Evol 2019; 5: vez044. 91. Sivay MV, Grabowski MK, Zhang Y, et al. Phylogenetic Analysis of Human Immunodeficiency Virus from People Who Inject Drugs in Indonesia, Ukraine, and Vietnam: HPTN 074. Clin Infect Dis 2019; published online Dec. DOI:10.1093/cid/ciz1081. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/cid/ciz1081&link_type=DOI) 92. Todesco E, Wirden M, Calin R, et al. Caution is needed in interpreting HIV transmission chains by ultradeep sequencing. Aids 2019; 33: 691–9. 93. Tovanabutra S, Sirijatuphat R, Pham PT, et al. Deep Sequencing Reveals Central Nervous System Compartmentalization in Multiple Transmitted/Founder Virus Acute HIV-1 Infection. Cells 2019; 8: 902. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3390/cells8080902&link_type=DOI) 94. Brooks K, Jones BR, Dilernia DA, et al. HIV-1 variants are archived throughout infection and persist in the reservoir. PLOS Pathog 2020; 16: e1008378. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1008378&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32492044&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 95. Leda AR, Hunter J, Castro de Oliveira U, et al. HIV-1 genetic diversity and divergence and its correlation with disease progression among antiretroviral naïve recently infected individuals. Virology 2020; 541: 13–24. 96. Liu Y, Jia L, Su B, et al. The genetic diversity of HIV-1 quasispecies within primary infected individuals. AIDS Res Hum Retroviruses 2020. 97. Martinez DR, Tu JJ, Kumar A, et al. Maternal Broadly Neutralizing Antibodies Can Select for Neutralization-Resistant, Infant-Transmitted/Founder HIV Variants. mBio 2020; 11: e00176–20. 98. Rolland M, Tovanabutra S, Dearlove B, et al. Molecular dating and viral load growth rates suggested that the eclipse phase lasted about a week in HIV-1 infected adults in East Africa and Thailand. PLoS Pathog 2020; 16: e1008179–e1008179. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1008179&link_type=DOI) 99. Villabona-Arenas ChJ, Hall M, Lythgoe KA, et al. Number of HIV-1 founder variants is determined by the recency of the source partner infection. Science 2020; 369: 103 LP – 108. 100.Lee HY, Giorgi EE, Keele BF, et al. Modeling sequence evolution in acute HIV-1 infection. J Theor Biol 2009; 261: 341–60. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jtbi.2009.07.038&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19660475&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000274798900017&link_type=ISI) 101.Slatkin M, Hudson RR. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 1991; 129: 555–62. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXRpY3MiO3M6NToicmVzaWQiO3M6OToiMTI5LzIvNTU1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDcvMTgvMjAyMS4wNy4xNC4yMTI1OTgwOS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 102.Simon V, Zennou V, Murray D, Huang Y, Ho DD, Bieniasz PD. Natural variation in Vif: differential impact on APOBEC3G/3F and a potential role in HIV-1 diversification. PLoS Pathog 2005; 1: e6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.0010006&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16201018&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 103.Bourara K, Liegler TJ, Grant RM. Target cell APOBEC3C can induce limited G-to-A mutation in HIV-1. PLoS Pathog 2007; 3: e153. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.0030153&link_type=DOI) 104.Shaw GM, Hunter E. HIV transmission. Cold Spring Harb Perspect Med 2012; 2: a006965. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTQ6ImNzaHBlcnNwZWN0bWVkIjtzOjU6InJlc2lkIjtzOjEyOiIyLzExL2EwMDY5NjUiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8wNy8xOC8yMDIxLjA3LjE0LjIxMjU5ODA5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 105.Novitsky V, Wang R, Rossenkhan R, Moyo S, Essex M. Intra-host evolutionary rates in HIV-1C env and gag during primary infection. Infect Genet Evol 2013; 19: 361–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.meegid.2013.02.023&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23523818&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) ## 12.8. S8: Supplementary References 1. Bates D, Sarkar D, Bates MD, Matrix L. 2007. The lme4 package. R Package Version 2:74. 2. Lüdecke D. 2018. ggeffects: Tidy data frames of marginal effects from regression models. J. Open Source Softw. 3:772. 3. Viechtbauer W. 2010. Conducting meta-analyses in R with the metafor package. J. Stat. Softw. 36:1–48. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.18637/jss.v036.i11&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25285054&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F18%2F2021.07.14.21259809.atom) 4. Wickham H. 2016. ggplot2: elegant graphics for data analysis. Springer 5. Wickham H, Francois R, Henry L, Müller K. 2015. dplyr: A grammar of data manipulation. R Package Version 04 3.