SUMMARY
Objectives We sought to understand the spread of SARS-CoV-2 infection in urban India, which has surprisingly low COVID-19 deaths.
Design Cross-sectional and trend analyses of seroprevalence in self-referred test populations, and of reported cases and COVID mortality data.
Participants 448,518 self-referred individuals using a nationwide chain of private laboratories with central testing of SARS-CoV-2 antibodies and publicly available case and mortality data.
Setting 12 populous cities with nearly 92 million total population.
Main outcome measures Seropositivity trends and predictors (using a Bayesian geospatial model) and prevalence derived from mortality data and infection fatality rates (IFR).
Results For the whole of India, 31% of the self-referred individuals undergoing antibody testing were seropositive for SARS-CoV-2 antibodies. Seropositivity was higher in females (35%) than in males (30%) overall and in nearly every age group. In these 12 cities, seroprevalence rose from about 18% in July to 41% by December, with steeper increases at ages <20 and 20-44 years than at older ages. The “M-shaped” age pattern is consistent with intergenerational transmission. Areas of higher childhood measles vaccination in earlier years had lower seropositivity. The patterns of increase in seropositivity and in peak cases and deaths varied substantially across cities. In Delhi, death rates and cases first peaked in June and again in November; Chennai had a single peak in July. Based local IFRs and COVID deaths (adjusted for undercounts), we estimate that 43%-65% of adults above age 20 had been infected (range of mid-estimates of 12%-77%) corresponding 26 to 36 million infected adults in these cities, or an average of 9-12 infected adults per confirmed case.
Conclusion Even with relatively low death rates, the large cities of India had remarkably high levels of SARS-CoV-2 infection. Vaccination strategies need to consider widespread intergenerational transmission.
Introduction
As of early March 2021, India had over 11 million confirmed SARS-CoV-2 infections (COVID cases), second only to the United States (US), and nearly 160,000 reported COVID deaths. Cases and deaths in India constitute about one in nine and one in thirteen, respectively, of worldwide totals.1 India’s COVID mortality rate adjusted for population size and age structure is substantially lower than that in the US or Europe.2 Within India, the reported cases have been concentrated in cities, with New Delhi, Bengaluru, Pune, Mumbai, Thane, and Chennai together reporting more than two million individuals with positive PCR tests.3 Reported COVID deaths are similarly concentrated, although both cases and deaths in rural areas are likely under-reported due to lower access to viral testing and lower levels of registration and certification of deaths than in urban areas. Most rural deaths occur at home without medical attention at the time of death.4
The transmission patterns of SARS-CoV-2 in India are likely to differ substantially from high-income countries or even from China,5 given that multiple generations typically live together. Moreover, high population density, particularly in urban areas, complicates physical distancing both at home and outside.
Serological surveys in focal areas of India report SARS-CoV-2 antibody seropositivity of 20-50% of adults.6-9 However, standardized seroprevalence estimates across the many diverse settings of urban India have not yet been documented. Here, we report results of serological testing from nearly 450,000 self-referred patients using a nationwide chain of private laboratories with central testing of IgG antibodies. We combine these data with recorded case counts and use demographic analyses of baseline and COVID-19 mortality to estimate mortality-derived infection rates in 12 of the most populous cities that have reasonably complete recording of all deaths. We also examine, on an ecological level, the association of past childhood vaccination10 and other risk factors11 to adult SARS-CoV-2 infection.
Methods
Population tested
We included 448,518 people of all ages who underwent testing for SARS-CoV-2 antibodies between June 12, 2020 and December 31, 2020 by Thyrocare Laboratories.12 Thyrocare conducts central laboratory testing in Navi Mumbai from over 2200 franchised collection centers in all major cities in India. Thyrocare charges 600-750 Rupees (US$8-10) for an antibody test. The available data included the type of assay used, sample collection date, age, sex, and area postal pincode, but no personal identifiers. Motives for testing were not ascertained. Based on Indian income distribution in urban areas,13 it is likely that only the higher socioeconomic strata (SES) can afford these costs.
SARS-CoV-2 IgG serological assays
The SARS-CoV-2 antibody assays involved collection of 3 ml of serum, shipment overnight to Navi Mumbai, and central high-throughput analyses using one or both of two widely-used assays that detect antibodies to the SARS-Cov-2 nucleocapsid protein: the Abbott SARS-CoV-2 chemiluminescent microparticle immunoassay for IgG (positive cutoff: >=1.40),14 and the Cobas, Roche Elecsys Anti-SARS-CoV-2 electrochemiluminescence immunoassay for total immunoglobulin (most of which is IgG; positive cutoff: >=1.0).15 For both assays, the laboratory followed the manufacturers’ quality control procedures.12
SARS-CoV-2 PCR and rapid antigen testing, and mortality in 12 metropolitan areas
We compared seropositivity from the laboratory results to cases and deaths in 12 cities with reasonably high completeness (>90%) of reporting deaths in pre-pandemic years, as defined by the Registrar General of India (RGI).16 These cities have a collective population nearly 92 million (61 million adults above age 20 years), including the megacities of Mumbai, Delhi, and Kolkata, as well as Chennai, Bangalore, Ahmedabad, Pune, Surat, Jaipur, Nagpur, Coimbatore, and Visakhapatnam. For all these cities, we obtained SARS-CoV-2 polymerase chain reaction (PCR) or rapid antigen testing results (some states used rapid tests in June and then switched to PCR testing). We do not distinguish between the two methods. COVID death reporting follows World Health Organization classification guidelines.17 Both cases and death data were from the COVID19India.org,3 a volunteer group that collects data daily from all official data sources. We compared these data with daily official statistics globally,1 or published by the Government of India,18 filling gaps through web searches of local bulletins and social media. We derived population and death rates from vital statistics.16
Data Analysis
The main calculations are of age- and sex-specific seropositivity nationally, and confirmed cases and COVID deaths per population over the age of 20 years, of the assayed participants. About 40% of the population tested by Thyrocare were in the 12 included cities. Because the tested population was not representative, we conducted a mortality-based estimate to calculate the prevalence of infection in each city. The relation of the age-specific SARS-CoV-2 prevalence (P) to COVID attributable mortality is P=DR/IFR where DR is the COVID death rates and IFR is the assumed infection fatality rate. The IFRs used values of 0.29%, 0.263% and 0.21%, drawn, respectively, from detailed serosurveys in Bangalore, Mumbai, and Pune.6-8 We applied a reasonably large 30% correction factor based on RGI data on death reporting completeness, and previous analyses of under-reporting in selected low and middle-income countries2,19 to arrive at adjusted COVID deaths which we then used to calculate the infection prevalence. Sensitivity analyses examined 0 to 50% undercount scenarios.
We linked the 1415 pincodes (83% of the total) from the serology data to a commercial pincode geocoding dataset,20 representing a wide geographic distribution across India (363 pincodes for the 12 cities, and 1054 for all other areas, supplementary figure 4), and to other spatial datasets. We used a geostatistical Bayesian logistic regression model to estimate the spatially smoothed probability of seropositivity, with the seropositive cases (based on positive results for either antibody assay) as the outcome and the total tested as the denominator for each pincode (yielding odds ratios [OR]). The model examined migrants from another state, population less than 20 years old, mean of particulate matter of 2.5 micron size (PM2.5; drawn from 2015-17 satellite-based exposure data21), and nationally representative estimates of childhood polio and measles vaccination from the National Family Health Survey 4 conducted in 2015-16.10 Because childhood vaccination was correlated, we examined polio and measles separately. The model also adjusted for sex, non-linear effect of age, percentage of urban settlement, population size, population density in the pincode, subdistrict-level solid-fuel use (as a measure of rural residence and lower socioeconomic status), female illiteracy, and language (as a proxy for geographic regions). We derived co-variates mostly from 2011 Indian census or WorldPop gridded data.22 To reduce the “noise” in the resulting geographic estimates, we added smoothly varying spatial random effects (i.e., spatial autocorrelation or clustering) and independent unit-level random effects.23 Additional details of the analyses are provided in the supplementary appendix.
Patient and public involvement
Owing to the nature of this research, no patients or members of the public were involved in the design or reporting of this study.
Results
For the whole of India, 31% (140,631) of the 448,518 self-referred people who tested were seropositive for SARS-CoV-2 antibodies on either assay. Seropositivity to SARS-CoV-2 rose from about 16% in July until December, by which time nearly 50% of the tested population was seropositive. Of those tested, 218, 111 used the Abbott assay, 103,221 used the Roche assay, and 127,186 used both. We present results for either assay interchangeably as the concordance of the two assays was high: arbitrarily choosing the more widely used Abbot assay as the reference, the Roche assay showed a sensitivity and specificity of 87% and 97%, respectively (Kappa agreement 0.879; 95%CI 0.876, 0.882). Analyses using dual positive individuals yielded similar results (data not shown). The standardized titers of the two assays yielded comparable age-specific results, with wider divergence at older ages (data not shown).
Seropositivity from July 1 to December 31, 2020 was higher in females (35%) than in males (30%) in nearly every age group (Figure 1 panel A, supplementary table 1). The “M-shaped” age pattern is broadly consistent with widespread intergenerational transmission: the sex-specific OR for seropositivity, using age 20 as the reference age and adjusted for location and various other covariates, rose in children to age 14, fell by age 20, and then rose to peaks at age 60 in females and 70 in males after which it declined (panel B). The seropositivity rates rose from July to December for each of the age groups <20, 20-44, 45-69 and 70+ (panel C), with the fastest growth at ages <20 and 20-44 years. Over 61% of the tested population were aged 20-44 years (supplementary table 1). Logistic regression of the daily new seropositive cases showed fluctuations nationally in seropositivity, with the peak of daily cases broadly synchronous with a September peak of PCR confirmed cases and deaths, followed by declining daily counts (panel D).
By December, seropositivity was 41% in the 12 cities but with marked variation in monthly growth from July across the cities (supplementary table 1 and supplementary figures 1 and 2). There was a steady increase in seropositivity in all cities, except in Surat, Vishakhapatnam, and Nagpur, which showed earlier peaks in August, September, and October, respectively. Daily changes in seropositive case were similar (supplementary figure 3). Table 1 presents the cumulative totals of PCR/rapid test confirmed cases and COVID deaths in the 12 cities through December 31, 2020. The 12 cities comprised just under a third of national totals of COVID cases, and over 40% of all COVID deaths. The highest COVID death rate per 100,000 population was in Pune (82) and Nagpur (69) and the lowest was in Vishakhapatnam (13) and Jaipur (8).
The overall patterns for all of India revealed a mid-September peak for cases and deaths (Figure 1 panel D), but with marked spatial variation (supplementary figure 5). Figure 2 illustrates this for Delhi, Pune, and Chennai. Delhi had two peaks of death rates and case rates, one in June and another in November, with seroprevalence in the self-referred population rising slowly to December. Pune’s case rate peaked in September, after deaths had peaked in August with seroprevalence in the self-referred population peaking in October. Chennai’s death and case rates peaked two months earlier in July with a continuous increase in seroprevalence in the self-referred population. The Pearson correlations were strong between city-specific COVID deaths and cases (0.84, p <.0001) and by week within each of the cities (data not shown) but only moderately between seropositivity and deaths (0.45, p <.0001).
An increase in measles vaccination among children in 2015-16 at the district level was associated with a marginally significant 6% reduction in seropositivity nationally (95%CI 0-12%), and there was a non-significant 2% reduced risk with increased polio vaccination (Figure 3). By contrast, PM2.5 per 10 unit increase, younger population structure or higher migrant history were each unassociated with increased seropositivity. The spatially-distributed risks of seropositivity at the pincode level varied, and at the state level, case and death rates peaked in most states in September-October (supplementary figure 5).
Because the tested individuals are not a representative sample of the urban Indian population, we also calculated mortality-based infection rates with an assumed infection fatality rate ranging from 0.29% to 0.21% based on independent sero studies in urban India, and adjusted COVID death totals in each city (Table 1, Figure 4, supplementary table 2). This suggests that overall in 12 cities about 43%-65% of adults above age 20 experienced SARS-CoV-2 infection in 2020, with a wide range of infection: Pune had the highest at 69% to an improbable 97%, whereas Jaipur had the lowest at 11%-15%. For these 12 cities, this suggests about 26 to 36 million adults above age 20 years had been infected. This corresponds to about 9 to 12 infected adults for every confirmed case, with variability in this ratio across cities (most confirmed cases were >20 years3). The mortality-derived prevalence varied considerably in the hypothetical scenarios of zero or 50% undercounts of COVID deaths (supplementary table 2). A subset of five cities had SARS-COV-2 antibody serosurveys on more randomly-selected populations, but there were no proper nationally-representative serosurveys. (Figure 4). These mostly showed a similarly high level of seropositivity (albeit with a wide range from 4-56%) and broadly similar patterns to mortality-derived prevalence (supplementary figure 6).
Discussion
We found a remarkably high seropositivity rate to SARS-CoV-2, approaching half of all adults in self-referred populations in selected major cities in India. The results were broadly consistent with the high prevalence estimated from COVID mortality data, as well as with a limited number of serological surveys done in more randomly-selected populations.
India’s urban COVID pandemic has been characterized as a paradox of widespread infection but low mortality.8,24,25 Urban India still has a young age structure and young adults have low IFRs in nearly all countries.24,25 The age-specific and temporal patterns suggesting substantial intergenerational transmission are consistent with Indian household composition. Similar seropositivity patterns by age are seen elsewhere where multiple generations live in one household, such as in Brazil and Iran.26,27 The most rapid increases in seropositivity through December at younger ages (<20) are consistent with younger cohorts in India being important transmitters of infection to older household members. The lowest OR of infection was at ages 20-24 years, when young adults are most likely to be living away from multigenerational households.28
Asymptomatic infection is more commonly reported in Indian serosurveys, exceeding 90% in some,6,29 in contrast to high-income countries, where about one-third of infections are asymptomatic.30,31 Asymptomatic infection may be less transmissible than from symptomatic index cases, and lead those infected also having milder disease.32 In addition, there has been speculation that non-specific effects of measles, polio, and BCG vaccination (each use live, attenuated viruses or bacteria) in the population may reduce the probability of SARS-CoV-2 infection or its severity.33,34 While our finding of reduced seropositivity with earlier childhood measles vaccine is based only on ecologic associations, it suggests that background immunity factors might help explain widespread infection and low mortality rates in India.
The alternative explanation, that urban India has a larger number of hidden deaths is certainly true in some settings, such as in the cities of Bihar and Uttar Pradesh, which have low levels of routine civil registration of deaths. However, we restricted our analyses only to the dozen cities with good registration in recent years,16 and hospitalization data from these cities3 show broadly similar levels and trends to COVID deaths. Under-reporting of COVID deaths of about 30-40% is now documented worldwide,2,19 and we incorporated such levels of under-reporting in our analyses. However, accounting for under-reporting resulted in even higher and sometimes implausibly high mortality-derived seroprevalence (supplementary table 2).
Widespread SARS-CoV-2 infection in urban India has several implications. First, subsequent viral waves remain possible. Indeed, Delhi and some other cities have had at least two viral waves and as of early March, modest increases in case counts were reported in various cities.3 Repeat waves could be expected with indoor crowding in many parts of India during April to June with extreme heat driving more people to cooler indoor settings. Second, the delivery of COVID vaccines practicable for widespread use in India would need to consider if immunity is already established, and the age-specific differences in durability of vaccine versus naturally-acquired immunity.35 Third, the reasons for the widespread variation in infection, including the role of cellular immunity, needs exploration.36
Notwithstanding the self-selected nature of the serosurvey population, our study has several advantages. The self-referred populations are likely biased to those seeking diagnosis and higher SES.13 Thus, the actual prevalence may be even greater, as serological studies in focal areas find that higher SES have lower prevalence than the lower SES.6-9 However, these biases were unlikely to have changed over the six-month study period. We capture the diverse regions of India, using standardized antibody assays with similar performance. Local mortality data enable a city-specific prevalence estimate, including if low prevalence was a consequence of under-reporting of deaths. Nonetheless, the study has some limitations. First, we could not provide reliable estimates of COVID infection or mortality in rural areas, where most of India’s 10 million annual deaths occur.4 In rural areas, SARS-CoV-2 infection and deaths might already be substantial. The Registrar General of India needs to re-start its Million Death Study,4,37 a mortality surveillance that provides cause of death information in over 8000 randomly-selected villages and urban blocks, as only maternal death data have been published after 2013. Repeat properly-representative large serosurveys that avoid some of the sampling limitations of current surveys29 are needed to monitor changes in natural and vaccine-induced immunity. Second, we had to rely on published estimates of IFR from three cities in Western India. IFRs may well vary by city and on variables such as background rates of obesity and other risk factors which are not well understood.5 The mortality-derived seroprevalence, while crude, is transparent and can be updated with improved IFR estimates.
Conclusions
India’s COVID experience until early 2021 has been substantially different than that in high-income countries, with widespread intragenerational transmission. In every country, direct sero-epidemiological studies and improved COVID mortality reporting38 can document the path of the pandemic and assess the impact of vaccination and other control efforts.
SUMMARY BOX
What is already known on this topic
Overall COVID mortality rates in urban India (where SARS-CoV-2 infection is concentrated) are substantially lower than those observed in high-income countries.
Previous reporting on the seroprevalence of SARS-CoV-2 antibodies has been in focal urban areas, and show infection to be common.
What this study adds
For the whole of India, 31% of the self-referred people tested in over 2,200 collection points tested seropositive for SARS-CoV-2 antibodies using centralized assays and by December, nearly half of the tested population were seropositive.
We observed higher seropositivity in females (35%) than in males (30%) in nearly every age group, and an “M-shaped” age pattern of seropositivity broadly consistent with widespread intergenerational transmission.
Seropositivity was reduced in areas with higher childhood measles vaccination, perhaps reflecting non-specific effects on SARS-CoV-2 immunity.
Based on local mortality data in the 12 cities, we estimate that 43%-65% of adults above age 20 were infected, corresponding 26 to 36 million adults in these cities, or an average of 9 to 12 infected per confirmed case.
Data Availability
The data analysis code is available from the corresponding author.
Contributors
Conceived the idea and developed the study design: AV, CN, PJ. Data analysis: WS, SHF, PB, Literature review: WS, HG. PJ and AV wrote the initial draft, and all authors were involved in commenting on subsequent revisions. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding
External funding was provided by Canadian Institutes of Health Research Foundation Grant (FDN 154277) and Emergency COVID-19 Research Grant, and the Connaught Global Challenge program of the University of Toronto.
Competing interests
Thyrocare is a commercial laboratory. We declare no competing interests relevant to this serological study. All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval
The study used anonymized data with no personal identifiers. Ethical approval was provided under an overall IRB for use of anonymized mortality and serosurveillance data by Unity Health REB, Toronto.
Data sharing
The data analysis code is available from the corresponding author.
The corresponding author (PJ) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
Dissemination to participants and related patient and public communities: Direct dissemination to study participants is not possible. Preprint versions of this analysis are available on MedRxiv.
Acknowledgements
We thank Thyrocare Laboratories for providing the anonymized data to enable this analyses.