Abstract
Background It is important to understand how BNT162b2, mRNA-1273, and JNJ-78436735 COVID-19 vaccines, as well as prior infection, protect against breakthrough cases and reinfections. Real world evidence on acquired immunity from vaccines, and from SARS-CoV-2 infection, can help public health decision-makers understand disease dynamics and viral escape to inform resource allocation for curbing the spread of pandemic.
Methods This retrospective cohort study presents demographic information, survival functions, and probability distributions for 2,627,914 patients who received recommended doses of COVID-19 vaccines, and 63,691 patients who had a prior COVID-19 infection. In addition, patients receiving different vaccines were matched by age, sex, ethnic group, state of residency, and the quarter of the year in 2021 the COVID-19 vaccine was completed, to support survival analysis on pairwise matched cohorts.
Findings Each of the three vaccines and infection-induced immunity all showed a high probability of survival against breakthrough or reinfection cases (mRNA-1273: 0.997, BNT162b2: 0.997, JNJ-78436735: 0.992, previous infection: 0.965 at 180 days). The incidence rate of reinfection among those unvaccinated and previously infected was higher than that of breakthrough among the vaccinated population (reinfection: 0.9%; breakthrough:0.4%). In addition, 280 vaccinated patients died (0.01% all-cause mortality) within 21 days of the last vaccine dose, and 5898 (3.1 %) died within 21 days of a positive COVID-19 test.
Conclusions Despite a gradual decline in vaccine-induced and infection-induced immunity, both acquired immunities were highly effective in preventing breakthrough and reinfection. In addition, for unvaccinated patients with COVID-19, those who did not die within 90 days of their initial infection (9565 deaths, 5.0% all-cause mortality rate), had a comparable asymptotic pattern of breakthrough infection as those who acquired immunity from a vaccine. Overall, the risks associated with COVID-19 infection are far greater than the marginal advantages of immunity acquired by prior infection.
Introduction
As of November 5, 2021, more than 192 million people in the United States have been fully vaccinated against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus which causes novel Coronavirus 2019 (COVID-19) (1). Vaccination against COVID-19 with the mRNA vaccines, BNT162b2 (Pfizer-BioNTech) and mRNA-1273 (Moderna), and the adenoviral vectored vaccine, JNJ-78436735 (Janssen) has significantly reduced the incidence of COVID-19 infection and associated severe outcomes (2). Centers for Disease Control and Prevention (CDC) reported the risk of infection, hospitalization and death rates among unvaccinated persons were 4.6, 10.4, and 11.3 times higher compared to those of fully vaccinated persons during the era where the Delta variant was predominant (2). Studies to date suggest that COVID-19 vaccines authorized for use in the United States (US) protect against most COVID-19 variants in the US (2). However, with the spread of the Delta variant (B.1.617.2), vaccine protection was reduced and breakthrough infections, hospitalization, and death increased among vaccine recipients (3–6). In addition to vaccine-induced immunity, the infection-induced immunity acquired from natural COVID-19 infection has shown a high effectiveness in preventing reinfection (7). According to the World Health Organization (WHO), most people who have recovered from COVID-19 develop a strong protective immune response which remains for at least 6 to 8 months after infection (8).
Previous studies that examined vaccine and infection-induced immunity in preventing COVID-19 infection have established the effectiveness of vaccines in preventing new infections, however, they are limited with small sample size, short study period interval, and insufficient consideration of spatiotemporal variables (9–15). Here, we address this gap and examine COVID-19 infection post-vaccination in the western US from December 12, 2020 to November 5, 2021, encompassing the emergence and dominance of newer variants including Delta. Additionally, we examine reinfection among unvaccinated patients to investigate the infection-induced immunity.
Methods
This study was a retrospective cohort study on electronic health record (EHR) data from Providence-St. Joseph Health (PSJH). PSJH is a community health system with 51 hospitals and 1085 clinics across five states in the western US: Alaska, California, Montana, Oregon and Washington. In this study, we defined two cohorts, vaccine- and infection-induced cohorts (Figure 1). The vaccine-induced cohort period was from 12/12/2020 to 11/05/2021, and the infection-induced cohort period was from the beginning of the pandemic to 11/05/2021. The source population was patient with at least one medical encounter during the study period (N=7,620,084). Our outcome of interests were breakthrough and reinfection for vaccine- and infection-induced cohorts, respectively. These were defined based on the positive SARS-CoV-2 PCR test result.
For the vaccine-induced immunity cohort (Figure 1A), we included patients vaccinated with at least two doses of BNT162b2 and mRNA-1273, and one dose of JNJ-78436735 during the study period (N=3,236,036). Patients who received boosters or died after their last dose of vaccine were right-censored. We excluded patients who were not fully vaccinated 21 days before 11/05/2021, had a positive test for COVID-19 before or within 21 days from the last dose of vaccination, or who received invalid test results. We also excluded patients younger than 14 years old. From this population (N=2,627,217), we categorized patients into three groups. The vaccine-induced immunity cohort was defined by SARS-CoV-2 PCR positive test after the recommended initial doses of COVID-19 vaccine: one for JNJ-78436735 and two for BNT162b2 and mRNA-1273 (N=9,321). We identified the negative group as the patient who received only negative results (N=601,326). Patients who didn’t get tested were defined as the not-tested group (N=2,017,267).
For the infection-induced immunity cohort (Figure 1B), we included patients who received at least one positive COVID-19 during the study period (N=191,722). From this population, we split into two cohorts; 1) those with multiple COVID-19 test results (N=79,779) and 2) those with one COVID-19 test result (N=111,943). Our exclusion criteria were patients who died within 90 days of initial infection, patients younger than 14 years old, and patients vaccinated during the study period. The cohort with multiple COVID-19 tests was split again into two groups: 1) a group of patients with multiple positive test results (N=13,393) and 2) a group of patients with one positive test result (N=66,386). For both groups, we applied common exclusion criteria. One positive test result group was defined as positive-negative group (N=11,067), after exclusion. The infection-induced immunity cohort was defined as ≥1 positive SARS-CoV-2 PCR test > 90 days apart, following CDC guidelines (16,17). We identified the positive-not tested group (N=52,507) by applying common exclusion criteria to the cohort with one COVID-19 test result.
We present descriptive analyses of both cohorts as frequencies and percentage for categorical variables, and as mean and standard deviation (std) for numerical variables (Supplementary Tables 1 and 2). Medical conditions include known risk factors for poor COVID-19 outcomes reported in the literature, as well as coarse grained geographical (by state) and temporal information (quarter of the year 2021). For biomedical precision we used the Systematized Nomenclature of Medicine Clinical Terms (SNOMED–CT©) hierarchy (Supplementary Table 4). For each SNOMED code listed, all descendant SNOMED codes were included. The differences between distributions of days to breakthrough was calculated using Mann Whitney U-test. Results were considered statistically significant at a (2-tailed) p-value < 0.05.
Additionally, we compared the acquired immunity through administration of each of COVID-19 vaccines using propensity score matched survival functions. Propensity score matching is a statistical method that defines a control group by matching control and treatment group based on a distribution conditioned on a fixed set of covariates (propensity score) (18,19). These covariates include age, sex, ethnic group, state of residency, and the quarter of 2021 when last dose of COVID-19 vaccine was administered. We matched the control and treatment group 1:1 ratio with replacement using nearest neighbor matching. We use the python package scikit-learn v0.23.2 (20) to calculate and match propensity scores. Missing values for continuous variables were imputed using the median value across the cohort. We confirmed covariates were balanced across control and treatment groups using standardized mean differences (SMD). Covariates were considered balanced when standardized mean difference is below 0.10(19) .
This study was approved by the Institutional Review Board (IRB) at PSJH with Study Number STUDY2020000196. Consent was waived because disclosure of protected health information for the study was determined to involve no more than a minimal risk to the privacy of individuals. We follow STROBE reporting guidelines (Supplemental Table 5).
Results
Vaccine-induced immunity cohort
We present descriptive analyses of vaccinated patients in Supplementary Table 1 and Supplementary Table 2. A total of 2,627,914 patients were eligible as the vaccine-induced immunity cohort. The mean age of this cohort was 53.57 years (±19.56). This cohort was enriched with patients with female sex, non-Hispanic/Latino ethnicity, White/Caucasian race, and California residence. 1,350,422 (51.4%) were administered two doses of BNT162b2 vaccine, 1,087,796 (41.4%) were administered two doses of mRNA-1273, and 189,696 (7.2%) were administered one dose of JNJ-78436735. A majority of patients completed their initial vaccination during the second quarter of 2021 (51.4%). This cohort was categorized into three groups based on their outcome (Figure 1); 9,321 (0.355%) patients were identified as vaccine breakthrough group, 601,326 (22.88%) patients in the test-negative group, and 2,027,924 patients in the not-tested group. The vaccine breakthrough group was enriched with patients between 18 to 44 years (25.6%), females (55.7%), and patients vaccinated during the first quarter of 2021 (52.0%). Figure 2 shows the distribution of time until breakthrough. The frequency of positive cases for each vaccine has a mean (std) of 161.8 (±55.4) and 152.5 (±55.9) for mRNA-1273 and BNT162b2, respectively, with approximately normal distributions, whereas JNJ-78436735 showed a more uniform distribution with a decaying tail and a mean of 123.6 (±53.7) days (Figure 2A). In addition, Figure 2B presents the normalized frequency distribution of the breakthrough cases for each vaccine. We obtain this from Figure 2A by dividing the counts in each bin by the total number of counts multiplied by the size of the bin. Furthermore, the Mann-Whitney U test showed statistically significant differences between all three distributions with p-values <0.001 for each pairwise comparison. Figure 3 presents the estimated survival functions for the outcome of breakthrough infection. In addition, Figure 4 represents the estimated survival function for the breakthrough infection in pairwise matched cohorts using propensity score matching. Note that the flat lines in post-vaccination COVID-19 breakthrough positive test shows the absence of new positive cases after that time point.
Infection-induced immunity cohort
Descriptive analyses for the infection-induced immunity cohort are given in Supplementary Table 3. Of 191,722 patients who tested positive for COVID-19 during the study period, 64,424 patients were eligible as the infection-induced immunity cohort. The mean age of this cohort was 43.33 (± 18.63) years old. This cohort was enriched with 18-44 age group, female, non-Hispanic/Latino, White/Caucasian, and patients with invalid geographical information. This cohort was categorized into three groups based on the outcome; 567 (0.88%) patients in the reinfection subgroup, 11,067 (17.18%) in the positive-negative subgroup, and 52,507 (81.50%) in the positive-not tested subgroup. Similar to the vaccine-induced immunity breakthrough subgroup, patients between the ages of 18-44 had the highest rate of reinfection; however, they made up a larger fraction: 55.73% vs. 25.6%. The reinfection group was enriched with 18-44 age group, female, non-Hispanic/Latino, White/Caucasian, and California residents.
Patients who died within 90 days of their initial infection were excluded. Patients who died after 90 days were considered unlikely to still be infected from their initial case and were right-censored. The estimated survival functions (Figure 5A and 5B) for this cohort displays an initial exponential decay followed by a linear decrease. The distribution of times itself also displayed approximately exponential behavior which asymptotically approaches a uniform distribution (Figure 5D). However, we only displayed cases with reinfection times up to 350 days to compare to the vaccine-induced immunity cohort. About 5% of the total cases fall between 350 and 600 days and are not displayed here but are included in the overall statistics and survival analysis. The inclusion of cases beyond 350 days gives more pronounced exponential behavior for both the distribution and the survival functions. Additionally, if we relax the assumption that reinfection occurs >90 days, we also see a more pronounced exponential distribution and survival function, which drastically reduces the mean and median reinfection times (Supplementary Figure 1). We find that the probability of avoiding reinfection is 0.995 at 180 days, 0.989 at 350 days and 0.987 at 600 days. However, this is only applicable for patients who survived their initial infection. Adjusting for the mortality rate, these probabilities become 0.945 at 180 days, 0.939 at 350 days, and 0.937 at 600 days.
Discussion
In this study, we examined the acquired immunity through three vaccines (mRNA-1273, BNT162b2, JNJ-78436735) that have been administered in the United States by characterizing breakthrough cases in a large, vaccinated population across California, Oregon, Washington, Alaska, and Montana. We also conducted pairwise-comparison of the acquired immunity of the vaccines, matching individual vaccine cohorts based on sex, ethnicity, age, and timing of the completion of the recommended initial dose(s). As of November 5th, 2021, 2,627,914 patients (mRNA-1273: 1,087,796; BNT162b2: 1,350,422; JNJ-78436735: 189,696) were fully vaccinated and 0.36% (N=9,321) tested positive after vaccination. All three vaccines showed a high probability of survival against breakthrough cases (mRNA-1273: 0.997, BNT162b2: 0.997, JNJ-78436735: 0.992 in 180 days). In terms of individual performance, two doses of mRNA-1273 was the most effective and a single dose of JNJ-78436735 was least effective among the three administered immunizations (168, 155, and 130 median days to breakthrough, P<0.05). These results remained robust in propensity score analyses. The present results support evidence of previous studies (9–11).
Additionally, we examined the acquired immunity through prior COVID-19 infection for preventing reinfection. We found a median survival time of 162 days and a mean of 191 days, with an asymptotic probability for avoiding reinfection of .995 at 180 days, which is comparable to vaccine-induced immunity breakthrough. We must be careful in comparing these numbers since we have a larger threshold between reinfected cases and breakthroughs. There is also a substantially greater risk of mortality subsequent to SARS-CoV-2 infection than there is to vaccination. We adjust for this by multiplying the resulting survival probability for reinfection by the probability of surviving after 90 days. This gives an adjusted survival of reinfection probability of .945 at 180 days.
Of note, people aged 18-44 years make up a significantly larger portion of the reinfected cohort than the breakthrough cohort (55.7% vs. 25.6%). This demographic also made up over half of the total workforce in the US in 2020, which could contribute to increased exposure. Another interesting difference between the demographics is the incidence of comorbidities. There seems to be a larger proportion of patients with comorbidities in the vaccinated cohort than in the reinfected cohort. Because chronic comorbidities have been reported as a risk factor for more severe COVID-19, patients with these comorbidities may have been more likely than the general population to get vaccinated. Further, we found that the incidence rate of reinfection was higher in the infection-induced immunity cohort than that of the vaccine-induced cohort (0.9% vs. 0.4% of the eligible populations of the infection-induced immunity and vaccine-induced immunity cohorts, respectively). This finding is smaller than that reported in a previous study (21), which suggested that the vaccine-induced immunity population had a six-fold higher risk of COVID-19 infection than COVID-induced immunity population. This study was different from our study as it was conducted in Israel and BNT162b2 was the only vaccine administered. Additionally, genomic mutations of COVID-19 (22) are distributed differently according to geographic location and pandemic time window.
Interestingly, the shape of the survival curve for infection-induced immunity contains no inflection point (Figure 3), unlike the survival curve for breakthrough infection (Figure 4). Since these curves are cumulative distributions as a function of time, inflection points are extreme in the probability distributions. For breakthroughs, the probability distributions are approximately normal, leading to a maximum, and therefore inflection point, around the mean whereas the distribution for infection-induced immunity is approximately exponential, which indicates that the probability of reinfection is highest towards the start of the interval. Exponential distributions possess a property called “memorylessness”. Formally, if T is survival time, this means Pr(T > t + a|T > a) = Pr(T > t), implying that the probability that one will get reinfected within a fixed number of days will always be the same. This is not applicable for vaccine-induced immunity, where distribution is approximately normal, leading to a complementary error function for its survival function. In general, it is difficult to say if Pr(T > t + a|T > a) is increasing or decreasing as a function of a.
Another interesting aspect of these distributions is the tails. Probability distributions must have a total measure of 1 and tend to zero at infinity. This is slightly counterintuitive in the current context, since one would expect the probability of getting reinfected would increase over time due to waning immunity. This could be attributed to i) lack of data over an extended period of time and ii) lack of access to positive tests outside of PSJH. i) comes from the fact that a majority of vaccinated individuals got their vaccine in the last year and may still get infected, especially as new variants spread through the population. As time progresses, we would expect the mean and median times to increase as patients in our cohort who have not been infected, experience breakthroughs. For ii), home and rapid tests make up a significant portion of all COVID-19 tests taken and are not accessible through the PSJH. Still, the actual asymptotic survival probabilities are very high.
Strengths and Limitations
In this study, we characterized both vaccine-induced immunity cohort and infection-induced immunity cohort. Direct comparison between vaccine-induced and infection-induced immunity cohorts is unreasonable because it is difficult to establish the time that the infection-induced immunity cohort recovered from COVID-19 and gained immunity. We instead conducted parallel analyses between vaccine-induced and infection-induced immunity cohorts.
We analyzed a large sample that represents the general population of the U.S. west coast. Previous published studies in the U.S. (9,23–27) are limited to a cohort with specific characteristics, such as health-care workers (6,15,23,27), patients with specific comorbidities (28–30), veterans (9,24,29,31), patients of academic hospitals (32,33). PSJH is a community-served hospital that covers five states of the west coast in both rural and urban areas. As we observed consistent results regarding vaccine-induced immunity (mRNA-1273 > BNT162b2 > JNJ-78436735) from previous studies (9,10,15) conducted in a cohort with special characteristics, our result adds validity and increases generalizability of previous studies. To our knowledge, our study had the longest observed duration for time to breakthrough/reinfection, encompassing both waning vaccine- and infection-induced immunity (21,34,35) and evolving variants (36).
We performed propensity score matching analysis pairwise between individual vaccines. We had a total of 2,627,914 patients (mRNA-1273: 1,087,796; BNT162b2: 1,350,422; JNJ-78436735: 189,696). We matched on sex, age, ethnicity, geographical location, and administered time of the last dose of vaccination. Geographical location and the time of the last dose of vaccination were proxy for environmental exposures. We adjusted for time to address how vaccination eligibility has changed over time based on the type of job. High-risk critical workers were eligible for vaccination in the first quarter of 2021 (37–42). People who were able to remotely work were eligible in the second quarter of 2021 (37,39–43). Patients were also matched by state of residence, in consideration of geographic differences in policies and vaccination percentages.
A potential weakness of this study is that, although PSJH captures vaccination records through bidirectional linkage state immunization registries, it does not necessarily have results for COVID-19 tests from outside of the PSJH system. Future analyses would benefit from requiring access to this data from these alternative data sources, as well as sequencing, titer level and other deep immunophenotyping data (44,45). Two other limitations are that variant types are infrequently tracked in the EHR, and we cannot precisely determine whether reinfection defined in this study is an actual reinfection, as opposed to an unusually long lingering initial infection. However, we chose a conservative threshold, 90 days after the first positive test, to determine the reinfection cases. Studies to date suggest that the maximum duration of SARS-CoV-2 RNA detection in upper respiratory tract specimens is 12 weeks (84 days) (46,47) after symptom onset and reinfection does not occur within 90 days of first COVID-19 infection or illness (48). Moreover, sensitivity analysis which included patients with reinfection times in between 21 days and 90 days drastically reduced the median reinfection time to 37 days, invalid to be considered as reinfection, and the asymptotic survival probability from 0.987 to 0.952 (Supplementary Figure 1). As a note, our choice to KNN function with replacement to match the propensity scores improves the average quality of matching, but because controls are used multiple times, the control group contains less information.
Additionally, our data is limited due to incongruencies in vaccine rollout between the three vaccines. Initially, only healthcare and other essential workers had access to the vaccine, with redistribution of the larger public coming only later. This means that a majority of patients who received the vaccine have an upper bound on the possible times they can be considered for reinfection. As time progresses, these people may experience breakthroughs at later times which would skew the distribution to the right, increasing the mean and median times reported. As such, the numbers reported in Figures 2, 3, 4 and 5 may increase in time as more people are eligible to get infected at later times.
Conclusion
Here we presented demographic information, survival functions and probability distributions for patients experiencing vaccination breakthrough or post-infection reinfection within the PSJH network, showing that all three vaccines (BNT162b2, mRNA-1273, and JNJ-78436735) have very low risk of breakthrough in the real world over 350 days. We also observed infection-induced immunity, though we did not make a direct comparison with vaccine-induced immunity for the reasons discussed above. However, the risks associated with COVID-19 infection are far, far greater than any marginal advantages acquired by prior infection.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Conflict of Interest Disclosures
None of the authors have a conflict of interest with this study.
Funding/Support
This work was funded in part by the Swedish Medical Center Foundation.
Author contributions
JJH, VD, AMB, YH, SM conceptualized the study. SM, AMB were involved in the EHR data extraction, data cleaning, and codification. SM, AMB, YH performed data analysis including statistical and survival analysis. JJH supervised implementation and provided administrative and material support. AMB, YH, SM prepared the manuscript with critical revision of the manuscript for important intellectual content provided by JJH and JDG. All authors reviewed and approved the final version of the manuscript.
Acknowledgement
We are grateful to Providence St. Joseph Health for sharing their data engineering expertise and computational resources. We would also like to acknowledge SNOMED International for developing and maintaining SNOMED-CT.