Abstract
While antibodies provide significant protection from SARS-CoV-2 infection and disease sequelae, the specific attributes of the humoral response that contribute to immunity are incompletely defined. In this study, we employ machine learning to relate characteristics of the polyclonal antibody response raised by natural infection to diverse antibody effector functions and neutralization potency with the goal of generating both accurate predictions of each activity based on antibody response profiles as well as insights into antibody mechanisms of action. To this end, antibody-mediated phagocytosis, cytotoxicity, complement deposition, and neutralization were accurately predicted from biophysical antibody profiles in both discovery and validation cohorts. These predictive models identified SARS-CoV-2-specific IgM as a key predictor of neutralization activity whose mechanistic relevance was supported experimentally by depletion. Validated models of how different aspects of the humoral response relate to antiviral antibody activities suggest desirable attributes to recapitulate by vaccination or other antibody-based interventions.
Introduction
The SARS-CoV-2 pandemic has resulted in over 127 million cases, 2.7 million deaths, and unprecedented social, economic, and educational impact despite interventions that have included quarantines, shutdowns, social distancing, and masking requirements. However, the pandemic has also led to international collaborations working toward understanding the disease and developing novel therapeutics and vaccines. To date, these efforts have resulted in several novel therapies and several vaccines approved for widespread deployment under emergency use authorization (EUA)1.
The success of these vaccines is thought to result in no small part to the potent antiviral activities of the antibodies they induce. While reinfections have been documented2,3, seropositivity and levels of neutralizing antibody are associated with highly reduced rates of re-infection4-6, and passive transfer of plasma from convalescent donors has shown therapeutic efficacy in some studies7-14 but not others15-18. The inconsistent results with convalescent plasma studies suggest that the variables that contribute to passive antibody efficacy in polyclonal preparations are not completely understood. Additionally, built on strong preclinical data showing the ability of antibodies to prevent infection5, monoclonal antibody therapies have been developed, including combination products19,20. Each of the three vaccines currently under emergency use in the United States induces neutralizing antibodies, often to levels exceeding those detected following natural infection21-23.
However, whether elicited by vaccination or infection, antibody responses between individuals are highly variable24-27, both in titer and in composition. This variability suggests that monoclonal antibody and convalescent plasma therapy, as well as vaccine design, can be improved by determining the factors that contribute to a functionally protective antibody response. Beyond neutralization, which has been established as a correlate of protection in diverse studies20,28-30, evidence has accrued supporting both protective and pathogenic roles of antibody effector functions in infection resistance and disease severity. These functions include activities mediated by both soluble factors and diverse innate immune effector cell types. For example, initiation of the complement cascade can result in direct viral or infected cell lysis31, or modification of other activities including neutralization31,32. Similarly, antibodies can induce phagocytosis, drive release of cytotoxic factors such as perforin and granzyme B, or secretion of inflammatory mediators such as a cytokines and reactive oxygen species32,33. In studies of SARS-CoV-2, extra-neutralizing functions have been shown to play an important role in antiviral activity of antibodies34-40. The importance of these functions has been defined in vivo in animal models using both using Fc engineering to modulate binding of the Fc domain to Fcγ Receptors (FcγR), and through depletion of effector cells. In contrast, in correlative studies some extra-neutralizing functions have also been linked to disease severity41,42. These findings suggest the importance of understanding the role of both neutralization and extra-neutralizing functions in antibody responses to SARS-CoV-2 infection. Given these observations, better understanding of the relationship between the magnitude and character of the humoral immune response and diverse antibody activities may offer key insights to further the development of successful therapeutics and vaccines for SARS-CoV-2.
Results
Characterization of antibody responses following SARS-CoV-2 infection
Antibody functions, including neutralization assessed by either an authentic virus assay or a luciferase-based pseudovirus assay, antibody-dependent cell-mediated phagocytosis (ADCP) mediated by monocytes, deposition of the complement cascade component C3b (ADCD), and FcγRIIIa ligation as a proxy for NK cell mediated antibody dependent cellular cytotoxicity (ADCC) induced by antibodies in response to recombinant antigen were previously reported26 for a set of convalescent samples collected from a discovery cohort of 126 eligible convalescent plasma donors from the Baltimore/Washington D.C. area (Johns Hopkins Medical Institutions, JHMI)27 and serum samples from 15 naïve controls and a validation cohort of 20 convalescent subjects from New Hampshire (Dartmouth-Hitchcock Medical Center, DHMC)43 (Supplemental Table 1). Biophysical antibody features were defined by a customized multiplexed Fc array assay that characterizes both variable fragment (Fv) and Fc domain attributes across a panel of SARS-CoV-2 antigens, consisting of: nucleocapsid (N) protein, stabilized (S-2P)44 and unstabilized trimeric spike protein, spike subdomains including S1 and S2, the receptor binding domain (RBD), and the fusion peptide (FP) from SARS-CoV-2; in addition, the panel included diverse pathogenic, zoonotic, and endemic coronavirus spike proteins and subdomains. Influenza hemagglutinin (HA) and herpes simplex virus glycoprotein E (gE) were evaluated as controls. The Fc domain characteristics evaluated for each antigen specificity included antibody isotype, subclass, and propensity to bind Fc receptors (FcRs) (Supplemental Table 2).
To understand how the different facets of the Ab response relate to one another, hierarchical clustering was performed on the biophysical antibody profiles of convalescent plasma donors (JHMI) and compared to the serum profiles of SARS-CoV-2 naïve subjects. Extensive variability in the SARS-CoV-2-specific Ab response magnitude and character was noted (Figure 1A). High levels of IgG were observed in many individuals, particularly those who had been hospitalized, while a small number of convalescent donors appeared not to seroconvert despite documented infection via nucleic acid amplification. Similarly, there was considerable variability in the IgA and IgM responses in SARS-CoV-2-convalescent subjects. IgG2, IgG4, and IgD responses were less commonly observed. Distinctions in antibody responses between subjects were apparent among antigen specificities. For example, perhaps due its high homology with endemic CoV, FP responses were isotype switched consistent with an amnestic response, whereas IgM responses to S were reliably observed.
A weighted network plot depicting Pearson’s correlation coefficients between Fc array features and functional measurements was created to elucidate correlative relationships more directly between aspects of humoral responses (Figure 1B). As was apparent in the heatmap (Figure 1A), features were often more strongly grouped by Fc domain characteristics than antigen-specificity. Nodes representing antibody effector functions clustered more tightly with RBD-, S1-, S- and N-specific FcγR-binding levels, IgG3, and total IgG responses than with IgG1 responses or those directed at S2 or FP. Though most closely linked to IgG-associated features, neutralization potency appeared as a hub that connected to IgA and IgM responses. Based on both hierarchical clustering and correlation analysis (Figure 1B), the ability of antigen-specific antibodies to interact with diverse FcγR was well correlated to multiple antibody effector functions.
Multivariate modelling methods to predict functional responses
With the dual goals of better understanding the humoral response features that may drive complex antibody functions and enabling robust predictions from surrogate measures, we applied supervised machine learning methods to this (JHMI) dataset, while using the DHMC cohort as validation to determine whether the models could predict activity in a generalized manner. A regularized generalized linear modeling approach trained to utilize Fc Array features to predict each antibody function with minimal mean squared error was selected based on prior success in identifying interpretable factors that contribute to functional activity while avoiding overfitting45. Five-fold cross-validation was employed to evaluate generalizability within the JHMI cohort, and comparison to models trained on permuted functional data established model robustness (Figure 2A). The cross-validated models trained on diverse data subsets showed similar accuracy (measured by mean squared error) when applied to held out subjects as when used to predict effector function and neutralization activity that was observed in the validation cohort (DHMC). Model quality was also evaluated in terms of the degree of correlation between predicted and observed activity for a representative cross-validation replicate, allowing for better visualization of model performance (Figure 2B).
The model consistently selected a subset of features for each function (Figure 3A). The features that appeared with high frequency in repeated modeling were likely to have relatively high coefficients, and inversely, biophysical features with relatively small coefficients were prone to be influenced by the selected sample subset and to be removed by chance across the replicates. Collectively, the frequently contributing features were exclusively related to spike recognition and were primarily driven by IgG and FcγR-binding antibodies.
To evaluate the magnitudes of feature contributions, a representative model for each function demonstrating the identity and relative coefficients of the contributing features is presented (Figure 3B). Again, despite their sparseness compared to the control antigen, endemic CoV, and other epidemic CoV features, these models relied almost exclusively on antibody responses to the SARS-CoV-2 spike. Consistent with the experimental approach evaluating functions elicited specifically against RBD, ADCC and ADCP models depended principally on antibodies specific to RBD or more broadly to S1. In contrast, the lead feature for virus neutralization was recognition of stabilized spike (S-2P). Similarly, complement deposition against whole spike was best predicted by a single feature related to spike trimer recognition. Responses to the S2 domain were not observed to contribute to functional predictions. Intriguingly, IgA responses against other CoV were observed to make inverse contributions to ADCC predictions. While these contributions were of small magnitude, this result suggests the possibility that cross-reactive, potentially S2-specific IgAs may inhibit the activity of S-reactive IgGs, as has been observed in the context of the HIV envelope glycoprotein46.
Beyond specificity, distinct antibody Fc characteristics contributed to model predictions. The most frequent Fc characteristic of features contributing to the final model of neutralization potency was the magnitude of IgG response, consistent with neutralization being FcR-independent. In contrast, the most frequent Fc characteristics in modeling ADCC and ADCP were FcγRIII- and FcγRII-binding responses, respectively – the receptors most relevant to each function. Further, despite comprising a relatively small fraction of circulating IgG, but consistent with its enhanced ability to drive effector functions47,48, IgG3 antibodies specific to RBD made a substantial contribution to models of both ADCP and ADCC activity, suggesting the potential importance of this subclass. Intriguingly, S1-specific IgM contributed to models of neutralization potency. IgM is typically associated with initial exposures49, and our data suggesting the possibility that this feature represents de novo rather than recalled cross-reactive lineages that may exhibit superior neutralization activity, as has been observed in the context of influenza responses50,51. Overall, while functions were predicted with differing degrees of accuracy, each generalized well to the independent validation cohort and relied upon features with established biological relevance.
Experimental validation of predictive models of antibody function
Given the somewhat surprising appearance of an IgM feature in predictions of neutralization activity, we sought to evaluate the mechanistic relevance of this isotype in particular. In a select group of individuals with both high IgM and neutralization levels (n=11), IgM was depleted from serum to determine whether the loss of CoV-2-specific IgM resulted in a reduction in neutralization of SARS-CoV-2 (Supplemental Figure 1). Both total (not shown) and RBD-specific IgM was depleted (97-fold) (Figure 4A). Minimal effects on total (not shown) and RBD-specific IgG (2.0-fold) and IgA (2.3-fold) levels were observed in the IgM-depleted samples. Following IgM depletion, samples showed 1.6-to 73-fold decreases in neutralization titer (Figure 4B). Though the magnitude of changes in Ig levels and neutralization before and after depletion varied per donor, only IgM and not IgG or IgA levels showed a statistically significant correlation with neutralization titer in these individuals (Figure 4C). This result demonstrates that mechanistically relevant features can be discovered from unbiased data analysis and modeling processes.
Discussion
It is now well established that SARS-CoV-2-specific antibodies can drive varied antiviral functions beyond neutralization34,43,52. These responses have been less well characterized, but accumulating evidence suggests their importance to protection from infection and disease. Both ADCC and phagocytosis have been reported to contribute to antibody-mediated antiviral activity against other coronaviruses53-55. Collectively, these functions have been suggested to play an important role in defense against SARS-CoV-2; they have been implicated in in vivo protection in diverse studies, including passive transfer studies that have demonstrated that effector functions play a role in the antiviral activity of monoclonal antibodies and correlates of protection analysis carried out on vaccine candidates34-40. Fc engineering approaches that both knocked out or enhanced antibody effector functions and studies of the depletion of effector cells in the context of diverse antibodies have provided convincing evidence of the mechanistic relevance of these observations.
In this work, antibody functions measured in two cohorts of convalescent subjects were modeled using biophysical antibody profiles comprised of tandem attributes representing Fv-specificity and Fc characteristics. Multivariate linear regression identified distinct biophysical features that predicted antibody functions such as ADCC, ADCP, ADCD, and neutralization, showing the unique dependencies of each activity on different aspects of humoral responses. Although responses toward both endemic and pathogenic CoV were considered, models were almost exclusively reliant on SARS-CoV-2-specific responses in predicting functional activity. These predictions were robust and generalizable, performing similarly well in training and testing data subsets across cross-validation runs as in an independent validation cohort. The consistency between antibody features contributing to each modeled function and expected biological relevance suggests that modeling approaches such as that employed here can identify mechanisms of antibody activity, as has been observed in other studies56-58.
Spike-specific FcγR-binding antibodies made frequent contributions to models of effector functions, with FcγRIIa contributing strongly to phagocytosis and FcγRIIIa contributing strongly to NK cell activity. Among subclasses, IgG3 made an outsized contribution, consistent with prior studies in the context of other infections59-61, and monoclonal antibody subclass-switching studies47,48. In contrast, virus-specific IgM contributed to predictions of neutralization activity. SARS-CoV-2 specific IgM has attracted interest because of its association with lower risk of death from COVID-1924. Consistent with our experimental results, another study in which IgM was selectively depleted also observed resulting reduction in neutralization activity, but additionally confirmed the activity of the isolated IgM fraction62,63. Interestingly, SARS-CoV-2 specific IgM administered intranasally has been shown to be effective in treating novel SARS-CoV-2 variants of concern, including the alpha, beta, and gamma variants in a mouse model64. The finding that so much of the neutralizing activity of convalescent plasma against SARS-CoV-2 resides in the IgM fraction raises concern about that gamma globulin preparations may lose much of their antiviral activity as this isotype is removed. Similarly, the faster clearance profile of IgM as compared to IgG may hold implications for both frequency of dosing and timing of plasma donation.
While features contributing to functional predictions have both prior support from other studies and experimental validation within this cohort, other feature sets are likely to provide similar performance. Given high feature dimensionality and relatively fewer subjects, regularization was used to increase the quality of prediction. This approach simplified the resulting models, resulting in improved interpretability of the selected variables at the cost of eliminating features that are highly correlated to selected variables in the established model. Collectively, this modeling choice can result in a trade-off between model simplification and obscuring potential biological mechanisms. Other limitations include the use of surrogate functional assays that bear advantages in terms of throughput and reproducibility but pose limitations in terms of their biological relevance. As further functional assays reliant on free virions and infected cells are developed, it will be of interest to compare and contrast both the degree of correlation with these convenient proxy assays as well as to model those activities in pursuit of insights into unique subpopulations of antibodies that may be responsible for their induction, or to define general characteristics of a response that is highly polyfunctional.
As viral variants continue to emerge, rapid binding profiling may be an important complement to functional breadth assessments. Insights into how Fc characteristics of cross-reactive responses relate to diverse functions may provide accelerated insights into population-level susceptibility and support prioritization among candidate vaccine regimens. Numerous randomized clinical trials of convalescent plasma for COVID-19 are in the process of completion and it is likely that plasma remnants will be available for retrospective detailed serological analysis and correlation with clinical outcome15. This multivariate analysis provides a blueprint for carrying out such investigation, which could provide information on the antibody functions that contribute to clinical efficacy. The discovery of antibody functions associated with passive antibody efficacy could allow optimization of serological characteristics of mAbs, plasma and gamma globulin products for prevention and therapy of COVID-19.
Materials and Methods
Human subjects
The discovery cohort comprised 126 adult eligible convalescent plasma donors diagnosed with SARS-CoV-2 infection by nucleic acid amplification in the Baltimore, MD and Washington DC area (Johns Hopkins Medical Institutions, JHMI cohort) and has been previously described27. The validation cohort comprised 20 SARS-CoV-2 convalescent individuals from the Hanover, New Hampshire area (Dartmouth Hitchcock Medical Center, DHMC cohort)43. Infection with SARS-CoV-2 was confirmed in all convalescent subjects by nasopharyngeal swab PCR. Plasma (JHMI) or serum (DHMC) was collected from each donor approximately one month after symptom onset or first positive PCR test in the case of mild or asymptomatic disease. Samples from 15 naïve subjects collected from the Hanover, New Hampshire area served as negative controls. Supplemental Table 1 provides basic clinical and demographic information for each cohort.
Human subject research was approved by both the Johns Hopkins University School of Medicine’s Institutional Review Board and the Dartmouth-Hitchcock Medical Center Committee for the Protection of Human Subjects. All participants provided written informed consent.
Antibody features and functions
The magnitude, Fv specificity, and Fc domain characteristics of antibody responses to diverse coronavirus and control antigens were profiled by multiplexed Fc Array assay22, as previously described26,43,65. Supplemental Table 2 reports the complete list of antigen specificities and Fc domain characteristics that were assayed. Fc Array data reported in median fluorescent intensity (MFI) was log transformed prior to analysis.
Antibody functions were assayed as previously described26,43. Briefly, neutralization of authentic virus27,66,67 was determined for samples from the JHMI cohort, whereas a pseudovirus neutralization assay68 was employed for evaluation of the DHMC cohort. Phagocytic activity was defined as the level of uptake of antigen-conjugated beads by THP-1 monocytes (ADCP)69,70 or primary neutrophils (ADNP)71. ADCC activity was modeled using a reporter cell line that expresses luciferase in response to FcγRIIIa ligation72. Antibody-dependent complement deposition was assessed by measuring C3b levels on antigen-conjugated beads following incubation in complement serum 73. For each assay, SARS-CoV-2 naïve samples were employed as negative controls, and data was collected in replicate.
IgM depletion
IgM was depleted from serum as described previously62. Briefly, 200 μL of NHS HP SpinTrap resin (Cytiva) was equilibrated and used to immobilize anti-human IgM (μ-chain specific, Sigma I0759) at 850 μg/mL for 30 minutes with end-over-end mixing at room temperature. The resin was washed, quenched with 50 mM Tris HCl, 1M NaCl pH 8.0 and 0.1M sodium acetate 0.5 M NaCl pH 4, and incubated with serum diluted 1:5 in DMEM and incubated overnight at 4°C with end-over-end mixing. Flow-through was subsequently collected by centrifugation. IgG, IgA, and IgM levels of each selected sample were evaluated with and without IgM depletion by multiplex assay as described above26,43,74. Neutralization was measured by pseudovirus reporter assay as described above68.
Data analysis and visualization
Basic analysis and visualization were performed using GraphPad Prism. Heatmaps, correlation plots, and other graphs were generated in R (supported by R packages pheatmap75, igraph76, and ggplot277). Fc Array features were filtered by elimination of features for which the samples exhibited signal within 10 standard deviations (SD) of the technical blank. Log transformed SARS-CoV-2-related Fc Array features and selected functions were scaled and centered by their standard deviation from the mean (z-score) per cohort and visualized following hierarchical clustering according to Manhattan distance. A weighted correlation network of pairs of SARS-CoV-2-related features and selected functions for which Pearson’s correlation coefficient ≥0.5 was graphed.
Multivariate linear regression was employed to predict antibody functions based on biophysical features with the R package “Glmnet”78, as previously described56-58. Regularization by L1-penalization (LASSO) was applied to eliminate variables that were less relevant to the outcome by imposing a penalty on the absolute value of the feature coefficient in order to reduce overfitting and reinforce performance generalizibility79. Functional measurements of ADCP, neutralization, and S1-specific ADCD were log10 transformed to reduce the prediction error of the models based on the assumption that better fitting models were more likely to rely on biologically relevant features. The lambda parameter (λ) was tuned using five-fold cross-validation to minimize mean squared error. A process of 200-times repeated modeling was used to investigate the potential of the different combinations of the biophysical features for modeling. Established with the JHMI cohort, a final model was selected based on the median MSE obtained among the repeated run in the JHMI cohort. The selected features and their coefficients were reported at a value of λ at which median model performance fell one standard error above the minimum to optimize the generalizability and provide more regularization to the model. In the permutation test procedure, the penalized multivariate regression was performed against randomized functional outcomes in the JHMI cohort in a 200-time repeated fashion. The correlation network was conducted with the biophysical features that were repeatedly selected within the repeated modeling process.
Data Availability
Data and code to reproduce analyses are available at (link pending).
Data and Code Availability
Data and code to reproduce analyses are available at (link pending).
Author Contributions
Contributed samples – E.M.B., R.S., O.A., A.A.R.T., D.S., S.S., P.F.W.
Collected experimental data – H.N., A.R.C., S.E.B., K.L, S.E.B., R.S., O.A., W.W.-A., A.P., R.I.C.
Performed data analysis – H.N., S.X., R.I.C. Drafted the manuscript - H.N., S.X.
Reviewed and edited the manuscript – all authors
Supervised research – M.E.A., P.F.W. A.D.R., A.A.R.T., A.C., A.P.
Conceived of work – M.E.A.
Conflict of Interest
The authors declare that they have no conflict of interest.
Supplemental Figures and Tables
Acknowledgements
The authors would like to thank the participants who generously agreed to contribute to this study, as well as the full study team. This work was supported in part by the Division of Intramural Research, NIAID, NIH.