Abstract
Convalescent plasma (CP) recurs as a frontline treatment in epidemics because it is available as soon as there are survivors. The COVID-19 pandemic represented the first large-scale opportunity to shed light into mechanisms of action, safety and efficacy of convalescent plasma using modern evidence-based medicine approaches. Studies ranging from observational case series to randomized controlled trials (RCT) have reported highly variable efficacy results for COVID-19 CP (CCP), resulting in more doubt than certainty. Reasons for CCP success and failure may be hidden in study details, which are usually difficult to explain to physicians and the public but provide fertile ground for designing next-generation studies. In this paper we analyzed variables associated with efficacy such as clinical settings, disease severity, CCP SARS-CoV-2 antibody levels and function, dose, timing of administration (variously defined as time from onset of symptoms, molecular diagnosis, diagnosis of pneumonia, or hospitalization, or by serostatus), outcomes (defined as hospitalization, requirement for ventilation, clinical improvement or mortality), CCP provenance and time for collection, and criteria for efficacy. Focusing only on the results from the 23 available RCT we noted that these were more likely to show signals of efficacy, including reductions in mortality, if the plasma neutralizing titer was ≥ 160 and the time to randomization was ≤ 9 days, consistent with passive antibody therapy efficacy requiring dosing with sufficient antibody. The fact that most studies revealed signals of efficacy despite variability in CCP and its use suggest robust therapeutic effects that become apparent despite the data noise.
Introduction
In the first 21 years of the 21st century humanity has experienced six major epidemics. The agents involved were SARS-CoV, MERS, influenza A(H1N1), Ebola, Zika and SARS-CoV-2 viruses. For five of these outbreaks the response included the use of convalescent plasma (CP) (reviewed in (1, 2)) and it was considered for the sixth (Zika virus). The attraction of CP is that it is readily available as soon as there are convalescing survivors, that unlike drugs or monoclonal antibodies it needs no development, and it is polyclonal, cheap and deployable even in resource poor countries. CP has been proposed as a first line response to new pandemics (3) and was deployed during the COVID-19 pandemic in March 2020 in countries that experienced the early waves of disease such as China (4, 5) and Italy (6).
While in early 2020 most clinical use was reported in case series or small phase II clinical trials (7), beginning in late March 2020 the US expanded access program (EAP) generated a large and robust treatment dataset, with insights on safety and optimal use. This database provided the first clear evidence that CP is safe, which was important given that early in the pandemic there were significant concerns about antibody-dependent enhancement (8). Later, an analysis of the first 3082 patents within the EAP database provided evidence that associated early administration of high titer CCP to non-ventilated hospitalized patients with reduced mortality (9). Before the FDA granted emergency use authorization (EUA), the US EAP provided CCP to as many as 94,287 patients. During the past year, many studies employing either randomized controls (RCT) or propensity score-matched (PSM) controls have been published. RCTs and PSM studies reported so far have had largely opposite outcomes, with most but not all RCTs finding little overall effect on mortality while the PSM and many smaller trials reporting mortality benefits. Several RCTs did not have mortality as a primary endpoint or it was part of a composite endpoint (5, 10-12). These disparate results have led to confusion for both the public and the clinicians, leading to reduced enthusiasm for the use of CP, in part because RCT data is more influential in affecting the opinion of many physicians, specialty societies and government regulators.
As with any other medical treatment, several key factors should be taken into account when evaluating a trial, including the indication (which can be estimated by timing or clinical severity), the therapeutic dose and the intended outcomes. The choices made by the trial designers determine whether the trial will demonstrate or conceal clinical benefit. While much attention is appropriately focused on the performance features of clinical trials (sample size, fidelity to randomization, appropriate analysis), the biological rationale for the hypothesis being tested is critically important but not always taken into account.
Methods
On September 7, 2021, we searched PubMed (which is also indexing the medrXiv prepublishing server) for clinical trials of CCP in COVID19, focusing on RCTs and PSM studies only. Each study was analyzed for the following variables: NCT identifier, recruitment, randomization strategy, type of control arm, baseline patient status, median neutralizing antibody (nAb) titer in both recipients (before CCP transfusion) and CCP units, type of viral neutralization test (VNT), primary endpoint, signals of efficacy, and reasons for failure
At the same date, the ClinicalTrials.gov database was searched for CCP RCTs worldwide having as status “completed”, “active, not yet recruiting” or “recruiting”.
Results
PubMed search retrieved 23 RCTs and 12 PSM studies about CCP, whose main variables are summarized in Tables 2 and 3. The characteristics of the VNTs used are summarized in Table 1. The variables were reconciled in 4 major topics, discussed in the following sections: the indication, the therapeutic doses, the relevance of CCP to the viral variant, and the intended outcome.
ClinicalTrials.gov search retrieved 8 CCP RCTs completed but not yet prepublished or published, 7 active but not yet recruiting RCTs, and 10 RCTs which are still recruiting (summarized in Table 4).
The indication
While it would be desirable to have a single drug that works at any disease stage, it was not reasonable to expect a silver bullet effect from neutralizing antibody-based treatments such as CCP in later stages of disease. COVID-19 is now well-defined as a disease with two stages, an initial viral phase characterized by flu-like and upper and lower respiratory symptoms, followed, in severe cases, by an inflammatory phase that is characterized by inflammation-driven damage to multiple organ systems, including the lungs that can impair gas exchange and cause life-threatening hypoxia and damage to multiple organs, including the brain and blood vessels (13). Specific intact antibodies in CCP are expected to neutralize SARS-CoV-2 in the intravascular system and, in some patients, prevent progression from early to severe and life-threatening disease (as seen in animal models (14)), but this antiviral therapy cannot be expected to reverse the inflammatory phase of the disease, nor neutralize infectious viruses invading the extravascular system. Thus, COVID-19 is similar to influenza, a disease in which antivirals are effective early in disease but have no effect in later stages when the symptomatology stems largely from the inflammatory response. The rationale for administering CCP as early as possible in the course of COVID-19 stems from the neutralization stoichiometry itself: the larger the number of actively replicating virions in the body, the higher the nAb dose needed for neutralization (15). Some uncontrolled studies have reported a lack of association between early intervention and outcomes (16, 17), but in these studies the level of neutralizing antibody (nAb) or the overall anti-Spike antibody level in the infused CCP was unknown, leaving room for alternative explanations.
At the beginning of the pandemic, some investigators and opinion leaders, riding the wave of CCP successes in anecdotal reports in the media and small case series, introduced CCP to the general public as a panacea for any patient with COVID-19, including life-threatening cases, leading to confusing messaging: after reports of failure in severely ill patients emerged, opinions became polarized and the debate became everything but scientific (18). In clinical trials, the indication (i.e., the baseline clinical setting) has been variously defined by patient status (outpatient vs. presenting to the emergency room vs. hospitalized vs. ICU-admitted), disease severity (using 5-category COVID-19 Outpatient Ordinal Outcome Scale (19), a 6-category ordinal scale (12), a 7-category COVID-19 severity scale (20), the WHO 8- (21) or 11-category (22) ordinal scales, or pneumological scores such as SOFA), the time elapsed before recruitment (also variably defined as from molecular diagnosis, from onset of hospitalization, from diagnosis of pneumonia, or from onset of symptoms), or by serological status (presence of antibodies or the ability to neutralize SARS-CoV-2). This variability in inclusion criteria for studies has resulted in marked heterogeneity in recruited patients.
An additional complexity in recruitment to CCP trials is time to treatment. Clinical trials involve administrative requirements and consent procedures, and recruitment to a RCT further requires randomization, which may produce delays in treatment. CCP therapy requires matching on blood type, ordering the CCP, which may or may not be available on site, and setting up the transfusion. This inherent delay from randomization to infusion means that RCTs may build in a disadvantage for the CCP study arm, where controls may have received treatment earlier in the disease course (as, for example, in the CP3O trial (23). ABO-compatible CCP units may be not readily available at the local blood bank and recruited patients may have to wait for a compatible unit of CCP. These almost inevitable delays from randomization mean that CCP may be provided later in the illness than is ideal, and even if the trial intends to treat early, in practice it may not be possible.
During a pandemic, moreover, delays in treatment are magnified. The accrual of severely ill patients in emergency departments and the overwhelmed or even collapsed health care systems can create long delays from arrival in the emergency room to treatment. In the absence of quick (antigenic or molecular) tests for SARS-CoV-2, the turnaround time for final confirmation of diagnosis with PCR, which must often be run in batches, can take several hours. All of these factors are likely to impact the efficacy of CCP treatment. To shorten such time, fully screened CCP collected from eligible donors (24) could be safely administered within emergency departments shortly after admission and even before the patient reaches the ward.
The therapeutic dose
Determining the effective dose of CCP is difficult in a pandemic because the antibody assays and other tests needed to assess the potency of any antibody product take time to be developed. In practice, the effective dose is the product of multiple factors, none of which is fully standardized. The first factor is the concentration of the nAbs as measured by a VNT. At the beginning of the pandemic, only a few BSL3 (or higher)-equipped virology laboratories could run VNT using authentic live SARS-CoV-2 virus: the procedure was time-consuming (3-5 days) and the reports were operator-dependent. Nowadays, the availability of Spike-pseudotyped viruses which can be managed under the more widely available BSL-2 laboratories, or cell-free ACE-2 competition assays, combined with automated (e.g., luminescence-based) readings, have standardized outcomes and shortened turnaround times (25): however, harmonization between different assays is still a work in progress (26). The VNT differs according to the type of replication-competent cell line, the viral isolate used for the challenge (which is critically important when the virus is mutating rapidly as has been the case with emergence of variants of concern), the multiplicity of infection (i.e., the ratio between the viral inoculum - referred with different measuring units – and the number of replication-competent cells within each well), the detection system (optic microscopy for cytopathic effect, immunostaining, quantitative PCR, or luminometer for engineered pseudoviruses), and finally the threshold of neutralization (50% or 90%). The DAWN-plasma RCT provides a clear example of such heterogeneity, with 4 different VNTs used in at different participating laboratories. It was not until August 2020, when many trials were already underway, that the FDA Emergency Use Authorization 26382 defined high-titer CCP on the basis of correlation with a reference standard, the Broad Institute the live-virus, 5-dilution VNT as a 50% inhibitory dilution (ID50) of 1:250 or more (https://www.fda.gov/media/141481/download), and exclusive use of high-titer CCP was formally recommended by the FDA only on March 9, 2021.
Table 1 summarizes the key variables in VNT employed to date in CCP RCTs. Published trials have varied greatly in their approaches to antibody quantification whether in measured transfused CCP units or in recipients. Many trials have relied on high-throughput semi-quantitative or qualitative assays with a poor-to-moderate relationship with nAb titers. Although most trials performed a correlation analysis between VNT and high-throughput serological assays, in many cases the CCP units were tested only with the latter without validation, as was the case with 66% of the patients in the PlasmAR trial (12). This procedure risks an incorrect evaluation of the neutralizing CCP activity. Another cause for discrepancies in outcomes could be that although IgM, IgG, and IgA are all capable of mediating neutralization, VNT titers correlate better with binding levels of IgM and IgA1 than they do with IgG (27). Yet it is IgG that is routinely measured in high-throughput serological assays, and these assays include non-neutralizing IgGs, the role of which in activity against SARS-CoV-2 has not been established. Trials should preferentially use VNTs to assess serostatus of transfused units and not rely on high-throughput serology.
As for any other medicinal product, CCP exhibits a dose-response relationship, which is also evident when using high-throughput assays. In the subgroup analysis of the EAP, a gradient of mortality was seen in relation to IgG antibody levels in the transfused CCP. In the subgroup of patients who were not receiving mechanical ventilation, death within 30 days after CCP transfusion occurred in 81 of 365 patients (22.2%; 95% CI, 18.2 to 26.7) in the low titer group, 251 of 1297 patients (19.4%; 95% CI, to 21.6) in the medium-titer group, and 50 of 352 patients (14.2%; 95% CI, 10.9 to 18.2) in the high-titer group. Depending on the statistical model, the RR for 30-day mortality in high-titer CCP compared to low-titer CCP recipients ranged from 0.64 – 0.67, with an upper 95% confidence bound of 0.91 (8). Similarly, the large retrospective PSM study from HCA reported a 0.2% decreased risk of mortality for every 1 unit of S/Co serology level (28).
The nAb titer (or total IgG levels as measured by surrogate assays) only describes one factor involved in defining the real therapeutic dose in that it represents the concentration of just one (likely the main) active ingredient. But CCP contains additional antibodies that mediate antibody-dependent cellular cytotoxicity (ADCC), complement activation and phagocytosis of viral particles, functions that can each contribute to its antiviral effects (29). At this time the relative importance of nAbs vs. the other antibody activities is not understood, but, hopefully, retrospective analyses that correlate CCP efficacy with these activities will reveal additional variables that need to be considered in choosing optimal CCP units.
Despite these uncertainties, we can make estimates of likely effective doses based on the available clinical experience thus far. The therapeutic dose of nAb is a product of its concentration in the infused CP multiplied by the overall infused CP volume, adjusted to the recipient body weight to take account of dilution into the blood volume and tissues. RCTs have varied in the provision of volume per unit (200-300 ml), and most importantly in cumulative volume per patient (1-4 units) and in extent of exposure to diverse antibodies from various CCP donors, and no published trials have adjusted levels of nAbs by recipient body weight (or, when attempts have been performed, they referred to the old-fashioned 10-15 ml/kg dose inferred from treatment of hemorrhagic coagulopathies (30)). A failure of CCP to improve outcomes when 200-ml of 1:160 nAb-titer CCP is provided to a patient who weighs 120 kg represents quite a different scenario from failure of a 600-ml transfusion of 1:640 nAb-titer CCP to produce improvement in a 60-kg patient. But these central issues in dosage have not been considered in the RCTs published so far.
Several RCTs performed nAb titration, but with highly heterogenous methods which makes comparability of doses across studies difficult. Table 1 attempts to reconcile doses across those trials, showing that they actually differed more than was apparent by inspection of raw titers. The lack of utility from low-titer (1:40) CCP in moderate COVID-19 was confirmed by the PLACID trial (10). As long as a clear therapeutic dose is not identified, it seems prudent to transfuse units containing nAb titers at least 10-fold higher than the nAb titer measured before transfusion in recipient serum. Similarly, the ConCOVID RCT showed that CCP units having nAb titers similar to those of the recipients (1:160) did not confer a clinical benefit (31). CCP units with an adequate nAb titer (nowadays estimated at >1:160) are more easily found among older males who recovered from a previous symptomatic COVID-19 requiring hospitalization (32, 33): unfortunately, such donors were poorly represented in the first donation waves, which tended to obtain CCP from younger donors will mild disease, and, presumably lower nAb titers (10).
Relevance of CCP to the viral variant
Albeit not formally demonstrated, CCP manufactured by pooling ABO-matched transfusion from many different donors (e.g., in PlasmAr (12)) theoretically have greater polyclonality of nAbs than repeated CCP doses from a single-donor (e.g. CAPSID (34)) and should grant higher efficacy against viral variants. Nevertheless, pooling typically occurs among donors attending the same blood bank, making donor exposure to different viral variants unlikely.
An analysis of potential variables associated with CCP efficacy associated near-sourcing with reduced mortality, with the efficacy of CCP in reducing mortality falling sharply when the CCP source was more than 150 miles from where it was used (35). This finding suggests that SARS-CoV-2 viruses vary enough in their antigenic composition in different geographic locations to create antibody responses that differ by locale (36). Even though CCP is often standardized for nAb titer to the Spike protein, the VNT could use a nonrelevant viral strain, or miss major functional differences for the antibody response (29). This finding has implication for RCTs that use nationally sourced (centralized) CCP, since the attempt to standardize the therapeutic units centrally could inadvertently reduce CCP efficacy if hospitals use CCP obtained from distant loci. For example, in the C3PO RCT, which was conducted in 21 USA states, 95% of the donor CCP was collected in either Chicago or Denver: since only 4 of the 48 centers were in Illinois or Colorado, most CCP usage had to be from remote sources (23). By contrast, the NCT04359810 RCT in New York and Brazil used CCP locally sourced in New York, whose efficacy against P.1 was tested to ensure efficacy at the other recruiting center in Brazil (11).
Although also not formally demonstrated during clinical trials, it is also reasonable to assume that CCP collected during early pandemic waves could be less effective against currently circulating variants of concern (37). RCTs whose recruitment was protracted across multiple pandemic waves (e.g., ConPlas-19) and which relied on CCP collected and banked months earlier could have inadvertently used CCP with reduced activity against the SARS-CoV-2 strains circulating the community when the therapy was administered. Hence, both geography and time of collection of the CCP are important variables when considering the efficacy of the treatment.
The intended outcomes
Most trials (CONTAIN, COMPILE, and PassItOn being exceptions) have used composite endpoints or specialty scores (e.g., SOFA) rather than progression in the simple WHO ordinal scale or mortality, and many were stopped because of apparent futility at a time when they may have been underpowered to detect significant benefit. As represented in Figure 1, several studies have reported overall negative results (panel A) despite the presence of positive signals of efficacy just barely missing statistical significance (panels B and C). The significance level (i.e., p= 0.05) is largely a socially constructed convention for rejecting the null hypotheses, but it has often been misinterpreted as a measure of reality by many individuals not familiar with the nuances of statistics. For example, some CCP studies have concluded that a difference that did not achieve a p value < 0.05 was an absence of difference, even when mortality in the CCP arm was ∼20-40% lower than in controls. This reasoning played a central role in the polarized views of CCP efficacy and prevented subsequent studies to drill down on positive effects that were observed. The dogged pursuit of statistical significance, viewed as a measure of reality instead of the actual reality demonstrated by the data, during a public health emergency dealt a serious blow to studies of CCP and created significant confusion for clinicians. It is also important to understand that RCTs are powered to be less tolerant of Type I error than Type II error, which are conventionally set at .05 and .20, meaning that a Type II error is expected four times as often as a Type 1 error. This statistical convention can contribute to the absence of significance in studies that were set up early in the pandemic when there was little information on expected effects for the various patient populations studied and the patients were very heterogenous such that only subgroups may have responded. Many studies were originally designed to enroll patients at any disease stage, and it should be no surprise that subgroup analyses on the groups that were later demonstrated more likely to benefit from CCP (e.g., early treated, seronegative patients, those receiving high nAb titre) were underpowered to reach statistical significance, as shown by orange color predominance in panel C of Figure 2. Nevertheless, favorable trends are a shared feature across such trials. Lastly, rigid adherence to primary outcomes that were often fixed in the early days of the pandemic when information about disease stage and quality of CCP associated with efficacy were not understood. When these outcomes were not met, trials were considered failures even though there were often signals of efficacy in the data that were not considered as valuable since these had not been pre-specified, even when they made biological sense. For example, in the New York-Brazil RCT cited above, CCP did not lower the primary end-point of clinical status on an ordinal scale, but the statistically significant halving of mortality was acknowledged in the abstract. Would it have made sense to ignore the strong effect of CCP on mortality in this trial just because mortality was not selected as a primary outcome? Although we agree that subgroup analysis carries the risk of ‘cherry picking’ data, such analyses are often important for hypothesis generation and critically important during the emergency of a pandemic where neither viral pathogenesis nor therapeutic variables are well understood. When sub-group analyses are based on firm biological principles, such as focusing on those treated early in disease or lacking their own serological response, the exercise is not cherry picking. To emphasize this point, Christopher Columbus missed the pre-specified primary endpoint of his mission - reaching India - but no one considers his discovery of the New World to be a failure! Turning to the clinical arena, most trials of anticoagulants in myocardial infarction found reductions in mortality of about 20-25%, which was generally not significant in these underpowered trials that declared the findings to be null, even though such a mortality reduction would clearly be of value (38).
Another misunderstood endpoint is viral clearance, defined as the conversion of nasopharyngeal swabs (NPS) from positive to negative for PCR evidence of SARS-CoV-2 in CCP-treated patients. While there was early and robust evidence for this effect from CCP (4, 10), some RCTs failed to find differences between arms just because they sampled NPS too late after CCP treatment, when the endogenous immune response had also mounted in the control arm, and differences vanished.
Analyzing failures in individual RCTs
We use the word ‘failures’ with care and considerable nuance, since negative trials can be very important in teaching us about populations that do not benefit from CCP or variables that affect its efficacy. Keeping the factors discussed above in mind, we have analyzed individual RCTs in detail. At the very beginning, many historically or internally controlled observational studies showed clinical benefit from CCP and this led the FDA to issue an EAP in March 2020 that was converted into an emergency use authorization (EUA) in August 23, 2020. The largest observational study is the US open-label EAP (NCT04338360) led from Joyner et al, which enrolled 105,717 hospitalized patients with severe or life-threatening COVID-19 from April 3 to August 23, 2020 (39). In an analysis of the effect of antibody in CCP performed independently of the results cited above (8) and using a nAb titer in an overlapping but non-identical group of EAP patients, the FDA showed that the 7-day mortality in non-intubated patients who were younger than 80 years of age and were treated within 72 hours after diagnosis was 6.3% in those receiving high-titer CCP and 11.3% in those receiving low-titer CCP (https://www.fda.gov/media/142386/download).
In a later analysis of a larger (N = 35,322) subset of EAP patients, (including 52.3% in the intensive care unit (ICU) and 27.5% receiving mechanical ventilation), the 7-day mortality rate was 8.7% in patients transfused within 3 days of diagnosis but 11.9% in patients transfused ≥ 4 days after diagnosis. Similar findings again from the US EAP were observed in 30-day mortality (21.6% vs. 26.7%) (40). The major criticism of these results is that controls were neither randomized nor PSM: hence a difference in the treatment outcome between treated and untreated groups may be caused by a factor that predicts treatment rather than by the treatment itself. However, importantly, nAb titer analysis was retrospectively done, both patients and physicians were unaware of the nAb content in the CCP units used, the results are what would have been expected from the experience with antibody therapy, and multivariate models were used to adjust for potential confounders (1). Additionally, given the outline of an optimal use case with this data and the earlier underpowered RCT by Li et al (5), it is unfortunate that due to (a) lack of awareness and (b) logistical burden associated with protocol adjustments, involving repowering and new patients’ recruitment criteria, later treatment RCTs either continued or initiated without modifications to include newly available evidence.
The highest level of scientific evidence in primary clinical research stems from prospective PSM and RCTs. PSM studies (Table 3) balance treatment and control groups on a large number of covariates without losing a large number of observations. Unfortunately, no PSM study to date has investigated nAb titers by VNT, and all times have been reported since hospitalization (excluding outpatients). Nevertheless, in 2 retrospective PSM studies from 2 different hospitals in New York, trends for improved outcomes in non-intubated and those treated within 7 days since hospitalization (HR 0.33) were observed (41, 42). These findings were later confirmed in a prospective PSM study from Houston (43, 44). Of interest, a retrospective PSM study from Providence did not show any benefit, but patients were treated at a median of 7 days after onset of symptoms (45). Another PSM study from Yale associated CCP with a 35% reduction in mortality (46). That study is notable in that it included patients on mechanical ventilation who would not normally be expected to benefit from CCP and the percentage of individuals receiving corticosteroids was very low since the study was conducted in the early days of the pandemic in the USA. Another PSM from the Washington DC area found a reduction in mortality with CCP use at both days 14 and 28, which reached statistical significance at the earlier date (47). Finally, a very large study from 176 community hospitals affiliated with Healthcare Corporation of America confirmed substantial mortality reduction in hospitalized patients receiving CCP within 3 days from admission (48).
Since PSM only accounts for observed (and observable) covariates, and not latent characteristics, RCT remains the gold standard for highest-level evidence (Table 2). In the PlasmAr RCT, the small number of early arrivals (less than 72 hours) showed superior primary and secondary outcomes in theCCP arm (n= 28) compared to the placebo arm (n=11), but the minimal contribution of this group to the overall cohort (228 CCP and 105 placebo) made the advantage disappear in the final outcomes at day 30 (12). In another Argentinean RCT on 160 outpatients older than 65 years of age with mild COVID-19 who were treated with CCP within 72 hours, progression to severe COVID-19 halved at day 30 (49). An RCT from India reported that patients younger than 67 treated at a median of 4 days after hospital admission manifested superior mitigation of hypoxia and survival in the CCP arm (50). Another RCT in Spain enrolling patients at less than 7 days of hospitalization showed four deaths in the control arm, none in the CCP arm (51). Given that conventional peer-review slows down during a pandemic, pre-publishing RCT results by the preprint mechanism should be encouraged to accelerate sharing of potentially life-saving therapeutic approaches and to provide pre-publication review that could improve the quality of the final published study.
Figure 1 graphically places the outcomes of RCTs and PSM studies on a Cartesian plot having timeliness and nAbs dose as variables (if values are disclosed in the reports): this makes immediately clear that the few successes at reaching the primary endpoints have gathered into the lower right corner (high nAb dose and early intervention), while the many “failures” have been scattered all around (panel A), reflecting lower antibody levels infused or late treatment, or both, with the latter being the commoner problem. Nevertheless, when we focus on mortality irrespective of statistical significance (panel B) or focusing on statistical significance (panel C), many more RCTs showed clear benefits.
We will focus here on “failures” as identified by title, abstract and/or press recognition. Narratively, we could group so-called “failures”, with failure implying inability to demonstrate a favorable outcome to CCP use, into 4 categories, according to the main reasons:
Trials that transfused insufficient therapeutic doses of CCP due to either low total IgG levels or low nAb levels (e.g., PLACID)
Trials that transfused appropriate doses of CCP but too late, but which nevertheless reported signals of efficacy (e.g., RECOVERY, CAPSID, NCT04359810 and TSUNAMI)
Trials that were stopped too early to observe benefit or with inherent design flaws, and/or were underpowered such that likelihood of success was reduced (e.g., C3PO)
Trial in which CCP was used to treat a condition not amenable to antibody intervention, such as hypoxia that is caused by pulmonary inflammation
Stopping trials for futility is an occurrence that deserves special attention, because it represents wasted resources during a pandemic. Six RCTs so far have been halted for futility, namely RECOVERY, REMAP-CAP, CONCOR-1, C3PO, and NCT04361253, with the first one being to date the strongest evidence for futility (30), with its massive recruitment affecting the outcomes of systematic reviews (52). Instead of stopping trials for futility based on pre-set endpoints it makes more sense that DSMBs facing a high likelihood of lack of statistical significance provide advice on trial modifications that are likely to amplify the significance of signals of efficacy evident in these studies. This would seem a more responsible action than trial cessation given the paucity of therapeutic alternatives in the pandemic emergency. Indeed, a Bayesian re-analysis of RECOVERY data with a wide variety of priors (vague, optimistic, skeptical and pessimistic) calculated the posterior probability for both any benefit or a modest benefit (number needed to treat of 100). Across all patients, when analyzed with a vague prior, the likelihood of any benefit or a modest benefit was estimated to be 64% and 18% respectively. In contrast, in the seronegative subgroup, the likelihood of any benefit or a modest benefit was estimated to be 90% and 74% (53). This finding of benefit accruing to specific sub-groups, who were not determined post-hoc but because they were likely to benefit based on understanding of principles of CP treatment is found in nearly every trial whose overall finding is negative.
The inadequacy of meta-analyses
With all the heterogeneity in key drivers discussed in the former paragraphs, it becomes clear that secondary research (ranging from umbrella reviews to meta-analyses to systematic reviews), whereby each study is considered at the same level, invariably ends up with biased and divergent conclusions. This adds confusion to the already complex field of individual trial outcomes. Amazingly, as of August 24, 2021, PubMed has indexed 25 meta-analyses on CCP efficacy, more than the RCTs reported at the same date. Until the beginning of 2021, meta-analyses (variably including observational studies) were generally in favor of CCP (54), but began to be biased towards failure after publication of the large RECOVERY trial (30), which, by enrolling as many as 11,448 patients, diluted all the other divergent RCTs. Clear examples of this phenomenon come from a widely cited metanalysis from Janiaud et al in JAMA (52) which included press release data from RECOVERY and from the living systematic review by the Cochrane Group (55). This paper was surely unprecedented in the tradition of meta-analysis, not only because it included a study based only on a news release (which proved to differ in some important respects from the published paper), but because it allowed these data from a news release to dominate the entire analysis. Several groups attempted to dissect the RECOVERY trial and others by running subgroup analyses in their systematic reviews (53, 56, 57), but these reviews were unable to restore confidence in CCP efficacy in the clinical community that had been lost because of the publication of the overall negative findings of RECOVERY and PlasmAr (58). A metanalysis of 22,591 patients (enrolled in 10 RCTs and 15 observational studies) showed that early CCP significantly reduced mortality (RR 0.72, p<0.00001), but only in patients who were not suffering severe or critical disease (59). On the other hand, another metanalysis of 18 peer-review clinical trials, 3 preprints, and 26 observational studies actually found that CCP use was associated with reduced risk of all-cause mortality in severe or critical COVID-19 patients (60). A recent umbrella review of 29 metanalyses and systematic reviews found evidences for improvement in the CCP arms for some outcomes (overall mortality, viral clearance at day 3,) but not for others (clinical improvement, length of hospital stay (61).
Rather than pooling published RCTs, the Continuous Monitoring of Pooled International Trials of Convalescent Plasma for COVID-19 Hospitalized Patients (COMPILE) study pooled individual patient data from ongoing RCTs at two-week intervals. Unfortunately, with the single exception of CONTAIN, participating RCTs largely shared late usage (DAWN-plasma, PLACID, ConCOVID, ConPlas-19, NCT04421404, NCT04397757, and the Brasília Covid-19 Convalescent Plasma (BCCP)) (62).
Conclusions
While CCP contains a plethora of biologically active molecules (63), we now have very strong evidence that appropriately vetted CCP from eligible convalescent donors is safe for patients (64, 65), with no evidence of increased risks of transfusion-transmitted acute lung injury, antibody-mediated enhancement concerns feared in the early days of the pandemic (66) nor is there evidence that CCP induces accelerated SARS-CoV-2 evolution (11). Polyclonal antibodies such as CCP, or CCP-derived hyperimmune globulins made from large donor pools, are likely to offer better protection against onset of variants than monoclonal antibodies. Outcomes in immunocompromised patients treated with CCP have been successful in the long-term, with minimal evidence for immune escape (67). There is evidence that vaccinated convalescents may have even higher nAb titers than unvaccinated convalescents offering the promise of expanded success in using CCP (68).
We have also learned that CCP is less likely to benefit patients requiring oxygen (i.e., from level 4 and up on the 11-point WHO ordinal scale), and hence, ideally, the focus should be on outpatients and in identifying that subset of patients who seek hospital care and are still sufficiently early in the course of disease such that they can benefit from CCP. This finding parallels the finding with hyperimmune serum and anti-Spike monoclonal antibodies, which at first failed in hospitalized patients (69, 70), but later succeeded for ambulatory patients with mild to moderate COVID-19 (71) and were approved for emergency use. However, at this moment clinical use in the US is restricted by the FDA to inpatients.
CCP usage per admission peaked after issuance of the EUA, with more than 40% of inpatients estimated to have received CCP between late September and early November 2020. However, following reports of RCTs that failed to show clear benefit from CCP, usage per admissions declined steadily to a nadir of less than 10% in March 2021. A strong inverse correlation (Pearson correlation coefficient of -0.5176 with P = 0.00242) was found between CCP usage/hospital admission and deaths occurring 2 weeks after admission, and this finding was robust to examination of deaths taking place 1, 2 or 3 weeks after admission. Changes in the number of hospital admissions, prevalence of variants, and age of patients could not explain these findings. The authors estimated that the retreat from CCP usage, a phenomenon they termed “plasma hesitancy”, might have resulted in 29,000 to 36,000 excess deaths in the period from mid-November 2020 to February 2021 (72). The same analysis estimated that USA had avoided 96,000 excess deaths from August 2020 to March 2021 by its liberal deployment of CCP.
Several lines of evidence, ranging from the EAP to clinical trials employing RCT or PSM controls are now indicating how CCP should be used in immunocompetent patients (73). The evidence supports the initiation of CCP treatment as early as 44-72 hours within onset of symptoms (which largely pertains to outpatients) and using CCP with a nAb titer > 1:160. Benefit within 1 week from onset of symptoms (including in hospitalized patients) is less well understood, although a benefit from higher therapeutic doses cannot be ruled out at this stage. Clinical benefit seems absent when administered after 1 week from onset of symptoms or in patients requiring ventilation, or in those who receive CCP with a low nAb titer. Nevertheless, chronically immunosuppressed patients benefit from CCP even at later stages (67, 74, 75) : the best evidence for this scenario comes from a prospective PSM showing a halving of mortality in ICU-admitted oncohematological COVID-19 patients who received CCP (76). We note that while there have been concerns that use in immuncompromised can promote the emergence of antibody-resistant variants, such variants have emerged from massive replication in susceptible populations and not from treated patients, who in any case are isolated in hospitals where mitigation efforts to reduce transmission are employed, and are thus very unlikely to transmit their viruses further (77). Such simple concepts have been poorly communicated to the general public and the clinical community, who should be better informed of the state of current evidence that support CCP efficacy.
The future of CCP
CCP remains a relatively inexpensive therapy that is available throughout the world even in resource poor areas that cannot afford expensive antiviral drugs or monoclonal antibody therapies. Much has been learned about the variables that affect CCP efficacy even though, as recounted here, the clinical efficacy data is mixed. Table 4 lists the RCTs whose outcomes have still to be reported after completion or which are still recruiting patients. Unfortunately, little new can be expected given that most of these RCTs were designed to enroll patients having symptoms for more than 7 days. Given the heterogeneity of the product and the complex variables that contribute to efficacy it is remarkable that many studies have reported reductions in mortality. This suggests a robust therapeutic effect that allow signals of efficacy to break through all the noise imposed by variability in the product and its clinical use. The positive evidence for CCP efficacy cannot be dismissed while negative results can be explained. In the absence of good therapeutic options for COVID-19, CCP is likely to find a niche in the early treatment of disease. Instead of looking for unlikely superiority outcomes, noninferiority RCTs comparing monoclonal antibody versus CCP in early arrivals should be initiated. Such an RCT is very unlikely to be sponsored by vendor companies, so public institutions should be sensitized to funding it.
Given the experience accumulated with COVID-19, it is almost certain that CP will again be deployed for the next epidemic and we are hopeful that lessons learned in this pandemic are heeded such that use and trials focus on the very early use with high-titer CP.
We declare we have no conflict of interest to disclose.
Data Availability
All data are available via request to the corresponding author
Footnotes
Updated datas from several RCTs
Abbreviations
- BSC
- best supportive care
- CP
- convalescent plasma
- CCP
- COVID-19 convalescent plasma
- nAbs
- neutralizing antibodies
- VNT
- viral neutralization test
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.