Abstract
Background There exists no prior systematic review of human challenge trials (HCTs) that focuses on participant safety. Key questions regarding HCTs include how risky such trials have been, how often adverse events (AEs) and serious adverse events (SAEs) occur, and whether risk mitigation measures have been effective.
Methods A systematic search of PubMed and PubMed Central for articles reporting on results of HCTs published between 1980 and 2021 was performed and completed by 10/7/2021.
Results Of 2,838 articles screened, 276 were reviewed in full. 15,046 challenged participants were described in 308 studies that met inclusion criteria. 286 (92.9%) of these studies reported mitigation measures used to minimize risk to the challenge population. Among 187 studies which reported on SAEs, 0.2% of participants experienced at least one challenge-related SAE. Among 94 studies that graded AEs by severity, challenge-related AEs graded “severe” were reported by between 5.6% and 15.8% of participants. AE data were provided as a range to account for unclear reporting. 80% of studies published after 2010 were registered in a trials database.
Conclusions HCTs are increasingly common and used for an expanding list of diseases. Although AEs occur, severe AEs and SAEs are rare. Reporting has improved over time, though not all papers provide a comprehensive report of relevant health impacts. From the available data, most HCTs do not lead to a high number of severe symptoms or SAEs. This study was preregistered on PROSPERO as CRD42021247218.
1 Introduction
Human challenge trials (HCTs) are a clinical research method where volunteers are exposed to a pathogen in order to derive scientifically useful information about the pathogen and/or an intervention (1). Such trials have been conducted with ethical oversight since the development of the modern institutional review system of clinical trials in the 1970s. More recently, there has been renewed discussion about the ethical and practical aspects of conducting HCTs, largely fuelled by interest in conducting HCTs for SARS-CoV-2. Some past reviews of HCTs focused on reporting methods (2–4), but these did not explicitly evaluate the safety of HCTs by assessing reported adverse events (AEs) and serious adverse events (SAEs). Furthermore, many additional HCTs have been performed since the publication of these reviews. In order to better inform discussions about future uses of HCTs, including during pandemic response, this article presents a systematic review of challenge trials since 1980 and reports on their clinical outcomes, with particular focus on risk of adverse events and risk mitigation strategies.
HCTs are often used to support development of therapies and vaccines more efficiently than conventional clinical trials (5–8), and have recently been discussed as particularly valuable in the context of novel disease pandemics like COVID-19, Zika virus, or a future Disease X (9–11). HCTs have been used to investigate malaria (12), influenza (13), common cold (14), various enteric diseases (15,16), and cholera (16). The benefits of such trials include defining and evaluating correlates of protection (17); the first FDA-approved cholera vaccine, Vaxchora, which proved its efficacy using a small HCT (7,18); a contribution to the development of the FDA-approved therapeutic oseltamivir for influenza (19); the vi-tetanus toxoid conjugate vaccine for Salmonella typhi (20); and dosing schedules for RTS,S/AS01 malaria vaccine (21).
Arguments against the use of HCTs have centered around ethics of participant compensation and the populations represented, and whether the risks and lack of personal benefit can be compatible with the principle of primum non nocere (22–25) due to the potential risks they may inflict on a study population. Despite the debate, there is a long-standing consensus that infecting healthy volunteers is ethically justifiable, so long as the risk of harm is acceptably low (24). HCTs can therefore be ethical, based on a case by case assessment of risk as part of wider research ethics oversight mechanisms.
AEs related to challenge are one measure of health risk in HCTs. AEs refer to “any untoward medical occurrence associated with the use of a drug in humans” (26). The US FDA considers challenge agents to be (akin to) investigational new drugs (27), such that AEs in HCTs refer to any untoward medical occurrence associated with the challenge. AEs that result in death, hospitalization, disability, or permanent damage; as well as AEs that are life-threatening or other important medical events, are reported as serious adverse events (SAEs) (26). It should be noted that AEs graded “severe” by studies are distinct from SAEs in most cases, usually because they are not life-threatening or do not require hospitalization.
A systematic review was performed to characterize the frequency and nature of AEs and SAEs in HCTs related to the challenge, and the risk mitigation measures employed. The review also investigated the pathogens studied, the clinical outcomes in participants, study registration in databases, the number and uses of HCTs over time, and the quality of data reporting.
2 Methods
2.1 Search Strategy
A systematic review of records from 1980 to 2021 indexed in the PubMed and PubMed Central (PMC) databases was performed to identify published articles describing HCTs. Articles published prior to 1980 were not assessed because the modern institutional review system was not in place until after the 1979 Belmont report. The initial search was preregistered on PROSPERO as CRD42021247218 (28), but identified few studies published prior to 2010. Additional searches were performed to address this and appropriately discover studies for each decade of interest, as detailed in the amended preregistration (28) and the Supplementary Methods. The database search strategy is presented in Table 1. Further manual searches of references lists and reviews were performed to identify additional articles describing HCTs that were missed.
2.2 Screening Process
Titles and abstracts of search results were manually screened by three authors working independently to identify articles that were eligible for full text review. Case reports, reviews, articles not available in English, studies that did not meet the criteria for an HCT, and articles published prior to 1980 were excluded. Secondary reviews of two past reviews (3,4) were also performed to identify more articles that were missed by the searches. Articles that described studies that performed secondary analysis of results from previously conducted HCTs were excluded, but their reference lists were reviewed to identify the original publication of these results.
2.3 Full Text Review Process
The unit of analysis is the individual study, as described within a published article detailing results. Individual studies were identified by trial registration. If trial registration was not reported, studies were counted per the article description, or as a single study if participants were challenged with a single pathogen. If multiple articles were published discussing the same study, the earliest published article was included. In some cases, multiple articles were combined (see Supplementary Methods).
There is an ongoing discussion on the precise definition of an HCT (29). In general, studies that had been completed and involved intentional exposure of human volunteers to a pathogen were included. Challenges with candidate vaccine viruses were also included, as were studies where previously challenged participants were challenged again with the same pathogen (rechallenges). Consistent with Kalil et al., studies involving live, attenuated vaccines which were not followed by intentional infection, as well as data from phases of studies involving immunization or vaccination with live, attenuated vaccines or other methods that could have potentially resulted in infection, but which are not generally referred to as HCTs, were excluded (2).
2.4 Data Collection Process
At least two reviewers independently examined each publication selected for full text review and any discrepancies were either reconciled, or resolved by the senior author. Data collection was performed manually and results were input into a spreadsheet.
2.5 Data Extraction
The following numerical data were extracted from each study: year of article publication, size of cohort, gender breakdowns; mean or median age, standard deviation, and age range; number of participants challenged, number of challenged participants infected with pathogen, number of participants in control group (those who did not undergo a challenge), number of control participants infected with pathogen, number of control participants with at least one AE, and number of challenged participants with: (a) at least one AE, (b) at least one “severe” or “very severe” (grade 3 or higher) AE, (c) at least one SAE.
In addition, the following non-numerical data were extracted from each study: clinical trial registration, pathogen assessed, definition of infection, definition of AEs, treatments administered to participants, risk mitigations taken, ethics committee and review board approvals reported, and a brief description of the study design.
For articles that reported separate study arms that were all exposed to a pathogen within a single pathogen category, data were summed across all arms to be treated as a single study. Data from rechallenges were extracted separately and treated as individual studies. No treatment effect measures were extracted.
AEs among challenged participants that were not related to challenge (such as AEs related to vaccination or drug treatment) were not extracted (see Supplementary Methods). For studies that did not define and/or report AEs, reported symptom data were extracted instead. For studies that did not define and/or report SAEs, reported symptom data that met the 2016 definition of SAEs provided by the FDA (26) based on reviewer judgment were extracted as SAEs.
3 Results
3.1 Study Selection
Figure 1 shows a PRISMA flowchart of study selection, generated using a tool by Haddaway et al. (30). Searches yielded a total of 2,654 results. 183 additional results were added by citation searching the reference lists of two past reviews (3,4) and articles identified among search results that used data from prior HCTs. One article (31) provided updated data for another (32). 11 results were not retrieved (five with no full text available and six with unpublished data), and 47 duplicates were removed. No further efforts were made to identify unpublished or unidentified work. Results were assessed for eligibility, and 276 articles were included, describing 308 studies from which data were extracted. Excluded results were primarily reviews and articles discussing non-HCT clinical trials. See the Supplementary References for the complete reference list of included articles.
3.2 Results of Individual Studies
Data from 284 studies, with 14,628 challenged participants, were extracted (Table 2). Additional data were extracted from 24 rechallenge studies (Supplementary Table 3). Between 9,917 and 10,277 challenged participants (67.8% −70.3%) were diagnosed with infection. The dataset and code used for generating all results and tables are publicly available (https://github.com/1DaySooner/HCTSystematicReview).
3.3 Reported Adverse Events and Unreported Data
Among 284 studies, 94 and 97 did not report any AE or SAE data, respectively (Table 3). The precise number of participants experiencing at least one SAE could not be extracted from two studies: one lost some challenged subjects’ original records in a storage facility that flooded (34), and the other only reported that “All serious AEs were self-limited and resolved within several days, and none were deemed to be vaccine-related” (35).
Among 10,325 challenged participants in studies that reported AEs, between 4,317 (41.8%) and 5,730 (55.5%) experienced at least one AE (Table 4). Among 5,083 challenged participants in studies that graded severity of AEs, between 285 (5.6%) and 801 (15.8%) experienced at least one severe or very severe (grade 3 or higher) AE (Table 5). The range in possible AE values is greater in more recent decades as a result of more studies reporting AEs by individual or symptom, rather than reporting the total number of participants with at least one AE. 19 studies included control (non-challenged) participants (n=433); only two of these studies reported AE data for control participants (n=69). Between seven (10.1%) and 12 (17.4%) control participants experienced at least one AE.
Among 10,016 challenged participants in studies that reported SAEs, 23 (0.2%) experienced at least one SAE (Table 6). Among 146 rechallenged participants in studies that reported SAEs, one additional participant (0.7%) experienced at least one SAE (Supplementary Table 6). No fatalities were reported. SAEs are described in more detail in Table 7, and some SAEs deemed not related to challenge are discussed further in Supplementary Table 7.
3.4 Studies by Pathogen
The numbers of studies and participants challenged within each category of pathogen are presented in Table 8, and Figure 2a illustrates studies of different pathogens have occurred over time. There were 28 pathogen categories, with the most commonly studied being Plasmodium spp. (73 studies, 1,689 participants), influenza viruses (45 studies, 3,536 participants), and rhinovirus (43 studies, 4,332 participants). Studies investigating Plasmodium spp. had the greatest number of challenged participants with SAEs, with seven SAEs (out of 23 in all non-rechallenge studies) occurring among 1,129 participants in 52 studies. Studies investigating norovirus had the greatest proportion of SAEs to number challenged, with four SAEs occurring among 163 participants in three studies.
3.5 Reporting Adverse Events and Use of Trial Registries Over Time
Overall, the number of challenge studies has been increasing each decade (Figure 2b).
Prior to the 2000s, many studies did not report AEs, but instead reported comparable symptom data. These were extracted as AEs. Of the 283 included studies, 123 explicitly mentioned or defined AEs, but not all reported them for the challenge phase specifically. The proportion of studies with definitions has increased over time, from only 19.4%, 23.9%, and 21.1% in the 1980s, 1990s, and 2000s respectively, to 68.9% and 72.7% in the 2010s and 2020s (thus far) respectively. Results that exclude studies that did not explicitly mention AEs and SAEs are presented in Supplementary Tables 9 and 10.
The National Institutes of Health (NIH) launched ClinicalTrials.gov on February 29, 2000. For NIH-funded research, post-2007, “applicable clinical trials” are required to be registered (51). However, publication year lags year of registration, so it is unclear how much of the lack of registration is noncompliance and how much is delayed publication. Still, only 5.3% of included studies published in the 2000s were registered in at least one registry; 76.4% of included studies published in the 2010s were registered in at least one registry. Every included study published so far this decade was registered.
3.6 Risk Mitigation
Text describing specific risk mitigation measures was found in 286 of the 308 studies, which is included in the dataset (https://github.com/1DaySooner/HCTSystematicReview), and a descriptive summary follows. The qualitative nature of these mitigation descriptions precluded meaningful quantitative analysis.
Risk mitigation measures typically include evaluating participants’ risk of disease if exposed to a challenge agent, by using medical screening and assessing participants’ medical histories. In some cases, checking for prior exposure to the pathogen was a risk mitigation strategy, but it could also be done for other reasons. Demographic criteria, pregnancy screening, assessment of cardiac risk, and assessment of weight and/or BMI were often used to evaluate risk. Many studies reported using treatment for the challenge infection where relevant (“rescue therapies”) to ensure that volunteers were cleared of infection before discharge. Many studies reported evaluating participants’ suitability for these therapies prior to challenge.
Some studies reported mitigation strategies for risks to non-participants, such as isolation throughout the duration of the study, requiring birth control, or excluding participants with employment posing risk of spread (for example, excluding food handlers in HCTs investigating Escherichia coli, norovirus, and Salmonella spp.). Validity of informed consent was sometimes assessed by testing participants’ understanding of the study protocol.
4 Discussion
The present review found a total of 24 (23 reported in traditional challenges, one in a rechallenge) SAEs and zero reported deaths or cases of permanent damage among 15,046 participants in 308 studies spanning 1980 to 2021. It is unlikely that any SAEs captured in this review (Table 7) were life-threatening, as they were primarily categorized as SAEs due to involving brief hospitalization for observation or supportive care, requiring non-invasive interventions (such as re-treatment for relapses), or falling under the broad category of “other serious (important medical events)” in the FDA definition of SAEs. The proportions of studies that define AEs and mention SAEs have increased over time, although inconsistent definitions make it challenging to compare reported data, particularly across studies investigating different pathogens. Unfortunately, the proportions of studies that don’t report AE and SAE data related to challenges remained unacceptably high in the 2010s at 24.5% and 30.2%, respectively (Table 3). While a high rate of failing to report SAEs may be indicative of their rarity in the HCT setting, clearer reporting would allow for better understanding of the risks and benefits of HCTs.
Issues surrounding AE reporting in clinical trials are not exclusive to HCTs (52). However, confusion related to reporting challenge-related AEs is an issue specific to HCTs. For example, some studies identified “expected symptoms” as being distinct from AEs, only reported AEs related to interventions, or omitted discussion of AEs entirely. Additionally, clinical endpoints (such as moderate to severe diarrhea in E. coli HCTs) were not always reported as AEs by the study. There is a greater degree of consistency for SAE reporting (generally in agreement with the FDA definition (26)), but many studies, especially those published prior to 2000, did not define or report SAEs. Guidelines for HCT reporting have been suggested (2), but have not yet been adopted. Accordingly, a major conclusion of this review is that in addition to a greater effort to standardize AE reporting in general, which others have postulated (53,54), these standardization efforts are particularly valuable to HCTs.
The number of new HCTs has been increasing; however, it is unclear whether this increase is proportional to the general growth trend in the number of new (non-HCT) clinical trials. Since 2010, pathogens such as Bordetella pertussis, Schistosoma mansoni, and Streptococcus pneumoniae have been studied in HCTs for the first time. Figure 2a shows that the number of influenza and rhinovirus HCTs has declined somewhat over time, following the discontinuation of several research programs focused on common cold, while the number of Plasmodium spp. HCTs sharply increased in the 2010s. These trends demonstrate that HCTs are an increasingly ubiquitous tool, and their relative speed allows researchers to investigate new pathogens of interest more rapidly than in traditional clinical trials.
Limitations of this review are primarily related to uncertainties around the accuracy of AE reporting. This includes potential bias in AE reporting, inconsistent reporting, and difficulty in precisely estimating the rates of events based on provided data. Many studies reported either no or unclear AE and/or SAE data, and issues of censoring and misclassification are common with respect to AE reporting in general (53). To partially address issues with different standards for reporting over time, we extracted symptom data as AE and/or SAE data from studies that did not mention or define AEs/SAEs, but this means that AEs for decades in which these studies occurred are not fully comparable. The review is further limited by our inability to locate some results, including published HCTs that were not on PubMed (55) and HCTs whose results have only been published as case reports (45). These limitations further highlight the need for improvements in the field of HCTs with respect to AE reporting and availability of results. Future work building off of this review includes policy recommendations around the issues of standardization and AE reporting, investigating the registration of HCTs in databases, and further qualitative analysis of risk mitigation measures in published articles.
5 Conclusions
The recent literature contains hundreds of HCTs involving over 10,000 participants and only 24 SAEs, with no recorded deaths or cases of permanent health damage. HCTs are now routinely used to understand infectious dose, disease progression, clinical efficacy of novel interventions, and immune response for a wide variety of pathogens. As evidenced by recent HCTs for COVID-19, they may be conducted for novel as well as familiar diseases. This review can help support public discussion and expert deliberation regarding the safety of HCTs. It may also inform future discussions among HCT researchers and members of ethics review committees regarding the planning, conduct, and reporting of future HCTs.
Preregistration, Protocol, and Conflict of Interest Disclosures
The review was preregistered on PROSPERO as CRD42021247218, risk outcomes and risk mitigation measures in human challenge trials: a systematic review. The review protocol is included as supplementary materials. As mentioned above, the preregistration was amended to include additional searches and data.
Thank you to 1DaySooner for supporting this work, and to the 1DaySooner Scientific Advisory board, which includes coauthors DM, VS, and WW, for reviewing the proposed study.
Data Availability
All data is publicly available online at https://github.com/1DaySooner/HCTSystematicReview
Contribution Statement
DM, WW, and VS conceived of the idea. EJ, JO, MR, and WW provided initial feedback and refined the idea. DT and DM designed and preregistered the systematic review, with feedback and expert guidance from EJ, JO, and MR. DT led the initial review for inclusion, with JAP and KS. The full text reviews were done by JAP, DT, and two non-author reviewers thanked below: SK and DK. Disputes were resolved by DM. Guidance on inclusion criteria and interpretation was provided by EJ, KS, and JO. JAP and DT led writing of the manuscript, with supervision and assistance by DM, VS, WW, KS, EJ, and JO. JW managed the dataset, performed analysis, and produced data summaries and visualizations.
Funding
This work was supported by 1Day Sooner. David Manheim was supported by grants from the Center for Effective Altruism’s Long Term Future Fund. Euzebius Jamrozik’s work was supported by the Wellcome Trust, including current grants 221719 and 216355.
Competing Interests
1DaySooner advocates for volunteers in HCTs and supports their broader usage. Several authors of the paper have volunteered for HCTs, though none have participated in a trial. Several authors of the paper were employed by 1DaySooner’s research team for this work, which is intended to be independent of the advocacy group. For this reason, there was no review of the manuscript nor input about the results from the management nor from the advocacy team. DM has been paid externally for work with both the advocacy and research teams at 1DaySooner, as well as other related advocacy and policy research. WW provides scientific consulting for pharmaceutical companies and other organizations which conduct clinical trials, but not for HCTs. MR is a professor and the clinical head of the Controlled Human Infection Center at Leiden University, which conducts challenge trials. JO has worked on challenge trials, and is working with an international collaborative group to drive development of a GAS pharyngitis CHIM. EJ has contributed to WHO Ethics Guidance documents on Human Challenge Trials and has received funding from the Wellcome Trust, including current grants 221719 and 216355, which supported work for this paper.
Availability of Data, Code, and Other Materials
The complete dataset of included studies is available at https://github.com/1DaySooner/HCTSystematicReview.
Acknowledgements
We would like to thank Steffen Kamenicek and Daniel Kaufman, who assisted with full text review of papers and data extraction. We would also like to thank Steffen Kamenicek and River Bellamy for additional assistance with reviewing the text, tables, and figures prior to submission.