SUMMARY
Neonatal sepsis causes significant morbidity and mortality worldwide but is difficult to diagnose clinically. Clinical prediction models (CPMs) could improve diagnostic accuracy. Neonates in low- income and middle-income countries are disproportionately affected by sepsis, yet no review has comprehensively synthesised CPMs validated in this setting. We performed a scoping review of CPMs for neonatal sepsis diagnosis validated in low-income and middle-income countries. From 4598 unique records, we included 82 studies validating 44 distinct models. Most studies were set in neonatal intensive or special care units in middle-income countries and included neonates already suspected of sepsis. Three quarters of models were only validated in one study. Our review highlights several literature gaps, particularly a paucity of studies validating models in low-income countries and the WHO African region, and models for the general neonatal population.
Furthermore, heterogeneity in study populations, definitions of sepsis and reporting of models may hinder progress in this field.
INTRODUCTION
Neonatal sepsis is a clinical syndrome caused by severe systemic infection in the first month of life.1 It is a leading cause of global neonatal mortality and disproportionately affects neonates in low- income and middle-income countries (LMICs).2
Conventionally, neonatal sepsis is divided into early-onset sepsis (EOS) usually due to vertical transmission of pathogens from the maternal genitourinary tract and occurring in the first 48-72 hours of life, and late-onset sepsis (LOS) usually due to pathogens acquired from the home or hospital environment and occurring after 48-72 hours.3 For infections caused by group B streptococcus (GBS), EOS is often considered to occur up to the seventh day of life.3 However, there is increasing recognition that this conventional classification of neonatal sepsis is misplaced in LMICs where neonates may be exposed from birth to organisms traditionally associated with LOS due to high rates of healthcare-associated infections.4
Neonatal sepsis is difficult to diagnose clinically due to non-specific signs and symptoms. Identifying a pathogenic organism from a normally sterile site (e.g. blood or cerebrospinal fluid) remains the gold standard method for diagnosis.3 Nevertheless, ‘clinical’ or ‘culture-negative’ sepsis – where a sterile culture is obtained from a neonate with signs and symptoms of sepsis – is a recognised entity.5 When sepsis is suspected, a fine balance must be struck between the risk of failing to treat a true invasive infection and the risks of unnecessary antimicrobial use, which can contribute to antimicrobial resistance,4 and adverse neonatal outcomes.6 Additionally, national guidelines advise starting antimicrobial therapy within one hour of suspecting sepsis to maximise the chance of survival, leaving little time for clinicians to make a diagnosis.7
Clinical prediction models (CPMs) are tools that combine characteristics to estimate the probability of a diagnosis or prognostic outcome.8 Models exist to diagnose neonatal sepsis in both high-income countries and LMICs based on various clinical features, risk factors and/or laboratory tests. These CPMs aim to improve diagnostic accuracy and therefore rationalise antibiotic use. The benefits of early recognition and treatment could be significant for neonates in low-resource settings where specialist care is limited.9 Furthermore, reducing the growing threat of antimicrobial resistance is even more crucial where access to sufficiently broad-spectrum antimicrobials may be unaffordable for patients.10
While several existing reviews examine CPMs to diagnose neonatal sepsis,11–14 there has been no comprehensive synthesis of models validated in LMICs. Therefore, the aim of this scoping review was to map the literature of CPMs to diagnose neonatal sepsis in LMICs.
Specific objectives were:
To provide an overview of existing CPMs to diagnose neonatal sepsis in LMICs
To determine the evidence underlying the use of CPMs to diagnose neonatal sepsis in LMICs
To compare the performance of CPMs using different approaches to risk stratification or different target populations
To identify unanswered research questions surrounding CPMs to diagnose neonatal sepsis in LMICs, which may guide future primary research or systematic reviews
METHODS
We conducted this review according to an a priori published protocol,15 developed with reference to the scoping review guidelines provided by the Joanna Briggs Institute.16 We report methods and results in accordance with the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (see supplementary appendix).17
Search strategy
Eligibility criteria are shown in Table 1. After reviewing the extent and breadth of the literature from our initial searches, we narrowed the scope of our original protocol to focus specifically on studies that validate a CPM to diagnose neonatal sepsis in a LMIC, as defined by the World Bank in 2020.18 We searched six electronic databases from their inception: Ovid MEDLINE, Ovid Embase, Scopus, Web of Science Core Collection, Global Index Medicus, and the Cochrane Library. Searches were initially performed on 20 December 2019 and updated on 5 September 2022 and 16 June 2024.
Search terms were chosen to capture the three domains of the research question (‘neonate’, ‘sepsis’, and ‘clinical prediction model’) through collaboration with a child health specialist librarian. The search strategy was developed for Ovid MEDLINE and adapted for each database (see supplementary appendix). Additional studies were identified by citation analysis and by hand searching the reference lists of included studies.
Record screening
We imported identified records into EndNote 21 for deduplication.19 Unique records were then uploaded to the Rayyan application for screening by two independent reviewers (DM, HG, MZ, SRN or SS).20 Titles and abstracts were first examined against the eligibility criteria to determine if each record was potentially eligible for inclusion. Next, full texts of potentially eligible studies were obtained and reviewed to confirm eligibility. Authors were contacted to request full texts where these could not be found online. Conflicts were resolved by discussion amongst the review team.
Data extraction and synthesis
Data extraction was performed independently by two reviewers for the initial searches (SRN and SS) and by one reviewer for each updated search (SRN or SS). We extracted data on study, participant and model characteristics, and model performance using a pre-piloted data extraction form (see supplementary appendix). Data items were chosen based on the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement.21 We summarised results by narrative synthesis. Data for quantitative outcomes were not pooled in a meta-analysis as this is beyond the scoping review methodology. Where multiple variations of a model were presented in the same study (e.g. different combinations of predictors presented during model specification), or model performance was presented at multiple classification thresholds, we only included data for the ‘optimal’ or ‘final’ model at a single classification threshold.
RESULTS
Searches and included studies
Searches identified 4598 unique records (Figure 1). From these, 82 studies published between 2003 and 2024 were included,22–103 and are summarised in Tables 2 and 3. The number of published studies validating a CPM to diagnose neonatal sepsis in LMICs has increased rapidly in recent years (Figure 2). Studies were conducted in 22 individual countries (Figure 3 and Table 4), with the greatest number of studies conducted in the World Health Organization (WHO) South-East Asian Region (n=48, 59%), particularly in India (n=37, 45%). The fewest studies were conducted in the WHO African Region (n=4, 5%). Regarding economic status, 51 studies were conducted exclusively in lower middle-income countries (62%) and 30 exclusively in upper middle-income countries (37%).
One study pooled data from both low-income and lower middle-income countries.98 Most studies were set in intensive care or special care admission units (n=64, 78%). The remainder included all live births at study sites (n=12, 15%), neonates presenting to emergency care services (n=3, 4%), all hospitalised neonates (n=1, 1%), or the setting was unclear (n=2, 2%).
In total, 24252 neonates were included across all studies. The median number of participants per study was 151 (range 36 to 3303, interquartile range [IQR] 200). Few studies restricted the study population based on gestational age or birthweight, with only 4 studies (5%) specifically investigating preterm neonates and 5 studies (6%) specifically investigating low or very low birthweight neonates. Most studies included neonates clinically suspected of sepsis or with specific maternal risk factors including chorioamnionitis (n=58, 71%).
Almost all studies included a positive blood and/or CSF culture in their outcome definition for sepsis (n=75, 91%). Of these, 18 (22% of all studies) also included clinical features or clinical suspicion of sepsis. One study used a consultant neonatologist’s clinical diagnosis of sepsis,76 one study used the International Classification of Diseases 10th Revision criteria for sepsis,77 and in three studies the outcome was unclear.
Model characteristics
The 82 included studies performed 109 evaluations validating 44 distinct models (Table 3).22–25,32,33,40,46,47,49–51,54,56,57,63,68,72,76–78,81,83,86,87,90,92,98–101,103–113 The most frequently validated model was the Hematological Scoring System by Rodwell et al. (n=32, 39% of studies; including studies that made minor modifications to the original model).112 Most models were only validated in one study (n=34, 77% of models).
A total of 135 predictors of sepsis were included across all models, of which 82 were clinical parameters (signs, symptoms or risk factors) and 53 were laboratory parameters (see supplementary appendix). The median number of predictors per model was 6 (range 2 to 110, IQR 4). 14 models (32%) included only clinical parameters, 12 models (27%) included only laboratory parameters, and 18 models (41%) included both. The commonest laboratory parameters were white cell count (n=17 models, 39%), C-reactive protein (CRP) (n=16 models, 36%) and platelet count (n=15 models, 34%). The commonest clinical parameters were neonatal fever (n=13 models, 30%) and gestational age (n=11 models, 25%).
Most models were developed using logistic regression (n=16 models, 36%) (often with stepwise selection to select predictors) or consisted of a scoring system based on univariable predictor performance or literature review and expert opinion (n=10 models, 23%).
Model performance
Model performance was principally reported using sensitivity and/or specificity (with or without a confusion matrix); only 4 studies (5%) did not report either metric. 32 studies (39%) reported area under the receiver operating characteristic curve (AUC). Less frequent methods of quantifying performance included predictive values, likelihood ratios, accuracy, change in antibiotic use, and mortality statistics. Across all 109 evaluations, median sensitivity was 81% (range 3% to 100%) and median specificity was 83% (range 11% to 100%).
Performance stratified by clinical vs. laboratory parameters
In models containing both clinical and laboratory parameters (n=24 evaluations), median sensitivity was 73% (range 3% to 100%) and median specificity was 80% (range 18% to 98%). In models containing only clinical parameters (n=23 evaluations), median sensitivity was 67% (range 3% to 100%) and median specificity was 72% (range 11% to 99%). In models containing only laboratory parameters (n=62 evaluations), median sensitivity was 83% (range 9% to 100%) and median specificity was 86% (range 33% to 100%).
Performance stratified by target population
Models validated in a population with existing clinical suspicion of sepsis (due to signs and symptoms and/or presence of maternal risk factors for sepsis) had a median sensitivity of 82% (range 3% to 100%) and median specificity of 84% (range 18% to 100%) (n=75 evaluations). In comparison, models evaluated in the general population had a median sensitivity of 77% (range 15% to 100%) and median specificity of 80% (range 11% to 100%) (n=34 evaluations).
Performance stratified by sepsis timing
In models developed to diagnose both EOS and LOS (n=59 evaluations), median sensitivity was 82% (range 3% to 100%) and median specificity was 85% (range 33% to 100%). In models to diagnose only EOS, median sensitivity was 82% (range 9% to 100%) and median specificity was 82% (range 11% to 98%) (n=29 evaluations). In models to diagnose only LOS, median sensitivity was 57% (range 3% to 100%) and median specificity was 75% (range 18% to 99%) (n=21 evaluations).
DISCUSSION
Our scoping review provides a comprehensive overview of CPMs to diagnose neonatal sepsis in LMICs. Previous reviews of CPMs to diagnose neonatal sepsis have been published, but we included more studies than were identified in existing reviews despite them additionally including studies from high-income countries.11–14 The breadth of literature highlights the need for, and academic interest in, effective risk stratification for neonatal sepsis in LMICs.
Several common themes emerged from our review. First, 99% of studies were conducted in middle- income countries, with only one study including neonates born in a low-income country,98 and no studies conducted exclusively in a low-income country. Furthermore, fewer studies were conducted in the African region than any other WHO region, despite the high burden of neonatal sepsis and slower progress in neonatal mortality in these countries.114,115 Two thirds of models required access to at least basic laboratory facilities. Access to laboratory facilities is limited in many low-resource settings or turnaround times are too long to usefully inform management.116
Second, although many CPMs have been validated in LMICs, 78% of studies were conducted in neonatal intensive care or special care units and only 29% were validated in a population of neonates that includes infants without existing suspicion of sepsis. A substantial benefit of CPMs for neonatal sepsis in LMICs lies in their ability to promote early, targeted antibiotic therapy in an undifferentiated population of neonates, reducing antibiotic overuse and the resultant antimicrobial resistance and adverse neonatal outcomes.6 Whilst diagnostic decision support in high-risk neonates is useful, models are needed that can be applied at the time of birth to facilitate the rapid antimicrobial therapy required to reduce morbidity and mortality from EOS.7
Third, description of model performance and clinical implications was highly variable between studies. Most studies only presented sensitivity and specificity at arbitrarily selected classification thresholds; just over one third of studies reported easily comparable global metrics such as AUC. Furthermore, few studies included practical discussion of how the proposed models could be integrated into routine neonatal care in LMICs. This includes how the probability or score can be calculated by healthcare workers (e.g. using a paper proforma or a computer-based system). Most models were developed using logistic regression, but authors often simplify the final model to present a scoring system where each predictor is assigned an integer score. This benefits interpretation and application but can negatively impact the resulting predictions, particularly if continuous predictors are categorised.8
One solution could be to implement models using app or web-based interfaces for clinicians such as the Neotree platform, which consists of an android app that also includes data visualisation, linkage and export capabilities.117 This would also aid implementation of more flexible classification methods, such as the artificial neural network proposed by Helguera-Repetto et al.47 Other considerations that are rarely discussed is how management decisions should be made based on model predictions, such as appropriate classification thresholds and how these can be incorporated into a wider sepsis management pathway within neonatal units. These oversights could be addressed if authors adhere to the TRIPOD checklist, which specifically includes ‘implications’ of the model as a checklist item.21
Finally, three quarters of models were only validated in one study in a LMIC, often only in the derivation cohort (internal validation). This may lead to overoptimistic performance results due to overfitting.8 Several authors caution against the current focus on developing new models, and advocate for further validation (including external validation) of promising existing models.118 Additionally, few studies included in our review assessed calibration performance of their models, which is an especially important consideration when models are intended for decision support.119
Several limitations of our review should be considered. First, we included only published studies. It may be that individual neonatal units or networks have developed their own clinical prediction models or tools that are in local use but have not been published and therefore would not have been identified by our search. Centres may use existing clinical prediction models without publishing performance data, particularly if studies show poor model performance. Second, there was significant heterogeneity in study populations, definitions of sepsis, and classification thresholds, which makes comparing model performance between studies particularly challenging. Despite the high incidence of neonatal sepsis globally, there is no internationally accepted consensus definition of neonatal sepsis.120 A systematic review of model performance in specific populations and for specific definitions of sepsis would help to address this. Given the vast number of model variations presented in included studies, and the often arbitrarily selected classification thresholds at which model performance is reported, it is impractical for a single review to be completely comprehensive and there is necessarily a degree of selective reporting in our review. Finally, we included only studies published in English or Spanish and were unable to obtain full texts for 36 potentially eligible studies despite contacting authors. The nature of scientific publishing in LMICs means that studies are not always published in journals indexed in major biomedical databases. We identified 13 of the 82 included studies outwith our primary database searches through citation analysis and reference lists only (especially for studies published in India), which raises the possibility of retrieval bias influencing our review.
CONTRIBUTORS
FF and MH conceived the idea for this review. SRN developed the review protocol and all authors reviewed and approved it before publication. SRN and SS performed the literature searches and extracted the data. DM, HG, MZ, SRN and SS screened the records. FF, MH, GC, KLD and MCB supervised the review process and provided clinical expertise for the manuscript. MCB translated records from Spanish and provided statistical advice. All authors critically edited and approved the final manuscript.
DECLARATION OF INTERESTS
FF and MH are trustees of Neotree, a UK registered charity that provides technology, software information, education and support to healthcare workers and medical practitioners throughout England and Wales, Malawi and Zimbabwe (charity number: 1186748). All other authors declare no competing interests.
DATA SHARING
Extracted data and code used to synthesise results will be uploaded at publication. A list of all records identified through database searches is available on request.
Data Availability
Extracted data and code used to synthesise results will be uploaded at publication. A list of all records identified through database searches is available on request.
ACKNOWLEDGEMENTS
The authors would like to thank Heather Chesters (Deputy Librarian at the UCL Great Ormond Street Institute of Child Health Library) for her advice and guidance when developing the search strategy for this review. This review was conducted without specific funding support.
Footnotes
Correction of typographical errors; supplementary material updated in preparation for publication.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.
- 13.
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.
- 24.
- 25.↵
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.↵
- 33.↵
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.↵
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.↵
- 47.↵
- 48.
- 49.↵
- 50.
- 51.↵
- 52.
- 53.
- 54.↵
- 55.
- 56.↵
- 57.↵
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.↵
- 64.
- 65.
- 66.
- 67.
- 68.↵
- 69.
- 70.
- 71.
- 72.↵
- 73.
- 74.
- 75.
- 76.↵
- 77.↵
- 78.↵
- 79.
- 80.
- 81.↵
- 82.
- 83.↵
- 84.
- 85.
- 86.↵
- 87.↵
- 88.
- 89.
- 90.↵
- 91.
- 92.↵
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.↵
- 99.
- 100.
- 101.↵
- 102.
- 103.↵
- 104.
- 105.
- 106.
- 107.
- 108.
- 109.
- 110.
- 111.
- 112.↵
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵