External validation and updating of clinical severity scores to guide referral of young children with acute respiratory infections in resource-limited primary care settings ============================================================================================================================================================================ * Arjun Chandna * Lazaro Mwandigha * Constantinos Koshiaris * Direk Limmathurotsakul * Francois Nosten * Yoel Lubell * Rafael Perera-Salazar * Claudia Turner * Paul Turner ## ABSTRACT **Background** Accurate and reliable guidelines for referral of children from resource-limited primary care settings are lacking. We identified three practicable paediatric severity scores (Liverpool quick Sequential Organ Failure Assessment [LqSOFA], quick Pediatric Logistic Organ Dysfunction-2 [qPELOD-2], and the modified Systemic Inflammatory Response Syndrome [mSIRS]) and externally validated their performance in young children presenting with acute respiratory infections to a primary care clinic located within a refugee camp on the Thailand-Myanmar border. **Methods** This secondary analysis of data from a longitudinal birth cohort study consisted of 3,010 acute respiratory infections in children aged ≤ 24 months. The primary outcome was receipt of supplemental oxygen. We externally validated the discrimination, calibration, and net-benefit of the scores, and quantified gains in performance that might be expected if they were deployed as simple clinical prediction models, and updated to include nutritional status and respiratory distress. **Results** 104/3,010 (3.5%) presentations met the primary outcome. The LqSOFA score demonstrated the best discrimination (AUC 0.84; 95% CI 0.79-0.89) and achieved a sensitivity and specificity > 0.80. Converting the scores into clinical prediction models improved performance, resulting in ∼20% fewer unnecessary referrals and ∼30-60% fewer children incorrectly managed in the community. **Conclusions** The LqSOFA score is a promising triage tool for young children presenting with acute respiratory infections in resource-limited primary care settings. Where feasible, deploying the score as a simple clinical prediction model might enable more accurate and nuanced risk stratification, increasing applicability across a wider range of contexts. KEY WORDS * Acute respiratory infection * paediatrics * risk stratification * triage * referral * primary care * low- and middle-income country ## INTRODUCTION Acute respiratory infections (ARIs) are the leading reason for unscheduled childhood medical consultations worldwide.1,2 Primary care workers function as gatekeepers to the formal health system, aiming to distinguish the minority of ARIs requiring onward referral from those suitable for community-based care.3 In rural regions of many low- and middle-income countries (LMICs) poorly functioning infrastructure, as well as geographic, climatic, socioeconomic, and cultural factors, can complicate referral mechanisms. Particularly in humanitarian and conflict settings referral can entail risks for both patients and providers.4 Consequently, there can be substantial inter- and intra-health system variation in referral thresholds. Existing tools to support community healthcare providers in their assessment of unwell children, such as the World Health Organization’s Integrated Management of Childhood Illnesses (IMCI) and Integrated Community Case Management (iCCM) guidelines, recommend certain ‘Danger Signs’ to guide referrals.5,6 However, these lack sensitivity and specificity, and suffer from considerable interobserver variability.7,8 A systematic review of paediatric triage tools concluded that none would be reliable in resource-constrained settings and that lack of follow-up data on children managed in the community rendered the validity of existing tools questionable.9 In this study we identified paediatric severity scores suitable for use in resource-limited primary care settings and externally validated their ability to guide referral of young children presenting with ARIs.10 We characterised the improvement in performance that might be expected if the scores were deployed as simple clinical prediction models and updated to include variables relevant to children presenting with ARIs in rural LMIC settings. ## METHODS ### Study population Data were collected during a prospective birth cohort study at a medical clinic for refugees and internally displaced people on the Thailand-Myanmar border.10 Between September 2007 and September 2008 pregnant women receiving antenatal care at the clinic were invited to participate. Children of consenting women were reviewed at birth and followed-up each month (routine visit) and during any intercurrent illness (illness visit) until 24 months of age. The local circumstances (inability of the population to move freely out of the camp and lack of other medical providers) contributed to low attrition rates and capture of the majority of acute illnesses for which care was sought. All ARI illness visits were included in this secondary analysis. An ARI was defined as (A) a presentation with rhinorrhoea, nasal congestion, cough, respiratory distress (chest indrawing, nasal flaring, grunting, tracheal tug, and/or head bobbing), stridor, and/or abnormal lung auscultation (crepitations and/or wheeze), and (B) a compatible contemporaneous syndromic diagnosis (rhinitis, croup, bronchiolitis, influenza-like illness, pneumonia, viral infection and/or wheeze) for children sent home directly from the clinic. ### Identification and shortlisting of scores Drawing on the results of two recent systematic reviews, we longlisted 16 severity scores that might risk stratify young children presenting from the community with acute respiratory infections (Supplementary Table 1).11,12 After considering reliability, validity, and feasibility for implementation we excluded eight scores that required specialist equipment and/or laboratory tests unlikely to be practical for the assessment of young children in busy LMIC primary care settings.13-20 Four others were excluded as ≥ 25% of the constituent variables were unavailable in the primary dataset.21-24 Two of the remaining scores (quick Sequential Organ Failure Assessment [qSOFA] and quick Pediatric Logistic Organ Dysfunction-2 [qPELOD-2]) contained blood pressure.25,26 Hypotension is a late sign in paediatric sepsis and not suitable for early recognition of impending serious illness at the community level.27 Furthermore, accurate use and maintenance of sphygmomanometers and stethoscopes may not be feasible in resource-limited settings.28 Recently, Romaine et al. replaced systolic blood pressure (SBP) with alternate signs of circulatory compromise (heart rate and capillary refill time) to develop the Liverpool-qSOFA (LqSOFA) score, and demonstrated superior performance compared to qSOFA in febrile children presenting from the community.29 Hence, we elected to evaluate the LqSOFA score in preference to qSOFA and to evaluate an adapted qPELOD-2 score (replacing SBP with capillary refill time and assessing mental status using the simpler Alert Voice Pain Unresponsive [AVPU] scale rather than the Glasgow Coma Scale [GCS]). The three scores shortlisted for evaluation were the LqSOFA, qPELOD-2, and modified Systemic Inflammatory Response Syndrome (mSIRS) scores (Table 1).26,29,30 View this table: [TABLE 1.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/T1) TABLE 1. Shortlisted paediatric severity scores and comparison between original and study populations. bpm = beats or breaths per minute; ED = emergency department; ICU = intensive care unit; PICU = paediatric intensive care unit. ### Selection of variables for model updating To update and improve model performance, additional predictors relevant for children presenting with ARIs in LMIC primary care settings were considered for inclusion. Nutritional status (weight-for-age z-score [WAZ]) and presence of respiratory distress were selected a priori, after considering resource constraints, reliability, validity, biological plausibility, availability of data in the primary dataset, and sample size (Supplementary Table 2).28 ### Data collection All data were measured by study staff and entered on to structured case report forms. With the exception of anthropometric data, all clinical data were collected at the time of presentation. Core (rectal) temperature was measured for neonates and infants and adjusted to axillary temperature by subtracting 0.5°C.6 Mental status was assessed using the AVPU scale. Capillary refill time was measured centrally. For children admitted to the clinic, weight was measured at the time of presentation (seca scale; precision ± 5g for neonates or ± 50g after birth). In addition, all children had their mid-upper arm circumference (MUAC), weight, and height measured at each monthly routine visit. For the purposes of these analyses, age-adjusted z-scores (R package: *z scorer*)31 were calculated using the closest anthropometric data to the illness visit within the following window periods: height ≤ 28 days; MUAC ≤ 28 days without intervening admission; weight ≤ 14 days without intervening admission. Median time between the index illness visit and each anthropometric measurement is reported. ### Primary outcome The primary outcome was receipt of supplemental oxygen during the illness visit. Study staff were unaware which baseline variables were to be used as candidate predictors at the time of ascertaining outcome status. Clinic treatment protocols specified that peripheral oxygen saturation (SpO2) must be checked prior to initiation of supplemental oxygen, with therapy only indicated if SpO2 was < 90%. All staff were trained on the treatment protocols prior to study commencement. ### Missing data 616 presentations were missing data on one or more candidate predictors (616/3,010; 20.5%) with capillary refill time containing the highest proportion of missingness (442/3,010; 14.7%; Supplementary Table 3). Under a missing-at-random assumption (Supplementary Figure 1), we used multiple imputation with chained equations (MICE) to deal with missing data (R package: *mice*).32 Analyses were done in each of 100 imputed datasets and results pooled. Variables included in the imputation model are reported in Supplementary Table 4. ### Statistical methods We assessed discrimination and calibration of each score by quantifying the area under the receiver operating characteristic curve (AUC) and plotting model scores against observed outcome proportions. We examined predicted classifications at each of the scores’ cut-offs. Prior to model building we explored the relationship between continuous predictors and the primary outcome using locally-weighted smoothing to identify non-linear patterns. Accordingly, temperature was modelled using restricted cubic splines (R package: *rms*)33 with three knots placed at locations based on percentiles (5th and 95th) and recognised physiological thresholds (36°C).34,35 We used logistic regression to derive the models and tested for important interactions using likelihood ratio tests (LRT). Random-effects were not modelled as 22% (169/756) of children presented only once. All predictors were prespecified and no predictor selection was performed during model development. Internal validation was performed using 100 bootstrap samples with replacement and optimism-adjusted discrimination and calibration reported (R package: *rms*).33 Finally, the models were updated by including respiratory distress and WAZ as additional candidate predictors. Penalised (lasso) logistic regression was used for model updating, variable selection, and shrinkage to minimise overfitting (R package: *glmnet*).36 A sensitivity analysis confirmed that median imputation grouped by outcome status produced similar results to MICE and hence to avoid conflicts in variable selection across multiply imputed datasets we used this approach to address missing data for model updating (Supplementary Table 5). We assessed discrimination and calibration of the updated models, examined predicted classifications at clinically-relevant referral thresholds, and compared their clinical utility (net-benefit) to the best-performing points-based severity score using decision curve analysis (R package: *dcurves*).37 A sensitivity analysis was performed excluding children who were hypoxic at the time of presentation. All analyses were done in R, version 4.0.2.38 ### Sample size No formal sample size calculation for external validation of the existing severity scores was performed. All available data were used to maximise power and generalisability. Of the 3,010 eligible ARI presentations, 104 met the primary outcome, ensuring sufficient outcome events for a robust external validation.39 For derivation and updating of the clinical prediction models we followed the methods of Riley et al. and assumed a conservative R2 Nagelkerke of 0.15.40 At an outcome prevalence of 3.5% (104/3,010) we estimated that up to 13 candidate predictors (events per parameter [EPP] = 8) could be used to build the prediction models whilst minimising the risk of overfitting (R package: *pmsampsize*).41 ### Ethics and reporting Ethical approvals were provided by the Mahidol University Ethics Committee (TMEC 21-023) and Oxford Tropical Research Ethics Committee (OxTREC 511-21). The study is reported in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines (Supplementary Table 6).42 ## RESULTS From September 2007 to September 2008, 999 pregnant women were enrolled, with 965 children born into the cohort. Amongst 4,061 acute illness presentations, 3,064 were for ARIs. Fifty-four ARI presentations were excluded as information on oxygen therapy was not available in the study database, leaving 3,010 presentations from 756 individual children for the primary analysis (Supplementary Figure 2). Baseline characteristics of the cohort are summarised (Table 2; Supplementary Table 7). The majority of children were managed in the community (72.3%; 2,175/3,010). Median length of stay for the 835 admissions was 3 days (IQR 2 to 4 days). One hundred and four (3.5%; 104/3,010) presentations met the primary outcome, with those with signs of respiratory distress, age-adjusted tachycardia and/or tachypnoea, lower baseline SpO2, prolonged capillary refill times, altered mental status, and lower WAZ more likely to require supplemental oxygen (p < 0.001 to 0.014; Table 2). View this table: [TABLE 2.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/T2) TABLE 2. Baseline characteristics of the cohort stratified by primary outcome status. aRespiratory distress defined as head bobbing, tracheal tug, grunting and/or chest indrawing; babnormal chest auscultation defined as crepitations and/or wheeze; crectal temperature converted to axillary temperature for neonates and infants. †Median interval between anthropometric measurement and index illness presentation: length = 8 days (IQR 4-12 days); MUAC = 9 days (IQR 4-13 days); weight = 4 days (IQR 0-10 days). *Missing data: gestation = 5; birthweight = 14; comorbidity = 10; symptom duration = 21; unwell family member = 10; fever = 5; runny nose = 2; noisy breathing = 6; stridor = 1; respiratory distress = 1; head bobbing = 1; tracheal tug = 1; grunting = 1; chest indrawing = 1; abnormal lung auscultation = 59; lung crepitations = 69; wheeze = 79; dehydration = 7; colour = 50; heart rate = 9; respiratory rate = 8; temperature = 3; oxygen saturation = 1,645; capillary refill time = 442; mental status = 37; WLZ = 158; WAZ = 147; MAZ = 682; LAZ = 14. ### LqSOFA and qPELOD-2 scores outperform the mSIRS score for risk stratification of ARIs Discrimination and calibration of the LqSOFA (AUC = 0.84; 95% confidence interval [CI] = 0.79 to 0.89) and qPELOD-2 (AUC = 0.79; 95% CI = 0.74 to 0.84) scores were considerably better than the mSIRS score (AUC = 0.57; 95% CI = 0.51 to 0.63; Figure 1; Supplementary Table 8; Supplementary Figure 3). At a cut-off of ≥ 1 the LqSOFA score demonstrated a sensitivity of 0.80 (95% CI = 0.72 to 0.89) and specificity of 0.86 (95% CI = 0.85 to 0.88); neither the mSIRS nor qPELOD-2 scores achieved a sensitivity and specificity > 0.70 at any cut-off (Table 3). View this table: [TABLE 3.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/T3) TABLE 3. Predicted classifications of the severity scores. Classifications calculated using full-case analysis: LqSOFA = 2,525 presentations (81 met primary outcome); mSIRS = 2,992 presentations (99 met primary outcome); qPELOD-2 = 2,531 presentations (83 met primary outcome) ![FIGURE 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/07/2022.12.06.22283016/F1.medium.gif) [FIGURE 1.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/F1) FIGURE 1. Discrimination of the LqSOFA, mSIRS, and qPELOD-2 severity scores. Receiver operating characteristic curve (ROC) for one imputed dataset shown. Variability in ROCs across multiply imputed datasets shown in Supplementary Figure 2. Pooled AUC reported. Bar plots showing risk scores against observed proportion of oxygen requirement using full case analysis: LqSOFA = 2,525 presentations (81 met primary outcome); mSIRS = 2,992 presentations (99 met primary outcome); qPELOD-2 = 2,531 presentations (83 met primary outcome). ### Improved performance of clinical severity scores when deployed as clinical prediction models Relationships between continuous predictors and the primary outcome are illustrated (Supplementary Figure 4). There was no evidence of interaction between heart rate (LRT = 2.09; p = 0.35) or respiratory rate (LRT = 0.77; p = 0.68) and age. Optimism-adjusted discrimination of the three models ranged from 0.81 to 0.90, with the LqSOFA model appearing most promising (AUC = 0.90; 95% CI = 0.86 to 0.94; Figure 2; Supplementary Figure 5). Calibration of the qPELOD-2 model was good. The LqSOFA and mSIRS models overestimated risk at higher predicted probabilities. ![FIGURE 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/07/2022.12.06.22283016/F2.medium.gif) [FIGURE 2.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/F2) FIGURE 2. Discrimination and calibration of the LqSOFA, mSIRS, and qPELOD-2 models. Receiver operating characteristic curve (ROC) and calibration slope for one imputed dataset shown. Variability in ROCs and calibration slopes across multiply imputed datasets shown in Supplementary Figure 5. Pooled optimism-adjusted AUCs and calibration slopes reported (100 bootstrap samples). On calibration plots, red line indicates perfect calibration; black dashed line indicates calibration slope for that particular model; blue rug plots indicate distribution of predicted risks for participants who did (top) and did not (bottom) meet the primary outcome. Discrimination of all three updated models containing respiratory distress and WAZ improved (AUCs = 0.93 to 0.95). Calibration of the updated LqSOFA and qPELOD-2 models was good, whereas the updated mSIRS model underestimated risk at higher predicted probabilities (Figure 3). The full models are reported in Supplementary Table 9. ![FIGURE 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/07/2022.12.06.22283016/F3.medium.gif) [FIGURE 3.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/F3) FIGURE 3. Discrimination and calibration of updated LqSOFA, mSIRS, and qPELOD-2 models. On calibration plots, red line indicates perfect calibration; black dashed line indicates calibration slope for that particular model; blue rug plots indicate distribution of predicted risks for participants who did (top) and did not (bottom) meet the primary outcome. ### Promising clinical utility of the LqSOFA and qPELOD-2 models to guide referrals from primary care We recognised that the relative value of correct and incorrect referrals is highly context-dependent, reflecting resource availability, practicalities of referral, and capacity for follow-up. Decision curve analyses accounting for differing circumstances suggest that the updated models could provide greater utility (net-benefit) compared to the best points-based score (the LqSOFA score), with the LqSOFA and qPELOD-2 models appearing most promising over a wide range of plausible referral thresholds (Figure 4). ![FIGURE 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/12/07/2022.12.06.22283016/F4.medium.gif) [FIGURE 4.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/F4) FIGURE 4. Decision curve analysis of the updated LqSOFA, mSIRS, and qPELOD-2 models. The net benefit of the updated models (green [LqSOFA], turquoise [qPELOD-2], and blue [mSIRS] lines) and original LqSOFA score (pink line), are compared to a “refer-all” (red line) and “refer-none” (brown line) approach. A threshold probability of 5% indicates a management strategy whereby any child with a ≥ 5% probability of requiring oxygen is referred (i.e., a scenario where the value of one correct referral is equivalent to 19 incorrect referrals or a NNR of 20). The ability of each updated model to guide referrals at thresholds ranging from 1% to 40% is shown (Table 4). A referral threshold of 5% reflects a strategy whereby any child with a predicted probability of requiring oxygen ≥ 5% is referred. At this cut off, the models would suggest referral in ∼15% of all presentations, correctly identifying ∼86-87% of children requiring referral, at a cost of also recommending referral in ∼12-13% of children not requiring referral; i.e., a number needed to refer (NNR; the number of children referred to identify one child who would require oxygen) of five. In contrast, at a similar threshold the LqSOFA score using a cut-off ≥ 1 would suggest referral in a similar proportion of presentations but result in a ∼25% increase in incorrect referrals (a NNR of six) and a ∼30-60% increase in the number of children incorrectly identified for community-based management. View this table: [TABLE 4.](http://medrxiv.org/content/early/2022/12/07/2022.12.06.22283016/T4) TABLE 4. Predicted classifications at different referral thresholds using the updated LqSOFA, qPELOD-2, and mSIRS models. A referral threshold of 5% reflects a management strategy whereby any child with a predicted probability of requiring oxygen ≥ 5% is referred. ### Sensitivity analysis The WHO recommend that pulse oximetry should be universally available at first-level health facilities.6,43 Although many barriers exist to realising this laudable goal, to account for the fact that in such contexts a severity score would not be required to guide referral for children who are already hypoxic at the time of presentation, we performed a sensitivity analysis excluding presentations with SpO2 < 90% at enrolment. Discrimination remained comparable but clinical utility of the models reduced slightly, with higher NNRs at the lowest referral thresholds (Supplementary Tables 11 & 12). ## DISCUSSION We report the external validation of three pre-existing severity scores amongst young children presenting with ARIs to a medical clinic on the Thailand-Myanmar border. Unlike other studies which investigated the scores’ prognostic accuracy in hospital settings,17,25 we evaluated their performance at the community level and demonstrate that the LqSOFA and qPELOD-2 scores could support early recognition of children requiring referral or closer follow-up in settings with limited resources. In keeping with previous literature, we found that the mSIRS score was poorly discriminative, not well calibrated, and led to substantial misclassification.17 An LqSOFA score ≥ 1 yielded a sensitivity and specificity > 80%. Encouragingly, this is remarkably consistent with the performance reported in the original LqSOFA development study and may reflect similarities in the use-case (febrile children presenting from the community) and severity of the cohorts (outcome prevalence 1.1% vs. 3.5%; admission rate 12.1% vs. 27.7%), albeit despite obvious demographic differences.29 In contrast to qPELOD-2, LqSOFA contains age-adjusted tachypnoea, which may have improved performance in children with respiratory illnesses. Furthermore, the performance of LqSOFA (or qSOFA) has been shown to improve outside of the PICU, when used to predict more proximal outcomes (e.g. critical care admission rather than mortality), and if the AVPU scale (vs. GCS) is employed to assess mental status.44 These all apply to our cohort. We demonstrated improvement in performance when the severity scores were deployed as clinical prediction models and when nutritional status and respiratory distress were included as additional predictors. Whilst discrimination of all three updated models was good, the AUC is a summary measure of model performance and does not necessarily reflect clinical utility.45-47 Decision curve analyses illustrate the superiority of the LqSOFA and qPELOD-2 models compared with the mSIRS model across a range of clinically-relevant referral thresholds. With growing access to smartphones there may be contexts where the increased accuracy afforded by a clinical prediction model outweighs the simplicity and practicality of points-based scoring systems. At a 5% referral threshold, the updated LqSOFA model identified a similar proportion of presentations for referral as the LqSOFA score at a cut-off of ≥ 1 (14.1% vs. 16.1%), however use of the model would have resulted in ∼25% fewer incorrect referrals and a ∼30% decrease in the number of presentations incorrectly recommended for community-based management. In addition to greater accuracy, prediction models permit more nuanced evaluation of risk; referral thresholds can be adjusted to the needs of an individual patient and/or health system and this flexibility may be particularly impactful in the heterogeneous environments commonplace in many LMIC primary care contexts. For example, in locations where community follow-up is feasible (e.g. via a telephone call or return clinic visit) and/or referral carries great cost (to the patient or system), a higher referral threshold (lower NNR) may be acceptable, compared with settings where safety-netting is impractical and/or access to secondary care is less challenging. We followed the latest guidelines in prediction model building and used bootstrap internal validation, penalised regression, placed knots at predefined locations, and limited the number of candidate predictors to avoid overfitting the models.40,42,48,49 Nevertheless, they require validation on new data to assess generalisability and provide a fairer comparison with the pre-existing points-based scores. We have published our full models to encourage independent validation. As others have highlighted, a limitation of many studies evaluating community-based triage tools in low-resource settings is the lack of follow-up data for patients categorised as low risk;9 72.3% (2,175/3,010) of our cohort were sent away from the clinic without admission. As acute illness visits were nested within the longitudinal birth cohort, we were able to confirm that 1.4% (30/2,083) of presentations sent away from the clinic without admission received supplemental oxygen within the next 28 days, although it is unknown whether this related to the index ARI or a new illness. A sensitivity analysis conservatively classifying these 30 presentations as meeting the primary outcome (i.e. assuming the oxygen therapy related to the index ARI) resulted in a decrease in the sensitivity of all three models (Supplementary Tables 12 & 13). Prospective research with dedicated outpatient follow-up is ongoing to investigate this issue further.50 We selected supplemental oxygen therapy as the primary outcome as this reflects a clinically-meaningful endpoint for ARIs and a pragmatic referral threshold for many resource-limited primary care settings. Oxygen was a scarce resource during the study (cylinders were transported in each week from ∼60km away) and oxygen therapy was protocolised; hence outcome misclassification is less likely. For those who met the primary outcome, the time of oxygen initiation was not available in the primary dataset. Although no patient had met the outcome when baseline predictors were measured, some may have done so shortly after. Nevertheless, the sensitivity analysis excluding presentations with baseline SpO2 < 90% (the qualifying criterion for supplemental oxygen) produced similar results. Furthermore, median length of stay was three days and hence the time horizon for all those who met the primary outcome is likely to have been relatively comparable. We externally validated three severity scores that could guide assessment of young children presenting with ARIs in resource-limited primary care settings to identify those in need of referral or closer follow-up. Performance of the LqSOFA score was encouraging and comparable to that in the original derivation setting.29 Converting the LqSOFA score into a clinical prediction model and including additional variables relevant to resource-constrained LMIC settings improved accuracy and might permit application across a wider range of contexts with differing referral thresholds. ## Supporting information Appendix [[supplements/283016_file03.pdf]](pending:yes) ## Data Availability De-identified, individual participant data from this study will be available to researchers whose proposed purpose of use is approved by the data access committees at the Mahidol-Oxford Tropical Medicine Research Unit. Inquiries or requests for the data may be sent to datasharing{at}tropmedres.ac. ## CONFLICT OF INTEREST DISCLOSURES The authors have no conflicts of interest relevant to this article to disclose. ## FUNDING/SUPPORT This research was funded by the UK Wellcome Trust [219644/Z/19/Z]. RPS acknowledges part support from the NIHR Applied Research Collaboration Oxford & Thames Valley, the NIHR Oxford Medtech and In-Vitro Diagnostics Co-operative and the Oxford Martin School. CK is supported by a Wellcome Trust/Royal Society Sir Henry Dale Fellowship [211182/Z/18/Z]. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. ## DATA SHARING De-identified, individual participant data from this study will be available to researchers whose proposed purpose of use is approved by the data access committees at the Mahidol-Oxford Tropical Medicine Research Unit. Inquiries or requests for the data may be sent to datasharing{at}tropmedres.ac. ## ABBREVIATIONS ARI : acute respiratory infection AUC : area under the receiver operating characteristic curve AVPU : Alert Voice Pain Unresponsive CI : confidence interval EPP : events per parameter GCS : Glasgow Coma Scale iCCM : integrated Community Case Management IMCI : Integrated Management of Childhood Illnesses IQR : interquartile range LAZ : length-for-age z-score LMIC : low- and middle-income country LqSOFA : Liverpool quick Sequential Organ Failure Assessment LRT : likelihood ratio test MAZ : MUAC-for-age z-score mSIRS : modified Systematic Inflammatory Response Syndrome MICE : multiple imputation with chained equations MUAC : mid-upper arm circumference NNR : number needed to refer OxTREC : Oxford Tropical Research Ethics Committe PICU : Paediatric Intensive Care Unit qPELOD-2 : quick Pediatric Logistic Organ Dysfunction-2 qSOFA : quick Sequential Organ Failure Assessment SBP : systolic blood pressure SpO2 : peripheral oxygen saturation TMEC : Tropical Medicine Ethics Committee TRIPOD : Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis WAZ : weight-for-age z-score WLZ : weight-for-length z-score * Received December 6, 2022. * Revision received December 6, 2022. * Accepted December 7, 2022. * © 2022, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## REFERENCES 1. 1.Bigio J, MacLean E, Vasquez NA, et al. Most common reasons for primary care visits in low- and middle-income countries: A systematic review. PLOS Global Public Health 2022; 2(5). 2. 2.Finley CR, Chan DS, Garrison S, et al. What are the most common conditions in primary care? Can Fam Phys 2018; 64(11): 832–40. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiY2ZwIjtzOjU6InJlc2lkIjtzOjk6IjY0LzExLzgzMiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEyLzA3LzIwMjIuMTIuMDYuMjIyODMwMTYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 3. 3.Buntinx F, Mant D, Van den Bruel A, Donner-Banzhof N, Dinant GJ. Dealing with low-incidence serious diseases in general practice. Br J Gen Pract 2011; 61(582): 43–6. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiYmpncCI7czo1OiJyZXNpZCI7czo5OiI2MS81ODIvNDMiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMi8wNy8yMDIyLjEyLjA2LjIyMjgzMDE2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 4. 4.Debarre A. Hard to Reach: Providing Healthcare in Armed Conflict: International Peace Institute, 2018. 5. 5.World Health Organization. Integrated Community Case Management. Geneva, Switzerland; 2012. 6. 6.World Health Organization. Integrated Management of Childhood Illnesses. Geneva, Switzerland; 2014. 7. 7.Keitel K, Kilowoko M, Kyungu E, Genton B, D’Acremont V. Performance of prediction rules and guidelines in detecting serious bacterial infections among Tanzanian febrile children. BMC Infect Dis 2019; 19(1): 769. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12879-019-4371-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 8. 8.Izudi J, Anyigu S, Ndungutse D. Adherence to Integrated Management of Childhood Illnesses Guideline in Treating South Sudanese Children with Cough or Difficulty in Breathing. Int J Pediatr 2017; 2017: 5173416. 9. 9.Hansoti B, Jenson A, Keefe D, et al. Reliability and validity of pediatric triage tools evaluated in Low resource settings: a systematic review. BMC Pediatr 2017; 17(1): 37. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 10. 10.Turner C, Turner P, Carrara V, et al. High rates of pneumonia in children under two years of age in a South East Asian refugee population. PLoS One 2013; 8(1): e54026. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 11. 11.Chandna A, Tan R, Carter M, et al. Predictors of disease severity in children presenting from the community with febrile illnesses: a systematic review of prognostic studies. BMJ Glob Health 2021; 6(1). 12. 12.Deardorff KV, McCollum ED, Ginsburg AS. Pneumonia Risk Stratification Scores for Children in Low-Resource Settings: A Systematic Literature Review. Pediatr Infect Dis J 2018; 37(8): 743–8. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 13. 13.Olson D, Davis NL, Milazi R, et al. Development of a severity of illness scoring system (inpatient triage, assessment and treatment) for resource-constrained hospitals in developing countries. Trop Med Int Health 2013; 18(7): 871–8. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 14. 14.Hooli S, Colbourn T, Lufesi N, et al. Predicting Hospitalised Paediatric Pneumonia Mortality Risk: An External Validation of RISC and mRISC, and Local Tool Development (RISC-Malawi) from Malawi. PLoS One 2016; 11(12): e0168126. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 15. 15.Reed C, Madhi SA, Klugman KP, et al. Development of the Respiratory Index of Severity in Children (RISC) score among young children with respiratory infections in South Africa. PLoS One 2012; 7(1): e27793. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0027793&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22238570&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 16. 16.Pollack MM, Holubkov R, Funai T, et al. The Pediatric Risk of Mortality Score: Update 2015. Pediatr Crit Care Med 2016; 17(1): 2–9. 17. 17.van Nassau SC, van Beek RH, Driessen GJ, Hazelzet JA, van Wering HM, Boeddha NP. Translating Sepsis-3 Criteria in Children: Prognostic Accuracy of Age-Adjusted Quick SOFA Score in Children Visiting the Emergency Department With Suspected Bacterial Infection. Front Pediatr 2018; 6: 266. 18. 18.Goldstein B, Giroir B, Randolph A, International Consensus Conference on Pediatric S. International pediatric sepsis consensus conference: definitions for sepsis and organ dysfunction in pediatrics. Pediatr Crit Care Med 2005; 6(1): 2–8. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/01.PCC.0000149131.72248.E6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15636651&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 19. 19.Egdell P, Finlay L, Pedley DK. The PAWS score: validation of an early warning scoring system for the initial assessment of children in the emergency department. Emerg Med J 2008; 25(11): 745–9. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1lcm1lZCI7czo1OiJyZXNpZCI7czo5OiIyNS8xMS83NDUiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMi8wNy8yMDIyLjEyLjA2LjIyMjgzMDE2LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 20. 20.Parshuram CS, Hutchison J, Middaugh K. Development and initial validation of the Bedside Paediatric Early Warning System score. Crit Care 2009; 13(4): R135. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/cc7998&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19678924&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 21. 21.George EC, Walker AS, Kiguli S, et al. Predicting mortality in sick African children: the FEAST Paediatric Emergency Triage (PET) Score. BMC Med 2015; 13: 174. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 22. 22.Emukule GO, McMorrow M, Ulloa C, et al. Predicting mortality among hospitalized children with respiratory illness in Western Kenya, 2009-2012. PLoS One 2014; 9(3): e92968. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 23. 23.Berkley JA, Ross A, Wangi I, et al. Prognostic indicators of early and late death in children admitted to district hospital in Kenya: cohort study. BMJ 2003; 326(361). 24. 24.Helbok R, Kendjo E, Issifou S, et al. The Lambarene Organ Dysfunction Score (LODS) is a simple clinical predictor of fatal malaria in African children. J Infect Dis 2009; 200(12): 1834–41. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/648409&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19911989&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 25. 25.Schlapbach LJ, Straney L, Bellomo R, MacLaren G, Pilcher D. Prognostic accuracy of age-adapted SOFA, SIRS, PELOD-2, and qSOFA for in-hospital mortality among children with suspected infection admitted to the intensive care unit. Intensive Care Med 2018; 44(2): 179–88. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00134-017-5021-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 26. 26.Leclerc F, Duhamel A, Deken V, Grandbastien B, Leteurtre S, Groupe Francophone de Reanimation et Urgences P. Can the Pediatric Logistic Organ Dysfunction-2 Score on Day 1 Be Used in Clinical Criteria for Sepsis in Children? Pediatr Crit Care Med 2017; 18(8): 758–63. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 27. 27.Brierley J, Carcillo JA, Choong K, et al. Clinical practice parameters for hemodynamic support of pediatric and neonatal septic shock: 2007 update from the American College of Critical Care Medicine. Crit Care Med 2009; 37(2): 666–88. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/CCM.0b013e31819323c6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19325359&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000263014800038&link_type=ISI) 28. 28.Fung JST, Akech S, Kissoon N, Wiens MO, English M, Ansermino JM. Determining predictors of sepsis at triage among children under 5 years of age in resource-limited settings: A modified Delphi process. PLoS One 2019; 14(1): e0211274. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0211274&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 29. 29.Romaine S.T, Potter J, Khanijau A, et al. Accuracy of a Modified qSOFA Score for Predicting Critical Care Admission in Febrile Children. Pediatrics 2020; 146(4): e20200782. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1542/peds.2020-0782&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32978294&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 30. 30.Beane A, Silva AP, Munasinghe S, et al. Comparison of Quick Sequential Organ Failure Assessment and Modified Systemic Inflammatory Response Syndrome Criteria in a Lower Middle Income Setting. J Acute Med 2017; 7(4): 141–8. 31. 31.Myatt M, Guevarra E. zscorer: Child Anthropometry z-Score Calculator. R package version 0.3.1. 2019. 32. 32.van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software 2011; 45(3): 1–67. 33. 33.Harrell FE, Jr.. rms: Regression Modeling Strategies. R package version 6.2-0. 2021. 34. 34.UK National Institute for Health and Care Excellence. Algorithm for managing suspected sepsis in children aged under 5 years outside an acute hospital setting. United Kingdom; 2017. 35. 35.World Health Organization. Pocket book of hospital care for children: guidelines for the management of common childhood illnesses. Geneva, Switzerland; 2013. 36. 36.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software 2010; 33(1): 1–22. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.18637/jss.v033.i01&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=WOS:00027520&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 37. 37.Sjoberg DD. dcurves: Decision Curve Analysis for Model Evaluation. R package version 0.3.0. 2022. 38. 38.R Core Team. R: A language and environment for statistical computing. VIenna, Austria: R Foundation for Statistical Computing; 2020. 39. 39.Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JD. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol 2005; 58(5): 475–83. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2004.06.017&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15845334&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000229008000007&link_type=ISI) 40. 40.Riley RD, Ensor J, Snell KIE, et al. Calculating the sample size required for developing a clinical prediction model. BMJ 2020; 368: m441. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE2OiIzNjgvbWFyMThfMi9tNDQxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMDcvMjAyMi4xMi4wNi4yMjI4MzAxNi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 41. 41.Ensor J, Martin EC, Riley RD. pmsampsize: Calculates the Minimum Sample Size Required for Developing a Multivariable Prediction Model. R package version 1.1.1. R; 2021. 42. 42.Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015; 162(1): 55–63. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7326/M14-0697&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25560714&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 43. 43.World Health Organization. Interagency List of Priority Medical Devices for Essential Interventions for Reproductive, Maternal, Newborn and Child Health. Geneva, Switzerland, 2016. 44. 44.Eun S, Kim H, Kim HY, et al. Age-adjusted quick Sequential Organ Failure Assessment score for predicting mortality and disease severity in children with infection: a systematic review and meta-analysis. Sci Rep 2021; 11(1): 21699. 45. 45.Fackler JC, Rehman M, Winslow RL. Please Welcome the New Team Member: The Algorithm. Pediatr Crit Care Med 2019; 20(12): 1200–1. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F12%2F07%2F2022.12.06.22283016.atom) 46. 46.Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016; 352: i6. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjE0OiIzNTIvamFuMjVfMi9pNiI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEyLzA3LzIwMjIuMTIuMDYuMjIyODMwMTYuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 47. 47.de Hond AAH, Steyerberg EW, van Calster B. Interpreting area under the receiver operating characteristic curve. Lancet Digit Health 2022; (). 48. 48.Steyerberg EW. Clinical Prediction Models: Springer International Publishing; 2019. 49. 49.Harrell FE, Jr.. Regression Modeling Strategies: Springer International Publishing; 2006. 50. 50.Chandna A, Aderie EM, Ahmad R, et al. Prediction of disease severity in young children presenting with acute febrile illness in resource-limited settings: a protocol for a prospective observational study. BMJ Open 2021; 11(1): e045826. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYm1qb3BlbiI7czo1OiJyZXNpZCI7czoxMjoiMTEvMS9lMDQ1ODI2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTIvMDcvMjAyMi4xMi4wNi4yMjI4MzAxNi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=)