Abstract
The Tuberculosis (TB) Portals is an international program of 14 countries connecting clinical, genomic, and radiologist specialists to develop an openly available repository of deidentified TB cases with multi-modal data such as case clinical characteristics, pathogen genomics, and radiomics. This real-world data resource contains over 4000 TB cases, principally drug resistant cases, with over 4000 chest X-rays (CXR) images. The scope of curated data offers a case-focused perspective into the drivers of disease incorporating the chronological context of the presented CXR data. Here, we analyze a cohort consisting of new TB cases to understand the relationship between baseline sputum microscopy status and nearby Chest X rays images. The Timika score, a lung biomarker of disease severity, was derived for each CXR using available radiologist observations. The Timika score along with the radiologist observations were compared for predictive performance of baseline sputum microscopy status. Baseline sputum microscopy status is a useful marker of pre-treatment disease severity and infectiousness. The modeling results support that both the radiologist observations as well as Timika score are predictive of smear status and that Timika score performs similarly to the top 5 radiologist features by feature selection. Moreover, inferential statistical analysis identifies the factors having the greatest association with sputum smear positivity such as presence of radiologist observations in both lungs, presence of cavity, presence of nodule, and Timika score itself. The results are consistent with prior reports showing Timika Score utility for predicting baseline sputum smear and disease status. We report testing of Timika Score on the largest, openly available real-world dataset of TB cases that can serve as a reference to explore extant and new TB disease severity scores bridging radiological, microbiological, and clinical data. To illustrate, we visualize Timika score from images in our database with other cases characteristics demonstrating that this score captures lung biomarker status consistent with known clinical risk factors.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This project has been funded in part with Federal funds from the National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health, Department of Health and Human Services under BCBB Support Services Contract HHSN316201300006W/75N93022F00001 to MEDICAL SCIENCE & COMPUTING and under contract HHSN316201200018W/75N98119F00012 to Deloitte Consulting LLP. This research was supported in part by the Office of Science Management and Operations of NIAID at the NIH. No additional external funding was received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study used de-identified data stripped of all PHI/PII, which is made publicly available through the TB Portals Program, a trans-national initiative led by the NIAID (https://tbportals.niaid.nih.gov/). Before public-sharing and reuse of the de-identified data, each participating clinical research institution (https://tbportals.niaid.nih.gov/where-do-our-cases-come-from) receives approval from the participating institution's IRB and must follow strict adherence to ethics rules requirements of CRDF Global and the International Science and Technology Center who are the grant-issuing institutions (https://journals.asm.org/doi/10.1128/JCM.01013-17). The data was analyzed in accordance to the guidelines specified in TB Portals Data Use Agreement (https://tbportals.niaid.nih.gov/pdf/TB-Portals-Data-Use-Agreement.pdf).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
The TB portals program necessitates all users of the data sign a DUA before access to the underlying, de-identified clinical data is provided and the data can be requested at the following URL (https://tbportals.niaid.nih.gov/download-data). Therefore, this study provides the code to reproduce the analysis without the underlying raw data (https://github.com/niaid/tbportals.xray.sputum.2021) in compliance with the DUA. To aid reproducibility, the list of public identifiers of the cases is provided in Supplementary Table 4.