Abstract
Objective Sarcoidosis is a granulomatous disease affecting the lungs in over 90% of patients. Qualitative assessment of chest CT by radiologists is standard clinical practice and reliable quantification of disease from CT would support ongoing efforts to identify sarcoidosis phenotypes. Standard imaging feature engineering techniques such as radiomics suffer from extreme sensitivity to image acquisition and processing, potentially impeding generalizability of research to clinical populations. In this work, we instead investigate approaches to engineering variogram-based features with the intent to identify a robust, generalizable pipeline for image quantification in the study of sarcoidosis.
Approach For a cohort of more than 300 individuals with sarcoidosis, we investigated 24 feature engineering pipelines differing by decisions for image registration to a template lung, empirical and model variogram estimation methods, and feature harmonization for CT scanner model, and subsequently 48 sets of phenotypes produced through unsupervised clustering. We then assessed sensitivity of engineered features, phenotypes produced through unsupervised clustering, and sarcoidosis disease signal strength to pipeline.
Main results We found that variogram features had low to mild association with scanner model and associations were reduced by image registration. For each feature type, features were also typically robust to all pipeline decisions except image registration. Strength of disease signal as measured by association with pulmonary function testing and some radiologist visual assessments was strong (optimistic AUC ≈ 0.9, p ≪ 0.0001 in models for architectural distortion, conglomerate mass, fibrotic abnormality, and traction bronchiectasis) and fairly consistent across engineering approaches regardless of registration and harmonization for CT scanner.
Significance Variogram-based features appear to be a suitable approach to image quantification in support of generalizable research in pulmonary sarcoidosis.
Competing Interest Statement
LAM received grants from the National Institute of Health (R01HL140357, R01HL142049, and R01HL136681), Ann Theodore Foundation, the FSR, Mallinckrodt Pharmaceuticals, and the University of Cincinnati (Mallinckrodt Pharmaceuticals Foundation Grant) and serves on the Scientific Advisory Board for FSR, and Boeringer Ingelheim. RY was a speaker at Boehringer Ingelhein and Omnicurus CME experts, and is a consultant with IMBIO. WLL, TEF, DAL, JR, ACH, SL, MMM, BQB, KJC, HJH, and NEC have nothing to declare.
Funding Statement
This work was supported by the National Institutes of Health under Grants R01 HL114587, R01 HL142049, R01 HL152735, and T32 HL007085. Data from the GRADS study was supported under Grants U01 HL112707, U01 HL112707, U01 HL112694, U01 HL112695, U01 HL112696, U01 HL112702, U01 HL112708, U01 HL112711, U01 HL112712.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was reviewed and approved by the National Jewish Health and BRANY Institutional Review Boards (HS3211).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
GRADS data are available via the process laid out by the GRADS consortium.