Abstract
Objective For multi-center heterogeneous Real-World Data (RWD) with time-to-event outcomes and high-dimensional features, we propose the SurvMaximin algorithm to estimate Cox model feature coefficients for a target population by borrowing summary information from a set of health care centers without sharing patient-level information.
Materials and Methods For each of the centers from which we want to borrow information to improve the prediction performance for the target population, a penalized Cox model is fitted to estimate feature coefficients for the center. Using estimated feature coefficients and the covariance matrix of the target population, we then obtain a SurvMaximin estimated set of feature coefficients for the target population. The target population can be an entire cohort comprised of all centers, corresponding to federated learning, or can be a single center, corresponding to transfer learning.
Results Simulation studies and a real-world international electronic health records application study, with 15 participating health care centers across three countries (France, Germany, and the U.S.), show that the proposed SurvMaximin algorithm achieves comparable or higher accuracy compared with the estimator using only the information of the target site and other existing methods. The SurvMaximin estimator is robust to variations in sample sizes and estimated feature coefficients between centers, which amounts to significantly improved estimates for target sites with fewer observations.
Conclusions The SurvMaximin method is well suited for both federated and transfer learning in the high-dimensional survival analysis setting. SurvMaximin only requires a one-time summary information exchange from participating centers. Estimated regression vectors can be very heterogeneous. SurvMaximin provides robust Cox feature coefficient estimates without outcome information in the target population and is privacy-preserving.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
GMW is supported by National Institutes of Health (NIH)/ National Center for Advancing Translational Sciences (NCATS) UL1TR002541, NIH/NCATS UL1TR000005, NIH/National Library of Medicine (NLM) R01LM013345, NIH/ National Human Genome Research Institute (NHGRI) 3U01HG008685-05S2. YL is supported by NIH/NCATS U01TR003528, and NLM 1R01LM013337. KC is supported by VA MVP000 and CIPHER. NGB is supported by PI18/00981, funded by the Carlos III Health Institute. DAH is supported by NCATS UL1TR002240. MSK is supported by NHGRI 5T32HG002295-18. JHM is supported by NLM 010098. MM is supported by NCATS UL1TR001857. DLM is supported by NIH/NCATS CSTA Award #UL1-TR001878. SNM is supported by NCATS 5UL1TR001857-05 and NHGRI 5R01HG009174-04. GSO is supported by NIH U24CA210867 and P30ES017885. LPP is supported by NCATS Clinical and Translational Science Award (CTSA) Award #UL1TR002366. FSJV is supported by NIH/NCATS UL1TR001881. AMS is supported by NIH/ National Heart, Lung, and Blood Institute (NHLBI) K23HL148394 and L40HL148910, and NIH/NCATS UL1TR001420. SV is supported by NCATS UL1TR001857. ZX is supported by National Institute of Neurological Disorders and Stroke (NINDS) R01NS098023. WY is supported by NIH T32HD040128.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
IRB Approval was obtained at Assistance Publique - Hopitaux de Paris, Beth Israel Deaconess Medical Center, Bordeaux University Hospital, Instituti Instituti Clinici Scientifici Maugeri Hospitals, University of Kansas Medical Center, Massachusetts General Brigham, Northwestern University, Medical Center University of Freiburg, University of Pittsburgh, and VA North Atlantic, Southwest, Midwest, Continental and Pacific. An exempt determination was made by Institutional Review Boards at Hospital Universitario 12 de Octubre, University of California Los Angeles, University of Michigan, and University of Pennsylvania.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data produced are...