ABSTRACT
OBJECTIVE The study aimed to develop and validate algorithms for identifying people with type 1 and type 2 diabetes in the All of Us Research Program (AoU) cohort, using electronic health record (EHR) and survey data.
RESEARCH DESIGN AND METHODS Two sets of algorithms were developed, one using only EHR data (EHR), and the other using a combination of EHR and survey data (EHR+). Their performance was evaluated by testing their association with polygenic scores for both type 1 and type 2 diabetes.
RESULTS For type 1 diabetes, the EHR-only algorithm showed a stronger association with T1D polygenic score (p=3×10−5) than the EHR+. For type 2 diabetes, the EHR+ algorithm outperformed both the EHR-only and the existing AoU definition, identifying additional cases (25.79% and 22.57% more, respectively) and showing stronger association with T2D polygenic score (DeLong p=0.03 and 1×10−4, respectively).
CONCLUSIONS We provide new validated definitions of type 1 and type 2 diabetes in AoU, and make them available for researchers. These algorithms, by ensuring consistent diabetes definitions, pave the way for high-quality diabetes research and future clinical discoveries.
Why did we undertake this study?This study was conducted to develop and validate algorithms for identifying type 1 and type 2 diabetes cases in the All of Us Research Program (AoU).
What is the specific question(s) we wanted to answer?Can accurate algorithms for type 1 and type 2 diabetes identification be developed and validated using AoU cohort Electronic Health Record (EHR) and survey data? Do the identified diabetes cases show association with polygenic scores in diverse populations?
What did we find?We developed a new validated type 1 diabetes definition and expanded upon the existing type 2 diabetes definition.
What are the implications of our findings?The developed algorithms can be universally implemented in AoU for identifying study participants for well-defined case-control diabetes studies.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
L.S. is supported by funds from the Ministry of Education and Science of Poland within the project ‘Excellence Initiative – Research University’, the Ministry of Health of Poland within the project ‘Center of Artificial Intelligence in Medicine at the Medical University of Bialystok’ and the American Diabetes Association grant 11–22–PDFPM–03. J.H.L. is supported by NIDDK K23 DK131345 and MGH ECOR Fund for Medical Discovery Clinical Research Award. J.C.F. is supported by NHLBI K24 HL157960. J.M.M. is supported by American Diabetes Association Innovative and Clinical Translational Award 1–19–ICTS–068, American Diabetes Association grant #11–22–ICTSPM–16 and by NHGRI U01HG011723. A.K.M. is supported by the Foundation for the National Institutes of Health with funding from AMP CMD RFP 2: GENERATION of New genetic, –omic, or biomarker data for Common Metabolic Diseases titled ‘Common metabolic disease genetic association analysis in the All of Us Research Program’ and by NHGRI U01HG011723. M.S.U. is supported by NIDDK K23DK114551, NIDDK R03DK131249, and Doris Duke Foundation Award 2022063.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
workbench.researchallofus.org
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
↵# These authors jointly directed this work.
Twitter Summary “New study develops and validates type 1 and type 2 diabetes algorithms in the All of Us Research Program cohort, improving case identification for diabetes research. #diabetesresearch #AllOfUsResearchProgram”
Data Availability
All data produced in the present study are available upon reasonable request to the authors