Abstract
Existing imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. After training the model on 40,000 images from the EyePACS dataset, we generated phenotypes from 130,329 images of 65,629 British White participants in the UK Biobank. We conducted GWAS on three sets of phenotypes: raw image phenotype, phenotypes derived from the original photos; retina color, the average color of the center region of the retinal fundus photos; and vessel-enriched phenotypes, phenotypes derived from vasculature-segmented images. GWAS of raw image phenotypes identified 14 loci with genome-wide significance (p<5×10-8 and intersection of hits from left and right eyes), while GWAS of retina colors identified 34 loci, 7 are overlapping with GWAS of raw image phenotype. Finally, a GWAS of vessel-enriched phenotypes identified 34 loci. While 25 are overlapping with the raw image loci and color loci, 9 are unique to vessel-enriched GWAS. We found that vessel-enriched GWAS not only retains most of the loci from raw image GWAS but also discovers new loci related to vessel development. Our results establish the feasibility of this new framework of genomic study based on self-supervised phenotyping of medical images.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by grants from the National Institute of Aging (AG070112-01A1) to Z.X., W.Z., L.G., H.C., and D.Z.. In addition, this work was supported by American Diabetes Association (1-16-INI-16 to S.W. , C.L., A.D. and M. W.), NIH 1R01EY03258501 (to S.W. , C.L., A.D. and M.W.), P30 to Stanford Ophthalmology (to S.W. , C.L., A.D. and M.W.). This work was also supported by grants from the National Eye Institute (EY022356, EY018571, EY002520), Retinal Research Foundation, and NIH shared instrument grant S10OD023469 to R.Chen. R Channa is supported by the grant from the National Eye Institute (1K23EY030911-01). This work was supported in part by an Unrestricted Grant from Research to Prevent Blindness to the UW-Madison Department of Ophthalmology and Visual Sciences. L.G. was also supported by the Translational Research Institute through NASA Cooperative Agreement NNX16AO69A, NIH grants UL1TR003167 and R01NS121154, and a Cancer Prevention and Research Institute of Texas grant (RP 170668).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
New results on GWAS of endophenotypes derived from raw retinal fundus images are added.
Data Availability
The MRI and the genetic data used in this study are provided by UK Biobank (https://www.ukbiobank.ac.uk/enable-your-research/register). The summary statistics of all GWAS can be downloaded at https://drive.google.com/drive/folders/1jaQ-dCDKbY_zW0_FinPUwS7uGlNNPoMc?usp=sharing.
https://drive.google.com/drive/folders/1jaQ-dCDKbY_zW0_FinPUwS7uGlNNPoMc?usp=sharing