Abstract
Background Stroke in UK Biobank (UKB) is ascertained via linkages to coded administrative datasets and self-report. We studied the accuracy of these codes using genetic validation.
Methods We compiled stroke-specific and broad cerebrovascular disease (CVD) code lists (Read V2/V3, ICD-9/-10) for medical settings (hospital, death record, primary care) and self-report. Among 408,210 UKB participants we identified all with a relevant code, creating 12 stroke definitions based on the code type and source. We performed genome-wide association studies (GWASs) for each definition, comparing summary results against the largest published stroke GWAS (MEGASTROKE), assessing genetic correlations, and replicating 32 stroke-associated loci.
Results Stroke case numbers identified varied widely from 3,976 (primary care stroke-specific codes) to 19,449 (all codes, all sources). All 12 UKB stroke definitions were significantly correlated with the MEGASTROKE summary GWAS results (rg 0.81-1) and each other (rg 0.4-1). However, Bonferroni-corrected confidence intervals were wide, suggesting limited precision of some results. Six previously reported stroke-associated loci were replicated using ≥1 UKB stroke definitions.
Conclusions Stroke case numbers in UKB depend on the code source and type used, with a 5-fold difference in the maximum case-sample size. All stroke definitions are significantly genetically correlated with the largest stroke GWAS to date.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
K Rannikmae is funded by Health Data Research UK Rutherford fellowship MR/S004130/1. AF is funded by BHF award RE/18/5/34216 and MR/S004130/1. This work was supported by the Wellcome Trust-University of Edinburgh Institutional Strategic Support Fund. AT is funded by HDR-UK awards HDR-9004 and HDR-9003. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
UK Biobank project 2532
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Grant numbers: MR/S004130/1; BHF award RE/18/5/34216; HDR-UK awards HDR-9004 and HDR-9003.
Data Availability
The datasets analyzed for this study is available from the UK Biobank and the MEGASTROKE consortium website upon request (https://www.megastroke.org/).