Abstract
Background Despite MN being one of the most common causes of nephrotic syndrome worldwide, its biological and environmental determinants are poorly understood in large-part due to it being a rare disease. Making use of the UK Biobank, a unique resource holding a clinical dataset and stored DNA, serum and urine for ∼500,000 participants, this study aims to address this gap in understanding.
Methods The primary outcome was putative MN as defined by ICD-10 codes occurring in the UK Biobank. Univariate relative risk regression modelling was used to assess the associations between the incidence of MN and related phenotypes with sociodemographic, environmental exposures, and previously described increased-risk SNPs.
Results 502,507 patients were included in the study of whom 100 were found to have a putative diagnosis of MN; 36 at baseline and 64 during the follow-up. Prevalence at baseline and last follow-up were 72 and 199 cases/million respectively. At baseline, as expected, the majority of those previously diagnosed with MN had proteinuria, and there was already evidence of proteinuria in patients diagnosed within the first 5 years of follow-up. The highest incidence rate for MN in patients was seen in those homozygous for the high-risk alleles (9.9/100,000 person-years).
Conclusion It is feasible to putatively identify patients with MN in the UK Biobank and cases are still accumulating. This study shows the chronicity of disease with proteinuria present years before diagnosis. Genetics plays an important role in disease pathogenesis, with the at-risk group providing a potential population for recall.
Introduction
Membranous nephropathy (MN) is among the most common causes of adult nephrotic syndrome worldwide and has a significant healthcare burden, despite being a rare disease (10-12 cases per million)1. For the majority of patients, it is an autoimmune disease associated with the anti-phospholipase M-type receptor autoantibody (anti-PLA2R); a highly sensitive biomarker and seemingly pathogenic factor in its own right2,3. There appears to be a strong genetic contribution to the condition, with the possession of single-nucleotide polymorphisms (SNPs) at PLA2R1 and the Major Histocompatibility Complex, Class II, DQA1 (HLA-DQA1) conferring a dramatically increased risk of developing the disease4–6. The most recent GWAS has also shown two further loci associated with disease risk in NKFB1 and IRF46. However, given its rarity, understanding not only the disease pathogenesis but also the risk of disease onset has been challenging. Why some patients with the genetic predisposition develop a pathogenic antibody whilst others do not, has still not been elicited. Circumstantial evidence suggests there may be a role for the loss of tolerance. As with other autoimmune conditions, the presence of the disease associated risk SNPs are unlikely to be solely responsible for the evolution of the disease. An accompanying trigger, be it environmental or pathogenic, at the correct time, is needed to initiate the immune cascade leading to symptoms. The podocyte antigen peptide sequence shows similarities to a cell wall enzyme in certain commonly encountered pathogens such as Clostridia species7. One study from China suggests a potential environmental trigger by identifying an association between air pollution and MN compared to other autoimmune glomerulopathies8,9.
The UK Biobank holds a clinical dataset and stored deoxyribonucleic acid (DNA), serum and urine for over 500,000 participants, aged between 40-70 years old, recruited from across the United Kingdom from 2007-2010. This age profile corresponds to the age of onset of MN with a mean of 54 years. DNA has been genotyped on a genome-wide microarray with subsequent imputation.
Further health record data has been linked through the Hospital Episode Statistics (HES) and cancer and mortality statistics10. Participant place of birth, residential area and occupation are all recorded, providing an invaluable resource for the study of environmental and occupational pollution exposures11–14. With the deep phenotyping of each participant, life-long follow-up, genotyping and health record linkage, the UK Biobank provides a unique and powerful tool to help understand the epidemiology of rare diseases such as MN. An initial step in the use of the UK Biobank is its validation as a tool for accurately identifying patients with MN. Neither International Classification of Disease tenth edition (ICD-10) nor UK Read codes (a coding system for clinical terms used in the UK) have an explicit code for MN, but both coding systems have codes that will include MN plus potentially several other diagnoses.
Here, for the first time, we have used the UK Biobank to demonstrate its feasibility and to investigate the genesis of the rare kidney disease, MN.
The specific study aims were:
to provide a detailed description of the numbers of participants with confirmed or possible MN within the UK Biobank to inform the identification of participants for further phenotyping and sampling or recall for assessments
to determine the prevalence and incidence rates of MN in this population
to stratify the UK Biobank population by the genetic risk of MN
to assess the feasibility of linking environmental factors (infection and occupational exposure) to triggering of autoimmunity in this genetically susceptible cohort
Methods
This is a cohort study in which the primary outcome was putative Membranous nephropathy (MN), defined as any primary or secondary HES inpatient diagnosis of ICD-10 N02.2, N03.2, N04.2, or N05.2 occurring before the UK Biobank HES data refresh in March 2019. The nominal diagnosis date was taken as the earliest such record. This did not include self-reported diagnoses at the time of UK Biobank entry, or clinical diagnoses not resulting in hospital admission, which are not fully available in the Biobank dataset.
Primary care attendance data in the UK is provided by multiple separate entities, with which UK Biobank is working to allow for data linkage. At present, a subset of 228,957 UK Biobank participants had clinical attendances in primary care11–13. For these records putative MN was defined as the equivalent Read codes version 2 (READ2) K0A22, K0A32, K021., K011, K016., K031. or READ3 codes K0A22, K021., K011., K016., K031.
Our descriptive study examined multiple exposures including sociodemographic characteristics and medical history records recorded at the time of UK Biobank entry, as well as hospital inpatient ICD-10 diagnoses and the Office of Population Censuses and Surveys Classification of Interventions and Procedures version 4 (OPCS-4) operative procedures occurring prior to the UK Biobank data as of 10th December 2019. Additional exposures examined included algorithmically derived end-stage renal disease (ESRD), measures of area-level residential air pollution, and Standard Occupational Classification (SOC)-2000 occupation-linked workplace airborne exposures as derived from the SOC codes and the Airborne Chemical Job Exposure Matrix (ACE JEM)15. Both CKD and Renal disease derived from Biobank fields f.41202 and f.41204. In addition to this, CKD defined using the ICD 10 code N18 – chronic kidney disease. Renal Disease was defined using the ICD codes N0* and N1* which includes renal disease but not limited to CKD. The source of each variable is provided in supplementary Table S1.
Given coeliac disease and type 1 diabetes mellitus classically share the same HLA class II alleles as primary MN, namely HLA-DQ2.529-31, we investigated the incidence of these conditions in the UK biobank in relation to identified MN cases.
Residential linked air pollution estimates of particulate matter air pollutants with aerodynamic diameters <10μm (PM10) and <2.5μm (PM2.5 or fine PM), and gaseous air pollutions (Nitrogen Dioxide (NO2) and Nitrogen Oxide (NO)) were generated for 2010 using Land Use Regression (LUR) modelling as part of the European Study of Cohorts and Air Pollution Effects (ESCAPE, http://www.escapeproject.eu/). Additional estimates for Nitrogen Dioxide were estimated for the years 2005-2007 from EU-wide air pollution maps (resolution 100 metre x 100 metre) based in the LUR models, full details of the model and its performance can be found online16. All effect estimates related to associations with annual estimates of each air pollution were expressed per μgm-3.
Where descriptive statistics have been produced, the median and interquartile range (IQR) have been reported for continuous measures and the proportion and percentages per category have been reported for categorical variables. Where data is missing, the number of cases with missing data is reported. In reporting disease incidence, the number of incident cases occurring post-UK Biobank entry has been divided by the total person-years of follow-up across the disease-free UK Biobank population until March 1st 2019, accounting for individual date of UK Biobank entry and recorded date of death if applicable. Baseline patients defined as those with a diagnosis of MN on study entry, ‘Early Incident’ defined as those diagnosed within five years after study entry, and ‘Late Incident’ defined as those patients diagnosed later than 5 years after study entry.
SNPs associated with MN were identified as the lead independent SNPs from 3 previous GWAS studies, referred to as GWAS1-3. Allele counts identified in GWAS14 were obtained from the directly genotyped Biobank dataset. Allele counts from GWAS25 and GWAS36 were obtained from the imputed Biobank dataset. Imputed genotypes were used as called regardless of any quality assessment.
Associations between dichotomous phenotypes and genotypes were assessed by determining the relative risk (RR) associated with each genotype compared to homozygous reference alleles, and the RR per allele (additive genetic model).
To determine univariate association between exposures of interest and MN diagnosis, generalised linear models were used with non-cases as the reference category. Results were presented as relative risk ratios (RR) with accompanying 95% Wald confidence intervals.
All statistical analyses were performed in R Version 3.5.117. Plink v2.00 and BCFtools v1.10.2 were used to extract genetic data from the UK Biobank18–21.
Results
MN Incidence and Prevalence
A total of 502,507 patients were included in the study. Based on hospital admissions, 100 participants were found to have a putative diagnosis of MN as indicated by the ICD 10 codes, 36 at baseline and 64 occurring during the follow-up period. At baseline, the prevalence of MN in the Biobank population was 72 cases per million, with 199 cases per million at the latest follow up. There was a total of 64 incident cases, with N052 contributing the largest number of patients at 31. With approximately 4.98 million total person-years of follow-up across the UK Biobank population, this corresponded to an incidence rate of 1.29 per 100,000 person-years, with a continued rise over time as the Biobank population ages. Table 1 and figure 1.
incidence and prevalence
Monthly cumulative incidence of MN. A) Cases identified each calendar month B) Kaplan-Meier estimate by age with 95%CI.
Demographics of MN cases
In the MN cohort, the median age at baseline assessment was 62 years old (IQR 56 – 65), compared to a median of 58 (IQR 50-63) in the non-MN group. There were 39% males in the MN cohort, compared to 54.4% males in the non-MN cohort. The majority of participants in the UK Biobank with and without MN were white, with similar BMI, smoking status, alcohol consumption and levels of deprivation as measured by the Townsend Deprivation Index and by the Index of Multiple Deprivation. Table 2.
Demographics for the UK Biobank population
Clinical
There was a higher level of proteinuria at baseline in the MN group compared to the non-MN group, with a median urinary albumin:creatinine ratio (uACR) of 3.0 mg/g (IQR 0.4-22.9) and 0.4 mg/g (IQR 0.4-0.6) respectively. Renal function at baseline was also lower in the MN group with a median estimated glomerular filtration rate (eGFR) 78.1ml/min/1.73m2 (IQR 52.2-93.6), compared to the non-MN cohort who had a median eGFR 92.8ml/min/1.73m2 (IQR 82.9-100.1). Table 3.
Phenotype per group. Relationship between baseline cases, all incident cases, and incident cases <=5 years from baseline. MN = Membranous Nephropathy, HES = Hospital Episode Statistics, CKD = Chronic Kidney Disease, ESRD = End-Stage Renal Disease, ACR = urinary albumin:creatinine ratio, eGFR = estimated Glomerular Filtration Rate, IQR = Inter-Quartile Range, SD = Standard Deviation, OPCS = OPCS Classification of Interventions and Procedures. See supplementary material table 1 for biobank variable codes and sources. eGFR calculated at baseline, algorithmically derived ESRD assessed throughout follow up period.
For those diagnosed with MN prior to study recruitment, a majority (66%; n=21) had proteinuria at baseline with a uACR of more than 3mg/g. 28% (n=9) had macro-albuminaemia with a uACR of more than 30mg/g. For those diagnosed within 5 years of recruitment (early incident), there was already evidence of proteinuria at baseline, with 49% (n=17) having a uACR of more than 3mg/g, and 29% (n=10) with a uACR of more than 30mg/g. In the late incident group, diagnosed more than 5 years after recruitment, the majority had no evidence of proteinuria at baseline with 69% (n=18) having a uACR of less than 3mg/g. For this late incident group there was proteinuria noted at baseline in 31% (n=8) of patients (uACR greater than 3mg/g), and macro-albuminaemia in 15% (n=4) with a uACR of more than 30mg/g. Table 3.
Renal function for patients diagnosed prior to recruitment and in the early incident group, as measured by eGFR, was similar at baseline; median 74.3ml/min/1.73m2 (IQR 46.8-86.7) in the baseline group, and median 76.8ml/min/1.73m2 (IQR 48.8-93.8) in the early incident group. In the late incident group, the median baseline eGFR was 92.9ml/min/1.73m2 (IQR 76.3-95.0). Table 3.
Genetics
In the total population, we found that 73.4% (n=357,516) were homozygous for the low-risk HLA-DQA1 alleles (CC) and 2.4% (n=11,622) patients were homozygous for the high-risk allele (TT). For PLA2R1, 35.8% (n=174,588) were homozygous for the high-risk alleles (AA) compared to 16.8% (n=81,861) homozygous for the low-risk GG alleles. Of these, 0.8% (n=4079) were found to have both high-risk allele SNPs. Tables S2 and S3.
We see similar results in the UK Biobank population as in the original GWAS with respect to risk of MN diagnosis. Compared to those homozygous for the low-risk HLADQ allele (C), being homozygous for the high-risk allele (T) was associated with an 8.79 times greater risk of MN diagnosis (RR: 9.79, 95% CI: 5.36-17.85). For PLA2R1, compared to those homozygous for the low-risk allele (G), being homozygous for the high-risk allele (A) was associated with a 1.22 times greater risk of MN (RR: 2.22, 95% CI: 1.16-4.25). Considering HLADQ and PLA2R1 in combination, using the low-risk allele combination CCGG as the reference, the relative risk of MN for those homozygous for both SNPs was 23.44 (95% CI: 7.67-71.62). There was no increased risk of MN associated with being homozygous for either of the two novel lead SNPs identified in the most recent GWAS6. For NFKB1, the relative risk of MN among those homozygous for the high-risk allele was 1.08 (95% CI: 0.56-2.10); and for IRF4, the RR among those homozygous for the high-risk allele was 1.72 (95% CI: 0.62-4.75). Table 4 and Figure 2.
Genetic risk analysis in the membranous nephropathy cohort using the lead SNPs from GWAS14, GWAS25 and GWAS36 from the imputed UK Biobank dataset. MN - Membranous Nephropathy, RR - Relative Risk, CI - Confidence Interval, GWAS - Genome-wide Association Study, OR - Odds Ratio, SNP - Single Nucleotide Polymorphism
Genetic Risk of MN by PLA2R1 and HLADQ. Based on directly called SNPs determined in GWAS14
The highest incidence rate of MN was seen in those homozygous for the high-risk HLA-DQA1 allele (TT) and high-risk PLA2R1 allele (AA), with an incidence rate of 9.9 cases per 100,000 person-years. The lowest-risk group (CCGG) had an incidence rate of 0.5 cases per 100,000 person-years. Table 5.
Prevalence and incidence measures for: those in high-risk and low-risk groups; those with a PLA2R1 allele and without a PLA2R1 allele
Subgroup analysis comparing the cohorts homozygous for both the high-risk alleles for HLA-DQA1 and PLA2R1 (TT and AA respectively) with those not homozygous for both show a similar proportion of diabetes, both self-reported and through HES linkage. In the high-risk group, 2.8% of individuals self-reported having coeliac disease, and 3.2% were identified as having coeliac disease through HES linkage, this is compared to 0.4% and 0.5% respectively, in the low-risk group. Table S4.
At baseline, there was no difference in eGFR in patients homozygous for the high-risk HLA-DQA1 and PLA2R1 alleles (TTAA) compared to those not. For uACR, the high-risk group showed weak association with a higher degree of proteinuria over 30 mg/g; 0.71% among those homozygous for TTAA, compared to 0.47% in those not homozygous for the high-risk alleles (not TTAA). Table S4.
Using patients with baseline proteinuria data, and homozygous for the low-risk allele for HLA-DQA1 as reference, there was a weak negative association with mild or macro-albuminaemia for those homozygous for the high-risk alleles (TT); (RR: 0.94, 95% CI: 0.86-1.02). Patients homozygous for the high-risk PLA2R1 allele (AA) were slightly more likely to have mild proteinuria (RR: 1.05, 95% CI: 1.01-1.08) and heavy proteinuria (RR: 1.11, 95% CI: 0.99-1.26), compared to the low-risk allele group (CC). Table S5.
Pollution and Work-Related Environmental Exposures
Univariate comparisons indicated no significant associations between increased environmental exposure to Nitrogen Dioxide, Nitrogen Oxides, or Particulate Matter and risk of MN diagnosis. Likewise, no significant associations were found between work-related environmental exposures and risk of MN diagnosis, though comparisons were subject to small sample sizes among those exposed with MN. Tables 6 and S6.
Associations with environmental exposures presented in ⍰gm-3: pollution exposure: Identified MN cases versus controls with no diagnosed MN
Primary care data
Data from primary care records was available for a subset of 228,957 (45.6%) participants. Of these, there were 38 patients identified as putative MN cases using the previously defined READ codes. Using hospital admissions data, 41 patients in this subset were identified as having MN using the previously defined ICD codes. Of all identified patients with MN, 19 were identified in both sources, 19 only in primary care and 22 only through hospital admissions. For those patients identified in both sources, the majority were coded in both databases at the same time, but there were a number of patients with diagnoses in the primary care database for whom diagnosis was coded years in advance of hospital admission. The age distribution at first hospital admission was similar among patients included and not included in the primary care database. However, some of the primary care cases not identified by hospital admissions were diagnosed at a younger age. Supplementary figure S7.
Discussion
This is the first study to examine the occurrence and determinants of MN within the UK Biobank cohort. We have shown that putative MN cases can be identified at entry to the UK Biobank study as well as throughout follow-up. The incidence rate calculated here of 1.29 diagnoses per 100,000 person-years is similar to previously reported rates in other studies1,8,9,22–24. Most previous studies reporting incidence rates for MN are based on retrospective data from biopsies – a possibly unrepresentative cohort given the inherent suspicion of an underlying kidney disease. In contrast, the UK Biobank cohort is ostensibly a population-based sample (see Fry et al for consideration of the sample representativeness14), with this study being the first to evaluate MN occurrence in such a population.
At present MN remains a biopsy-diagnosed condition requiring specialist input, usually in a tertiary referral centre in the UK. The coding for MN in primary care records is therefore generally based on correspondence from a tertiary care centre, meaning patients assigned codes for MN are highly likely to have the condition (unpublished work). However, there is no unifying code for MN in either ICD-10 or READ codes, and those that are available include adjunctive diagnoses. This raises the possibility that patients with MN could be missed, as they have been coded with more generic descriptions of the condition such as nephrotic syndrome. A number of patients identified from primary care data were diagnosed prior to the initiation of HES and who do not, therefore, appear in the hospital admissions database at all, or who appear years later. It may be that some of these patients were already in remission before the commencement of HES data, and therefore were never captured. At present, the available HES data also does not capture outpatient episodes, meaning patients already diagnosed prior to the commencement of the database, will not be identified as they are only seen in the outpatient clinic setting and not as an inpatient. Patients in remission could appear in the HES database years after identification in the General Practitioner (GP) database, owing to a relapse necessitating admission to a specialist nephrology centre. An ongoing limitation with data linkage in the UK Biobank is the incomplete coverage of both primary care data and HES, although this is being continuously addressed and updated.
MN is a disease of middle to late age, with the median age at baseline here of 62 years old25,26. We found a female predominance for MN in the UK Biobank, which is in-keeping with autoimmune disease in general although not specifically in MN27. A number of studies in the past have shown the opposite is true with MN, however, though this prior work has been inconsistent, emblematic of the difficulty of epidemiological studies in examining this rare disease1.
As expected, there was a higher degree of proteinuria and lower eGFR at baseline in our study in patients with MN compared to the non-MN cohort. Interestingly, those patients who had no diagnosis of MN at baseline but who went on to be diagnosed within 5 years already had evidence of subclinical biochemical abnormalities suggestive of MN. Almost half of these early incident patients had some degree of proteinuria and over a third had an eGFR of less than 60ml/min/1.73m2. These findings are similar to other recent studies28, and are indicative of the chronicity and slowly progressive nature of the disease, with patients experiencing ongoing pathogenic autoantibody production and glomerular damage years before it becomes clinically apparent. Identification of this prodromal cohort in the UK Biobank offers a unique opportunity to confirm the early role of anti-PLA2R in pathogenesis up to 5 years prior to clinical diagnosis of MN using the sera stored at study entry. It also allows testing of the hypothesis that infection could be the environmental trigger for MN by interrogation of the IgG antibody proteome library in early serum samples first positive for anti-PLA2R, in comparison to age/gender-matched controls. It may be possible to identify a unique IgG phenotype to infectious agents coincident with the onset of anti-PLA2R autoantibodies.
Currently 80% of autoimmune MN patients present with nephrotic syndrome without any prior indication of an underlying kidney disease, but with proteinuria and autoantibody production that may have been active for a variable period before this. Although the exact cause is still unknown, knowledge of the disease pathogenesis has increased markedly since the discovery of the anti-PLA2R autoantibody. There is undoubtedly a strong genetic component to autoimmune MN, with multiple GWAS initially identifying two genes, HLA-DQA1 and PLA2R1 in Europeans, and more recently NFKB1 and IRF4, that all account for susceptibility to MN. Possession of homozygous pathological alleles of HLA-DQA1 and PLA2R1 raises the odds ratio from 1 to 78 for having the disease4. What is not known, is how likely it is for a patient to develop MN in the presence of these high-risk alleles. What is the lifetime risk of developing MN conferred by the high-risk alleles? Certainly, in our study, the relative risk of MN is significantly higher in patients homozygous for the HLA-DQA1 and PLA2R1 compared to those without any of the risk alleles (RR: 23.4, 95% CI: 7.7-71.6). Homozygosity of risk alleles in HLA-DQA1 and PLA2R1 resulted in a higher incidence of MN (9.9 cases per 100,000 person-years) and a higher degree of macro-albuminaemia. This suggests not only a higher incidence but also a more severe phenotype in keeping with previous work showing an association of individuals with the HLA-DQA1 risk alleles having higher anti-PLA2R antibodies26. There was a low penetrance with only 0.20% of patients homozygous for the high-risk alleles having, or developing MN, during the study follow-up, but as UK Biobank is a lifetime epidemiology study, the risk of developing MN will become evident. Given the chronicity of MN disease and its diagnosis late in life for most patients, this is a group of patients in whom active follow-up would be essential in order to determine the strength of genetics on the development of the disease.
We found no evidence for an increase in MN diagnosis in relation to particulate exposure or occupational exposures to heavy metals and hydrocarbons. This may be related to the much lower levels of exposure among patients in the UK compared to China, where this association was first described in biopsy patients8,9. In our cohort, the median level of exposure over the study period was 10.0⍰gm-3 for particulate matter of less than 2.5⍰m, compared to a mean of 55.6⍰gm-3 (range 8.1 to 110.5) in China. Limitations to this comparison include the small numbers of exposed cases, lack of data on previous exposures, and the study population being generally more educated and from a higher socioeconomic background compared to the general population, and therefore potentially less likely to be exposed to higher pollution levels – all influencing the statistical power of environmental comparisons and our ability to detect plausible magnitude associations. A significantly larger study population would be required to investigate this further at lower particulate matter levels, and at present would be outside the remit of the UK Biobank.
Conclusion
Here we have shown for the first time that it is feasible to putatively identify patients with MN in the UK Biobank, both at study entry and over time, allowing for the use of this powerful resource to study initiation of a rare autoimmune disease in greater depth than previously possible, however definitive diagnosis would require participant or stored sample access. This study provides further evidence for the chronicity of disease with proteinuria present years prior to diagnosis in a number of patients. Genetics also plays an important role in the development of the disease although with low penetrance. Patients with the high-risk alleles provide an invaluable study population for prospective investigation of the disease process, this is particularly pertinent given the potential for missed cases in the database. This at-risk group would be an important sample for recall to interrogate further. Despite the low numbers of MN patients identified, this is the most comprehensive study of MN in a defined population to date. Importantly it also provides a baseline for further work to understand the pathogenesis of the disease as follow-up progresses.
Data Availability
Data held by UK Biobank (http://ukbiobank.org) - access to data is dependent on application directly to UK biobank
Author Contributions
PB – project conception
PB, SR, DAK and PH – project protocol development
KB, SR and MG – data analysis
MLD, CV, HS, SG, SR - Genetic Analysis
PH – draft manuscript preparation
All authors – manuscript review
Acknowledgements
This was an approved study (I.D. 1618) by UK BioBank (http:/ukbiobank.org). We acknowledge funding from Kidney Research UK (KRUK) for the Stoneygate Foundation Grant JFS_IN_003_20160914
MLD was supported by the KRESCENT post-doctoral fellowship from the Kidney Foundation of Canada
Disclosures
None
Supplementary Material Table of Contents (PDF)
Table S1 Biobank fields and variable definitions
Table S2 Allele counts at the lead SNPs from GWAS14, GWAS25, and GWAS36 in all UK Biobank participants using the imputed UK Biobank datasets
Table S3 Counts of genotype combinations at the lead SNPs from GWAS14 and GWAS25 in all UK Biobank participants using both the imputed UK Biobank datasets
Table S4 Phenotype per combined genotype group at HLADQA1 (rs2187668, test allele T) and PLA2R1 (rs4664308, test allele A) from GWAS14 in all UK Biobank participants using the directly genotyped UK Biobank dataset
Table S5 Phenotype and genotype by degree of proteinuria for all UK Biobank participants, with genotypes obtained from the imputed UK Biobank dataset at the lead SNPs from GWAS14 and GWAS25.
Table S6 Environmental exposures – work-related
Figure S7 A) Time difference from GP diagnosis to HES diagnosis where the case was identified in both sources.
B) Age distributions of the cases identified in the two sources
C) Date of diagnosis of the cases identified in the two sources
Footnotes
↵* Joint first author