Multi-population genome-wide association study implicates both immune and non-immune factors in the etiology of pediatric steroid sensitive nephrotic syndrome
==============================================================================================================================================================

* Alexandra Barry
* Michelle T. McNulty
* Xiaoyuan Jia
* Yask Gupta
* Hanna Debiec
* Yang Luo
* China Nagano
* Tomoko Horinouchi
* Seulgi Jung
* Manuela Colucci
* Dina F. Ahram
* Adele Mitrotti
* Aditi Sinha
* Nynke Teeninga
* Gina Jin
* Shirlee Shril
* Gianluca Caridi
* Monica Bodria
* Tze Y Lim
* Rik Westland
* Francesca Zanoni
* Maddalena Marasa
* Daniel Turudic
* Mario Giordano
* Loreto Gesualdo
* Riccardo Magistroni
* Isabella Pisani
* Enrico Fiaccadori
* Jana Reiterova
* Silvio Maringhini
* William Morello
* Giovanni Montini
* Patricia L. Weng
* Francesco Scolari
* Marijan Saraga
* Velibor Tasic
* Domenica Santoro
* Joanna A.E. van Wijk
* Danko Milošević
* Yosuke Kawai
* Krzysztof Kiryluk
* Martin R. Pollak
* Ali Gharavi
* Fangmin Lin
* Ana Cristina Simœs e Silva
* Ruth J.F. Loos
* Eimear E. Kenny
* Michiel F. Schreuder
* Aleksandra Zurowska
* Claire Dossier
* Gema Ariceta
* Magdalena Drozynska-Duklas
* Julien Hogan
* Augustina Jankauskiene
* Friedhelm Hildebrandt
* Larisa Prikhodina
* Kyuyoung Song
* Arvind Bagga
* Hae Il Cheong
* Gian Marco Ghiggeri
* Prayong Vachvanichsanong
* Kandai Nozu
* Marina Vivarelli
* Soumya Raychaudhuri
* Katsushi Tokunaga
* Simone Sanna-Cherchi
* Pierre Ronco
* Kazumoto Iijima
* Matthew G. Sampson

## ABSTRACT

Pediatric steroid-sensitive nephrotic syndrome (pSSNS) is the most common childhood glomerular disease. Previous genome-wide association studies (GWAS) identified a risk locus in the HLA Class II region and three additional signals. But the genetic architecture of pSSNS, and its genetically driven pathobiology, is largely unknown. We conducted a multi-population GWAS meta-analysis in 38,463 participants (2,440 cases) and population specific GWAS, discovering twelve significant associations (eight novel). Fine-mapping implicated specific amino acid haplotypes in HLA-DQA1 and HLA-DQB1 driving the HLA Class II risk signal. Non-HLA loci colocalized with eQTLs of monocytes and numerous T-cell subsets in independent datasets. Colocalization with kidney eQTLs was lacking, but overlap with kidney cell open chromatin suggests an uncharacterized disease mechanism in kidney cells. A polygenic risk score (PRS) associated with earlier disease onset in two independent cohorts. Altogether, these discoveries expand our knowledge of pSSNS genetic architecture across populations and provide cellspecific insights into its molecular drivers.

pSSNS is a rare disease of glomerular filtration barrier failure. Its incidence ranges from 0.96-13.5/100,000, being most frequent in South Asian and East Asian populations1. pSSNS causes massive proteinuria and increased risk of thromboembolism, sepsis, and progression to chronic kidney disease (CKD)/end-stage kidney disease (ESKD)2–7. And those progressing to ESKD have increased odds of recurrent NS in their allograft8. pSSNS is impactful across the lifespan - 31-50% of those affected have relapses in adulthood9. Much of pSSNS’s morbidity is related to side effects of the non-specific immunosuppressants that allow some to achieve remission of their proteinuria7,10–17.

There are no monogenic forms of pSSNS to illuminate its pathobiology. However, we know that immune dysregulation is a major contributor18,19. But determining causal immune factors via case-control studies of cytokines profiles, cell types, and transcriptomic signatures is challenging. The dynamic responses of the immune system at different disease stages and to various stimuli make it difficult to determine whether observed differences are causal, correlated, or due to independent biological/environmental factors. And kidney tissue in children is rarely available to determine intrarenal, molecular drivers of pSSNS.

GWAS have discovered four pSSNS risk loci20–24. The top signal in each was in the HLA Class II region. Two other loci are plausibly immune-related, with the closest genes being Calcium Homeostasis Modulator Family Member 6 (*CALHM6*)25 and TNF Superfamily Member 15 (*TNFSF15)*. The lead SNP of the fourth locus was within nephrin (*NPHS1)*, a fundamental glomerular gene implicated in Mendelian NS26. These studies are illuminating but naturally by smaller sample sizes, primarily population-specific analyses, and limited post-GWAS analysis. Here, towards discovering a fuller spectrum of disease-associated genetic variation and unraveling its pathogenesis at the interface of the immune system and kidney, we conducted a large and diverse GWAS of pSSNS.

We conducted a multi-population, fixed-effect, inverse-variance, meta-analysis across twelve GWAS datasets comprised of 2,440 cases and 36,023 controls of Admixed American, African, East Asian, European, Maghrebian and South Asian populations (**Figure 1, Figure S1, Table S1**).To account for population-driven effect heterogeneity, we also performed a meta-regression with MR-MEGA27. Eight loci (four new, and all outside HLA) were significant (MR-MEGA p < 5 × 10−8) (**Table 1, Figure 2A, Figure S2**). The lead SNPs of the novel loci were all intronic: **(1)** rs7759971 in Abelson Helper Integration Site 1*(AHI1*; p = 4.90 × 10−12); **(2)** rs55730955 in CD28 molecule (*CD28*; p = 4.27 × 10−10); **(3)** rs8062322 in C-type Lectin Domain Containing 16A **(***CLEC16A*; p = 1.61 × 10−10); **(4)** rs28862935 in betacellulin **(***BTC*; p = 1.08 × 10−9). Four other significant loci were previously reported24,23. Two more significant loci emerged after conditioning: **(5)** rs1794497 upstream of *HLA-DQB1*, (p = 6.79 × 10−52); **(6)** rs2256318 in an intron of MHC Class I Chain-related Gene A (*MICA;* p = 9.70 × 10−18) (**Figure 2B, Figure S3**). Population specific GWAS meta-analysis discovered two additional significant loci in Europeans (**Figure 2C, Table S2, Table S3, Figure S4**): The lead SNPs were in introns of **(7)** rs111796602 in an intron of Engulfment and Cell Motility 1 (*ELMO1;* p = 1.72 × 10−8) and **(8)** rs12911841 in an intron of Mortality Factor 4 Like 1 (*MORF4L1*; p = 3.88 × 10−8). The *CALHM6* association appeared specific to Europeans *(P*anc_het *=* 4.99 × 10−4) and the *TNFSF15* and *NPHS1* associations to East Asians *(P*anc_het *=* 7.76 × 10−4 and *P*anc_het *=* 2.43 10−4, respectively) (**Figure S5, Figure S6**). All other loci had similar effects across populations (**Figure 2D**). Finally, there were 13 novel suggestive loci (MR-MEGA p < 5 × 10−6) in the multi-population GWAS (**Table S4, Table S5**). On a liability scale and excluding HLA, European heritability was 0.04 [CI: -0.08,0.16] and East Asian heritability was 0.12 [CI: 0.04,0.21], with large confidence intervals likely due to small effective sample sizes.

View this table:
[Table 1.](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/T1)

Table 1. Genome-wide significant SNPs from multi-population meta-analysis
EA=effect allele, NEA=non-effect allele, OR [95% CI]= Odds ratio with 95% confidence interval. MR-MEGA results are not available for the conditional analysis.

![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/10/13/2022.09.13.22279644/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/F1)

Figure 1: Flowchart of study design.
pSSNS= pediatric steroid sensitive nephrotic syndrome, EUR = European, EAS = East Asian, eQTL = expression quantitative trait loci.

![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/10/13/2022.09.13.22279644/F2.medium.gif)

[Figure 2.](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/F2)

Figure 2. 
Manhattan plots of **A)** multi-population meta-analysis of 2,440 cases vs. 36,023 controls, **B)** multi-population conditional meta-analysis, **C)** European meta-analysis of 674 cases vs. 6,817 controls. Novel genome-wide association are indicated with in red and only novel associations are labeled in B and C. Discoveries that included the summary statistics from suggestive SNPs available from Dufek et al indicated with ‘+’, **D)** Multi-population & single population odds ratios for novel multi-population significant SNPs discovered

A number of insights emerged from evaluating disease associations, functions, and expression patterns of the lead SNPs and/or the closest genes at the novel non-HLA loci. First, PheWAS using Open Target Genetics ([http://genetics.opentargets.org](http://genetics.opentargets.org))28 found that SNPs at most loci associated with GWAS of diverse white blood cell traits, atopic disorders, and autoimmune conditions. For example, among the strongest associations with the lead SNPs at the following loci include: *CLEC16A, CD28, MICA*, and *ELMO1* with eosinophil counts; *AHI* with monocyte and neutrophil counts, asthma, and hay fever (also shared by *CD28*); and *MICA* with Type 1 Diabetes (T1DM).

Second, most of these genes, while primarily known for their role in immunity, also have known roles in kidney diseases and cells. Common *AHI1* variants are associated with atopy, lupus, and diverse immune cell traits28. Rare *AHI1* coding variants cause the monogenic ciliopathy Joubert Syndrome, which includes cystic kidney disease29. *ELMO1* participates in Rac1 pathway activation and actin cytoskeletal rearrangement30, is expressed in podocytes31, and is associated with diabetic nephropathy32. *CD28*, a T-cell glycoprotein, binds a co-stimulatory molecule B7-1 (CD80) on antigen-presenting cells. B7-1 is expressed on human podocytes in some forms of nephrotic syndrome, and blocking the B7-1/CD28 interaction with a CTLA-4 immunoglobulin can ameliorate proteinuria33. *MICA* is expressed in kidney endothelium, binds and activates cytotoxic CD8+ T cells and NK cells, and has increased glomerular expression in lupus34. *BTC* contributes to inflammation by binding to epidermal growth factor receptor35, a gene whose renal expression is upregulated after kidney injury36. *CLEC16A* takes part in the B cell receptor-dependent HLA-II pathway in human B cells37, but is also significantly expressed in the human podocytes ([https://atlas.kpmp.org](https://atlas.kpmp.org)). It is involved in autophagy, mitophagy, and endolysosomal trafficking in multiple cell types38,39. It is also in close proximity to *CIITA*, a master transcription factor of HLA class II genes40 and Dexamethasone Inducible Transcript (*DEXI)*, a glucocorticoid-induced gene41.

We next turned to discovering specific variants and genes driving these GWAS signals and discerning whether they are acting in immune cells, kidney cells, or both.

First, we conducted colocalization with eQTL data from two functionally distinct kidney compartments (glomerulus and tubulointerstitium)42, multiple tissues from GTEx43, and immune cells from DICE44 and Blueprint45. Overall, pSSNS GWAS SNPs demonstrated significant enrichment in multiple immune cell eQTLs, led by a 69x and 62x increased odds of being monocyte and CD4+ memory Treg eQTLs, respectively (**Figure 3**). On an individual gene level, considering eQTLs with a regional colocalization probability (RCP) > 0.2, nine genes – including three which were the closest gene to a GWAS SNP (*CALHM6, AHI1, TNFSF15) –* colocalized with immune cell eQTLs (Figure 3, Table S6). All three significantly colocalized with monocyte eQTLs. *AHI* had significant eQTLs across monocytes, many T-cell subsets, and naïve B cells. Finally, in CD4+ memory Treg cells, SNPs in the “Gasdermin B (*GSDMB)”* suggestive locus colocalized with two different genes – *GSDMB* and ORMDL sphingolipid biosynthesis regulator 3 (*ORMDL3)*. The *GSDMB/ORMDL3* locus is associated with multiple autoimmune disorders and eosinophilic inflammation-driven asthma46. In asthma, higher *GSDMB* expression is correlated with increased interferon signaling and MHC class I antigen presentation47. Notably, there was no colocalization with kidney eQTLs despite sufficient sample sizes to do so (**Figure S7**).

![Figure 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/10/13/2022.09.13.22279644/F3.medium.gif)

[Figure 3.](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/F3)

Figure 3. Colocalization of SSNS GWAS and eQTL datasets.
Each eQTL data set is labeled with colocalized loci to the left and enrichment estimates to the right. Genes with regional colocalization probability (RCP) > 0.2 in at least one tissue/cell are included. pSSNS GWAS loci that colocalized with tissue/cell-type eQTLs are indicated by black dots, with larger dots indicating higher RCP. GTEx tissues without associations are excluded from this figure (see Figure S7). Enrichment estimates, from fastENLOC, are based on genome-wide summary statistics from GWAS and include a shrinkage parameter, resulting in 0 enrichment for multiple tissues/cell-types. **logOR=2 ∼ OR=7.5, logOR=3 ∼ OR=20.1, logOR=4 ∼ OR=50.6**.

We then created a 95% credible set for all non-HLA significant loci and assessed their overlap with ATAC-seq derived open chromatin data from immune48 and kidney cells49,50 (**Table S7**). The SNPs with the highest posterior inclusion probability (PIP) for *AHI*, rs7759971 (PIP 0.52), overlapped with open chromatin of multiple immune cell types, including CD34+ cells, common lymphoid and myeloid progenitors, hematopoietic stem cells, and multipotent progenitors. The top PIP SNP for *CLEC16A, NPHS1, CD28, CALHM6*, and *TNFSF15* had no overlap with open chromatin. However, each locus had individual SNPs with lower PIPs that overlapped with both immune and kidney cell open chromatin.

We next fine-mapped the HLA signal (**Table S8**) We first imputed across the extended MHC region using a multi-population HLA imputation panel 51, resulting in 640 classical HLA alleles, 4,513 amino acids in HLA proteins and 49,321 SNPs in the extended MHC region for association. We used population-specific and multi-population SNP-level logistic regression, to identify specific SNPs and classical alleles associated with pSSNS (**Table S9, Supplement Note 1, Table S10, Figure S8**).

We next turned to discovering specific HLA amino acid positions most associated with risk of pSSNS through logistic regression analysis of all residues at each position. Amino acid position 47 in *HLA-DQA1* was the most strongly associated with pSSNS (*P*omnibus = 7.73 × 10−83) (**Table S11, Figure S9**). Arginine was the most frequent amino acid; a substitution to lysine conferred the greatest disease risk (*P* = 5.70 × 10−80; OR [95% CI] =3.62 [3.17 – 4.14]). A second association in near-perfect linkage disequilibrium was identified at *HLA-DQA1* position 52 (*P* = 1.14 × 10−82). Arginine was again the most common amino acid at this position, and a substitution to serine conferred the greatest protection from risk (*P* = 1.00 × 10−28; OR = 0.53 [0.47 – 0.59]). After conditioning, an independent association was discovered at *HLA-DQB1* position 26 (*P* = 3.22 × 10−13). A change from the most common amino acid leucine to glycine conferred the most significant protection (*P* = 4.75 × 10−12; OR = 0.64 [0.60 – 0.73]). A haplotype analysis identified the 47lysine-52histidine haplotype associated with greatest increased odds of pSSNS (**Figure 4a**). *HLA-DQA1* position 47 is located on the outside of the peptide-binding groove and acts as a regulator of binding stability, which, when altered, has been suggested to mediate the development of autoimmune disorders52. Arginine at *HLA-DQA1* position 52 has been associated with autoimmune disorders, including T1DM53.

![Figure 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/10/13/2022.09.13.22279644/F4.medium.gif)

[Figure 4.](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/F4)

Figure 4. *HLA-DQA1* amino acid risk associations.
**A)** Increased risk and predicted stability change of the two-amino acid residue hapltoypes at *HLA-DQA1* positions 47 and 52. Odds ratios are from a joint logistic regression with arginine47-serine52. The reference haplytype is the one conferring the strongest protection (i.e., odds ratio indicate increase in risk compared to arginine47-serine52). Increasingly negative “Predicted Stability Change” indicates increasingly decreased stability. **B)** Protein structure for the reference haplotype arginine47-serine52 (left, blue) and lysine47-histidine52 (right, red). The residues in lime color displays potential interacting amino acid with mutated amino acids. The color scheme for interaction is as follows: Cyan for Van der Waals, red for hydrogen bonds, green for hydrophobic bonds, sky blue for Carbonyl bonds and orange for polar bonds. When no bonds are displayed but the amino acids are shown they were predicted to form weak VDW bonds.

We then used DynaMut254 to model the impact on protein structural stability of two amino acid haplotypes (**Figure 4b**). This is quantified by Delta Delta G (ddG), where ddG < 0 predicts unstable structure. The haplotype consisting of lysine (47) and histidine (52) predicted the most instability (ddG = -3.64). Notably, the predicted increase in protein instability and increased odds of disease for each haplotype were concordant. This suggests that pSSNS-associated haplotypes increased the odds of disease by increasing the instability of *HLA-DQA1* and altering its ability to properly form a stable HLA-II molecule.

Finally, we generated a pSSNS PRS using summary statistics of 1,607 cases and 11,995 controls from European and East Asian populations. We first tested its association with demographic and clinical phenotypes in 233 European children with sufficient clinical data from the EU-European sub-cohort, after adjusting for genetic principal components. There was a significant negative association observed between PRS and age of disease onset (*P* = 1.49 × 10−4, **Figure 5A, Table S12**). We then assessed the PRS in an independent cohort of 165 children with proteinuric kidney disease enrolled in the Nephrotic Syndrome Study Network (NEPTUNE)55. 30%, 49%, and 21% of all participants had focal segmental glomerulosclerosis, minimal change disease, or no biopsy, respectively. Adjusting for sex, histology and genetic principal components, we found a significant association between PRS and age of onset (*P* = 0.003; **Figure 5B, Table S12, Figure S10**).

![Figure 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2022/10/13/2022.09.13.22279644/F5.medium.gif)

[Figure 5.](http://medrxiv.org/content/early/2022/10/13/2022.09.13.22279644/F5)

Figure 5. Association between pSSNS PRS quartiles and age of onset
in pediatric SSNS patients from **A)** EU-European and **B)** NEPTUNE pediatric cohorts.

A number of important discoveries emerged from this study. First, we tripled the number of pSSNS from four to twelve. Second, we found that while the immunological connections with the lead SNPs and closest genes in these newly discovered loci are well-established, most of them also have a *bona fide*, but overall less understood role, in kidney cells and diseases. Discovering upon which cells and organ systems each of these risk loci act will be an important future step. The availability of omics data from pediatric kidney tissue will be critical, as we hypothesize that the paucity of kidney eQTL signals we observed may be due to mapping pSSNS GWAS data to signatures primarily from adult tissues. Third, we identified monocytes and eosinophils dysfunction as a potential contributor to the pathogenesis of pSSNS. The mechanisms by which genetically driven changes in these cell types contribute to pSSNS onset is an important area of future inquiry. Fourth, we discovered specific amino acid changes in HLA-DQA1 and -DQB1 associated with pSSNS that should help will empower subsequent studies to illuminate pathomechanisms at the most significant risk locus for this disease. Finally, the association of higher PRS with younger age of onset suggests that a stronger genetic predisposition to disease lowers the threshold of an individual to develop pSSNS in the context of environmental factors and may ultimately help share clinical screening and care. In conclusion, these findings expand our knowledge of the genetic architecture of pSSNS and accelerate our understanding of its molecular underpinnings and clinical implications.

## METHODS

### GWAS data summary

#### GWAS data from NEPHROVIR/EU

Sample collection and genotype calling were done at Sorbonne Université in Paris. Pediatric steroid sensitive nephrotic syndrome was defined as proteinuria > 0.25g/mmol, serum albumin <25 g/L (<30 in France), full response within 4 weeks of 60 mg/m2/day of oral prednisone or prednisolone, and age of onset < 18 years old. 244 previously reported European patients from the NEPHROVIR study21 were combined with 159 newly recruited European patients recruited from France, Lithuania, Poland, Russia, Italy. Healthy adult controls (n=300) were recruited from Lyon, France, and combined with population-matched controls from The 1000 Genomes Project Phase 3 release (n=493)56. There were also 56 sub-Saharan African cases with 454 population-matched controls from The 1000 Genomes Project and 85 Maghrebian cases with 261 Moroccan population-matched controls. Both were reanalyzed from a previous report21. There were 160 Indian cases with 93 population-matched controls. Samples were genotyped on the Illumina Human OmniExpress or Illumina Omni 2.5 arrays.

#### GWAS data from Columbia University (US Cohorts)

Sample collection and genotype calling was done at Columbia University in New York. Cases were defined by local recruitment centers across the US, Europe, and Brazil as either minimal change disease or non-biopsied SSNS with age of disease onset <21. Five cohorts from Columbia University consisted of patients from European (ncases=371, ncontrols-=4359), East Asian (ncases=17, ncontrols=443), sub-Saharan African (ncases=65, ncontrols=7344), South Asian (ncases =39, ncontrols=534) and Admixed American (ncases =109, ncontrols=13266) populations. The genotyping of the cases was performed using multiple versions of MEGA (Multi-Ethnic Global Array) chips that includes MEGA 1.0, MEGA 1.1 and MEGAEX. The controls that were genotyped on MEGA1.0 were downloaded from NCBI dbGAP (IDAT files) from the PAGE consortium57. The differences between the chips were corrected first by mapping all the SNPs to a common cluster file in Genome Studio for individual MEGA platforms and further using Snpflip ([https://github.com/biocore-ntnu/snpflip](https://github.com/biocore-ntnu/snpflip)) software.

#### GWAS data from Kobe University

Pediatric steroid sensitive nephrotic syndrome cases were defined as urine protein to creatinine ratio ≥ 2.0, serum albumin ≤ 2.5 g/dl, and complete remission with 4-6 weeks after starting 6-mg/m12 oral prednisolone per day and age of onset < 18 years old. Three GWAS studies of SSNS in Japanese (ncases= 987, ncontrols =3206), Korean (ncases= 243, ncontrols=4041) and Thai (ncases=65, ncontrols=94) population were completed at The University of Tokyo, Japan. The Japanese and Korean GWAS data have been previously reported22,23. The Thai dataset was genotyped with the Axiom array.

### Dataset QC, imputation and GWAS

Quality control, imputation, and GWAS were conducted separately for each study location and population. GC lambda (GCθ) was used to assess inflation in all studies. The final case and control sizes and the number of variants tested can be found in Table S1 and Figure S1. Figure S11 shows matching of cases and controls in PCA plots. Manhattan plots and GC can be found in Figure S12 and genome-wide significant hits resulting from dataset GWAS are in table S13.

#### *GWAS data from NEPHROVIR/EU*: EU-European, EU-African, Maghrebian, Indian

Each file was quality controlled separately to remove related individuals (IBD > 0.1875), low call rate (genotype rate < 98%), and cases with discordant sex. SNPs were quality controlled for allele frequency (MAF < 0.01), call rate (genotype rate < 98%) in all cohorts, and Hardy Weinberg equilibrium (HWE P<1×10−5) in controls only. The EU-European datasets were generated in multiple files and were merged stepwise on the common subset of SNPs, with the previous QC procedure reapplied after each merge. PCA plots were constructed from PLINK v1.9 to identify population outliers and check for batch effects58. Pre-imputation QC was conducted using McCarthy Tools with the TOPMed reference panel to check strand alignment and allele assignment. Insertions and deletions were excluded prior to imputation. Each population was imputed separately and cases and controls were imputed together on the TOPMed Imputation Server with the TOPMed r2 reference panel59–61. The QC was repeated after imputation and SNPs with low imputation quality (rsq < 0.3) were excluded. After imputation, UCSC Liftover62 was used to convert SNP positions from each population dataset to build GRCh37 to match the build of summary statistics from other analyses. The association tests were completed using PLINK v1.9 under an additive model with principal component adjustment to account for population stratification.

#### GWAS data from Columbia University (US Cohorts)

US-European, US-African, US-South Asian, US-East Asian and Admixed American Population was assigned by KING63 kinship analysis software and based on continental population as defined by the 1000 Genomes Project for all cases and controls64,65. Within each continental population (EUR, AFR, AMR, SAS and EAS), we removed variants with genotype rate < 99%, MAF < 0.01 and HWE P<1×10−5. Each population was imputed separately with the TOPMed r2 panel59–61. After imputation, we removed first-degree relatives using KING, and variants with R2 < 0.8, MAF < 0.01 and HWE P<1×10−5. Principal components were calculated with FlashPCA66. For cohorts with large case/control imbalances (Admixed American and US-African), we used the SAIGE logistic model65 for calculating P-value and generating summary statistics. Association tests for European, South and East Asian were completed using PLINK v1.9 under an additive model with principal component adjustment to account for population stratification58.

#### *GWAS data from Kobe University*: Japanese, Korean and Thai

Quality control and analysis of the Japanese and Korean datasets are previously described in Jia et.al23. For the Thai analysis, SNPs were removed that had info score < 0.9, MAF < 0.005, call rate < 97%, or HWE P < 1×10−5. Individuals with missing rate > 3%, IBD > 0.1875 and PCA outliers were removed. Genotypes were imputed with The 1000 Genomes reference panel using SHAPEIT67 and IMPUTE268. Logistic regression was performed with Plink1.9 and p-values were adjusted for genomic control (GC).

### Population-specific and multi-population meta-analysis

For each population-specific meta-analysis and the multi-population one, we conducted an inverse-variance, fixed-effect meta-analysis using METAL with adjustment for population stratification (GC) on each input dataset and assessment for heterogeneity selected69. For within-population meta-analyses, we removed variants with heterogeneity P-value < 0.05. All significant associations were visually inspected and single SNPs that did not follow the expected LD trend were not taken forward for further consideration.

For the European meta-analysis, we included available summary statistics of suggestive SNPs from Dufek et.al24, increasing the European sample size to 1,096 cases and 12,459 controls. The resulting suggestive and genome-wide significant SNPs are annotated in tables and figures.

### Multi-population meta-regression with MR-MEGA

To account for and assess heterogeneous loci, we conducted a meta-regression using MR-MEGA27. We included three principal components, which captured the population structure across all 12 datasets. This allowed us to stratify heterogeneity into residual heterogeneity and heterogeneity that correlates with population. We visualized each dataset’s PCs from MR-MEGA with the dataset-specific log odds ratio from METAL for each variant with heterogeneity that correlated with population. We adjusted for genomic control at the study level and after meta-regression to account for population structure within and between datasets. SNPs present in less than five studies were excluded. GC lambda (GCθ) was used to assess inflation. Results tables include summaries from both METAL and MR-MEGA analyses (Table 1). All Manhattan plots were generated with the qqman R package [doi: 10.21105/joss.00731.] and LocusZoom web tool70. All significant loci are > 1Mb from each other with r2 < 0.1. Loci were labeled by nearest genes.

### Conditional analyses

To identify independent secondary signals at the candidate loci, we used GCTA COJO71,72 to conduct approximate conditional analyses based on cohort-specific meta-analysis summary statistics. Conditional analysis was conducted in each dataset, with an LD reference generated from the dataset samples, due to differences in linkage disequilibrium structure between continental populations. Each cohort was conditioned for the eight independent loci identified from the initial meta-analysis. Multi-population meta-analysis of the conditioned cohorts was repeated in METAL69 to assess multi-population genome-wide significant secondary loci after GCTA.

### Heritability estimates

SNP-based heritability was estimated on a liability scale with LD score regression (LDSC)73 using a population prevalence of 16/100,000 and excluding HLA [chr6:25,000,000-34,000,000]. We used non-GC corrected population-specific meta-analysis summary statistics from METAL and pre-computed LD scores generated from 1000 Genomes EUR or EAS samples ([https://alkesgroup.broadinstitute.org/LDSCORE/](https://alkesgroup.broadinstitute.org/LDSCORE/)).

### Colocalization of SSNS GWAS variants and eQTLs datasets

We used fast enrichment estimation aided colocalization analysis (fastENLOC)74 for colocalization analysis with glomerular (n=240) and tubulointerstitial (n=311) eQTLs from nephrotic syndrome patients55, GTEx tissues (varied sample sizes), and immune eQTLs from both Blueprint45 (n=200) and DICE 44 (n=91) databases. Posterior probabilities for SSNS GWAS variants were calculated from MR-MEGA Z-scores using TORUS75. We used an LD panel from European and East Asian 1000 Genomes samples to define haplotype blocks in the pSSNS meta-analysis56. Enrichment of pSSNS GWAS variants in each eQTL dataset was estimated using fastENLOC and subsequently informed prior probabilities for each analysis. For colocalization with our kidney eQTLs, which had available raw data, we could identify multiple eQTL signals per gene and multiple colocalized signals at each locus. For all other data, in which only summary statistics were available, we assumed at most one colocalized signal at each locus and did not account for LD.

### Open chromatin annotation of credible sets

95% credible sets were constructed for each independent locus identified from the multi-population meta-regression from Bayes’ factors reported by MR-MEGA. Posterior inclusion probability (PIP) was estimated by dividing each Bayes’ factor by the summation of Bayes’ factors across all variants within a +/- 1Mb from the lead locus76.

SNPs within 95% credible sets of our genome-wide significant loci were evaluated for positional overlap based on the boundaries of known open chromatin peaks in kidney49 and immune77 cell types. Open chromatin peaks were identified by MACS2 peak calling algorithm and optimized by gkmQC50.

### HLA imputation and analysis

To fine-map the HLA region, we conducted HLA imputation with the four-digit multi-ethnic v2 reference panel on Michigan Imputation Server51. Cohorts were imputed individually to optimize population-specific structure within the HLA region. The imputed cohorts were then merged for multi-population associations. We used HLA-TAPAS ‘assoc’ module to conduct a logistic regression of the HLA region of the multi-population and population-specific datasets. For population-specific analyses, we adjusted for genotype-based principal components from Plinkv1.958. The population-specific principal components and continental populations were included as covariates in the multi-population analysis. HLA-TAPAS was also used to conduct a stepwise conditional analysis, conditioning on the locus with the smallest association P-value. We additionally performed an omnibus test on the population-specific and multi-population cohorts to assess significance by amino acid position.

### HLA modeling

To predict the reference (with arginine at position 47 and serine at position 52) structure of *HLA-DQA1* we extracted the sequence of *HLA-DQA1* from UNIPROT database (Uniprot ID: P01909). We used NCBI BLASTp against PDB database to find the closest structure associated with the amino acid sequence of P01909. We identified the top hit as 6PX6_A (HLA-TCR complex, E = 2e-161) for the *HLA-DQA1* sequence78. We extracted the PDB coordinates for chain A from the 6PX6 and visualized in PYMOL v2.5. Since the most common amino acid haplotype in the control population was arginine (47) and serine (52), we performed mutagenesis using PYMOL to model the reference protein 3-D structure79.

In brief, we used the mutagenesis tool from PYMOL and selected the rotamer (most likely amino acid conformation) for arginine and serine which showed the minimum number of clashes with nearby atoms. Afterwards, we adjusted the conformation of nearby atoms (within 5 Angstrom) to minimum free state using “Clean” command in PyMOL which uses MMFF94 force field80. Though point mutations locally affect the conformation of the protein, they can result in torsion, bending and stretching of the entire molecule. Therefore, we exported the protein structure to SPDBV software for further refinement81.

We first fixed all the side chains of all amino acids to the best rotamer conformation using the simulated annealing method. Afterwards, we performed energy minimization using GROMOS 96 force field to extract the 3D coordinates that represent the lowest minimum energy conformation82. The refined protein structure of *HLA-DQA1* was then assessed for changes in stability of protein for both amino acid combinations for each haplotype using “MULTIPLE MUTATION” in DynMut2 server54. The instability of HLA-DQA1 was evaluated using the predicted ddG parameter which measures changes in Gibbs free energy between the folded and unfolded states and the change in folding when a mutation is present. The interaction among amino acids in reference and mutated structure were predicted using Arpeggio83 and visualized in PyMOL.

### Polygenic risk score analysis

#### Construction of the polygenic risk score (PRS)

To investigate genetic risk across the genome, we generated a polygenic risk score (PRS) from the GWAS of European (US-European) and East Asian (Japanese, Korean, US-East Asian) populations using PRS-CSx84. Population-specific weights estimated by PRS-CSx were used as input, along with a testing/training dataset, for optimization of the gamma-gamma priors a and b and the global shrinkage parameter used in the PRS-CSx model. Our main objective was to test clinical associations with PRS within case cohorts. We used case/control data and prediction accuracy to choose the best PRS model parameters. We fine-tuned and estimated PRS population weights in the EU-European and NEPTUNE separately. Since NEPTUNE is a case-only cohort, we included population--matched controls from the 1000 Genomes Project. In each dataset, 80% of cases and controls were randomly selected for training and 20% for testing. We varied the PRS-CSx hyper-parameters and chose the combination that gave the best prediction accuracy. The prediction accuracy with an area under the receiver operating characteristic curve was 0.74 (95%CI: 0.72-0.77) in EU-European dataset and 0.64 (0.61-0.68) in NEPTUNE.

#### Clinical associations with PRS

The PRS was applied to pediatric participants from NEPTUNE55 (n=165), and the EU-European data for which clinical data were available (n=239). For bivariate tests, we used the Wilcoxon test for binary traits, Kruskal-Wallis for categorical traits and univariate linear regression for continuous traits. Population in NEPTUNE was predicted using Peddy85 and the 1000 Genomes reference panel. For associations with the age of onset, we conducted linear regression adjusting for sex, four genetic principal components, and histology (NEPTUNE only).

## Supporting information

Supplement [[supplements/279644_file04.pdf]](pending:yes)

## Data Availability

All data produced in this present study are available upon reasonable request to the authors and will be subsequently uploaded to a data-sharing portal.

## Footnotes

*   * Co-senior authors

*   Author affiliations updated. Figure 2B and 2C legends revised.

*   Received September 13, 2022.
*   Revision received October 13, 2022.
*   Accepted October 13, 2022.


*   © 2022, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/)

## REFERENCES

1.  1.Noone, D. G., Iijima, K. & Parekh, R. Idiopathic nephrotic syndrome in children. The Lancet 392, 61–74 (2018).
    
    
2.  2.Gipson, D. S. et al. Gaining the PROMIS perspective from children with nephrotic syndrome: a Midwest pediatric nephrology consortium study. Health Qual. Life Outcomes 11, 30 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1477-7525-11-30&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23510630&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

3.  3.Ruth, E. M., Landolt, M. A., Neuhaus, T. J. & Kemper, M. J. Health-related quality of life and psychosocial adjustment in steroid-sensitive nephrotic syndrome. J. Pediatr. 145, 778–83 (2004).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jpeds.2004.08.022&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15580200&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000225702300014&link_type=ISI) 

4.  4.Kerlin, B. A. et al. Epidemiology and risk factors for thromboembolic complications of childhood nephrotic syndrome: a Midwest Pediatric Nephrology Consortium (MWPNC) study. J. Pediatr. 155, 105–110, 110.e1 (2009).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jpeds.2009.01.070&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19394032&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000267672600028&link_type=ISI) 

5.  5.Hingorani, S. R., Weiss, N. S. & Watkins, S. L. Predictors of peritonitis in children with nephrotic syndrome. Pediatr. Nephrol. Berl. Ger. 17, 678–682 (2002).
    
    
6.  6.Rheault, M. N. et al. AKI in Children Hospitalized with Nephrotic Syndrome. Clin. J. Am. Soc.Nephrol. CJASN 10, 2110–2118 (2015).
    
    
7.  7.Sato, M. et al. Prognosis and acute complications at the first onset of idiopathic nephrotic syndrome in children: a nationwide survey in Japan (JP-SHINE study). Nephrol. Dial. Transplant. Off. Publ. Eur. Dial. Transpl. Assoc. - Eur. Ren. Assoc. 36, 475–481 (2021).
    
    
8.  8.Ding, W. Y. et al. Initial steroid sensitivity in children with steroid-resistant nephrotic syndrome predicts post-transplant recurrence. J. Am. Soc. Nephrol. JASN 25, 1342–1348 (2014).
    
    
9.  9.Korsgaard, T., Andersen, R. F., Joshi, S., Hagstrøm, S. & Rittig, S. Childhood onset steroid-sensitive nephrotic syndrome continues into adulthood. Pediatr. Nephrol. Berl. Ger. 34, 641–648 (2019).
    
    
10. 10.Ishikura, K. et al. Morbidity in children with frequently relapsing nephrosis: 10-year follow-up of a randomized controlled trial. Pediatr. Nephrol. Berl. Ger. 30, 459–468 (2015).
    
    
11. 11.Fakhouri, F. et al. Steroid-sensitive nephrotic syndrome: from childhood to adulthood. Am. J. Kidney Dis. Off. J. Natl. Kidney Found. 41, 550–557 (2003).
    
    
12. 12.Kyrieleis, H. A. C. et al. Long-term outcome of biopsy-proven, frequently relapsing minimal-change nephrotic syndrome in children. Clin. J. Am. Soc. Nephrol. CJASN 4, 1593–1600 (2009).
    
    
13. 13.Skrzypczyk, P. et al. Long-term outcomes in idiopathic nephrotic syndrome: from childhood to adulthood. Clin. Nephrol. 81, 166–173 (2014).
    
    
14. 14.Trompeter, R. S., Lloyd, B. W., Hicks, J., White, R. H. & Cameron, J. S. Long-term outcome for children with minimal-change nephrotic syndrome. Lancet Lond. Engl. 1, 368–370 (1985).
    
    
15. 15.Aydin, M. et al. The long-term outcome of childhood nephrotic syndrome in Germany: a cross-sectional study. Clin. Exp. Nephrol. 23, 676–688 (2019).
    
    
16. 16.Lee, J. M., Kronbichler, A., Shin, J. I. & Oh, J. Review on long-term non-renal complications of childhood nephrotic syndrome. Acta Paediatr. Oslo Nor. 1992 109, 460–470 (2020).
    
    
17. 17.Hjorten, R., Anwar, Z. & Reidy, K. J. Long-term Outcomes of Childhood Onset Nephrotic Syndrome. Front. Pediatr. 4, 53 (2016).
    
    
18. 18.Shalhoub, R. J. Pathogenesis of lipoid nephrosis: a disorder of T-cell function. Lancet Lond. Engl. 2, 556–560 (1974).
    
    
19. 19.Iijima, K., Sako, M., Kamei, K. & Nozu, K. Rituximab in steroid-sensitive nephrotic syndrome: lessons from clinical trials. Pediatr. Nephrol. Berl. Ger. 33, 1449–1455 (2018).
    
    
20. 20.Gbadegesin, R. A. et al. HLA-DQA1 and PLCG2 Are Candidate Risk Loci for Childhood-Onset Steroid-Sensitive Nephrotic Syndrome. J. Am. Soc. Nephrol. JASN 26, 1701–1710 (2015).
    
    
21. 21.Debiec, H. et al. Transethnic, Genome-Wide Analysis Reveals Immune-Related Risk Alleles and Phenotypic Correlates in Pediatric Steroid-Sensitive Nephrotic Syndrome. J. Am. Soc. Nephrol. JASN 29, 2000–2013 (2018).
    
    
22. 22.Jia, X. et al. Strong Association of the HLA-DR/DQ Locus with Childhood Steroid-Sensitive Nephrotic Syndrome in the Japanese Population. J. Am. Soc. Nephrol. JASN 29, 2189–2199 (2018).
    
    
23. 23.Jia, X. et al. Common risk variants in NPHS1 and TNFSF15 are associated with childhood steroid-sensitive nephrotic syndrome. Kidney Int. 98, 1308–1322 (2020).
    
    
24. 24.Dufek, S. et al. Genetic Identification of Two Novel Loci Associated with Steroid-Sensitive Nephrotic Syndrome. J. Am. Soc. Nephrol. JASN 30, 1375–1384 (2019).
    
    
25. 25.Julia, A. et al. A genome-wide association study identifies a novel locus at 6q22.1 associated with ulcerative colitis. Hum. Mol. Genet. 23, 6927–34 (2014).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/hmg/ddu398&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25082827&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000347923000020&link_type=ISI) 

26. 26.Ovunc, B. et al. Mutation analysis of NPHS1 in a worldwide cohort of congenital nephrotic syndrome patients. Nephron Clin. Pract. 120, c139–146 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1159/000337379&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22584503&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

27. 27.Mägi, R. et al. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 26, 3639–3650 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/hmg/ddx280&link_type=DOI) 

28. 28.Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

29. 29.Ferland, R. J. et al. Abnormal cerebellar development and axonal decussation due to mutations in AHI1 in Joubert syndrome. Nat. Genet. 36, 1008–1013 (2004).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng1419&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15322546&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223658100030&link_type=ISI) 

30. 30.Kukimoto-Niino, M. et al. Cryo-EM structure of the human ELMO1-DOCK5-Rac1 complex. Sci. Adv. 7, eabg3147 (2021).
    
    [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czo4OiJhZHZhbmNlcyI7czo1OiJyZXNpZCI7czoxMzoiNy8zMC9lYWJnMzE0NyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEwLzEzLzIwMjIuMDkuMTMuMjIyNzk2NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

31. 31.Sharma, K. R. et al. ELMO1 protects renal structure and ultrafiltration in kidney development and under diabetic conditions. Sci. Rep. 6, 37172 (2016).
    
    
32. 32.Shimazaki, A. et al. Genetic variations in the gene encoding ELMO1 are associated with susceptibility to diabetic nephropathy. Diabetes 54, 1171–1178 (2005).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZGlhYmV0ZXMiO3M6NToicmVzaWQiO3M6OToiNTQvNC8xMTcxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTAvMTMvMjAyMi4wOS4xMy4yMjI3OTY0NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

33. 33.Yu, C.-C. et al. Abatacept in B7-1–Positive Proteinuric Kidney Disease. N. Engl. J. Med. 369, 2416–2423 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa1304572&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24206430&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000328586000009&link_type=ISI) 

34. 34.Spada, R. et al. NKG2D ligand overexpression in lupus nephritis correlates with increased NK cell activity and differentiation in kidneys but not in the periphery. J. Leukoc. Biol. 97, 583–598 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1189/jlb.4A0714-326R&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25583577&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

35. 35.Rayego-Mateos, S. et al. Role of Epidermal Growth Factor Receptor (EGFR) and Its Ligands in Kidney Inflammation and Damage. Mediators Inflamm. 2018, 8739473 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1155/2018/8739473&link_type=DOI) 

36. 36.Ju, W. et al. Tissue transcriptome-driven identification of epidermal growth factor as a chronic kidney disease biomarker. Sci Transl Med 7, 316ra193 (2015).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6InNjaXRyYW5zbWVkIjtzOjU6InJlc2lkIjtzOjE0OiI3LzMxNi8zMTZyYTE5MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEwLzEzLzIwMjIuMDkuMTMuMjIyNzk2NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

37. 37.Rijvers, L. et al. The Role of Autoimmunity-Related Gene CLEC16A in the B Cell Receptor-Mediated HLA Class II Pathway. J. Immunol. Baltim. Md 1950 205, 945–956 (2020).
    
    
38. 38.Tam, R. C. Y. et al. Human CLEC16A regulates autophagy through modulating mTOR activity. Exp. Cell Res. 352, 304–312 (2017).
    
    
39. 39.Pearson, G. et al. Clec16a, Nrdp1, and USP8 Form a Ubiquitin-Dependent Tripartite Complex That Regulates β-Cell Mitophagy. Diabetes 67, 265–277 (2018).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZGlhYmV0ZXMiO3M6NToicmVzaWQiO3M6ODoiNjcvMi8yNjUiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMC8xMy8yMDIyLjA5LjEzLjIyMjc5NjQ0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

40. 40.Steimle, V., Siegrist, C. A., Mottet, A., Lisowska-Grospierre, B. & Mach, B. Regulation of MHC class II expression by interferon-gamma mediated by the transactivator gene CIITA. Science 265, 106–109 (1994).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIyNjUvNTE2OC8xMDYiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMC8xMy8yMDIyLjA5LjEzLjIyMjc5NjQ0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

41. 41.Edgar, A. J., Birks, E. J., Yacoub, M. H. & Polak, J. M. Cloning of dexamethasone-induced transcript: a novel glucocorticoid-induced gene that is upregulated in emphysema. Am. J. Respir. Cell Mol. Biol. 25, 119–124 (2001).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1165/ajrcmb.25.1.4417&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11472984&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000170221800019&link_type=ISI) 

42. 42.Han, S. K. et al. Mapping genomic regulation of kidney disease and traits through high-resolution and interpretable eQTLs. 2022.06.01.494352 Preprint at [https://doi.org/10.1101/2022.06.01.494352](https://doi.org/10.1101/2022.06.01.494352) (2022).
    
    
43. 43. GTEx Consortium et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-018-0154-4&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29955180&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

44. 44.Schmiedel, B. J. et al. Impact of Genetic Polymorphisms on Human Immune Cell Gene Expression. Cell 175, 1701-1715.e16 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2018.10.022&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30449622&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

45. 45.Fernández, J. M. et al. The BLUEPRINT Data Analysis Portal. Cell Syst. 3, 491-495.e5 (2016).
    
    
46. 46.Li, X. et al. Genetic analyses identify GSDMB associated with asthma severity, exacerbations, and antiviral pathways. J. Allergy Clin. Immunol. 147, 894–909 (2021).
    
    
47. 47.1.  Alt, F. W.
    
    Das, S., Miller, M. & Broide, D. H. Chapter One - Chromosome 17q21 Genes ORMDL3 and GSDMB in Asthma and Immune Diseases. in Advances in Immunology (ed. Alt, F. W.) vol. 135 1– 52 (Academic Press, 2017).
    
    
48. 48.Ecker, S. et al. Genome-wide analysis of differential transcriptional and epigenetic variability across human immune cell types. Genome Biol. 18, 18 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13059-017-1156-8&link_type=DOI) 

49. 49.Muto, Y. et al. Single cell transcriptional and chromatin accessibility profiling redefine cellular heterogeneity in the adult human kidney. Nat. Commun. 12, 2190 (2021).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41467-021-22368-w&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33850129&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

50. 50.Han, S. K. et al. Quality assessment and refinement of chromatin accessibility data using a sequence-based predictive model. 2022.02.24.481844 Preprint at [https://doi.org/10.1101/2022.02.24.481844](https://doi.org/10.1101/2022.02.24.481844) (2022).
    
    
51. 51.Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ethnic fine-mapping in HIV host response. [http://medrxiv.org/lookup/doi/10.1101/2020.07.16.20155606](http://medrxiv.org/lookup/doi/10.1101/2020.07.16.20155606) (2020) doi:10.1101/2020.07.16.20155606.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoibWVkcnhpdiI7czo1OiJyZXNpZCI7czoyMToiMjAyMC4wNy4xNi4yMDE1NTYwNnYxIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTAvMTMvMjAyMi4wOS4xMy4yMjI3OTY0NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

52. 52.Frommer, L., Flesch, B. K., König, J. & Kahaly, G. J. Amino Acid Polymorphisms in Hla Class II Differentiate Between Thyroid and Polyglandular Autoimmunity. J. Clin. Endocrinol. Metab. 105, dgz164 (2020).
    
    
53. 53.Badenhoop, K. et al. Susceptibility and resistance alleles of human leukocyte antigen (HLA) DQA1 and HLA DQB1 are shared in endocrine autoimmune disease. J. Clin. Endocrinol. Metab. 80, 2112–2117 (1995).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1210/jc.80.7.2112&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=7608264&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995RP49200023&link_type=ISI) 

54. 54.Rodrigues, C. H. M., Pires, D. E. V. & Ascher, D. B. DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutations. Protein Sci. Publ. Protein Soc. 30, 60–69 (2021).
    
    
55. 55.Gadegbeku, C. A. et al. Design of the Nephrotic Syndrome Study Network (NEPTUNE) to evaluate primary glomerular nephropathy by a multidisciplinary approach. Kidney Int. 83, 749–756 (2013).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ki.2012.428&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23325076&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000316945400027&link_type=ISI) 

56. 56.1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature15393&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26432245&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

57. 57.Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-019-1310-4&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

58. 58.Purcell, S. et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/519795&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17701901&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

59. 59.Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. bioRxiv 563866 (2019) doi:10.1101/563866.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYmlvcnhpdiI7czo1OiJyZXNpZCI7czo4OiI1NjM4NjZ2MSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEwLzEzLzIwMjIuMDkuMTMuMjIyNzk2NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

60. 60.Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284– 1287 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3656&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27571263&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

61. 61.Fuchsberger, C., Abecasis, G. R. & Hinds, D. A. minimac2: faster genotype imputation. Bioinformatics 31, 782–784 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btu704&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25338720&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

62. 62.Hinrichs, A. S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkj144&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16381938&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000239307700126&link_type=ISI) 

63. 63.Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btq559&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20926424&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000283919800010&link_type=ISI) 

64. 64.Clarke, L. et al. The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data. Nucleic Acids Res. 45, D854– D859 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/nar/gkw829&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27638885&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

65. 65.Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-018-0184-y&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30104761&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

66. 66.Abraham, G., Qiu, Y. & Inouye, M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinforma. Oxf. Engl. 33, 2776–2778 (2017).
    
    
67. 67.Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2011).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nmeth.1785&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22138821&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

68. 68.Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pgen.1000529&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19543373&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

69. 69.Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btq340&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=20616382&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000281738900017&link_type=ISI) 

70. 70.Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinforma. Oxf. Engl. 26, 2336–2337 (2010).
    
    
71. 71.Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: A Tool for Genome-wide Complex Trait Analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ajhg.2010.11.011&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21167468&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

72. 72.Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.2213&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22426310&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

73. 73.Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3211&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25642630&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

74. 74.Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization. PLOS Genet. 13, e1006646 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pgen.1006646&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28278150&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

75. 75.Wen, X., Lee, Y., Luca, F. & Pique-Regi, R. Efficient Integrative Multi-SNP Association Analysis via Deterministic Approximation of Posteriors. Am. J. Hum. Genet. 98, 1114–1129 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ajhg.2016.03.029&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27236919&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

76. 76.Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.2435&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23104008&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

77. 77.Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3646&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

78. 78.Ting, Y. T. et al. A molecular basis for the T cell response in HLA-DQ2.2 mediated celiac disease. Proc. Natl. Acad. Sci. U. S. A. 117, 3063–3073 (2020).
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMDoiMTE3LzYvMzA2MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIyLzEwLzEzLzIwMjIuMDkuMTMuMjIyNzk2NDQuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

79. 79.Mooers, B. H. M. Shortcuts for faster image creation in PyMOL. Protein Sci. Publ. Protein Soc. 29, 268–276 (2020).
    
    
80. 80.Halgren, T. A. MMFF VII.. Characterization of MMFF94, MMFF94s, and other widely available force fields for conformational energies and for intermolecular-interaction energies and geometries. J. Comput. Chem. 20, 730–748 (1999).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/(SICI)1096-987X(199905)20:7<730::AID-JCC8>3.0.CO;2-T&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000079900000008&link_type=ISI) 

81. 81.Johansson, M. U., Zoete, V., Michielin, O. & Guex, N. Defining and searching for structural motifs using DeepView/Swiss-PdbViewer. BMC Bioinformatics 13, 173 (2012).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2105-13-173&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22823337&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

82. 82.Pol-Fachin, L., Fernandes, C. L. & Verli, H. GROMOS96 43a1 performance on the characterization of glycoprotein conformational ensembles through molecular dynamics simulations. Carbohydr. Res. 344, 491–500 (2009).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.carres.2008.12.025&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19185293&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000264615900012&link_type=ISI) 

83. 83.Jubb, H. C. et al. Arpeggio: A Web Server for Calculating and Visualising Interatomic Interactions in Protein Structures. J. Mol. Biol. 429, 365–371 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jmb.2016.12.004&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27964945&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F13%2F2022.09.13.22279644.atom) 

84. 84.Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022).
    
    
85. 85.Pedersen, B. S. & Quinlan, A. R. Who’s Who? Detecting and Resolving Sample Anomalies in Human DNA Sequencing Studies with Peddy. Am. J. Hum. Genet. 100, 406–413 (2017).
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ajhg.2017.01.017.&link_type=DOI)