Phenome-wide causal proteomics enhance systemic lupus erythematosus flare prediction: A study in Asian populations ================================================================================================================== * Liying Chen * Ou Deng * Ting Fang * Mei Chen * Xvfeng Zhang * Ruichen Cong * Dingqi Lu * Runrun Zhang * Qun Jin * Xinchang Wang ## Abstract **Objective** Systemic lupus erythematosus (SLE) is a complex autoimmune disease characterized by unpredictable flares. This study aimed to develop a novel proteomics-based risk prediction model specifically for Asian SLE populations to enhance personalized disease management and early intervention. **Methods** A longitudinal cohort study was conducted over 48 weeks, including 139 SLE patients monitored every 12 weeks. Patients were classified into flare (n = 53) and non-flare (n = 86) groups. Baseline plasma samples underwent data-independent acquisition (DIA) proteomics analysis, and phenome-wide Mendelian randomization (PheWAS) was performed to evaluate causal relationships between proteins and clinical predictors. Logistic regression (LR) and random forest (RF) models were used to integrate proteomic and clinical data for flare risk prediction. **Results** Five proteins (SAA1, B4GALT5, GIT2, NAA15, and RPIA) were significantly associated with SLE Disease Activity Index-2K (SLEDAI-2K) scores and 1-year flare risk, implicating key pathways such as B-cell receptor signaling and platelet degranulation. SAA1 demonstrated causal effects on flare-related clinical markers, including hemoglobin and red blood cell counts. A combined model integrating clinical and proteomic data achieved the highest predictive accuracy (AUC = 0.769), surpassing individual models. SAA1 was highlighted as a priority biomarker for rapid flare discrimination. **Conclusion** The integration of proteomic and clinical data significantly improves flare prediction in Asian SLE patients. The identification of key proteins and their causal relationships with flare-related clinical markers provides valuable insights for proactive SLE management and personalized therapeutic approaches. ![Figure1](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F1.medium.gif) [Figure1](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F1) Keywords * Systemic lupus erythematosus * Flare * Causal proteomics * Phenome-wide mendelian randomization * Longitudinal cohort study ## Introduction Systemic lupus erythematosus (SLE) is a heterogeneous autoimmune disorder characterized by the production of multiple autoantibodies and involvement of various organ systems. Its clinical course is unpredictable, with patients experiencing remissions and flares sudden increases in disease activity that pose significant management challenges[1, 2]. Although early disease control and adherence to pharmacological interventions are foundational in managing SLE, conventional predictive models relying on clinical observations and genetic markers are limited in capturing the disease’s dynamic and multifactorial nature[3, 4]. The complexity of SLE flares, often arising unpredictably from multi-faceted and poorly understood triggers, necessitates a nuanced, data-driven predictive approach. Accurate flare prediction requires longitudinal tracking of disease activity, identification of modifiable risk factors, and comprehensive evaluation of clinical, serological, and molecular markers. Recent studies in Asian SLE populations have linked risk factors such as thrombocytopenia, hypocomplementemia, elevated neutrophil-to-lymphocyte ratio (NLR), and high platelet-to-lymphocyte ratio (PLR) to increased flare risk[5, 6]. Additionally, the presence of specific autoantibodies like anti-ribosomal P and anti-phospholipid antibodies is associated with heightened flare susceptibility in Chinese patients[7]. Non-biological factors, including quality of life, psychological stress, and overexertion, also significantly contribute to flare incidence, highlighting the need to consider both biological and psychosocial variables in predictive models[8, 9, 10]. Advancements in genome-wide association studies (GWAS) and PheWAS have deepened our understanding of the genetic underpinnings and phenotypic expressions of autoimmune diseases. Mendelian randomization (MR) and PheWAS methodologies enable exploration of causal relationships between genetic variations, proteins, and disease outcomes, offering more robust insights than traditional observational studies[11, 12]. Given SLE’s diverse nature, a multi-omics approach that integrates proteomic data with genomic and clinical information is essential for identifying dynamic biomarkers capable of improving flare prediction[13]. In this context, proteomics-based MR holds significant promise for discovering novel, druggable protein targets, thereby enhancing our ability to classify and stratify patients by flare risk. This study aims to address the limitations of current predictive models for SLE flares through a comprehensive, integrative approach. We first conduct an exhaustive analysis of known clinical predictors to build a robust baseline model. Building on this foundation, we incorporate cutting-edge plasma proteomics techniques to gain deeper insights into molecular changes associated with flare events. We explore causal relationships between identified plasma proteins and clinical outcomes through phenome-wide Mendelian randomization analyses, identifying proteins with a causal impact on flare risk. Finally, we leverage advanced machine learning algorithms to integrate proteomic and clinical data, constructing a predictive model that offers more precise risk stratification for SLE flares. This integrative methodology aims to significantly improve early detection of organ damage, guiding timely and personalized therapeutic interventions. ## Methods ### Patient Recruitment and Follow-up This prospective multicenter study recruited 139 SLE patients between August 2020 and January 2023. Eligible participants were aged 18-65 and met the 2012 SLICC classification criteria[14]. Patients with active infections, malignancies, or other connective tissue diseases were excluded. Disease activity was assessed using the SLEDAI-2K at baseline and at 3, 6, and 12 months to identify flare events. The study was approved by the ethics committee of the Second Affiliated Hospital of Zhejiang Chinese Medical University (approval NO.2020-KL-002-IH01). Informed consent was obtained from all participants. Additional details are in Supplementary Table 1. ### Outcome Measurement The primary outcome was the occurrence of an SLE flare within 12 months, defined as an increase in SLEDAI-2K score by *≥* 3 points from baseline and previous assessments[15]. This widely accepted criterion standardizes flare identification. ### Clinical Variables: Definitions and Selection Baseline data included: (1) Complement levels (C3, C4); (2) Inflammatory markers (NLR, PLR); (3) Medication use (glucocorticoids, immunosuppressants); (4) Serological markers (anti-ribosomal P antibodies); (5) Disease activity (SLEDAI-2K score); and (6) Quality of life (LupusQoL questionnaire). LR and RF algorithms identified key clinical predictors by ranking features based on predictive importance. All variables were dichotomized (1 or 0) for simplicity and clinical relevance, facilitating risk stratification. This approach aims to develop a robust predictive model integrating clinical parameters with proteomic biomarkers. ### Proteomic Analysis DIA data were processed into spectral libraries using SpectraST and analyzed with DIA-NN (v1.7.0), ensuring accurate results. Detailed workflows are available at [https://www.iprox.cn/page/HMV006.html](https://www.iprox.cn/page/HMV006.html). ### Differential Protein and SLEDAI-2K Correlation Analysis Proteins expressed in at least 25% of samples were analyzed. Student’s t-test identified plasma proteins differentiating flare and non-flare patients, considering proteins with *|* log2(Fold Change)*| ≥* 1 and *p <* 0.05 as significant. Spearman’s correlation assessed relationships between these proteins and SLEDAI-2K scores, with significance at *p <* 0.05. ### Pathway Enrichment Analysis We performed pathway enrichment analysis on differentially expressed proteins and those correlated with SLEDAI-2K scores. We analyzed Reactome and KEGG pathways, considering those significantly enriched at *p <* 0.05 using DAVID ([https://david.ncifcrf.gov/](https://david.ncifcrf.gov/)). ### Causal Proteomics Analysis Using PheWAS Genetic instruments for identified pQTLs were obtained from 2,958 Han Chinese participants[16]. Outcome data for SLE phenotypes were sourced from BioBank Japan (*n* = 179,000) and supplemented by UK Biobank and FinnGen data (*n*total = 628,000)[17]. The GWAS dataset included predictors like chronic glomerulonephritis, blood cell counts, and medication use. We selected pQTLs with *p <* 5 *×* 10*−*8, independence (*R*2 *<* 0.001 with clumping distance *>* 10,000 kb), and *F* statistic *>* 10 to ensure robust instruments. The primary method was inverse-variance weighted Mendelian Randomization. To address potential violations, we performed sensitivity analyses using MR-Egger regression, weighted median, and mode-based estimators. Heterogeneity was assessed with Cochran’s Q test, and leave-one-out analyses evaluated SNP influence on causal estimates. This approach allowed us to explore causal relationships between serum amyloid A1 (SAA1) pQTLs and SLE flare risk. ### Machine Learning-Based Biomarker Selection We developed predictive models for SLE flares using LR and RF algorithms. Three models were created: (1) Clinical models using validated risk parameters; (2) Protein models using identified biomarkers; and (3) Combined models integrating clinical and proteomic data. Datasets were split into training (85%) and testing (15%) sets. Model performance was evaluated using Receiver Operating Characteristic curves and calibration curves to assess discriminative ability and reliability. ### Statistical Analysis Statistical analyses were conducted using R (v4.3.2) and GraphPad Prism 8. Clinical data are presented as medians with interquartile ranges. Group comparisons used Student’s t-test. Correlation analyses and linear regression assessed variable relationships. PheWAS analyses employed the TwoSampleMR R package. Predictive models used LR. Model accuracy was assessed using ROC curves and Area Under the Curve metrics via the pROC R package. ## Results ### Clinical Data Characteristics and Flare Outcome As shown in Fig.1, 139 SLE patients were monitored over 48 weeks and classified into flare (*n* = 53) and non-flare (*n* = 86) groups; 38% experienced at least one flare. The flare group had a high female predominance (94%), aligning with SLE’s gender disparity. Approximately 47% of flare patients were positive for anti-dsDNA antibodies. Significantly, the flare group had a higher prevalence of rRNP antibodies than the non-flare group (*p* = 0.019), suggesting their potential role in predicting flares. ![Figure 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F2.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F2) Figure 1: Study design and patient follow-up schema. Schematic representation of the recruitment process, inclusion and exclusion criteria, and longitudinal assessment timeline for SLE patients over the 48-week study period. Significant differences in erythrocyte sedimentation rate (ERY, *p* = 0.008), platelet count (PLT, *p* = 0.023), and platelet-to-lymphocyte ratio (PLR, *p* = 0.035) were observed, highlighting their potential as flare risk markers. Medication use showed no significant difference in prednisone use (*p* = 0.284). Hydroxychloroquine use was slightly lower in the flare group (by 11%), though not statistically significant (*p* = 0.617), warranting further investigation into its protective effects. No significant differences in organ involvement were observed (*p >* 0.05). Baseline characteristics are detailed in Table 1. View this table: [Table 1:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/T1) Table 1: Clinical and demographic characteristics of SLE flare. ### Importance Ranking of Clinical Predictors We identified key clinical risk factors for SLE flares using univariate LR and RF based on 39 baseline features from 139 patients. In the LR model (Fig.2A), top predictors were: (1) 24-hour urinary total protein (UTP), (2) urinary erythrocyte count (ERY), (3) platelet count (PLT), (4) platelet-to-lymphocyte ratio (PLR), and (5) anti-ribonucleoprotein (rRNP) antibodies. The RF model (Fig.2B) yielded a similar ranking, with NLR as the fifth predictor. ![Figure 2:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F3.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F3) Figure 2: Comparative analysis of clinical variable importance in predicting SLE flares. (A) Ranking of clinical variables based on LR coefficients. (B) Relative importance of clinical variables determined by the RF algorithm, measured by the mean decrease in Gini impurity. ### Identification of Differentially Expressed Biomarkers Associated with Flare Proteomic analysis identified 102 significantly differentially expressed proteins (73 upregulated, 23 downregulated) associated with flares (Fig.3A), based on proteins detected in over 25% of samples. A heatmap (Fig.3B) illustrates these findings. Pathway enrichment analysis using Reactome revealed involvement in key pathways: (1) B-cell receptor (BCR) downstream signaling, (2) response to elevated platelet cytosolic Ca2+, and (3) platelet degranulation (Fig.3C), offering insights into molecular mechanisms underlying flares. ![Figure 3:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F4.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F4) Figure 3: Multi-dimensional proteomic profiling of SLE flares. (A) Volcano plot illustrating the distribution and statistical significance of 102 differentially expressed proteins between flare and non-flare groups (‘the absolute value of the base-2 logarithm of the fold change’ *≥* 1, *p <* 0.05). (B) Hierarchical clustering heatmap displaying normalized plasma protein expression levels, with a color gradient (blue to red) representing relative protein abundance. (C) Reactome pathway enrichment analysis of flare-associated proteins, with statistical significance indicated by -log10(p-value). (D) KEGG pathway analysis of proteins correlated with SLEDAI-2K scores. (E) STRING-based protein-protein interaction network of SLEDAI-2K-associated proteins, where node size reflects the degree of interaction. (F) Correlation matrix showing the relationships between upregulated proteins positively associated with SLEDAI-2K scores. ### Proteins Significantly Associated with SLEDAI-2K We identified plasma proteins significantly associated with SLEDAI-2K scores. KEGG pathway analysis indicated their involvement in neurode-generation, lipid metabolism, atherosclerosis, and neurotrophin signaling (Fig.3D). Protein-protein interaction analysis using STRING (Fig.3E) showed functional relationships among these proteins. Among 13 proteins correlating with SLEDAI-2K, five were significantly upregulated and positively associated with worse outcomes: (1) Serum Amyloid A-1 (SAA1), (2) *β*-1,4-galactosyltransferase 5 (B4GALT5), (3) Ribose 5-phosphate Iso-merase A (RPIA), (4) GTPase-activating Protein 2 (GIT2), and (5) N-alpha-acetyltransferase 15 (NAA15). Correlation analysis (Fig.3F) showed significant positive associations between SAA1 and B4GALT5, RPIA, and NAA15, suggesting potential functional relationships in SLE activity and flare risk. ### Causal Effects of Flare-Associated Proteins on SLE and Risk Factors Using PheWAS, we investigated causal mechanisms between plasma proteins and SLE flares, focusing on pQTLs for SAA1. We examined 220 SLE-related outcomes using data from BioBank Japan. Six SNPs associated with SAA1 levels were identified as instrumental variables. Inverse-variance weighted (IVW) analysis revealed significant causal effects of SAA1 on four flare-related outcomes (OR = 1.071, 95% CI: 1.004 - 1.143, p = 0.040), hemoglobin (OR= 0.971, 95% CI: 0.947 - 0.996, *p* = 0.023), red blood cell count (OR = 0.971, 95% CI: 0.947 - 0.996, *p* = 0.021), hematocrit (OR =0.967, 95% CI: 0.937 - 0.997, *p* = 0.031) (Fig.4A-D, Table 2). All of them exhibited absence of heterogeneity by IVW (Cochran’s Q = 0.419; Cochran’s Q = 0.263; Cochran’s Q = 0.291 ; Cochran’s Q =0.104, respectively) and MR-Egger (Cochran’s Q= 0.334; Cochran’s Q = 0.169; Cochran’s Q = 0.195; Cochran’s Q =0.068, respectively), horizontal pleiotropy by MR-Egger (intercept = -0.015, p= 0.585; intercept = -0.001, *p* = 0.891; intercept = -0.003, p = 0.808; intercept = -0.005, *p* = 0.694, respectively). The leave-one-out method suggested that the MR analysis results were reliable (Fig.5A-D). ![Figure 4:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F5.medium.gif) [Figure 4:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F5) Figure 4: Mendelian randomization analysis of SAA1’s causal effects on SLE flare risk factors. Forest plots from random-effects inverse-variance weighted (IVW) analyses depicting the genetic causal relationships between SAA1 and (A) anti-inflammatory and antirheumatic medication use, (B) hemoglobin levels, (C) red blood cell count, and (D) hematocrit. Odds ratios and 95% confidence intervals are shown. ![Figure 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F6.medium.gif) [Figure 5:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F6) Figure 5: Sensitivity analysis of Mendelian randomization results using the leave-one-out method. (A) Anti-inflammatory and antirheumatic medication use. (B) Hemoglobin levels. (C) Red blood cell count. (D) Hematocrit. This analysis evaluates the robustness of the causal relationships between SAA1 and four key outcomes: anti-inflammatory and antirheumatic medication use, hemoglobin levels, red blood cell count, and hematocrit. Each subplot (A-D) represents one of these outcomes, with individual points showing the Mendelian randomization estimate after excluding one SNP from the analysis, while the vertical line indicates the estimate using all SNPs. This approach helps identify potential outlier SNPs that may disproportionately influence the overall results. View this table: [Table 2:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/T2) Table 2: Mendelian Randomization Results - Causal Effects Between pQTLs and SLE Flare Risk (IVW Approach) ### Construction and Internal Validation of Flare Risk Prediction Models We developed predictive models for SLE flare risk by dividing patients into training and test sets. Using LR and random forest algorithms, and following TRIPOD guidelines[18], we selected six key clinical variables: UTP, ERY, PLT, PLR, anti-rRNP antibodies, and NLR. Three models were developed: a clinical model, a protein-based model using five key proteins (SAA1, B4GALT5, GIT2, NAA15, RPIA), and a combined model integrating clinical and proteomic variables. ROC curve analysis showed the protein-based model had superior predictive accuracy (AUC = 0.744, 95% CI: 0.646 - 0.842) compared to the clinical model (AUC = 0.643, 95%CI: 0.541 - 0.745). The combined model achieved the highest AUC (AUC = 0.769, 95% CI : 0.678 - 0.860), indicating the benefit of integrating clinical and proteomic data. Calibration analysis demonstrated good predictive performance for all models (Fig.6A-C). The protein-based model showed particular strength in AUC and Mean Absolute Error (MAE), suggesting its suitability for independent flare prediction. The combined model may be more appropriate in complex clinical scenarios requiring multiple variables for accurate risk assessment. ![Figure 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/11/18/2024.11.17.24317460/F7.medium.gif) [Figure 6:](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/F7) Figure 6: Comparative performance evaluation of SLE flare risk prediction models. (A-C) Receiver Operating Characteristic (ROC) curves (left) and calibration plots (right) for (A) clinical, (B) proteomic, and (C) combined prediction models. ROC curves display sensitivity versus 1-specificity, with Area Under the Curve (AUC) values indicated. Calibration plots show predicted versus observed probabilities, generated using 1,000 bootstrap resamples. Hosmer-Lemeshow goodness-of-fit test P-values are provided to evaluate model calibration. ## Discussion This is the first study to develop an SLE flare prediction model based on proteomic analysis in an East Asian population. We identified novel molecular pathways and biomarkers associated with SLE flares, offering valuable insights for improved risk stratification and personalized management. Pathway enrichment analysis showed significant involvement of B cell receptor signaling, elevated platelet cytosolic Ca2+ responses, and platelet degranulation in SLE flare pathogenesis. KEGG analysis revealed enrichment of neurodegeneration, lipid metabolism, atherosclerosis, and neurotrophin signaling pathways among proteins linked to disease activity. These findings highlight the complex interplay between immune activation, vascular responses, and systemic inflammation in SLE flares, particularly affecting hematological, cardiovascular, and neurological systems in this East Asian cohort. We identified five key proteins, SAA1, B4GALT5, GIT2, NAA15, and RPIA, strongly associated with increased SLE flare risk over 48 weeks, independent of conventional risk factors. Our analysis demonstrated a causal effect of elevated SAA1 levels on flare risk factors, underscoring its critical role in disease progression. Integrating these proteomic biomarkers with clinical indicators significantly enhanced our model’s predictive accuracy. These results confirm the prognostic potential of causal proteomics in SLE flare risk stratification, paving the way for personalized therapies and earlier interventions for high-risk patients. Previous studies support Serum Amyloid A1 (SAA1) as a key biomarker for assessing SLE flare risk, reinforcing its predictive utility. SAA1-associated pQTLs have been causally linked to hematological parameters and depressive disorders in multi-GWAS PheWAS[19], enhancing our understanding of proteins in the psychological aspects of SLE flares[20]. As an acute-phase protein, SAA1 is markedly upregulated during inflammation, contributing to organ dysfunction[21]. Our study showed that elevated SAA1 levels causally influence hematological involvement and antirheumatic medication use, consistent with prior findings[22]. Elevated SAA1 levels correlate with SLEDAI-2K scores and nervous system involvement severity[23]. SAA1 is linked to the Th17 cell differentiation pathway, amplifying inflammatory responses[24]. Given its pivotal role in modulating inflammatory mediators and immune cells in SLE pathophysiology[25], investigating SAA1 can advance our understanding of flare mechanisms. Identified as an early immunological diagnostic biomarker with high sensitivity and specificity for SLE[26], SAA1 may serve as a valuable biomarker for identifying high-risk patients based on our East Asian PheWAS results. These findings suggest potential therapeutic targets for personalized treatments. However, since infections also trigger flares and SAA1 is associated with white blood cell count and NLR, further research is needed to distinguish SLE flares from infection-related events[27]. *β*-1,4-Galactosyltransferase V (B4GALT5) is crucial in carbohydrate metabolism, specifically in lactosylceramide synthesis [28]. Unexpectedly, our logistic regression showed a strong positive correlation between B4GALT5 expression and SLE flares, making it a significant protein associated with flare risk. Given B4GALT5’s role in antiviral immunity, overexpression leads to upregulation of inflammatory cytokines and glycosylated surface proteins involved in antigen presentation, cell adhesion, and migration[29]. However, we did not establish a causal relationship between B4GALT5 and SLE, possibly due to the low prevalence of B4GALT5 variants in the East Asian population studied. This underscores the need for diverse population studies to evaluate B4GALT5’s role in SLE pathogenesis and its predictive value. The unexpected correlation suggests a complex interplay between glycosylation processes, immune responses, and SLE activity. Further investigation is needed to elucidate how B4GALT5 might influence flare risk, given its functions in carbohydrate metabolism and immune regulation. These findings open avenues for understanding the glycosylation-immune axis in SLE and potential therapeutic interventions. GIT2 and NAA15 emerged as potential contributors to SLE flares. GIT2 is a scaffold protein regulating aging processes affecting multiple tissues and linked to neurodegeneration and cardiovascular disorders[30]. NAA15 is a susceptibility gene for neurodevelopmental disorders[31], implicated in congenital cardiac anomalies, plaque stability in atherosclerosis, and seizure pathophysiology[32, 33]. Although no direct evidence links GIT2 and NAA15 to SLE, their identification suggests novel pathways in SLE pathogenesis. Further research is needed to elucidate their specific roles in SLE flares, particularly considering disease duration and age, which merit attention in risk prediction models. SLE progression involves distinct risk factors at different disease stages. Early organ damage is driven by active disease (reflected in SLEDAI-2K), while later damage results from long-term medication effects, especially prolonged glucocorticoid use and withdrawal[34, 35, 36]. We identified key clinical risk factors for flares, including hematological and renal involvement and medication use, aligning with traditional findings. Risk factors may shift during disease progression, with early damage from disease activity and later damage from drug side effects, notably glucocorticoid therapy. Traditional correlation studies offer insights but do not establish causality. We employed proteomic Mendelian randomization to develop robust flare prediction models. By integrating correlational methods with advanced machine learning and causal analyses, we aimed to enhance prediction accuracy. Identifying proteomic biomarkers with direct causal effects provides deeper insights into SLE flare mechanisms, highlighting potential causal pathways and therapeutic targets for future interventions. This study has several limitations. The small sample size and focus on an East Asian population may limit statistical power and generalizability to broader populations. The 12-month follow-up may be insufficient to capture long-term fluctuations in SLE disease activity. Moreover, reliance on advanced proteomics techniques, while valuable, may hinder clinical accessibility and routine implementation. Although our PheWAS analysis revealed associations between genetic protein levels and SLE, establishing definitive causality remains challenging due to the complexities of SLE and potential violations of Mendelian randomization assumptions. Therefore, these results should be interpreted with caution. Future research should prioritize external validation of our predictive model in diverse cohorts, adhering to the TRIPOD guidelines. Studies with larger, heterogeneous populations and extended follow-up periods are essential to address current limitations. Developing methodologies that account for time-varying exposures and non-linear associations will enhance model robustness and accuracy. Further investigation is needed to verify causal relationships between identified proteins and SLE risk, improving clinical applicability. Additionally, comprehensive proteomic analyses across multiple organs are necessary to assess organ-specific protein effects, providing deeper insights into early flare mechanisms. ## Conclusion This study demonstrates the potential of integrating causal proteomics with clinical risk factors to improve SLE flare prediction. Our findings provide significant advancements in understanding the molecular mechanisms underlying SLE flares and offer a foundation for more precise and personalized approaches to SLE management. The identified proteomic signatures and causal pathways represent promising avenues for future therapeutic interventions and risk stratification strategies, with the potential to enhance patient care and outcomes in SLE. ## Data Availability The mass spectrometry proteomics data generated in this study have been deposited in the Integrated Proteome Resources (iProX) database under the dataset identifier IPX0007978000. The data are publicly accessible through the iProX website (https://www.iprox.cn/page/). Additional datasets that support the findings of this study are available from the corresponding author upon reasonable request. ## Authors’ Contributions LC conceptualized the study, developed the methodology, conducted formal analysis, and led the writing of the manuscript. OD contributed to the development of the methodology, analysis, supervision, and revision of the manuscript. TF, MC, and XZ were involved in data collection, curation, and validation. RC, DL, and RZ provided assistance in manuscript preparation and revision. QJ contributed to conceptualization, supervision, and project administration. XW oversaw the project, providing leadership in conceptualization, supervision, funding acquisition, and manuscript review. All authors have reviewed and approved the final version of the manuscript. ## Funding This research was supported by the Zhejiang Traditional Chinese Medical Modernization Special Project (grant number 2020ZX008). ## Data Availability The mass spectrometry proteomics data generated in this study have been deposited in the Integrated Proteome Resources (iProX) database under the dataset identifier IPX0007978000. The data are publicly accessible through the iProX website ([https://www.iprox.cn/page/](https://www.iprox.cn/page/)). Additional datasets that support the findings of this study are available from the corresponding author upon reasonable request. ## Declarations ### Ethics approval and consent to participate The study was approved by the ethics committee of the Second Affiliated 58 Hospital of Zhejiang Chinese Medical University (approval NO.2020-KL-002-59 IH01). Informed consent was obtained from all participants. ### Consent for publication Not applicable. Our manuscript does not contain any individual person’s data. ### Competing Interests The authors declare no competing interests. ## Abbreviations SLE : systemic lupus erythematosus PheWAS : phenome-wide mendelian randomization GWAS : genome-wide association studies MR : mendelian randomization DIA : data-independent acquisition LR : logistic regression RF : random forest UTP : 24-hour urinary total protein ERY : urinary erythrocyte count PLT : platelet count NLR : neutrophil-to-lymphocyte ratio PLR : platelet-to-lymphocyte ratio rRNP : anti-ribosomal P antibodies LupusQoL : lupus quality of life SLEDAI-2K : Systemic Lupus Erythematosus Disease Activity Index 2000 BCR : B-cell receptor SLICC : systemic lupus international collaborating clinics KEGG : Kyoto Encyclopedia of Genes and Genomes SAA1 : serum amyloid A1 B4GALT5 : *β −* 1,4-galactosyltransferase 5 RPIA : Ribose 5-phosphate isomerase A GIT2 : GTPase-activating protein 2 NAA15 : N-alpha-acetyltransferase 15 IVW : inverse-variance weighted TRIPOD : transparent reporting of a multivariable prediction model for individual prognosis or diagnosis. ## Supplementary Tables *1. Name of multi-center hospitals* View this table: [Table S1.](http://medrxiv.org/content/early/2024/11/18/2024.11.17.24317460/T3) Table S1. Name of multi-center hospitals ## Acknowledgments We thank members of the Genomics Platform (Human Phenome Institute, Fudan University) for library preparation experiments and providing the proteomics analysis. We also appreciate the medical background advice provided by Professor Atsushi Ogihara of Waseda University. * Received November 17, 2024. * Revision received November 17, 2024. * Accepted November 18, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. [1].Guidelines for referral and management of systemic lupus erythematosus in adults. American College of Rheumatology Ad Hoc Committee on Systemic Lupus Erythematosus Guidelines. Arthritis and rheumatism. 1999;42(9):1785-96. 2. [2].Fanouriakis A, Kostopoulou M, Alunno A, Aringer M, Bajema I, Boletis JN, et al. 2019 update of the EULAR recommendations for the management of systemic lupus erythematosus. Annals of the rheumatic diseases. 2019;78(6):736-45. 3. [3].Barturen G, Babaei S, Català-Moll F, Martínez-Bueno M, Makowska Z, Martorell-Marugán J, et al. Integrative Analysis Reveals a Molecular Stratification of Systemic Autoimmune Diseases. Arthritis & rheumatology (Hoboken, NJ). 2021;73(6):1073–85. 4. [4].Goetz I, Choong C, Winnie J, Nelson DR, Birt J, Noxon V, et al. Development of a claims-based flare algorithm for systemic lupus erythematosus. Current medical research and opinion. 2022;38(9):1641–9. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35866412&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 5. [5].Cho J, Lahiri M, Teoh LK, Dhanasekaran P, Cheung PP, Lateef A. Predicting flares in patients with stable systemic lupus erythematosus. Seminars in arthritis and rheumatism. 2019;49(1):91–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.semarthrit.2019.01.001&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30660381&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 6. [6].Cho J, Liang S, Lim SHH, Lateef A, Tay SH, Mak A. Neutrophil to lymphocyte ratio and platelet to lymphocyte ratio reflect disease activity and flares in patients with systemic lupus erythematosus-A prospective study. Joint bone spine. 2022;89(4):105342. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35032639&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 7. [7].Sun F, Zhao L, Wang H, Zhang D, Chen J, Wang X, et al. Risk factors of disease flares in a Chinese lupus cohort with low-grade disease activity. Lupus science & medicine. 2022;9(1). 8. [8].Goldschen L, Ellrodt J, Amonoo HL, Feldman CH, Case SM, Koenen KC, et al. The link between post-traumatic stress disorder and systemic lupus erythematosus. Brain, behavior, and immunity. 2023;108:292–301. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36535611&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 9. [9].Yelin E, Trupin L, Bunde J, Yazdany J. Poverty, Neighborhoods, Persistent Stress, and Systemic Lupus Erythematosus Outcomes: A Qualitative Study of the Patients’ Perspective. Arthritis care & research. 2019;71(3):398–405. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29781579&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 10. [10].Katz P, Wan GJ, Daly P, Topf L, Connolly-Strong E, Bostic R, et al. Patient-reported flare frequency is associated with diminished quality of life and family role functioning in systemic lupus erythematosus. Quality of life research : an international journal of quality of life aspects of treatment, care and rehabilitation. 2020;29(12):3251–61. 11. [11].Yuan S, Jiang F, Chen J, Lebwohl B, Green PHR, Leffler D, et al. Phenome-wide Mendelian randomization analysis reveals multiple health comorbidities of coeliac disease. EBioMedicine. 2024;101:105033. 12. [12].Yuan S, Wang L, Zhang H, Xu F, Zhou X, Yu L, et al. Mendelian randomization and clinical trial evidence supports TYK2 inhibition as a therapeutic target for autoimmune diseases. EBioMedicine. 2023;89:104488. 13. [13].Shen X, Kellogg R, Panyard DJ, Bararpour N, Castillo KE, Lee-McMullen B, et al. Multi-omics microsampling for the profiling of lifestyle-associated changes in health. Nature biomedical engineering. 2024;8(1):11–29. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36658343&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 14. [14].Petri M, Orbai AM, Alarcón GS, Gordon C, Merrill JT, Fortin PR, et al. Derivation and validation of the Systemic Lupus International Collaborating Clinics classification criteria for systemic lupus erythematosus. Arthritis and rheumatism. 2012;64(8):2677–86. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/art.34473&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22553077&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000306906500028&link_type=ISI) 15. [15].Fanouriakis A, Tziolos N, Bertsias G, Boumpas DT. Update on the diagnosis and management of systemic lupus erythematosus. Annals of the rheumatic diseases. 2021;80(1):14–25. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTE6ImFubnJoZXVtZGlzIjtzOjU6InJlc2lkIjtzOjc6IjgwLzEvMTQiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyNC8xMS8xOC8yMDI0LjExLjE3LjI0MzE3NDYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 16. [16].Xu F, Yu EY, Cai X, Yue L, Jing LP, Liang X, et al. Genome-wide genotype-serum proteome mapping provides insights into the crossancestry differences in cardiometabolic disease susceptibility. Nature communications. 2023;14(1):896. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36797296&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 17. [17].Sakaue S, Kanai M, Tanigawa Y, Karjalainen J, Kurki M, Koshiba S, et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nature genetics. 2021;53(10):1415–24. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-021-00931-x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34594039&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 18. [18].Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ (Clinical research ed). 2015;350:g7594. 19. [19].Zheng J, Haberland V, Baird D, Walker V, Haycock PC, Hurle MR, et al. Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases. Nature genetics. 2020;52(10):1122–31. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41588-020-0682-6&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32895551&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 20. [20].Moroni L, Mazzetti M, Ramirez GA, Farina N, Bozzolo EP, Guerrieri S, et al. Beyond Neuropsychiatric Manifestations of Systemic Lupus Erythematosus: Focus on Post-traumatic Stress Disorder and Alexithymia. Current rheumatology reports. 2021;23(7):52. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34196907&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 21. [21].Rosenau BJ, Schur PH. Antibody to serum amyloid A. Journal of autoimmunity. 2004;23(2):179–82. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jaut.2004.05.005&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15324936&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223912200009&link_type=ISI) 22. [22].Márquez-Macedo SE, Perez-Arias AA, Pena-Vizcarra Ó R, Zavala-Miranda MF, Juárez-Cuevas B, Navarro-Gerrard MA, et al. Predictors of treatment outcomes in lupus nephritis with severe acute kidney injury and requirement of dialytic support. Clinical rheumatology. 2023;42(8):2115–23. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=37188962&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 23. [23].He J, Tang D, Liu D, Hong X, Ma C, Zheng F, et al. Serum proteome and metabolome uncover novel biomarkers for the assessment of disease activity and diagnosing of systemic lupus erythematosus. Clinical immunology (Orlando, Fla). 2023;251:109330. 24. [24].Lee JY, Hall JA, Kroehling L, Wu L, Najar T, Nguyen HH, et al. Serum Amyloid A Proteins Induce Pathogenic Th17 Cells and Promote Inflammatory Disease. Cell. 2020;180(1):79–91.e16. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2019.11.026&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31866067&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 25. [25].Lu X, Huang Y, Zhou M, Guo Y, Zhou Y, Wang R, et al. Artesunate attenuates serum amyloid A-induced M1 macrophage differentiation through the promotion of PHGDH. International immunopharmacology. 2024;127:111462. 26. [26].Zhou G, Wei P, Lan J, He Q, Guo F, Guo Y, et al. TMT-based quantitative proteomics analysis and potential serum protein biomarkers for systemic lupus erythematosus. Clinica chimica acta; international journal of clinical chemistry. 2022;534:43–9. 27. [27].Luo KL, Yang YH, Lin YT, Hu YC, Yu HH, Wang LC, et al. Differential parameters between activity flare and acute infection in pediatric patients with systemic lupus erythematosus. Scientific reports. 2020;10(1):19913. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33199770&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 28. [28].Yoshihara T, Satake H, Nishie T, Okino N, Hatta T, Otani H, et al. Lactosylceramide synthases encoded by B4galt5 and 6 genes are pivotal for neuronal generation and myelin formation in mice. PLoS genetics. 2018;14(8):e1007545. 29. [29].Zhang L, Ren J, Shi P, Lu D, Zhao C, Su Y, et al. The Immunological Regulation Roles of Porcine *β*-1, 4 Galactosyltransferase V (B4GALT5) in PRRSV Infection. Frontiers in cellular and infection microbiology. 2018;8:48. 30. [30].van Gastel J, Boddaert J, Jushaj A, Premont RT, Luttrell LM, Janssens J, et al. GIT2-A keystone in ageing and age-related disease. Ageing research reviews. 2018;43:46–63. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29452267&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 31. [31].Stessman HA, Xiong B, Coe BP, Wang T, Hoekzema K, Fenckova M, et al. Targeted sequencing identifies 91 neurodevelopmental-disorder risk genes with autism and developmental-disability biases. Nature genetics. 2017;49(4):515–26. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/ng.3792&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28191889&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 32. [32].Cheng H, Dharmadhikari AV, Varland S, Ma N, Domingo D, Kleyner R, et al. Truncating Variants in RPIA Are Associated with Variable Levels of Intellectual Disability, Autism Spectrum Disorder, and Congenital Anomalies. American journal of human genetics. 2018;102(5):985–94. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.ajhg.2018.03.004&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29656860&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 33. [33].Qun L, Wenda X, Weihong S, Jianyang M, Wei C, Fangzhou L, et al. miRNA-27b modulates endothelial cell angiogenesis by directly targeting Naa15 in atherogenesis. Atherosclerosis. 2016;254:184–92. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27755984&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F11%2F18%2F2024.11.17.24317460.atom) 34. [34].Ugarte-Gil MF, Alarcón GS. Low Disease Activity Early in the Course of Systemic Lupus Erythematosus. The Journal of rheumatology. 2024. 35. [35].Ji L, Xie W, Fasano S, Zhang Z. Risk factors of flare in patients with systemic lupus erythematosus after glucocorticoids withdrawal. A systematic review and meta-analysis. Lupus science & medicine. 2022;9(1). 36. [36].Nakai T, Fukui S, Ikeda Y, Suda M, Tamaki H, Okada M. Glucocorticoid discontinuation in patients with SLE with prior severe organ involvement: a single-center retrospective analysis. Lupus science & medicine. 2022;9(1).