Body mass index changes and their association with SARS-CoV-2 infection: a real-world analysis ============================================================================================== * Jithin Sam Varghese * Yi Guo * Mohammed K. Ali * W. Troy Donahoo * Rosette J. Chakkalakal ## ABSTRACT **Objective** To study body mass index (BMI) changes among individuals aged 18-99 years with and without SARS-CoV-2 infection. **Subjects/Methods** Using real-world data from the One Florida+ Clinical Research Network of the National Patient-Centered Clinical Research Network, we compared changes over time in BMI in an Exposed cohort (positive SARS-CoV-2 test between March 2020 – January 2022), to a contemporary Unexposed cohort (negative SARS-CoV-2 tests), and an age/sex-matched Historical control cohort (March 2018 – January 2020). Body mass index (kg/m2) was retrieved from objective measures of height and weight in electronic health records. We used target trial approaches to estimate BMI at baseline and change per 100 days of follow-up for Unexposed and Historical cohorts relative to the Exposed cohort by categories of sex, race-ethnicity, age, and hospitalization status. **Results** The study sample consisted of 44,436 (Exposed cohort), 164,118 (Unexposed cohort), and 41,189 (Historical cohort). Cumulatively, 62% were women, 21.5% Non-Hispanic Black, 21.4% Hispanic and 5.6% Non-Hispanic Other. Patients had an average age of 51.9 years (SD: 18.9). At baseline, relative to the Exposed cohort (mean BMI: 29.3 kg/m2 [95%CI: 29.0, 29.7]), the Unexposed (–0.07 kg/m2 [95%CI; –0.12, –0.01]) and Historical controls (–0.27 kg/m2 [95%CI; – 0.34, –0.20]) had lower BMI. Relative to no change in the Exposed over 100 days (0.00 kg/m2 [95%CI; –0.03,0.03]), the BMI of those Unexposed decreased (–0.04 kg/m2 [95%CI; –0.06, – 0.01]) while the Historical cohort’s BMI increased (+0.03 kg/m2 [95%CI;0.00,0.06]). BMI changes were consistent between Exposed and Unexposed cohorts for most population groups, except at start of follow-up period among Males and those 65 years or older, and in changes over 100 days among Males and Hispanics. **Conclusions** In a diverse real-world cohort of adults, mean BMI of those with and without SARS-CoV2 infection varied in their trajectories. The mechanisms and implications of weight retention following SARS-CoV-2 infection remain unclear. ## Introduction Poor cardiometabolic health, including obesity and diabetes, are known risk factors for COVID-19 severity and mortality.1 Recent studies of post-acute sequelae of COVID-19 (PASC) suggest COVID-19 may, in turn, worsen cardiometabolic health as increased rates of type 2 diabetes, high blood pressure, and dyslipidemia have been reported for individuals infected with SARS-CoV-2.2–8 Few studies have examined changes in body mass index (BMI) which may contribute to observed associations of COVID-19 with incident cardiometabolic diseases, particularly for younger, healthier adults. Racial and ethnic minorities in the US have also been underrepresented in prior studies of PASC, making the findings of this research difficult to generalize to communities that experience disproportionately higher rates of SARS-CoV-2 infection, COVID-19-related morbidity and mortality, and cardiometabolic disease in the US.9–11 Understanding the magnitude and direction of changes in cardiometabolic risk factors during the post-acute period of SARS-CoV-2 infection status in diverse, real-world populations can provide insights into pathways through which COVID-19 may increase rates of type 2 diabetes and other cardiometabolic conditions. The purpose of this real-world evidence study was to characterize key changes in BMI among individuals who survived the acute phase (defined as the first 30 days) of SARS-CoV-2 infection in a socio-demographically diverse cohort and compare them with a contemporary cohort who always tested negative for SARS-CoV-2 as well as a historical cohort (prior to the pandemic) matched on age and sex to those who tested positive. Using analytical approaches for target trial emulation, we examined if BMI changes differed by socio-demographic (sex, race-ethnicity, age) category and hospitalization status. ## Methods ### Study Design We conducted a secondary analysis of patient-level electronic health record (EHR) data from the OneFlorida+ clinical research network (CRN) in the National Patient-Centered Clinical Research Network (PCORnet).12 PCORnet CRNs have become a valuable resource to study symptom profiles of PASC.13,14 For the current study, the OneFlorida+ Data Analytics team independently queried the clinical data warehouse, built on the PCORnet Common Data Model, to extract data on socio-demographic and clinical characteristics of eligible patients.15 We built three retrospective cohorts of adult individuals who sought care at least three times in the OneFlorida+ clinical sites over the period 2016-22.12 Individuals were included if they were 18 years or older with at least two inpatient or outpatient encounters recorded in OneFlorida+ in the two years before inclusion, and at least one inpatient or outpatient encounter in the period after inclusion (30 days from inclusion until last visit or 28 February 2022). Individuals were excluded if they had diabetes mellitus, as defined by previously validated computable phenotypes for type 1 diabetes (T1DM) and type 2 diabetes (T2DM; at least two of three criteria: ICD-9/10 diagnosis code for diabetes, antidiabetic medication, and/or HbA1c ≥6.5%), in the 24 months preceding the date of first test or COVID-19 diagnosis code (**Supplementary Table 1)**.16 The Institutional Review Board of Emory University determined the study met criteria for exemption under 45 CFR 46.104(d). This study followed the REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) Statement.17 ### Exposure Cohorts The Exposed cohort were individuals exposed to SARS-CoV-2, defined by meeting at least one of two criteria between 1 March 2020 and 29 January 2022: any positive test (Nucleic Acid Amplification Tests [NAAT] or antigen) or COVID-19 related ICD-10-CM codes (E08 to E13).13 The index date of infection exposure (T) was defined as the date of the first positive test or date of record for the ICD-10-CM code. The timeline for cohort selection and follow-up is presented in **Supplementary** Figure 1. The Unexposed cohort were individuals not measured to have had exposure to SARS-CoV-2 and consisted of individuals who met all of three criteria between 1 March 2020 and 29 January 2022: at least one negative test (NAAT or antigen), no COVID-19 related International Classification of Diseases-10 CM (ICD-10 CM) codes (E08.X to E13.X), and no positive test during follow-up period. Since the Unexposed cohort tested negative throughout the follow-up period, to maximize follow-up duration, the index date (T) was defined as the date of the first negative test. The Historical cohort was constructed to be representative of typical clinic visits among patients who utilized the healthcare delivery systems in the pre-pandemic period, and ensure differences observed in the Exposed and Unexposed cohorts were not due to the pandemic period. Patients who were not subsequently part of Exposed or Unexposed cohorts were sampled into the Historical cohort after matching on sex and 5-year age intervals. The index dates of Historical cohort ranged from 2 March 2018 and 30 January 2020, and were selected such that the last date of possible follow-up (29 February 2020) did not overlap with the start of the calendar period of identification (1 March 2020) for the Exposed and Unexposed cohorts. We restricted our study sample to individuals who had at least one measurement of weight in the year preceding their observation period in our study (**Supplementary** Figure 2). The final analytic sample consisted of 44,436 Exposed, 164,118 Unexposed controls and 41,189 Historical controls. All subsequent analysis accounted for the stratified sampling of Historical controls. ### Outcomes – Cardiometabolic Health Indicators We explored average longitudinal change in body mass index (BMI) based on objective measures recorded in the EHR. Follow-up period (in days) was calculated as number of days from 30 days after index date (i.e. Time tF = t – [T + 30]), defined as the post-acute phase of the disease. To exclude weight changes associated with severe illness and weight loss prior to mortality, we included only those participants who were alive at least 30 days after SARS-CoV2 infection was detected.2,13 ### Effect Modifiers – Socio-demographic Characteristics and Hospitalization Status Socio-demographic characteristics included sex (male, female), race-ethnicity (Non-Hispanic White, Non-Hispanic Black, Hispanic), and age at index date (18-39 years, 40-64 years, 65 years and older). We determined all-cause hospitalization status (not hospitalized, hospitalized) based on the clinical encounter coded on the index date. ### Confounders We used a combination of biologically plausible covariates and empirically identified covariates to adjust for baseline differences between cohorts using inverse probability weighting (IPW) (see **Supplementary Methods**). Biologically plausible covariates known to be associated with higher cardiometabolic risk were derived from data collected before the index date. We used data collected within 1 year of index date to define smoking status (yes/no), use of medications relevant to cardiometabolic health (yes/no; antihypertensives, statins, antidepressants, antipsychotics, steroids), and laboratory parameters (HbA1c, random glucose, lipid panel). Given the lower frequency of reporting diagnosis codes, we used data collected within 2 years before the index date to identify comorbidities using grouped ICD-10-CM codes (yes/no; obesity, cardiovascular disease, cerebrovascular disease, hyperlipidemia, hypertension, pulmonary disease). Information on medication, laboratory, and diagnosis codes used for categorization are provided in **Supplementary Table 1**. We identified the health system and primary insurance type (Medicare, Medicaid, Other Government, Private, No Insurance, No Information) used on index date. Empirically identified covariates were computed across six data dimensions available in OneFlorida+: inpatient diagnoses, outpatient diagnoses, inpatient procedures, outpatient procedures, prescriptions, and laboratory tests.18 We used participant records from within 1 year before index date to compute the covariates. Diagnostic codes and prescriptions were categorized into 537 Clinical Classifications Software Refined (CCSR) categories and 89 Anatomical Therapeutic Chemical (ATC) Level 3 classes respectively. The 1,951 procedure codes (Current Procedural Terminology, ICD-10-CM) and abnormalities in 876 laboratory tests (Logical Observation Identifiers Names and Codes [LOINC]) were included without additional categorization. We identified the top 200 covariates in each data domain based on their prevalence in the sample. For each covariate, we then computed three intensity dummy variables (at least once, sporadic [more often than median participant], frequent [more often than the 75th percentile]). ### Statistical Analysis We present an overview of the analytic strategy in **Supplementary** Figure 3. #### Inverse probability weighting for cohort membership and follow-up Since this is an observational study, baseline characteristics may differ between exposed, unexposed, and historical control cohorts resulting in non-random cohort membership. To minimize instrumental variable bias when constructing propensity scores, we performed variable selection of empirically identified covariates longitudinally associated with BMI after adjusting for multiple comparison correction using the Benjamini-Hochberg procedure. Next, random forests models for probability of membership in each exposure cohort, relative to other cohorts, were fit separately with 5-fold cross-validation. We tuned the parameters based on the area under receiver operating characteristic curve (AUC) and identified 2000 trees and 10 observations per node as the best combination of hyperparameters. We included both biologically-plausible and algorithmically selected covariates to estimate the high-dimensional propensity score.19,20. We assessed covariate balance between the Unexposed and Historical Control cohorts, relative to the Exposed, after IPW using population standardized bias.21 Population standardized bias is the maximum difference between IP weighted group mean and unweighted pooled mean.21 Bias less than 0.1 after weighting was considered as indicative of covariate balance. We additionally constructed IPW for availability of follow-up data for BMI to minimize selection bias, since not all participants had data on all cardiometabolic health indicators during the follow-up period.22 The final weight for each individual in the analytic sample is a product of treatment weights and selection weights. We provide additional detail of the modeling strategy in an extended methodological note (**Supplementary Methods)**. IPW were also constructed separately for each socio-demographic, clinical, and community characteristic of interest with numerator reflecting probability of cohort membership under levels of each characteristic (sex, race-ethnicity, age) when assessing differences between levels.23 #### Changes in BMI First, we modeled BMI in the follow-up period for Exposed, Unexposed, and Historical Control cohorts using marginal structural models. We used a difference-in-differences framework to understand if changes in BMI per 100 days of follow-up differed for Unexposed and Historical cohorts relative to the Exposed cohort. We adjusted for all imbalanced characteristics except laboratory parameters in the regression model since the latter had high rates of missingness. Second, we estimated average BMI at origin date and longitudinal change in BMI per 100 days of follow-up across categories of sex (male vs female), race-ethnicity (NH White vs NH Black vs Hispanic), age (18-39 vs 40-64 vs 65 and older) and hospitalization status by exposure group. We did not adjust for multiple comparisons in these prespecified subgroup analysis. ### Sensitivity Analysis First, to minimize the influence of patients who rarely used the health system, we restricted the analytic sample to those patients with at least 100 days of follow-up and at least two encounters where BMI was measured in the follow-up period. Second, to account for differences in health outcomes due to differences in reasons for visit during follow-up period, we adjusted for number and types of clinical encounters between measurements of the cardiometabolic risk factors as a time-varying covariate. Third, to account for differences in timing of follow-up visits, we adjusted for COVID-19 transmission in county of residence on dates when cardiometabolic risk factors were measured. COVID-19 cases and test positivity per 100,000 persons in the 7 days prior to the date of measurement were obtained from Centers for Disease Control and Prevention’s COVID-19 County Level of Community Transmission data archive. New cases and test positivity would be zero on all dates for the Historical Control cohort. Finally, we used multiple imputation (5 datasets) to impute missing values of laboratory parameters in lookback period and adjusted for imbalanced covariates in the regression model. All analysis was carried out using R 4.2.0 and Python 3.11.4. ## Results The study sample consisted of 249,743 adults, of whom 44,436 (19.2%) belonged to the Exposed cohort, 164,118 (61.5%) belonged to the Unexposed cohort, and 41,189 (19.3%) belonged to the Historical cohort (**Table 1**). Most of the study sample were women (155,223 [62%]), with an average age of 51.9 years (SD: 18.9). Nearly half the sample belonged to minority race-ethnicity groups (Non-Hispanic Black: 21.5%, Hispanic: 21.4%, Non-Hispanic Other: 5.6%). Most patients used private insurance (46%) or Medicare (18%) on index date and 27% were hospitalized on index date. The average BMI on index date in the study sample was 29.5 kg/m2 (SD: 7.4) with 30% and 40% having overweight and obesity respectively. The excluded sample without weight available in the preceding year were healthier than the study sample with weight available (**Supplementary Table 2**). View this table: [Table 1.](http://medrxiv.org/content/early/2024/02/13/2024.02.12.24302697/T1) Table 1. Socio-demographic, clinical and community characteristics of exposure cohorts before weighting and population standardized bias after weighting. The Exposed cohort was younger (48.8 years [SD: 18.4]) and more likely to be female (65%) relative to the Unexposed cohort (age: 53.6 years [SD: 19.0], 61% female). The Exposed cohort was also more likely to be Non-Hispanic Black or Hispanic race-ethnicity (50.6% vs 40.6% Unexposed and 43.5% Historical), privately insured (53% vs 42% Unexposed and 42% Historical) and had higher obesity (46% vs 39% Unexposed and 38% Historical). The Unexposed cohort was more likely to be hospitalized due to any cause relative to other cohorts (31% vs 28% Exposed and 7.4% Historical). The proportion with comorbidities and medications prescribed were similar between the three cohorts. After inverse probability weighting (**Supplementary Table 3**), some covariates remained imbalanced between the three groups, namely age, race-ethnicity, smoking, health system at admission, hospitalization status, BMI, and laboratory parameters (HbA1c, serum creatinine, HDL, LDL) in the previous year. ### Trajectories of body mass index in follow-up period Of the study sample, only 194,792 (77.9%) had BMI information in the follow-up period (Exposed: 32,384, Unexposed: 127,142, Historical: 35,266). Socio-demographic and clinical characteristics were similar between those with and without BMI information in the follow-up period (**Supplementary Table 2**). The random forests model to predict loss to follow-up had high model fit in the hold-out test set (AUROC: 0.97; **Supplementary Methods Note**). The time (median [IQR]) to the first follow-up visit of BMI was shorter for Exposed (39 [12, 107]) and Unexposed (33 [11, 98]) cohorts, relative to the Historical (74 [22, 180]) cohort (**Supplementary Table 4**). The Historical cohort had more frequent (median: 4 vs 3 each for Exposed and Unexposed) and longer duration of follow-up (median days [IQR]; 464 [271, 592] vs 178 [76, 357] for Exposed and 212 [85, 374] for Unexposed). BMI at the first follow-up was higher for the Exposed (29.3 kg/m2, 95% Confidence Interval [CI]: 24.9, 34.9), relative to Unexposed (27.9 kg/m2, 95%CI: 23.9, 33.0) and Historical (27.8 kg/m2, 95%CI: 23.8, 32.8) cohorts. Adjusting for imbalanced covariates and after IPW, the average estimated BMI at origin date (**Table 2**) was higher for the Exposed (29.3 kg/m2, 95%CI: 29.0, 29.7) relative to the Unexposed (difference: –0.07 kg/m2, 95%CI: –0.12, –0.01) and Historical (difference: –0.27 kg/m2, 95%CI: –0.34, –0.20) cohorts. Change in BMI per 100 days of follow-up was null for the Exposed (0.00 kg/m2, 95%CI: –0.03, 0.03), negative for the Unexposed (–0.03 kg/m2, 95%CI: – 0.04, –0.02) and positive for the Historical (0.04 kg/m2, 95%CI: 0.03, 0.05) cohort. The change in BMI per 100 days of follow-up for the Unexposed (–0.04 kg/m2, 95%CI: –0.06, –0.01) and Historical (0.03 kg/m2, 95%CI: 0.00, 0.06) cohorts were different, relative to the Exposed cohort (**Table 2**). View this table: [Table 2.](http://medrxiv.org/content/early/2024/02/13/2024.02.12.24302697/T2) Table 2. Marginal estimates at origin date and change at 100 days of follow-up for body mass index Adjusted differences in BMI at origin date for subgroups of sex, age, race-ethnicity, and hospitalization status are presented in **Figure 1A**. Across most socio-demographic subgroups (except among adult Males, those aged 65 and older, Hispanic adults, non-hospitalized adults) BMI at origin date was higher among Exposed, relative to Unexposed and Historical cohorts (**Supplementary Table 5**). Relative to Exposed cohort (**Figure 1B**), the change in BMI per 100 days of follow-up among the Unexposed cohort was lower among Males (difference-in-differences [DiD]: –0.07 kg/m2, 95%CI: –0.12, –0.02) and non-Hospitalized adults (DiD: –0.04 kg/m2, 95%CI: –0.07, –0.01). Relative to the Exposed cohort, the change in BMI per 100 days of follow-up among the Historical cohort was higher among Females (DiD: 0.07 kg/m2, 95%CI: 0.03, 0.11), Non-Hispanic White adults (DiD: 0.07 kg/m2, 95%CI: 0.03, 0.11) and non-Hospitalized adults (DiD: 0.03 kg/m2, 95%CI: 0.00, 0.06) ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2024/02/13/2024.02.12.24302697/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2024/02/13/2024.02.12.24302697/F1) Figure 1. Difference from Exposed cohort in body mass index at start of post-acute period and change per 100 days of follow-up for Unexposed and Historical cohorts. Results were similar when restricting to those with at least 100 days of follow-up and at least two encounters when BMI was measured (**Supplementary Table 6**), after adjusting for number and types of encounters between BMI measurements (**Supplementary Table 7**) and after adjusting for COVID-19 cases and testing (**Supplementary Table 8**). Results were also similar after accounting for missing data in laboratory parameters using multiple imputation **(Supplementary Table 9**). ## Discussion In a diverse real-world cohort of adults who sought medical care during the pandemic, those who tested positive for COVID-19 (Exposed) were heavier at the start of the post-acute phase of SARS-CoV-2 infection compared to contemporary controls who tested negative (Unexposed), and age-sex matched historical controls. The BMI of those Exposed did not change per 100 days of follow-up, while the BMI of those Unexposed decreased, and the BMI of historical controls increased. The change in BMI over time differed between Exposed and Unexposed cohorts at start of follow-up among Males and those 65 years or older, and in changes per 100 days among Males and Hispanics. The persistent higher BMI, including before infection, among those Exposed compared to those who were Unexposed suggest that those infected by SARS-CoV-2 may have been at higher risk of COVID-19 when seeking care.5 This finding is consistent with numerous reports of elevated BMI being a risk factor for severe COVID-19.24,25 The current study extends the findings of this prior research to examine BMI trajectories following SARS-CoV-2 infection at the individual level. To date, most studies have examined the pandemic’s impact on BMI at a population level. For instance, a meta-analysis of observational studies showed that the pandemic resulted in a small increase in BMI and prevalence of obesity among adults.26,27 Potential explanations for BMI increases at the individual and population level include rise in sedentary time and unhealthy dietary habits, downward social mobility from loss of income and employment, and disruption of preventive care.26 The bidirectional relationship of viral infections and cardiometabolic disease may be mediated by excess adiposity, insulin resistance, inflammation and/or delayed recovery after COVID-19 as a post-acute sequelae.2–4,28,29. This analysis of real-world data has several strengths. We used a validated computable phenotype for prevalent T2DM as part of our exclusion criteria to identify a cohort of adults free of T2DM, who were users of the healthcare system before the pandemic, and who were tested for SARS-CoV-2. We additionally included a historical control cohort, matched on age and sex, to characterize differences in the patient population attributable to the pandemic. The OneFlorida+ Data Trust is a part of the National Patient-Centered Clinical Research Network (PCORnet) and follows the common data model, facilitating reproducibility. Compared to other real-world studies, the analytic sample was younger and more diverse across categories of race-ethnicity, age, and sex.2,5 The duration of follow-up was longer than other studies reporting post-acute sequelae of COVID-19.3,5,6,13 We used methods for target trial emulation to minimize confounding, including minimizing the potential for instrumental variable bias in high dimensional confounder selection. There are several limitations for this study. First, the OneFlorida+ CRN includes data from many but not all of Florida’s healthcare providers; our analyses could not account for patients testing positive at centers not participating in the OneFlorida+ CRN. However, the criteria we used for the lookback period were meant to identify regular users of healthcare systems participating in OneFlorida+.30 Additionally, our sample selection was not biased by home testing because it became widespread only after the last date of follow-up in February 2022. Second, although we used a high dimensional propensity score for confounding adjustment, we cannot rule out all differences in clinical characteristics at the time of the index encounter. For instance, those Exposed and Unexposed may have had different reasons for seeking care. Finally, singly-robust machine learning based propensity scores are inferior to doubly-robust methods for lower bias and confidence-interval coverage.31 However, to our knowledge, there were no standard software packages to implement doubly-robust causal inference methods for sparse, longitudinal datasets such as electronic health records. This analysis of diverse, real-world data showed that among those who sought medical care during the pandemic, patients testing positive were heavier at the start of the post-acute phase of COVID-19 and retained weight during the post-acute phase of follow-up compared to those testing negative throughout the observation period. Although the subgroup analysis revealed some heterogeneity in these associations, the observed associations were present among racial-ethnic minorities and young adults who were disproportionately affected by COVID-19. The mechanisms influencing this observed increase in weight retention after COVID-19 infection warrant further investigation. ## Supporting information Supplementary Material [[supplements/302697_file03.pdf]](pending:yes) RECORD-PE Checklist [[supplements/302697_file04.docx]](pending:yes) ## Sources of Support National Institutes of Health (R01DK120814-05S1, P30DK111024) ## Funding This research was supported by the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) of the National Institutes of Health, award number 3R01DK120814-05S1. MKA was partially supported by the Georgia Center for Diabetes Translation Research which is funded by the National Institutes of Health (P30DK111024). ## Conflicts of Interest The authors declare no competing financial interests. ## Author contributions RJC, WTD and JSV conceptualized the study with inputs from MKA and YG. JSV conducted the analysis. JSV wrote the first draft with inputs from RJC. All authors reviewed and edited subsequent drafts. ## Competing Interest The authors declare no competing financial interests. ## Data availability statement The code for the analysis is available on [https://github.com/jvargh7/pasc\_cardiometabolic\_risk](https://github.com/jvargh7/pasc_cardiometabolic_risk). Information of the OneFlorida+ CRN is provided at [https://onefloridaconsortium.org/](https://onefloridaconsortium.org/), and OneFlorida+ data are made available to researchers with an approved study protocol at [https://onefloridaconsortium.org/front-door/prep-to-research-data-query/](https://onefloridaconsortium.org/front-door/prep-to-research-data-query/). For questions regarding OneFlorida+, email: OneFloridaOperations{at}health.ufl.edu. ## Figure 1 Legend Our historical cohort corresponds with the blue lines. Our unexposed cohort corresponds with the green lines. Our exposed cohort corresponds with the red lines. ## ACKNOWLEDGEMENT/SUPPORT Acknowledgements This research was supported by the National Institute for Diabetes and Digestive and Kidney Diseases (NIDDK) of the National Institutes of Health, award number 3R01DK120814-05S1. MKA was partially supported by the Georgia Center for Diabetes Translation Research which is funded by the National Institutes of Health (P30DK111024). The authors thank the OneFlorida+ Data Trust team (Kathryn Shaw, Meggen Kaufman, Jiang Bian, Elizabeth Shenkman) for support on query development and data extraction. The authors thank Shihab Chowdhury for administrative support. ## Abbreviations BMI : Body mass index CP : Computable phenotype EHR : Electronic health record HDL : High density lipoprotein LDL : Low density lipoprotein PASC : Post-acute sequelae of COVID-19 T2DM : Type 2 diabetes * Received February 12, 2024. * Revision received February 12, 2024. * Accepted February 13, 2024. * © 2024, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.Singh AK, Khunti K. COVID-19 and Diabetes. Annu Rev Med. 2022;73(1):129–147. doi:10.1146/annurev-med-042220-011857 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1146/annurev-med-042220-011857&link_type=DOI) 2. 2.Xie Y, Al-Aly Z. Risks and burdens of incident diabetes in long COVID: a cohort study. The Lancet Diabetes & Endocrinology. 2022;10(5):311–321. doi:10.1016/S2213-8587(22)00044-4 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S2213-8587(22)00044-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35325624&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 3. 3.Barrett CE, Koyama AK, Alvarez P, et al. Risk for Newly Diagnosed Diabetes >30 Days After SARS-CoV-2 Infection Among Persons Aged <18 Years — United States, March 1, 2020–June 28, 2021. MMWR Morb Mortal Wkly Rep. 2022;71(2):59–65. doi:10.15585/mmwr.mm7102e2 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm7102e2&link_type=DOI) 4. 4.Rathmann W, Kuss O, Kostev K. Incidence of newly diagnosed diabetes after Covid-19. Diabetologia. Published online March 16, 2022. doi:10.1007/s00125-022-05670-0 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00125-022-05670-0&link_type=DOI) 5. 5.Holman N, Barron E, Young B, et al. Comparative Incidence of Diabetes Following Hospital Admission for COVID-19 and Pneumonia: A Cohort Study. Diabetes Care. 2023;46(5):938–943. doi:10.2337/dc22-0670 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2337/dc22-0670&link_type=DOI) 6. 6.Al-Aly Z, Xie Y, Bowe B. High-dimensional characterization of post-acute sequelae of COVID-19. Nature. 2021;594(7862):259–264. doi:10.1038/s41586-021-03553-9 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41586-021-03553-9&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33887749&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 7. 7.Xu E, Xie Y, Al-Aly Z. Risks and burdens of incident dyslipidaemia in long COVID: a cohort study. The Lancet Diabetes & Endocrinology. 2023;11(2):120–128. doi:10.1016/S2213-8587(22)00355-2 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S2213-8587(22)00355-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=36623520&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 8. 8.Xie Y, Xu E, Bowe B, Al-Aly Z. Long-term cardiovascular outcomes of COVID-19. Nat Med. 2022;28(3):583–590. doi:10.1038/s41591-022-01689-3 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-022-01689-3&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=35132265&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 9. 9.Mude W, Oguoma VM, Nyanhanda T, Mwanri L, Njue C. Racial disparities in COVID-19 pandemic cases, hospitalisations, and deaths: A systematic review and meta-analysis. J Glob Health. 2021;11:05015. doi:10.7189/jogh.11.05015 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.7189/jogh.11.05015&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 10. 10.Dalsania AK, Fastiggi MJ, Kahlam A, et al. The Relationship Between Social Determinants of Health and Racial Disparities in COVID-19 Mortality. J Racial and Ethnic Health Disparities. 2022;9(1):288–295. doi:10.1007/s40615-020-00952-y [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s40615-020-00952-y&link_type=DOI) 11. 11.Beckles GL, Chou CF. Disparities in the Prevalence of Diagnosed Diabetes — United States, 1999–2002 and 2011–2014. MMWR Morb Mortal Wkly Rep. 2016;65(45):1265–1269. doi:10.15585/mmwr.mm6545a4 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm6545a4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27855140&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 12. 12.Hogan WR, Shenkman EA, Robinson T, et al. The OneFlorida Data Trust: a centralized, translational research data infrastructure of statewide scope. Journal of the American Medical Informatics Association. 2022;29(4):686–693. doi:10.1093/jamia/ocab221 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/jamia/ocab221&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 13. 13.Zhang H, Zang C, Xu Z, et al. Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes. Nat Med. 2023;29(1):226–235. doi:10.1038/s41591-022-02116-3 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-022-02116-3&link_type=DOI) 14. 14.Hernandez-Romieu AC, Carton TW, Saydah S, et al. Prevalence of Select New Symptoms and Conditions Among Persons Aged Younger Than 20 Years and 20 Years or Older at 31 to 150 Days After Testing Positive or Negative for SARS-CoV-2. JAMA Netw Open. 2022;5(2):e2147053. doi:10.1001/jamanetworkopen.2021.47053 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jamanetworkopen.2021.47053&link_type=DOI) 15. 15.PCORnet. Common Data Model (CDM) Specification, Version 6.0. Published online January 2022. [https://pcornet.org/wp-content/uploads/2022/01/PCORnet-Common-Data-Model-v60-2020\_10_221.pdf](https://pcornet.org/wp-content/uploads/2022/01/PCORnet-Common-Data-Model-v60-2020_10_221.pdf) 16. 16.Wiese AD, Roumie CL, Buse JB, et al. Performance of a computable phenotype for identification of patients with diabetes within PCORnet: The Patient_JCentered Clinical Research Network. Pharmacoepidemiol Drug Saf. 2019;28(5):632–639. doi:10.1002/pds.4718 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/pds.4718&link_type=DOI) 17. 17.Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) Statement. PLoS Med. 2015;12(10):e1001885. doi:10.1371/journal.pmed.1001885 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pmed.1001885&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26440803&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 18. 18.Rassen JA, Blin P, Kloss S, et al. High_Jdimensional propensity scores for empirical covariate selection in secondary database studies: Planning, implementation, and reporting. Pharmacoepidemiology and Drug. 2023;32(2):93–106. doi:10.1002/pds.5566 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/pds.5566&link_type=DOI) 19. 19.Tian Y, Schuemie MJ, Suchard MA. Evaluating large-scale propensity score performance through real-world and synthetic data experiments. International Journal of Epidemiology. 2018;47(6):2005–2014. doi:10.1093/ije/dyy120 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyy120&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29939268&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 20. 20.Simon N, Friedman J, Hastie T. A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression. Published online 2013. doi:10.48550/ARXIV.1311.6529 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.48550/ARXIV.1311.6529&link_type=DOI) 21. 21.McCaffrey DF, Griffin BA, Almirall D, Slaughter ME, Ramchand R, Burgette LF. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Statist Med. 2013;32(19):3388–3414. doi:10.1002/sim.5753 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.5753&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23508673&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 22. 22.Weuve J, Tchetgen Tchetgen EJ, Glymour MM, et al. Accounting for Bias Due to Selective Attrition: The Example of Smoking and Cognitive Decline. Epidemiology. 2012;23(1):119–128. doi:10.1097/EDE.0b013e318230e861 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0b013e318230e861&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21989136&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000298156800018&link_type=ISI) 23. 23.Robins JM, Hernán MÁ, Brumback B. Marginal Structural Models and Causal Inference in Epidemiology: Epidemiology. 2000;11(5):550–560. doi:10.1097/00001648-200009000-00011 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00001648-200009000-00011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10955408&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000088854500011&link_type=ISI) 24. 24.Gao M, Piernas C, Astbury NM, et al. Associations between body-mass index and COVID-19 severity in 6·9 million people in England: a prospective, community-based, cohort study. The Lancet Diabetes & Endocrinology. 2021;9(6):350–359. doi:10.1016/S2213-8587(21)00089-9 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S2213-8587(21)00089-9&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33932335&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 25. 25.Kompaniyets L, Goodman AB, Belay B, et al. Body Mass Index and Risk for COVID-19– Related Hospitalization, Intensive Care Unit Admission, Invasive Mechanical Ventilation, and Death — United States, March–December 2020. MMWR Morb Mortal Wkly Rep. 2021;70(10):355–361. doi:10.15585/mmwr.mm7010e4 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.15585/mmwr.mm7010e4&link_type=DOI) 26. 26.Anderson LN, Yoshida_JMontezuma Y, Dewart N, et al. Obesity and weight change during the COVID\_J19 pandemic in children and adults: A systematic review and meta\_Janalysis. Obesity Reviews. 2023;24(5):e13550. doi:10.1111/obr.13550 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/obr.13550&link_type=DOI) 27. 27.Lin AL, Vittinghoff E, Olgin JE, Pletcher MJ, Marcus GM. Body Weight Changes During Pandemic-Related Shelter-in-Place in a Longitudinal Cohort Study. JAMA Netw Open. 2021;4(3):e212536. doi:10.1001/jamanetworkopen.2021.2536 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jamanetworkopen.2021.2536&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33749764&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F02%2F13%2F2024.02.12.24302697.atom) 28. 28.Perakakis N, Harb H, Hale BG, et al. Mechanisms and clinical relevance of the bidirectional relationship of viral infections with metabolic diseases. The Lancet Diabetes & Endocrinology. 2023;11(9):675–693. doi:10.1016/S2213-8587(23)00154-7 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S2213-8587(23)00154-7&link_type=DOI) 29. 29.Harding JL, Oviedo SA, Ali MK, et al. The bidirectional association between diabetes and long-COVID-19 – A systematic review. Diabetes Res Clin Pract. 2023;195:110202. doi:10.1016/j.diabres.2022.110202 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diabres.2022.110202&link_type=DOI) 30. 30.Haneuse S, Arterburn D, Daniels MJ. Assessing Missing Data Assumptions in EHR-Based Studies: A Complex and Underappreciated Task. JAMA Netw Open. 2021;4(2):e210184. doi:10.1001/jamanetworkopen.2021.0184 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jamanetworkopen.2021.0184&link_type=DOI) 31. 31.Naimi AI, Mishler AE, Kennedy EH. Challenges in Obtaining Valid Causal Effect Estimates With Machine Learning Algorithms. American Journal of Epidemiology. 2023;192(9):1536–1544. doi:10.1093/aje/kwab201 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwab201&link_type=DOI)