PT - JOURNAL ARTICLE AU - Zhao, Yige AU - Zhong, Guojie AU - Hagen, Jake AU - Pan, Hongbing AU - Chung, Wendy K. AU - Shen, Yufeng TI - A probabilistic graphical model for estimating selection coefficient of nonsynonymous variants from human population sequence data AID - 10.1101/2023.12.11.23299809 DP - 2023 Jan 01 TA - medRxiv PG - 2023.12.11.23299809 4099 - http://medrxiv.org/content/early/2023/12/13/2023.12.11.23299809.short 4100 - http://medrxiv.org/content/early/2023/12/13/2023.12.11.23299809.full AB - Accurately predicting the effect of missense variants is important in discovering disease risk genes and clinical genetic diagnostics. Commonly used computational methods predict pathogenicity, which does not capture the quantitative impact on fitness in humans. We developed a method, MisFit, to estimate missense fitness effect using a graphical model. MisFit jointly models the effect at a molecular level (d) and a population level (selection coefficient, s), assuming that in the same gene, missense variants with similar d have similar s. We trained it by maximizing probability of observed allele counts in 236,017 European individuals. We show that s is informative in predicting allele frequency across ancestries and consistent with the fraction of de novo mutations in sites under strong selection. Further, s outperforms previous methods in prioritizing de novo missense variants in individuals with neurodevelopmental disorders. In conclusion, MisFit accurately predicts s and yields new insights from genomic data.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work is supported by NIH grants (R35GM149527, R01GM120609, and P50HD109879), Simons Foundation (SFARI #1019623), and Columbia Precision Medicine Pilot grants program.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:IRB of Columbia University gave ethical approval for this workI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present work are contained in the manuscript https://github.com/ShenLab/MisFit