AVAILABILITY OF DATA AND MATERIALS
The cancer mutation data from Cancer Hotspots that support the findings of this study are available through a public database and at the following URL: https://www.cancerhotspots.org/ (DOI: 10.1038/nbt.3391) Germline variants and their classifications are available in the ClinVar public archive: https://www.ncbi.nlm.nih.gov/clinvar/ (DOI: https://doi.org/10.1093/nar/gkx1153). For the Cancer Hotspots cancer mutation data transformation, the Python script is openly available on a GitHub repository: https://github.com/haqueb2/Cancer-Hotspots-Reformat. The training dataset used for training supervised learning models, the LRM and RFM pathogenicity scores assigned to training and test dataset variants, and prediction scores generated by other in silico tools for the test dataset are all available in Supplemental Table 6. All variants used in test and training datasets are included in Supplemental Table 6. R scripts used to train supervised learning models can be found in Supplemental Appendix 1 and 2. Datasets from Genomics England (DOI: https://doi.org/10.6084/m9.figshare.4530893.v7), MSSNG (DOI: 10.1016/j.cell.2022.10.009), Care4Rare (DOI:10.1016/j.ajhg.2022.10.002), and GeneDx are not openly available due to controlled access requirements. Access to these datasets can be made available upon request to the respective organizations.