Data Availability
All computation tools and packages in this study are publicly available. The docker image containing all GATK tools is available at (https://hub.docker.com/r/broadinstitute/gatk/). The docker image containing the germline variant detection tool, DeepVariant can be found at (https://hub.docker.com/r/google/deepvariant). Tools and detailed usage for SpliceAI (https://github.com/Illumina/SpliceAI) and GATK-gCNV (https://github.com/theisaacwong/talkowski/tree/master/gCNV) can be found on the respective GitHub pages. All raw sequencing data for TCGA studies can be accessed with controlled access on the GDC data portal (https://portal.gdc.cancer.gov/) with approval. All raw sequencing data for the ICGC study can be accessed with controlled access on the ICGC data portal (https://dcc.icgc.org/) with approval. All raw sequencing data for CHECKMATE clinical studies and Genentech study can be downloaded from European Genome Phenome Archive (Dataset ID: EGAD00001001023) with approval. All raw sequencing data for cancer free control samples can be accessed on dbGAP Autism Sequencing Consortium (ASC) (dbGAP:phs000298.v4.p3), Framingham Cohort (dbGAP:phs000007.v32.p1), MESA Cohort (dbGAP: phs000209.v13.p3), NHLBI GO ESP: Lung Cohorts Exome Sequencing Project (dbGAP: phs000291.v2.p1). In house exome data for controls is available upon request.