Data Availability
All computation tools and packages in this study are publicly available. The docker image containing all GATK tools is available at ( The docker image containing the germline variant detection tool, DeepVariant can be found at ( Tools and detailed usage for SpliceAI ( and GATK-gCNV ( can be found on the respective GitHub pages. All raw sequencing data for TCGA studies can be accessed with controlled access on the GDC data portal ( with approval. All raw sequencing data for the ICGC study can be accessed with controlled access on the ICGC data portal ( with approval. All raw sequencing data for CHECKMATE clinical studies and Genentech study can be downloaded from European Genome Phenome Archive (Dataset ID: EGAD00001001023) with approval. All raw sequencing data for cancer free control samples can be accessed on dbGAP Autism Sequencing Consortium (ASC) (dbGAP:phs000298.v4.p3), Framingham Cohort (dbGAP:phs000007.v32.p1), MESA Cohort (dbGAP: phs000209.v13.p3), NHLBI GO ESP: Lung Cohorts Exome Sequencing Project (dbGAP: phs000291.v2.p1). In house exome data for controls is available upon request.