Data availability
The SARS-CoV-2 genome data used in this work are available from the GISAID EpiCov Database [18, 36] at https://www.gisaid.org. To view the contributors of each individual sequence with details such as accession number, Virus name, Collection date, Originating Lab and Submitting Lab and the list of Authors, visit https://doi.org/10.55876/gis8.230731by. Experimental data on viral phenotypes by Starr et al. and Greaney et al. is available from [3, Table S2] and [4, Table S3]. Data on fitness effects of Spike gene amino acid changes by Bloom & Neher is available from [108]. Data on filtration rules for highly homoplasic sites, problematic sites and potential artifacts in SARS-CoV-2 sequence alignments by Turakhia et al. and De Maio et al. is available from [46, 47]. Data on COSMIC mutational signatures for Genome GRCh37 by COSMIC is available from [109].