Data Availability
We analyse three datasets here: Viral genomes sequenced by the Sanger Institute at part of the COG-UK Consortium. These can be accessed at https://cog-uk.s3.climb.ac.uk/phylogenetics/latest/cog_all.fasta, or as an alignment at https://cog-uk.s3.climb.ac.uk/phylogenetics/latest/cog_alignment.fasta . Metadata, including lineage calls, is at https://cog-uk.s3.climb.ac.uk/phylogenetics/latest/cog_metadata.csv . Global SARS-CoV-2 genomes available at https://gisaid.org. An alignment is also available there within the Downloads section, with a separate metadata file, including lineage calls. Raw reads from the Sequence Read Archive were accessed from https://www.ncbi.nlm.nih.gov/sra Fully deidentified Ct-genotype mappings, as well as the other COG-UK data analysed are available at https://github.com/theosanderson/amplicon_72.
https://www.ncbi.nlm.nih.gov/sra
https://www.cogconsortium.uk/tools-analysis/public-data-analysis-2/