Abstract
Our ability to track the spread of the human monkeypox virus (hMPXV) during the ongoing monkeypox (hMPX) outbreak of 2022 relies on the availability of high-quality reference genomes. However, the way the information content of these genomes is organized in genome databases leaves room for interpretation. A current limitation of hMPXV genomic analysis is that the variability occurring in the inverted terminal repeats (ITRs) cannot be effectively resolved. This is because of shortcomings of the leading short-read sequencing and reference-guided assembly and variant calling used in the ongoing global hMPXV outbreak surveillance effort. Here I propose ITR tail-trimming, a simple no-cost reframing of how we organize hMPXV reference genomes and future assemblies. This approach is based on long terminal repeat (LTR) tail-trimming, which is a common practice in HIV sequence analysis. The main point of repeat sequence trimming is to remove problematic sequences while paying attention to limitations of mapping and variant calling in remaining repeat-associated (but ideally no longer repetitive) sequence. ITR tail-trimming would neutralize ITRs as distracting features at the read- and assembly-levels, allowing the global community to focus our efforts to track variability across hMPXV genomes.
Competing Interest Statement
I have received travel support in the form of poster bursaries from Oxford Nanopore Technologies, Oxford, UK. I am on the editorial board of AIDS.
Funding Statement
This work was supported by Cooperative Agreement Number NU60OE000104-02, funded by the Centers for Disease Control and Prevention through the Association of Public Health Laboratories. Its contents are solely the responsibility of the author and do not necessarily represent the official views of the Centers for Disease Control and Prevention or the Association of Public Health Laboratories.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Example anonymized and dehosted sequencing data was downloaded from the US National Center for Biotechnology Information (NCBI) Short Read Archive (SRA). The following MPXV WGS sequencing data was used as an example due to its high coverage over the ITRs: CA-LACPHL-M10162_081822_5x_01 of BioSample:SAMN30416950; run accession SRR21143274. This was submitted by Los Angeles County Public Health Laboratories microbial pathogen submission group (LACPHL) under BioProject:PRJNA864832. The hMPXV Clade IIb reference genome used was RefSeq:NC_063383.1 which was used to derive the tail-trimmed assembly mentioned in the text with documentation to make.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Contact: itspronouncedhenner{at}gmail.com, Downey, Los Angeles County, California, USA.
Data Availability
All data produced in the present work are contained in the manuscript.