RR:C19 Evidence Scale rating by reviewer:
Potentially informative. The main claims made are not strongly justified by the methods and data, but may yield some insight. The results and conclusions of the study may resemble those from the hypothetical ideal study, but there is substantial room for doubt. Decision-makers should consider this evidence only with a thorough understanding of its weaknesses, alongside other evidence and theory. Decision-makers should not consider this actionable, unless the weaknesses are clearly understood and there is other theory and evidence to further support it.
***************************************
Review:
This is a concise manuscript that reports some findings from the comparative analysis of 171 Monkeypox virus genome sequences. The methods section is quite brief, using conventional tools to generate a multiple sequence alignment and reconstruct a phylogenetic tree by maximum likelihood. Unfortunately, the manuscript does not provide accession numbers for these genome sequences, making it impossible to reproduce their analysis.
The monkeypox virus genome encodes over 200 genes. In this study, the authors focus on nine particular protein-coding genes that were the focus of a previous study (Likos et al., 2005), which had selected those genes on the basis of having at least five amino acid substitutions separating all monkeypox virus clades (that were present at the time) from the vaccinia virus outgroup genomes.
The objective of the present study was to examine amino acid substitutions associated with more recently emerging monkeypox virus lineages, e.g., A.2, B.1. Thus, the implicit assumption is that these nine proteins selected in the previous study are particularly significant for the more recent evolution and expansion of monkeypox virus. Although this reliance on prior information to predict the selective effect of mutations can be useful at an early stage of an epidemic where there is limited divergence, a more data-driven approach examining the distribution of mutations among sites and branches of the tree is preferable. The authors observed a total of nine nucleotide substitutions in these nine focal genes, of which four resulted in amino acid substitutions. Moreover, the authors applied the same approach in examining four additional genes that have been targeted for vaccine research, among which they observed one mutation.
It is unclear what coordinate system the authors are using to report mutations, e.g., D442N in gene A50L (the lineage is incompletely specified as “2”, presumably ‘A.2’). I could not find this substitution in the Nextstrain list of mutations associated with this lineage.
The loss of the C3L gene has been remarked on in other preprints and peer-reviewed articles as a common feature of orthopoxviruses, e.g. Senkevich et al. (2021, mBio).
My overall assessment is that the conclusions drawn in the manuscript are not adequately supported by their results. For instance, the statement that “such proteins are attractive targets for future studies in vaccine production [...]” was based on the lack of substitutions in these genes, which have previously been targeted for vaccine studies. It would be more informative to examine the full distribution of non-synonymous substitutions, adjusted for variation in mutation rates, across genes for contemporary samples. In addition, the statement: “Our analyses suggest that lineage/clade A.2 may be suffering the different effects of various selective pressures than lineage/clade B.1.” is ambiguous and not clearly supported by the results of their analysis. The remainder of the manuscript is rather speculative.