RR:C19 Evidence Scale rating by reviewer:
Potentially informative. The main claims made are not strongly justified by the methods and data, but may yield some insight. The results and conclusions of the study may resemble those from the hypothetical ideal study, but there is substantial room for doubt. Decision-makers should consider this evidence only with a thorough understanding of its weaknesses, alongside other evidence and theory. Decision-makers should not consider this actionable, unless the weaknesses are clearly understood and there is other theory and evidence to further support it.
***************************************
Review:
Here, the authors aim to provide an all-in-one pipeline for the reconstruction of SARS-CoV-2 genomes from different sequencing technologies. While such easy-to-use pipelines are needed by the worldwide community to rapidly reconstruct genomes for molecular surveillance and the detection of emerging variants, they also need to be accurate to support decision-making even based on single nucleotide changes. Unfortunately, I think that the pipeline in its current state does not produce high-quality genome sequences. While the used tools seem to some extent reasonable for short-read data, they will fail for the reconstruction of accurate genomes from Nanopore data.
Thus, I highly recommend either focusing the pipeline on short reads only or including proper analysis steps and tools that also support Nanopore data. In the current state, I would not recommend the pipeline for Nanopore data at all.
Major
[1] Read quality analysis and trimming
As described, the authors use Trimmomatic for a basic read qc. However, the removal of the remaining 5’ and/ or 3’ adapter sequences or primer sequences in particular for Illumina protocols is a crucial step that can also impact mapping/ variant calling if not properly done. I recommend adding additional functionalities for adapter trimming and primer clipping. For example, adapter trimming can be performed via fastp (that could be also a general replacement of Trimmomatic in terms of speed) while providing the adapter sequences in FASTA format. For primer clipping (e.g. derived from Illuminas CleanPlex protocol,) I can recommend bamclipper. This might complicate the workflow but is crucial for specific sequencing protocols like involving amplicons. Regarding Nanopore reads: do the authors also trim them with Trimmomatic? If so, there is no need and normally Nanopore data is only filtered by length. E.g. many labs use the well-established ARTIC amplicon protocol and, for example, select only reads between 400-700 nt (V3 protocol) for further processing.
[2] Subtraction of human sequences
I recommend not only mapping against the human reference genome but rather generating an index out of human+SARS-CoV-2. Otherwise, it could happen that (short) reads are sub-optimally mapping against the human genome that including a not inconsiderable amount of endogenous viral elements. Do the authors map Nanopore reads with Bowtie2 as well? Or Minimap2?
[3] Contig assembly
I am unsure if the authors are also using SPAdes for Nanopore data. I would recommend specialized long-read assembly tools such as flye. Besides, it is questionable if such a de novo step is needed at all if the authors construct the consensus reference-based.
[4] Genome reconstruction
The authors use mpileup and bcftools from SAMtools for the variant calling and consensus reconstruction. While these are basic tools for such tasks, there are also more sophisticated tools for variant calling already used by the SARS-CoV-2 community such as LoFreq, Freebayes, or GATK. Also, parameter settings such as allele frequency cutoffs, … are important to consider. For Nanopore data, the used procedure will result in many false variant calls (see below).
[5] Nanopore
The pipeline is lacking important steps needed for the proper analysis of Nanopore data. After mapping reads with e.g. minimap2 polishing steps (e.g. racon, Medaka) are needed to reduce errors in Nanopore data. Also, the variant calling should be not performed with default tools such as samtools but rather using the machine learning models e.g. implemented in Medaka for variant calling. If I understand Tab. 1 correctly, the results clearly show that the pipeline is not working for Nanopore data: 96 % of consensus sequences with different nucleotide calls. Sure, Genome Detective is not much better (90 %) but it might be that the tool is also not suitable for analyzing Nanopore data?
[6] Benchmark
First of all, it is unclear which genomes the authors used from their pipeline: the reference-based or de novo reconstructed ones? The authors report that the genomes produced are generally longer than the ones produced by CLC or Genome Detective. I wonder if these tools perform de novo assemblies of the reads or also use a reference-guided consensus strategy? Most pipelines currently available (such as https://github.com/connor-lab/ncov2019-artic-nf, https://github.com/replikation/poreCov, https://gitlab.com/RKIBioinformaticsPipelines/ncov_minipipe, …) perform reference-based reconstructions and thus rely on the length of the (Wuhan) reference genome. Thus, the length of the consensus genome is not necessarily a meaningful quality metric. I also wonder why the pipeline in the mean produced longer genomes than the GISAID references that might be also assembled reference-based. Are the reconstructions extended at the 5’ and 3’ end of the genome?
Besides, CLC seems to perform much better than the pipeline regarding a very important metric: % of consensus sequences with different nucleotide calls.