RR:C19 Evidence Scale rating by reviewer:
Not informative. The flaws in the data and methods in this study are sufficiently serious that they do not substantially justify the claims made. It is not possible to say whether the results and conclusions would match that of the hypothetical ideal study. The study should not be considered as evidence by decision-makers.
In this work, Hoffman and colleagues propose the use of bulk RNA-seq as a routine diagnostic tool in the clinic, specifically for the determination of immune cell type proportions, currently determined using the gold standard complete blood counts, and the diagnosis of viral infection including TCR/BCR clone identification. The authors further claim that a sequencing depth of 10 million reads per sample is enough to provide this information.
The authors use four sets of publicly available data generated from patients infected with different variants of SARS-CoV2, or individuals defined as seronegative. First, they applied a series of deconvolution methods to the RNA-seq data and compared the output to, presumably, lab-derived CBC counts (although this is not clear from the manuscript). The methods or overall approach here is not novel but this is, to our knowledge, one of the first examples of the algorithms being compared head-to-head against a reference standard using data from SARS-CoV2. The results are of potential benefit to researchers highlighting differences between algorithms, but the presented data does not provide any true support for the method being used in clinical practice. Statistically, we would suggest plotting the different algorithms on a single plot stratified by cell type to allow better comparison. Furthermore, the authors should be clear about which method they use to test correlation as the distributions are unlikely to all be normal. Provision of R estimates on the plots themselves would also be more helpful. We agree that the results for neutrophils and lymphocytes for xCell are very odd. In regards to the authors’ suggestion that this method could be used clinically, this is unfortunately highly unlikely due to the substantially higher cost and reduced scalability of RNA-seq compared to established cell-count systems, and the levels of error for the algorithms for less common subsets would be an area of concern.
The authors next propose a hypothesis that individuals with less severe COVID-19 disease are more likely to have bulk- RNA-seq data more similar to seronegative individuals (i.e. healthy state). This is a well-known fact and has been shown in multiple publications using bulk and single cell RNA-seq and proteomic data, so we would argue it is not really a valid hypothesis to retest. Furthermore, their datasets have not been presented in a method that truly allows a testing of this hypothesis as we do not know the clinical severity of the individuals in each variant group. We appreciate that the variants could be associated with differential severity in themselves, but some individuals early in the pandemic would have been hospitalised for public health reasons and not clinical need, hence breaking the proposed link between early variants and increased disease severity. Therefore we would strongly suggest a presentation of the data at least acknowledging clinical severity status if available (or at least including severity states in the Supplementary Tables demonstrating that earlier variant carriers were more unwell).
The next stage of analysis was the interpretation of BCR and TCR sequences from the available data and identifying potentially SARS-CoV2-specific sequences compared to seronegative individuals. The authors’ analysis is very simplistic and we question how the authors could state that their approach could be used for clinical diagnostics when in some cases they identified one sequence out of 100 as being linked to SARS-CoV2 when many of the other reference sequences also matched perfectly given the short stretch of peptide sequence available. We also question why a further step of pBLAST was required rather than depending on the specific output from MixCR that should provide clonotype information that could then be matched against SARS-CoV2 specific databases (e.g. https://doi.org/10.1016/j.immuni.2022.03.019). The current interpretation of the data appears highly subjective and lacks clarity. This is an area of significant attention across the research field, and many advanced systems are in place for this type of analysis but to demonstrate clinical utility requires careful protocolisation and training and testing datasets of both SARS-COV2 and other infection data to allow the generation of performance metrics including specificity and sensitivity. This present work offers nothing of this type of analysis.
The final component of the work demonstrates, largely using the deconvolution data, that read depth to 10M is sufficient to allow interpretation of deconvolution similar to when the data had not been down sampled. This is perhaps the most significant message of the manuscript and may be helpful for RNA-seq analysts in the future when designing experiments, although the presented data is only really helpful for deconvolution. It does not test whether down sampling affects the reliability of gene expression or network analysis, that is arguably a greater strength of RNA-seq data. It certainly does not justify the potential use of RNA-seq in the clinic.
In summary, we remain unconvinced that many of the authors’ claims are justified by the research presented in the manuscript. RNA-seq is highly unlikely to be used in routine clinical practice owing to increased costs, reduced scalability, and increased turn-around times. The elements presented by the authors in this paper including cell count determination and immune receptor analysis offer no compelling arguments to challenge this stance. The main take-home message for us from the presented data is that deconvolution methods can be used but different cell types are estimated with differential accuracy from the different algorithms and read depth to 10M may be the lower bound of acceptance for such methods.