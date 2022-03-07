RR:C19 Evidence Scale rating by reviewer:

Reliable. The main study claims are generally justified by its methods and data. The results and conclusions are likely to be similar to the hypothetical ideal study. There are some minor caveats or limitations, but they would/do not change the major claims of the study. The study provides sufficient strength of evidence on its own that its main claims should be considered actionable, with some room for future revision.

Review:

Summary: The authors analyzed acoustic features extracted from cough sounds for COVID-19 detection, and further developed machine learning models for COVID-19 detection. Three subtasks were considered according to the patients’ symptoms. Specifically, the COMPARE feature set of 6373 dimensions was adopted, and feature importance was ranked. Different machine learning techniques were investigated and compared for COVID-19 detection. All experiments were conducted using the COUGHVID database.

The methods used to analyze the feature importance and the machine learning methods for COVID-19 detection are reasonable, and the analysis and performance support the claim. Only some minor caveats could be improved:

In table 5, it is hard to justify whether the higher performance is due to the gender/age or the number of samples. As it is imbalanced dataset, the authors should take this into account when comparing the performance in gender/age subgroups. Therefore, the claim of performance in male over female and the age group might need further justification.

Some machine learning techniques may suffer from the ‘curse of dimensionality’, such as SVM. Therefore, the lower performance of SVM or nonlinear models when compared with linear models may not be owing to the model itself, but rather to the inappropriate features. The authors might need to justify this in system comparisons in table 4. Further, compared to other approaches employing deep learning using the same dataset, the reported performance is low. This raises the concern that if deep learning features significantly outperform this feature set, what are the main advantages of the feature analysis? Are there any other potential ways to use these most important features to achieve better performance?

In table 2, it is observed that different groups of features show effectiveness in three subtasks, and task 2 requires most of the features. If the authors could discuss the potential reason, it would provide some insights and significantly enhance the understanding.

The presented feature analysis could provide some insights but needs further justification in comparison to state-of-the-art deep learning features and system performance.

The literature review is relatively comprehensive and covers the broad picture of disease detection using audio sounds, as well as the specific task of COVID-19 detection. The careful consideration in categorizing audio samples with and without symptoms is good.

The manuscript is well-structured and clearly presented. The writing quality is also good.



