Skip to main content
SearchLoginLogin or Signup

Review 2: "Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers"

This study develops a prediction model for positive COVID-19 diagnosis using data collected from Apple Watches on heart rate variability (HRV) among healthcare workers. Reviewers highlight unclear model justifications and methodology.

Published onJan 26, 2022
Review 2: "Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers"
1 of 2
key-enterThis Pub is a Review of
Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers

AbstractImportancePassive and non-invasive identification of SARS-CoV-2 infection remains a challenge. Widespread use of wearable devices represents an opportunity to leverage physiological metrics and fill this knowledge gap.ObjectiveTo determine whether a machine learning model can detect SARS-CoV-2 infection from physiological metrics collected from wearable devices.DesignA multicenter observational study enrolling health care workers with remote follow-up.SettingSeven hospitals from the Mount Sinai Health System in New York CityParticipantsEligibility criteria included health care workers who were ≥18 years, employees of one of the participating hospitals, with at least an iPhone series 6, and willing to wear an Apple Watch Series 4 or higher. We excluded participants with underlying autoimmune/inflammatory diseases, and medications known to interfere with autonomic function. We enrolled participants between April 29th, 2020, and March 2nd, 2021, and followed them for a median of 73 days (range, 3-253 days). Participants provided patient-reported outcome measures through a custom smartphone application and wore an Apple Watch, collecting heart rate variability and heart rate data, throughout the follow-up period.ExposureParticipants were exposed to SARS-CoV-2 infection over time due to ongoing community spread.Main Outcome and MeasureThe primary outcome was SARS-CoV-2 infection, defined as ±7 days from a self-reported positive SARS-CoV-2 nasal PCR test.ResultsWe enrolled 407 participants with 49 (12%) having a positive SARS-CoV-2 test during follow-up. We examined five machine-learning approaches and found that gradient-boosting machines (GBM) had the most favorable 10-CV performance. Across all testing sets, our GBM model predicted SARS-CoV-2 infection with an average area under the receiver operating characteristic (auROC)=85% (Confidence Interval 83-88%). The model was calibrated to improve sensitivity over specificity, achieving an average sensitivity of 76% (CI ±∼4%) and specificity of 84% (CI ±∼0.4%). The most important predictors included parameters describing the circadian HRV mean (MESOR) and peak-timing (acrophase), and age.Conclusions and RelevanceWe show that a tree-based ML algorithm applied to physiological metrics passively collected from a wearable device can identify and predict SARS-CoV2 infection. Utilizing physiological metrics from wearable devices may improve screening methods and infection tracking.

RR:C19 Evidence Scale rating by reviewer:

  • Potentially informative. The main claims made are not strongly justified by the methods and data, but may yield some insight. The results and conclusions of the study may resemble those from the hypothetical ideal study, but there is substantial room for doubt. Decision-makers should consider this evidence only with a thorough understanding of its weaknesses, alongside other evidence and theory. Decision-makers should not consider this actionable, unless the weaknesses are clearly understood and there is other theory and evidence to further support it.



This preprint, by Hirten et al, presents a study in which heart rate variation (HRV) was used to predict COVID-19 status of healthcare workers. Previous work, mentioned in the article, demonstrate that data from wearables has been successful in predicting COVID-19 status, in some cases even before patients are symptomatic. The study fits well into existing work in the area by building off previous work of wearables and changing in heart rate (such as the daily amplitude) can help predict COVID status. However, these other studies included more information than just heart rate based variables and very basic demographics (age and gender). The authors state that the main contribution of this study is training the model and using testing data to determine accuracy.

The study included over 400 health care workers with each participant’s heart rate being collected using apple watches for a median of 73 days. The authors address the low number of positive COVID cases (12%) in their analysis but do not discuss anything regarding the study’s enrollment bias or the disproportionate number of women in the study. A major weakness in the article is that the sampling method is not clear to the reader. The authors state: "Data was split into independent training and testing sets, ensuring that observations were taken on chronologically similar days (e.g., Day 6 and Day 7), for the same subject, were in the same set. A sampling procedure was employed that ensured that observations with proximity in time (±4 days), for the same subject, did not appear in both training and testing sets." This can be interpreted as: multiple observations of the same subject can appear in the same set if the observations are chronically close and that the same subject may appear in both the training and testing sets if the observations are chronically far apart. Having multiple observations from the same subject creates problems because the observations would be correlated. There are methods that can be used to address correlation in tree-based methods [1] but they were not mentioned in the paper. If the same subject is in both the training and the testing set, then the two sets are not independent which can result in a falsely high prediction rate. Furthermore, the fact that COVID positive patients are less likely to contract COVID again for some period of time after recovery presents additional correlation issues in which the ±4 days window would not be sufficient to claim the sets are independent. If this is indeed the analysis performed by the authors, then the method used is inappropriate for the type of data used in the study. The assumptions for GBM were violated. However, if this was not the case, the language should be corrected and expanded to avoid confusion. In addition, the authors would need to address how the issue of correlated data was handled or why in this case it can be ignored.

Given the unaddressed limitations of the analysis, the results from the testing data is suggestive and do not support the claim that the HRV measured by a wearable device can reliably predict COVID infections. It is these reasons that the authors’ claim of the article’s contribution of using testing data is further weakened. This study has an interesting idea but some additional work is needed to better understand the predictive power of HRV on COVID-19 status.

1. Rabinowicz A, Rosset S. Trees-Based Models for Correlated Data. arXiv preprint arXiv:2102.08114. 2021 Feb 16.

No comments here
Why not start the discussion?