Skip to main content
SearchLoginLogin or Signup

Review 1: "Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers"

This study develops a prediction model for positive COVID-19 diagnosis using data collected from Apple Watches on heart rate variability (HRV) among healthcare workers. Reviewers highlight unclear model justifications and methodology.

Published onJan 26, 2022
Review 1: "Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers"
1 of 2
key-enterThis Pub is a Review of
Evaluation of a Machine Learning Approach Utilizing Wearable Data for Prediction of SARS-CoV-2 Infection in Healthcare Workers

AbstractImportancePassive and non-invasive identification of SARS-CoV-2 infection remains a challenge. Widespread use of wearable devices represents an opportunity to leverage physiological metrics and fill this knowledge gap.ObjectiveTo determine whether a machine learning model can detect SARS-CoV-2 infection from physiological metrics collected from wearable devices.DesignA multicenter observational study enrolling health care workers with remote follow-up.SettingSeven hospitals from the Mount Sinai Health System in New York CityParticipantsEligibility criteria included health care workers who were ≥18 years, employees of one of the participating hospitals, with at least an iPhone series 6, and willing to wear an Apple Watch Series 4 or higher. We excluded participants with underlying autoimmune/inflammatory diseases, and medications known to interfere with autonomic function. We enrolled participants between April 29th, 2020, and March 2nd, 2021, and followed them for a median of 73 days (range, 3-253 days). Participants provided patient-reported outcome measures through a custom smartphone application and wore an Apple Watch, collecting heart rate variability and heart rate data, throughout the follow-up period.ExposureParticipants were exposed to SARS-CoV-2 infection over time due to ongoing community spread.Main Outcome and MeasureThe primary outcome was SARS-CoV-2 infection, defined as ±7 days from a self-reported positive SARS-CoV-2 nasal PCR test.ResultsWe enrolled 407 participants with 49 (12%) having a positive SARS-CoV-2 test during follow-up. We examined five machine-learning approaches and found that gradient-boosting machines (GBM) had the most favorable 10-CV performance. Across all testing sets, our GBM model predicted SARS-CoV-2 infection with an average area under the receiver operating characteristic (auROC)=85% (Confidence Interval 83-88%). The model was calibrated to improve sensitivity over specificity, achieving an average sensitivity of 76% (CI ±∼4%) and specificity of 84% (CI ±∼0.4%). The most important predictors included parameters describing the circadian HRV mean (MESOR) and peak-timing (acrophase), and age.Conclusions and RelevanceWe show that a tree-based ML algorithm applied to physiological metrics passively collected from a wearable device can identify and predict SARS-CoV2 infection. Utilizing physiological metrics from wearable devices may improve screening methods and infection tracking.

RR:C19 Evidence Scale rating by reviewer:

  • Potentially informative. The main claims made are not strongly justified by the methods and data, but may yield some insight. The results and conclusions of the study may resemble those from the hypothetical ideal study, but there is substantial room for doubt. Decision-makers should consider this evidence only with a thorough understanding of its weaknesses, alongside other evidence and theory. Decision-makers should not consider this actionable, unless the weaknesses are clearly understood and there is other theory and evidence to further support it.



The authors discuss the use of wearable device data for non-invasive prediction of SARS-CoV-2 infection. It has previously been shown that heart monitoring data collected from wearable devices is associated with infection. In this work, the authors develop a prediction model for positive COVID-19 diagnosis based on data collected from Apple Watches on heart rate variability and resting heart rate data.

The authors’ prediction algorithm can effectively predict positive COVID-19, with sensitivity and specificity of 76% and 84%, respectively, on a testing data set. The authors also show that HRV measurements are among the most important predictors in the algorithm, suggesting that HRV should be considered as a potentially useful predictor variable in further development of algorithms for predicting COVID-19 diagnosis.

The authors’ description of the construction of their predictive model is in some ways unclear to me. For instance, it is not clear how the smoothed estimates of daily HRV mean, amplitude and acrophase were calculated. I also feel it is not sufficiently well explained what are the parameters of the COSINOR random effects model the authors estimate or how the estimated COSINOR model was used in development the prediction model. Additionally, it would be helpful if the authors provided more details explaining how their training and testing sets were constructed. In particular, I do not understand how independence between the training and testing sets was achieved. It is also unclear how the authors accounted for correlation between observations when constructing confidence intervals for summaries of model performance on the training and testing set.

I also suspect that an improved prediction model could possibly be constructed, which predicts COVID-19 positivity for a subject at any given time point, based on the subject’s history up until that time point. My reasoning is that whether a subject, on a given day, observes HRV or heart rate measurements that are far different from the population average may be less relevant for predicting COVID-19 positivity than whether the subject observes a change in HRV and heart rate measurements that is substantially different from what is typical for that particular subject. I suggest that the authors consider this approach in their future research.

In summary, the authors’ conclusion that wearable device data can be used to predict COVID-19 infection appears to be justified. However, some aspects of the author’s methodology could have been explained more clearly, and the manuscript would benefit from the inclusion of more details.

No comments here
Why not start the discussion?