RR:C19 Evidence Scale rating by reviewer:
Potentially informative. The main claims made are not strongly justified by the methods and data, but may yield some insight. The results and conclusions of the study may resemble those from the hypothetical ideal study, but there is substantial room for doubt. Decision-makers should consider this evidence only with a thorough understanding of its weaknesses, alongside other evidence and theory. Decision-makers should not consider this actionable, unless the weaknesses are clearly understood and there is other theory and evidence to further support it.
***************************************
Review:
After reviewing the methods, I believe that the results are potentially informative as a discussion for future research, but not as an actionable point of evidence. The claims made regarding the total potential reduction of COVID-19 transmission and deaths are based on an unreliable estimate for the effectiveness of masks, with much further uncertainty added in the process of projection. I have focused my review on the methods underlying the initial mask effect estimate, and have limited my review to the parts in which I have more limited expertise (e.g. infectious disease modeling). While I believe that the study was well-conducted and is useful to researchers, I do not believe it to be sufficiently reliable to be informative for policymakers.
The key methodological weakness in this study is the initial causal effect of masks on the prevention of respiratory diseases, on which all projections and modeling rely. In this case, it uses data meta-analyzed from two existing meta-analyses. However, there is no evaluation of the strength and quality of those studies, potential biases, etc. That is critical given that the effectiveness of masks on protection from respiratory virus infection is an extremely difficult subject to study in general, and that the pool of studies from which this meta-estimate relies on are of relatively low strength and unreliable methods. A meta-regression of unreliable evidence (Bayesian or otherwise), unfortunately still yields an unreliable estimate. At a minimum, the uncertainty bounds should reflect the full plausible range of plausible mask use estimates, including the additional uncertainty produced by study design issues. This is achievable through sensitivity or through careful application of Bayesian priors, noting that the final uncertainty range of the results, in this case, would almost certainly be enormous, beyond reasonable utility for decision-makers.
The projection of mask usage data appears reasonable, albeit of limited generalizability given selection. The SEIR basis for modeling is not my area of expertise, but there remains some question about too-simple mechanistic models in this setting. Some of the problems related to this difference between scenarios, so this may or may not be a severe issue for the main claims.
I found the limitations section in the discussion to be refreshingly detailed, comprehensive, and upfront. That is not easy to do, and unfortunately a rarity in the literature. That makes this study particularly useful for researchers who may be able to tackle some of these limitations to produce more reliable estimates. If I had any critique, it would be that the extent of these limitations is better reflected in the abstract.
The conclusions broadly lie within the general expert and literature consensus, but this study adds limited actionable information beyond those.