RR:C19 Evidence Scale rating by reviewer:
Reliable. The main study claims are generally justified by its methods and data. The results and conclusions are likely to be similar to the hypothetical ideal study. There are some minor caveats or limitations, but they would/do not change the major claims of the study. The study provides sufficient strength of evidence on its own that its main claims should be considered actionable, with some room for future revision.
The authors use a large outpatient database from Southern California (USA) to compare the profile of individuals infected by the XBB.1.5 SARS-CoV-2 variant or by non-XBB variants (mostly the BQ.1.1 variant). By analysing individual health records, they can stratify individuals based on their number of vaccine doses and history of SARS-CoV-2 infection at the time of the infection. Their results show that XBB.1.5 cases tend to be less vaccinated than non-XBB cases but, conversely, tend to have more prior infections. They further analyse their results by exploring potential differences in terms of hospital admission and other adverse clinical outcomes. Finally, they discuss their results in the context of immune evasion and variant emergence.
The dataset analysed is impressive, the methods seem robust (although sometimes unclear), and the question addressed is timely and important. The writing is also very good. My two main concerns have to do with the clarity of the analyses performed, and the risk of a potential bias in the analysis.
1) Patient stratification The main risk in such analysis is that the outcome studied (here infection by XBB.1.5 or not) overlaps with another stratification in the data. The authors go to some extent to control for this by comparing patient age, sex, comorbidity index, etc (see Table 1).
1.1 My first comment is that potential differences between populations do not seem to be thoroughly investigated. For instance, Table 1 could include a statistical test for differences between the two types of infection. Potential differences could also be discussed (for instance, infections by XBB.1.5 seem to circulate in a population with more women and older individuals).
1.2 Related to this point, it would be interesting to add to this table the date at which the test was performed to highlight potential temporal differences.
1.3 Furthermore, it seems important to add some spatial information. Indeed, since the data covers more than 4 million people all over California, this could allow the authors to identify some stratification in the data that may overlap with their outcome.
2) Temporal and spatial issues
2.1 One of the aspects the authors do not seem to discuss is that the data does not seem to show an increase in XBB.1.5 cases (in Figure 1C the number of cases seems rather constant). It is more the decrease in non-XBB cases in December 2022 that drives the relative increase in XBB.1.5. The authors checked that samples without the SGTF were XBB.1.5 but it would be interesting to see if this is the case both for December (when non-XBB were the majority) and February (when XBB.1.5 is the majority). If the non-SGTF cases in December are indeed caused by XBB.1.5, then it would be interesting to discuss why this variant did not increase in proportion until 2023. My guess is that its advantage comes from evading immunity to non-XBB lineages and that this advantage was limited until 2023. If so, this could potentially be linked with the message of the article, which is that XBB.1.5 is better at evading natural immunity than vaccine immunity. Furthermore, that XBB.1.5 could not overgrow the previous lineage perhaps tells us something about the magnitude of the wave it may cause.
2.2 It was difficult for me to understand how the temporal component was controlled for in the analysis. In general, it does not seem to affect the authors' results. However, I was wondering if the timing of vaccination campaigns in the US might not be an issue. If there are waves of vaccination for boosters, perhaps this could create associations with the variant waves. Overall, although probably unlikely, I wonder if the combination of spatial and temporal sampling biases could potentially affect the results, for instance, if an epidemic wave affected one region before affecting another region. I might be wrong but this could perhaps be consistent with individuals infected by XBB.1.5 showing more prior infections than individuals infected by non-XBB lineages.
3.1 The authors decided not to give access to the raw individual data, which can be understood given that some of the details are very sensitive. However, it also hampers the reproducibility of the study. Perhaps an intermediate option could be envisaged with only the basic information (e.g. age and sex) along with the test results. This could also allow the authors to share their R scripts.
3.2 Even with the raw data, I do not think the study is reproducible. The methods for the logistic analyses are 9 lines long (along with a supplementary figure for acyclic graphs) and the way the adjustments are performed is unclear. Adding an online supplementary methods section seems important, especially since there seem to be differences between the two populations from Table 1 (not to mention potential spatial or temporal differences).
Since our solicitation of reviews, this preprint has been published in Nature Communications journal and the link to the published manuscript can be found here.