RR\ID Evidence Scale rating by reviewer:
***************************************
Review: This study found specific human factors, namely individual pre-vaccination antibody titers, that were indicative of vaccination effectiveness. This study provided insights, as well as a replicable analysis framework, to predict an individual’s response to vaccination. Please see comments below:
Overall:
This is an excellent and informative analysis with a potential for considerable impact.
It seems that your emphasis on the “prediction” framework understates the interpretability of your findings. More specifically, your analyses that specifically target the relationship between pre-vac HAI and post-vac response go beyond a typical prediction analysis. I believe that a reframing of this analysis with broader language, rather than solely prediction, would be better aligned with your analysis and inferences and would better attract relevant readers.
Additional details on prediction and modeling would be useful, specifically regarding how specific factors (e.g. pre vac HAI) were isolated in analyses.
Authors could include additional references to similar methodologies (e.g. selection of holdout sets), which would further ground this approach in existing literature.
Abstract:
The phrase “predictive understanding” doesn’t quite make sense, as prediction often foregoes understanding in the name of predictive accuracy. Perhaps a phrase such as “ability to predict” may be more clear.
“20k data points” is a bit vague. Does this refer to all covariate information, outcomes, etc.?
Additional context could be provided for the phrase “These datasets formed a blinded prediction challenge, where the computational team only received the pre-vaccination data yet predicted the post-vaccination responses with 2.2-fold error, comparable to the 2-fold intrinsic error of the experimental assay.” While this will likely be clear to vaccination experts, it would be useful to provide context on this comparison, given the broad readership of RR\ID. The following section provides a great overview of this context, but it would be helpful to provide a short allusion in the abstract, or remove this comparison.
Introduction:
The statement, “This underscores the need for a combined virus-and-people-centric approach based upon both a strain’s prevalence and the immunity it elicits in people,” is compelling as the need for this research.
It would be helpful to define the terms, “Antigenic seniority, imprinting, vaccine blunting, and antibody ceiling effects”.
“…can be predicted with accuracy comparable to experimental noise.” Noise (variability) and accuracy seem like very different parameters – which is comparable here?
Performance in holdout sample is impressive and supports generalizability “Prediction accuracy holds across four new vaccine studies we conduct (in 2022 and 2023) spanning three vaccine types and two geographic locations. For this challenge, the computational team (T.E.) was blinded and only given the prevac data to stringently test the model’s predictive power.”
More details from this paragraph could be included in abstract.
This statement is unclear and could benefit from rephrasing: “While individual studies may find different relationships between variants, combining all studies from the past decade leads to universal relations that accurately predict post-vac titers”.
Here I have noted a statement that goes beyond a typical prediction framework: “The magnitude of the fold-change post-vac is strongly associated with the # of years (ΔPeak) between the two most recent peaks in the HAI landscape; 2≤ΔPeak≤3 yields a large fold-change while 4≤ΔPeak≤6 leads to a smaller fold-change in 73% of cases.” This is beyond prediction in terms of interpretability. I view this as a strength of the paper, but framing could be improved to prepare the reader for these inferences.
In the following excerpt, it would be useful to refer to literature for best practices on holdout sets, which supports your approach (Collins GS, Dhiman P, Andaur Navarro CL, Ma J, Hooft L, Reitsma JB, et al. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open. 2021;11(7):e048008) “Instead of splitting each dataset into training/testing sets, we restrict ourselves to the harder prediction challenge of training on some datasets and testing on entirely different datasets”.
Results:
The following excerpt is unclear and could benefit from rephrasing: “Due to the substantial effort involved, such studies often restrict their analyses to the sizable datasets they produce”.
Here, I highlight another excerpt that goes beyond a typical prediction framework, as it’s targeting the relationship between pre-vac HAI and post-vac HAI: “We next determine how well pre-vac HAIs predict the post-vac response, while taking into account the heterogeneity of responses, the different variants measured in each study, and differences in study design that may affect the response.” This seems beyond prediction, as it’s targeting the relationship between pre-vac HAI and post-vac HAI
This excerpt provides useful context and would be helpful in the abstract “Since many influenza studies (beyond the ones analyzed in this work) only measure the vaccine strain, we sought to beat the 4.5x error found when only matching the vaccine strain’s pre-vaccination HAI (VacPre, Fig 2A red square), ideally aiming for the 2x noise limit of the HAI assay (see Methods for the quantification of assay noise)”.
Discussion:
Methods: