Early prediction of clinical response to checkpoint inhibitor therapy in human solid tumors through mathematical modeling
Abstract
Background:
Checkpoint inhibitor therapy of cancer has led to markedly improved survival of a subset of patients in multiple solid malignant tumor types, yet the factors driving these clinical responses or lack thereof are not known. We have developed a mechanistic mathematical model for better understanding these factors and their relations in order to predict treatment outcome and optimize personal treatment strategies.
Methods:
Here, we present a translational mathematical model dependent on three key parameters for describing efficacy of checkpoint inhibitors in human cancer: tumor growth rate (α), tumorimmune infiltration (Λ), and immunotherapymediated amplification of antitumor response (µ). The model was calibrated by fitting it to a compiled clinical tumor response dataset (n = 189 patients) obtained from published antiPD1 and antiPDL1 clinical trials, and then validated on an additional validation cohort (n = 64 patients) obtained from our inhouse clinical trials.
Results:
The derived parameters Λ and µ were both significantly different between responding versus nonresponding patients. Of note, our model appropriately classified response in 81.4% of patients by using only tumor volume measurements and within 2 months of treatment initiation in a retrospective analysis. The model reliably predicted clinical response to the PD1/PDL1 class of checkpoint inhibitors across multiple solid malignant tumor types. Comparison of model parameters to immunohistochemical measurement of PDL1 and CD8+ T cells confirmed robust relationships between model parameters and their underlying biology.
Conclusions:
These results have demonstrated reliable methods to inform model parameters directly from biopsy samples, which are conveniently obtainable as early as the start of treatment. Together, these suggest that the model parameters may serve as early and robust biomarkers of the efficacy of checkpoint inhibitor therapy on an individualized perpatient basis.
Funding:
We gratefully acknowledge support from the Andrew Sabin Family Fellowship, Center for Radiation Oncology Research, Sheikh Ahmed Center for Pancreatic Cancer Research, GE Healthcare, Philips Healthcare, and institutional funds from the University of Texas M.D. Anderson Cancer Center. We have also received Cancer Center Support Grants from the National Cancer Institute (P30CA016672 to the University of Texas M.D. Anderson Cancer Center and P30CA072720 the Rutgers Cancer Institute of New Jersey). This research has also been supported in part by grants from the National Science Foundation Grant DMS1930583 (ZW, VC), the National Institutes of Health (NIH) 1R01CA253865 (ZW, VC), 1U01CA196403 (ZW, VC), 1U01CA213759 (ZW, VC), 1R01CA226537 (ZW, RP, WA, VC), 1R01CA222007 (ZW, VC), U54CA210181 (ZW, VC), and the University of Texas System STARS Award (VC). BC acknowledges support through the SER Cymru II Programme, funded by the European Commission through the Horizon 2020 Marie SkłodowskaCurie Actions (MSCA) COFUND scheme and the Welsh European Funding Office (WEFO) under the European Regional Development Fund (ERDF). EK has also received support from the Project Purple, NIH (U54CA210181, U01CA200468, and U01CA196403), and the Pancreatic Cancer Action Network (1665SING). MF was supported through NIH/NCI center grant U54CA210181, R01CA222959, DoD Breast Cancer Research Breakthrough Level IV Award W81XWH1710389, and the Ernest Cockrell Jr. Presidential Distinguished Chair at Houston Methodist Research Institute. RP and WA received serial research awards from AngelWorks, the GillsonLongenbaugh Foundation, and the Marcus Foundation. This work was also supported in part by grants from the National Cancer Institute to SHC (R01CA109322, R01CA127483, R01CA208703, and U54CA210181 CITO pilot grant) and to PYP (R01CA140243, R01CA188610, and U54CA210181 CITO pilot grant). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Editor's evaluation
A mathematical model was established for predicting immunotherapy efficacy in this work. With three convenient available clinical parameters, the model has exhibited considerable predictive capacity with stable performance across several tumor types. It may show great promise in selecting participants for prospective trials and guiding targeted application of immunotherapy in cancer patients.
https://doi.org/10.7554/eLife.70130.sa0Introduction
Recent advances in the understanding of immunological pathways responsible for antibody and/or cellmediated destruction of tumors have led to the development of unique cancer therapeutics in recent years, leading to markedly improved survival in the setting of previously intractable metastatic melanoma, bladder, kidney, lung, and head and neck cancers, among several other human solid tumor types (Borghaei et al., 2015; Robert et al., 2015a; Robert et al., 2015b). In particular, one of the most successful clinical applications of immune checkpoint inhibitors includes antibodies directed against the programmed death protein 1 (PD1) pathway, which inhibits cellular immune killing of cancer cells via complementary binding of tumor expressed programmed death ligand 1 (PDL1) to PD1 on immune cells (Pardoll, 2012). As a still emerging yet quite compelling immunotherapy approach in contemporary cancer medicine, targeting of immune checkpoints is being extensively investigated in ongoing and upcoming clinical trials aimed at unleashing T cell activity, augmenting immune recognition against tumor metastases, and boosting immune memory for longlasting clinical remission posttreatment (Postow et al., 2015; Le et al., 2017). While the remarkable potential for checkpoint inhibitors in treating cancer is unequivocally exciting, the combined clinical trial experience has shown that durable effective treatment outcomes occur only in a limited subset of patients (Sharma et al., 2017). Alas, some cancer types are presumed to be minimally immunogenic, resulting in little or no response to this treatment strategy (Brahmer et al., 2012). An accumulating body of evidence has established that certain immunological features, including T cell exhaustion (e.g., Tim3) and exclusion (JerbyArnon et al., 2018), senescence markers (Moreira et al., 2019) such as CD57, or immune incompetence, exhaustion, or premature senescence (e.g., loss of CD27) (Riaz et al., 2017; Vallejo, 2005), could perhaps reflect or even predict sensitivity and resistance to checkpoint inhibitorbased cancer immunotherapy. However, early attempts at identification of specific pathological biomarkers to predict immunotherapy response, including transcriptomic rubrics (Auslander et al., 2018), machine learning algorithms (Johannet et al., 2021), genomic approaches (Cormedi et al., 2021) such as tumor mutational burden (Duffy and Crown, 2019), among others (Pilard et al., 2021), have thus far shown somewhat inconsistent results, challenged further by the inherent molecular, cellular, and biophysical diversity of human tumors (Teng et al., 2015; Tumeh et al., 2014; Carbognin et al., 2015). Instead, most currently applied parameters merely document tumor responses that have already occurred as opposed to predicting it a priori; these include the standard Response Evaluation Criteria in Solid Tumor (RECIST) v1.1 (Eisenhauer et al., 2009) and even the newer proposed response assessment rubrics specific for immunotherapy (iRECIST). Considering the sheer complexity of the biological interactions between the immune system and the tumor microenvironment, one could reason that the introduction of more sophisticated mathematical analytic techniques would hopefully have potential to enhance the qualitative and quantitative understanding of such interactions and to improve malignant tumor treatment with checkpoint inhibitors, ultimately supporting the development of a predictive clinical tool that could either minimize or overcome this clear unmet need of cancer medicine.
In previous work, mechanistic mathematical models of immunomodulatory interventions for cancer control utilizing coupled ordinary differential equations (ODEs) have already allowed the prediction of tumor response after Bacillus Calmette–Guérin (BCG) administration in superficial transitional cell carcinoma of the urinary bladder and the serum prostatespecific antigen response after prostate cancer vaccine administration with considerable accuracy (BunimovichMendrazitsky et al., 2016; BunimovichMendrazitsky et al., 2007; Kronik et al., 2010). Moreover, other investigators have shown that mechanistic models of interleukin21 (IL21) therapy schedules based on tumor mass and antigenic properties can predict growth patterns of multiple tumor types in patients receiving personalized doses of IL21 (de Pillis et al., 2005). In another notable work, mathematical modeling of neoantigen fitness based on cancer population evolution and antigen recognition potential was able to predict tumor response and patient survival following treatment with checkpoint blockade therapy, as validated in both melanoma patients treated with antiCTLA4 and in lung cancer patients treated with antiPD1 therapies (Łuksza et al., 2017). Finally, modeling of chemotherapy and targeted drug therapy by our group with similar mathematical techniques was able to accurately reproduce entire dose–response curves of tumor cell kill with untargeted cytotoxic and liganddirected agents (Pascal et al., 2013a; Hosoya et al., 2016; Dogra et al., 2018; Goel et al., 2019; Dogra et al., 2020b). These examples highlight the previous success of mathematical models to qualitatively or quantitatively discern the underlying biologically and physically relevant, often nonlinear processes present in cancer, which may otherwise be missed, and help optimize treatment delivery approaches.
The innovation and scope of the present study is to demonstrate how standard clinical measures, including radiological imaging assessments and immunohistochemical (IHC) analysis of tissue samples, may inform a mechanistic mathematical model to elucidate patient and tumorspecific attributes that would likely benefit from the application of checkpoint inhibitorbased therapy. Our model parameters, which can be determined in multiple ways (e.g., from early time point imaging and/or histopathology data), present a clear advantage over the standard measures currently used in the clinic for predicting treatment outcome and patient survival. The mathematical model presented in this report is based on an extensive series of prior methodological reports (Pascal et al., 2013a; Hosoya et al., 2016; Dogra et al., 2018; Goel et al., 2019; Dogra et al., 2020b; Das et al., 2013; Pascal et al., 2013b; Wang et al., 2016; Koay et al., 2014; Frieboes et al., 2015; Wang et al., 2015; Brocato et al., 2018; Cristini et al., 2017; Dogra et al., 2019; Brocato et al., 2019; Dogra et al., 2020a; Goel et al., 2020; Anaya et al., 2021), but with rigorous emphasis on parameters related to the cancerimmune response to the treatment with checkpoint inhibitors in the setting of patients with solid tumors. In particular, we have developed a model that contains key mechanistic biological and physical processes involved in checkpoint inhibitor therapy, which has been objectively validated against imaging data of patient tumor burden and measured response, defined by both change in total tumor burden and patient survival (Butner et al., 2020). To date, this modeling work has only examined tumor response on a bulk scale, without confirming the mechanistic links between key model parameters and the underlying biology they describe. In this work, we demonstrate how model parameters may be informed by using cellscale IHC analysis of biopsied tissues, which are available as early as at the start of treatment, for reliable prediction of therapeutic response. Thus, this represents a key step towards clinical translation (i) by validating prior results in additional patient cohorts and (ii) demonstrating how IHC may be used with the model to predict patient response at times closer to the start of treatment. The predictions obtained through this model were compared against available retrospective clinical data from published trials with monoclonal antibodies blocking the PD1/PDL1 pathway for validation, thereby focusing on a quantitative relationship between tumor response and its underlying mechanisms. Taken together, these retrospective results indicate that our predictive mathematical model with the class of PD1/PDL1 checkpoint inhibitors has translational merit and that it should therefore be evaluated as an integral predictive biomarker in the setting of carefully designed prospective cancer investigational trials.
Methods
Mathematical model
The final mathematical model we used for comparison with clinical data is a single ODE, which determines changes in relative tumor mass over time after initiation of antiPD1 therapy. It is a simplified and userfriendly version originated from a complex set of partial differential equations (PDEs), which takes into account spatial relationships within the tumor microenvironment. By reducing the model to this closedform solution, we present the model in a form that combines related biological processes (see Figure 1) into only a few easytointerpret values, while also ensuring the ability to obtain unique solutions when only minimal data are available. The full mathematical derivation has been demonstrated elsewhere (Butner et al., 2020) and also in Appendix 1, but its underlying cancer biology is also schematically depicted in Figure 1. It is ultimately represented by
where ρ′ is the normalized tumor mass, t is time, α is a proliferation constant of tumor cells, Λ is the ratio of cancer cells (σ) to effector immune cells multiplied by the ability of immune cells (ψ_{k}) to kill cancer cells (namely, the antitumor immune state), µ is the effect of immunotherapy on the ability of immune cells to kill cancer cells, and ${\rho}_{\infty}$ represents the final longterm tumor burden, which may be calculated as
In words, Equation (1) describes the timecourse tumor burden under immunotherapy intervention as a function of three key parameters α, Λ, and µ, which represent the ability of malignant tumor cells to grow, the ability of immune cells to kill cancer cells within the tumor environment, and the potential efficacy of checkpoint inhibitorbased immunotherapy treatment. Equation (2) states that longterm tumor burden is a function of the relationship between tumor growth and kill rates [the greater of which sets the sign of response, i.e., progressive disease ($\alpha \S amp;gt;\mu $) or favorable clinical response ($\alpha \S amp;lt;\mu $)], as scaled by the parameter $\mu \cdot \Lambda $. These three parameters are intrinsically abstract terms, which have multiple components in themselves. Λ is dependent on the density of malignant tumor versus immune cells at the initiation of immunotherapy (t = 0) and on the ability of each immune cell to kill tumor cells. Equation (1) assumes the following five assumptions: (i) tumor mass grows exponentially with rate α in the absence of immune cell killing; (ii) cancer cells are killed by one type of net effector immune cell (termed $\psi $_{k}); (iii) the influx of immune cells and the concentration of checkpoint inhibitor within the tumor environment reaches a drug steady state on time scales far faster (i.e., from hours to days) than clinical response (i.e., from weeks to months); (iv) all cancer cells are within an average intratumoral movement path length of effector immune cells, and the effect of cytokines in stimulating immune cell function/movement is small, thus spatial diversity effects are negligible in this particular case; and (v) the site of action of the checkpoint inhibitor takes place at the interface between the tumor cell and immune cell, with blocking of one of the immune checkpoint receptors, either on the cancer or on the immune cell, which are deemed necessary and sufficient for blockade of the immune inhibitory pathway.
While one must recognize that although these assumptions certainly represent a gross biological generalization of the true tumorimmune microenvironment complexity, they enable direct comparison with clinical response data, in which measurements of all the parameters required for a full description of immune system reaction to tumors are often not readily available in conventional clinical settings. In this retrospective mechanistic analysis, we start with a simple biophysical model to validate clinical findings and the mechanistic links between highlevel model parameters and the underlying biology they describe, with the possible addition of further refining variables as an aspirational goal for future prospective translational or clinical cancer studies. The mathematical simulations and procedures conducted are fully described in Appendix 1 (Appendix 1—figures 1 and 2); finally, we also address the implications of these arbitrary choices in ‘Discussion’.
Clinical evaluation and fitting of the mathematical model
To evaluate Equation (1) with clinical data, we initially focused on clinical trials with the class of checkpoint inhibitors that specifically target the PD1/PDL1 pathway. In this analysis, tumor volumes over time measurements were obtained from six published clinical trial reports (Borghaei et al., 2015; Antonia et al., 2015; Le et al., 2015; Motzer et al., 2015; Powles et al., 2014; Topalian et al., 2012). As an initial retrospective proof of concept, five clinical trials of antiPD1 therapy plus another clinical trial of antiPDL1 treatment (Table 1) were included, with the combined dataset representing several common solid tumor pathology types (n = 189 cancer patients); this dataset represents our model calibration cohort. An online tool (Web Plot Digitizer [Rohatgi, 2010]) served to extract the individual patient tumor response data over time from the published clinical trials, and subsequently Equation (1) was fit to these individual and collective datasets to derive the parameters α, Λ, and µ. A unique fit (and thus set of parameter values) was obtained for the data from each individual patient, and more details on this procedure may also be found in Butner et al., 2020.
The fitting procedure was performed in two sequential steps. First, the fastest progressing patient in each trial was fit to a simple exponential function to estimate α for each cancer histopathology (i.e., approximation for uninhibited tumor doubling time). Although this approximation represents a limiting factor in the study, it was necessary due to lack of pretreatment tumor measurement in the calibration cohort, which would allow for more accurate estimation of the growth kinetic without clinical intervention. As presented, α may represent the fastest tumor growth potential in the cohort and thus and overestimate the true average growth kinetic; however, because it likely represents the patient with the least response to treatment in each cohort, we take it to be a reasonable approximation of tumor growth kinetic unaffected by treatment. Second, this cancerspecific α was used to determine the remaining parameters, Λ and µ, for each individual patient by fitting Equation (1) to timecourse tumor burden data (Figure 2 and Appendix 1—figure 1). The curve fitting was performed by using nonlinear least squares and the Mathematica function NonLinearModelFit (Wolfram Research I, 2017). The workflow for deriving the model parameters from the clinical data is depicted in Appendix 1—figure 1.
The results from this model calibration were checked against an additional validation patient cohort from the University of Texas M.D. Anderson Cancer Center (MDACC). Briefly, data from patients (n = 64) with nonsmall cell lung cancer (NSCLC) treated with pembrolizumab (MK3475) were obtained from the investigational study NCT02444741 (a total of 95 patients were obtained; however, 18 left the study after admission, 11 were removed due to lack of pretreatment measurements, 1 was excluded because all indexed lesions were treated with XRT, and 1 discontinued treatment but continued followup). Serial lung tumor measurements were taken on postcontract chest CT images acquired with 2.5 mm slice thickness. All lesions that were previously indexed by the radiologist were included in order to best reproduce actual clinical practice (patients had a range of 1–12 indexed lesions; median, 3 indexed lesions). Lesion volumes were estimated as 3D spheres calculated from the geometric mean of long and short axes of each lesion at each followup, and then volumes were summed to generate a total volume for all indexed lesions at each followup time point. Because imaging data before the start of pembrolizumab treatment were available for each patient within this cohort, we were able to obtain a patientspecific pretreatment growth rate α for each patient between the imaging immediately preceding start of treatment and at the time of first immunotherapy treatment by using the exponential growth rate estimation $\rho \left(t\right)\approx {\text{e}}^{\alpha \bullet t}$ between tumor burdens measured via imaging collected before treatment and at start of treatment. These perpatient α values were then used in Equation (1) to obtain values for Λ and µ, as described above.
Sensitivity analysis and model simulations
A sensitivity analysis was next performed on Equation (1) in two independent ways (Appendix 1—figure 2). First, each of the parameters Λ and µ were individually altered by ±10% while holding the other parameters constant to simulate the change in relative tumor mass ($\rho `$). Second, individual values of $\rho `$ in the calibration cohort were varied by a uniform random variable by ±0.1 (i.e., 10%) and Equation (1) was refitted to these modified data to determine any changes in the derived values for Λ and µ. To determine stability of derived Λ and µ values over time, we also truncated the $\rho `$ data at differing time points (namely, t < 30, 60, 120, 200 days), and subsequently refitted Equation (1) to these truncated data to determine Λ and µ. The correlation between the derived values of Λ and µ from the truncated datasets and the full composite dataset was then determined by using Spearman correlation analysis. A simulated longterm RECIST v1.1 score was then determined for each patient by projecting the expected tumor burden at t = 700 days (chosen as a time point beyond the last reported followup for most of the patients) by using these values of Λ and µ derived from the truncated fits in order to compare the predicted RECIST scores to the known retrospective scores. Finally, full model simulations were performed at different levels of α, Λ, and µ to determine values of $\rho `$ based on input parameter variation.
Validation of Λ and µ by tumorinfiltrating immune cells and immunostaining
The parameters Λ and µ derived from fitting the clinical response data were compared to pathological biomarkers from additional antiPD1 or PDL1 clinical trials (Borghaei et al., 2015; Robert et al., 2015b; Brahmer et al., 2012; Tumeh et al., 2014; Motzer et al., 2015; Powles et al., 2014; Topalian et al., 2012; Garon et al., 2015; Herbst et al., 2014; Kefford et al., 2014; Spira et al., 2015; Taube et al., 2014; Weber et al., 2015), as shown in Appendix 1—table 1. Specifically, we investigated the ability of Λ and µ to correlate with CD8+ tumorinfiltrating lymphocytes (TILs) and tumor PDL1 expression, respectively. To compare the values of Λ and µ derived from fitting the clinical response data over time with the trials assessing TILs and PDL1 expression, the extracted cancer patients were categorized into objective responders versus nonresponders by applying RECIST v1.1 (Eisenhauer et al., 2009) note that resist categories determined by lesion diameter (D) or our volume estimation are mathematically comparable by $Volume=\frac{4}{3}\pi {\left(\frac{D}{2}\right)}^{3}$ . Specifically, patients from all RECIST categories (Eisenhauer et al., 2009) were condensed into two groups for analysis, with patients who had ≥30% reduction in tumor burden (partial or complete response) grouped into the ‘favorable’ response group, and patients with either <30% disease reduction (stable disease or disease progression) into the ‘unfavorable’ response group. The last recorded time point from the response curve from each individual patient served to designate the RECIST v1.1 category as this provides the most accurate longterm treatment outcome.
After stratifying each patient into a RECIST v1.1 category, Λ and µ were compared between responding and nonresponding patients by using a Wilcoxon rank sum test and compared to the literature either as a continuous variable (Λ) or by using thresholds (µ). For comparison with biomarkers reported in the literature, Λ was converted to an estimated intratumoral CD8+ T cell count (for details, the interested reader is referred to Butner et al., 2020) by assuming each CD8+ T cell would kill one tumor cell on average (the assumed mean ‘fitness’ of the immune cell population [Mempel and Bauer, 2008]), and that there were 5558 cells/mm^{2} in the tumor microenvironment, as has been quantitatively measured in melanoma (Erdag et al., 2012). For comparison with PDL1 staining, µ was converted from its raw numerical value to a percentage; no other scaling of the variable was performed as the number of cancer cells bound by antiPD1/PDL1 therapy action is the dominant term in the integral specified in Equation S4 in Butner et al., 2020; this approximation and its implications are further examined in ‘Discussion.’ Assessment of objective response rates (ORRs) at standard thresholds of 1% and 5% PDL1 staining (corresponding to µ) were used in this study. Other alternative PDL1 cutoffs present in some studies (Table 1 and Appendix 1—table 1) were not included in this analysis due to the relatively small numbers of patients. The biomarker trials used for validation of these parameters are also presented in Appendix 1—table 1.
Results
Quantification of model parameters α, Λ, and µ
Measured tumor burden data over time from the total collective calibration patient pool (n = 189) after initiating checkpoint inhibitorbased therapy were extracted from the six reported clinical trials (Table 1). Of this population, 55 patients (29%) had objective responses by RECIST v1.1 criteria while 134 patients (71%) demonstrated stable/progressive disease. The derived tumor proliferation constant, α, was determined to range from 0.0034 (tumor doubling time of ~200 days) in renal cell carcinoma to 0.0622 (tumor doubling time of ~11 days) in nonmismatch repair colon cancer with an average α of 0.018 and average tumor doubling time of ~85 days (Table 1). In the validation NSCLC cohort, 25 patients (39%) had objective response, and 39 patients (61%) demonstrated stable/progressive disease, with individual patient proliferation constants (α) ranging from –0.0129 to 0.0602 (note that α < 0 indicates a subset of patients that had a shrinking tumor burden before start of therapy).
Having determined α for each cancer histopathology, the clinical response curves to checkpoint inhibitors for each tumor type were fit to Equation (1) to determine Λ and µ. The average rootmeansquare error of fitting Equation (1) to the clinical data was only 0.4%. A sample exponential fit to extracted melanoma timecourse data is shown (Appendix 1—figure 1, second panel, red curve). A sample of these curve fits for patients with melanoma from the trial by Topalian et al., 2012 is depicted in Figure 2. The derived fits from all 189 patient response curves in the calibration cohort yielded a mean ± standard error of the mean (SEM) value of Λ of 0.714 ± 0.257 in patients with partial/complete response and 0.0995 ± 0.0264 in patients with stable/progressive disease (p=0.119; Wilcoxon twotail), while the mean ± SEM value of µ was 0.054 ± 0.014 versus 0.013 ± 0.0012 in partial/complete response versus stable/progressive disease, respectively (p<0.001). In the validation NSCLC cohort, the mean ± SEM value of Λ was 0.876 ± 0.102 in patients with partial/complete response and 0.0297 ± 0.469 in patients with stable/progressive disease (p<0.001), while the mean ± SEM value of µ was 0.0529 ± 0.00982 to –0.0064 ± 0.0032 in partial/complete response versus stable/progressive disease (p<0.001). These values are depicted in Figure 3. These results suggest that, while Λ and µ give significantly separated classification ranges for partial/complete response versus stable/progressive disease, the specific value of the binary classification threshold is likely a function of the unique diseasedrug combination used, as observed in our prior studies (Butner et al., 2021). This point was explored further through a ‘leaveonecancertypeout’ validation study within the calibration cohort, wherein one cancer type was removed from the calibration cohort and used as validation against the parameter ranges in the reduced calibration set obtained from Borghaei et al., 2015; Antonia et al., 2015; Le et al., 2015; Motzer et al., 2015; Powles et al., 2014; and Topalian et al., 2012. Results revealed that mean ranges vary between individual cancer types, and that μ shows more consistent significant difference between response categories relative to Λ (these results are consistent with the results shown in Butner et al., 2020; these results are shown in Appendix 1—figure 3, Appendix 1—figure 3—source data 1, and explored further in ‘Discussion’).
Confirmation of model stability
The parameters α, Λ, and µ were held constant during the sensitivity analysis at three different values for each parameter. These values were the minimum, average, and maximum values derived from fitting Equation (1) to the data as described above. Changing Λ and µ from these values ± 10% yielded a maximum change in $\rho `$ of 9.2% at t = 200 days. Varying ${\rho}^{`}$ data points randomly up to ±0.1 resulted in Spearman correlation coefficients of 0.766 and 0.919 for Λ and µ, respectively, derived from fitting these randomized data versus the actual clinical data. We note that, due to high stability observed to these perturbations in the large calibration cohort (n = 189 patients), we did not repeat this sensitivity analysis in the validation cohort. Λ values derived from fitting truncated $\rho `$ data for the calibration cohort at various time points displayed Spearman correlation coefficients with Λ derived from fitting the full data ranging from 0.071 at t < 30 days to 0.730 at t < 200 days, while Spearman correlation coefficients for µ ranged from 0.910 (t < 30 days) to 0.921 (t < 200 days) for all truncated datasets (Table 2). In the validation cohort, Spearman correlation coefficients for Λ ranged 0.800 at t < 30 days to 0.771 at t < 200 days, and 0.800 at t < 30 days to 0.989 at t < 200 days for μ. Simulated RECIST v1.1 categorization (complete/partial response vs. stable/progressive disease) by using the values of Λ and µ derived from fitting the truncated calibration cohort data resulted in 19% misclassification by using data from t < 60 days (n = 177) and 13% misclassification at t < 200 days (n = 189) when compared to the full dataset. In the validation cohort, we observed 18.6% misclassification at t < 60 days (n = 43) and 10.9% misclassification at t <200 days (n = 64). Predicted normalized tumor mass after initiation of immunotherapy is depicted at different combinations of α, Λ, and µ (Figure 4).
Analysis of model predictions
Using the binary classification of tumor response of partial/complete response as positive and stable/progressive disease as negative, receiveroperator characteristic (ROC) analysis was performed to identify optimal predictive response thresholds for μ and Λ (Figure 3, insets) as higher values of μ and Λ are expected to correspond to a more favorable patient response. Identification of optimal predictive thresholds in the calibration cohort by maximizing the Youden’s J statistic revealed cutoff thresholds where sensitivity (the proportion of correctly identified patients with partial/complete response) was higher for both Λ and μ (0.945 and 0.891, respectively), while specificity was reduced (0.381 for Λ and 0.567 μ). Response prediction thresholds were identified as 0.722 for Λ and 0.00905 for μ. Testing these same response threshold values (identified in the calibration cohort) in the validation cohort revealed sensitivity of 0.6 for Λ and for 0.960 μ, and specificity of 0.743 for Λ and 0.769 for μ in the validation cohort. We found that, in the calibration cohort, positive predictive values (PPVs; the probability a patient will be a responder when the model predicts they will be a responder: 0.381 for Λ and 0.458 for μ) were lower than the negative predictive values (NPVs; 0.889 for Λ and 0.927 for μ), indicating fewer falsenegatives (stable/progressive disease) than falsepositives (partial/complete response). This trend was confirmed in the validation cohort (a PPV of 0.600 for Λ and 0.723 for μ; an NPV of 0.744 for Λ and 0.968 for μ). Lastly, we note that overall accuracy was lower by both Λ and μ in the calibration cohort than were observed in the validation cohort; this is presumably due to the misbalance of patients with ORR (29% = 55 patients) versus patients with stable/progressive disease (71%, = 134 patients) while sensitivity (correctly identified patients with ORR) was higher than specificity.
Comparison of Λ and µ with clinical and pathological data
Conversion of Λ to an estimated intratumoral CD8+ T cell count yielded a predicted mean ± SEM of 3970 ± 1429 cells/mm^{2} in patients with partial/complete response to checkpoint inhibition and 553 ± 147 cells/mm^{2} in patients with stable/progressive disease (Figure 5) in the calibration cohort, and 4871 ± 567 cells/mm^{2} versus 165 ± 2,604 cells/mm^{2} in patients with partial/complete response versus stable/progressive disease in the validation cohort. Intratumoral CD8+ T cell counts referenced from Tumeh et al., 2014 are derived from two scenarios, either encompassing data in the pretreatment setting only (n = 46) or data including pathological CD8+ T cell counts encompassing both pretreatment and on treatment (n = 23). In the pretreatment setting, the average intratumoral CD8+ T cell count was 2632 ± 518 cells/mm^{2} in patients with partial/complete response to checkpoint inhibitors and 322 ± 133 cells/mm^{2} in patients demonstrating stable/progressive disease, while including ontreatment plus pretreatment data yielded an average of 3770 ± 675 cells/mm^{2} in patients with partial/complete response and 501 ± 113 cells/mm^{2} in patients with stable/progressive disease.
For comparison of µ with PDL1 staining, data were extracted from a total of 12 published clinical trials (n = 975 cancer patients) with data on clinical response by using a PDL1 staining cutoff of ≥1% versus <1% and 1492 patients with a cutoff of ≥5% versus <5%. The ORR of patients with these thresholds was 15% for PDL1 staining <1% and 28% for PDL1 staining ≥1%, and 21% and 38% for <5% and ≥5%, respectively. The ORR of patients in the calibration cohort by using the same thresholds in the derived values of µ was therefore 10% and 45% for patients with µ < 1% and ≥1%, respectively, and was 25% and 67% for patients with µ < 5% and ≥5%, while in the validation cohort, the modelpredicted ORR was 9% and 71% for patients with µ < 1% and ≥1%, respectively, and was 29% and 100% for patients with µ < 5% and ≥5% (Figure 5—source data 1).
Discussion
The translational mathematical model introduced here retrospectively investigates the molecular, cellular, and biophysical mechanism(s) behind patient response to immune checkpoint inhibitor therapy, and it explores the potential clinical value of the early prediction of treatment effectiveness. The reductionist approach introduced here simplifies the rather complex biological crosstalk between the immune system with cancer cells to a set of differential equations ultimately dependent on three intrinsic parameters that represent tumor proliferation, intratumoral immune cell presence/killing efficiency (considered as baseline immune cell infiltration), and checkpoint inhibition efficacy; this latter parameter represents the active role of the checkpoint inhibitors in initiating and amplifying the antitumor response. We have demonstrated that these parameters may be informed using either imaging or IHC measures, enabling for robust implementation even when some of these measures are unavailable, and offering methods to calculate PDL1 density and intratumoral CD8+ T cell counts without invasive methods or to inform the model using only IHC measures at start of treatment.
From a mechanistic standpoint, the mathematical model is fully compatible with the current understanding of the biological effect of checkpoint blockade and clinical subtypes of immunotherapyresponsive tumors (Teng et al., 2015). First, the parameter values derived from our model for intratumoral immune cell infiltration (Λ) and immunotherapy antibody efficacy (µ) correspond with the quantitative pathological measures of intratumoral CD8+ T cell infiltration and PDL1positive staining (Figure 4; Figure 5). Although parameters Λ and μ are not direct representations of CD8+ T cell counts or PDL1 expression, they correlated with measured pathological biomarkers, thus providing strong experimental evidence that these mathematical parameters may be quantified from these biomarkers. This offers a unique method for clinical translation using clinically available measured quantities to provide personalized prediction of patient response, a potential that we have often found particularly lacking in prior mechanistic modeling research. By demonstrating links between model parameters and known biology, our model may also be used to quantify the associated mechanistic underpinnings of treatment failure, with the goal of suggesting clinical interventions that may overcome these deficiencies to improve patient outcome. Unfortunately, the absence of perpatient measurements of CD8+ T cell counts or PDL1 expression corresponding to individual patient response to treatment in the cohorts examined prevented analysis of how these parameters perform on a perpatient basis. We are currently collecting inhouse a new dataset containing both measures for single patients, and the results of this study will be published once complete. We are also investigating how these biomarkers may inform the model to prospectively identify pseudoprogression (a clinical phenomenon wherein lesions are observed to experience rapid volume increase followed by disease response) on a perpatient basis, based on our previous result that retrospectively known longterm parameter values are unique in pseudoprogression (Butner et al., 2020).
Moreover, the model demonstrates that the rate of immunotherapy response arises from nonlinear interactions between both adequate effector immune cell infiltration in tumors and immunotherapy efficacy within the tumor microenvironment (Figure 4). Mechanistic mathematical modeling across several cancer types (Table 1) suggests that the overlying master equation [i.e., Equation (1)] may well be universally applied, while the model parameter values presented here are likely tumorspecific. Indeed, this conclusion may be observed in Figure 3, where Λ and µ are distinctly different for ORR versus stable or progressive disease, but the values are different for the different cohorts, thereby capturing mechanistic differences across the different cancer types. However, further testing of the model in additional cancer types and larger patient cohorts remains necessary, in part due to the heterogeneous nature of solid tumors, both within individual patients and across different cancer types. The nontrivial interaction of these inputs predicted by this model is different in comparison to many established and current candidate biomarker studies (which focus on linear immunotherapy responses to just one of these markers, e.g., PDL1 expression [Carbognin et al., 2015]), perhaps providing a plausible explanation why single biomarkers sometimes provide inconsistent predictions across multiple cancer types (Carbognin et al., 2015). Similarly, the unique insights presented herein allow for a more direct comparison of the model parameters to common pathological markers either reported in the literature or as yet to be reported, and provide a different perspective compared to other immune checkpoint inhibitor modeling studies (Łuksza et al., 2017; Wilkie and Hahnfeldt, 2013; Serre et al., 2016).
The high sensitivity at early time points (t < 60 days) demonstrated here indicates that this model may provide valuable early identification of cancer patients most likely to benefit from checkpoint inhibition therapy, with a high NPV indicating minimal falsenegatives in the group predicted to receive the least benefit from therapy. Therefore, this attribute could serve as an early indicator of the efficacy of checkpoint inhibition by simply using either clinical or imaging measurements of index tumor lesions and by applying the mathematical model to project maximal effect for an individual patient. Further, the model demonstrates robustness and stability in the derived values of Λ and μ to random variations in $\rho `$, and by using values of Λ and μ derived from fitting limited data over time (Table 2). This result is in stark contrast to previous mathematical descriptions of immunemediated killing of cancer cells, where a relatively small change in a single input parameter of merely ±1% could dramatically vary tumor response by up to ±50% (de Pillis et al., 2005).
Towards clinical translation of cancer models, a mathematical model with two simple terms (an exponential growth term and a regression term) has previously been used to successfully correlate and predict patient survival (Stein et al., 2008). The fundamental difference between this prior model (Stein et al., 2008) and the model reported here is that the quantification of cell death is more mechanistic, that is, depending on the specific mechanisms (Figure 1) that underlie cancer cell and immunotherapy drug interactions, hence allowing one to gain more mechanistic insight into the treatment system. As demonstrated, model parameters Λ and µ can be used to stratify responding and nonresponding patients (Figure 3), which has immediate potential utility as a biomarker for patient selection in prospective clinical trials. Moreover, the model contains biological values that, at least in principle, may either be measured (e.g., tumor burden, growth rate) or changed clinically (µ: e.g., drug dosing or dosing schedule; $\Lambda $: e.g., radiotherapyinduced increase in PDL1 expression, radiotherapy abscopal effect). Thus, we expect that the model (i) may be informed by using standardofcare measurements (e.g., Figure 5) and (ii) can provide quantitative information on which parameters must be changed to maximize therapeutic benefit, providing clinicians with potentially valuable decisionmaking information.
Of course, as with any biomathematical model, this model is based on several educated assumptions about the interactions of immune effectors (CD8+ T cells) with cancer cells, which naturally does not reflect the sheer magnitude and complexity of the tumorimmune microenvironment. These include the presence of immunosuppressive cells (such as neutrophils, myeloidderived suppressor cells, Treg, and M2polarized tumorassociated macrophages), which extensively infiltrate into tumor mass and hinder the efficacy of immune checkpoint inhibitors (Jenkins et al., 2018). Moreover, effects of other immune cells, such as CD4+ T cells, are not explicitly included in the current model form; while it is likely that some of the effects from these other cells are captured within the model representation of CD8+ T cells, this assumption warrants future model development where each immune component is explicitly represented. Some of these assumptions (e.g., lack of immune cell/antibody infiltration terms) are supported with in vitro or in vivo data such as high rates of immune cell binding by antiPD1 antibodies in the peripheral circulation (Topalian et al., 2012) and local intratumoral (as opposed to systemic myelogenous) expansion of CD8+ T cells during checkpoint inhibition (Tumeh et al., 2014; Ribas et al., 2016). Other potential model parameters would logistically be quite challenging to measure in an outpatient clinicready setting routinely at this time, and their inclusion remains open for followup confirmatory retrospective studies and ultimately for future prospective clinical trials. As such, the integration of additional prognostic biomarkers and master regulators that intrinsically (e.g., genetic stability) and extrinsically control the efficacy of immunotherapy represents open research areas for future translational investigation.
Moreover, the two response parameters derived from the model, Λ and μ, represent abstract terms, which may not precisely model intratumoral CD8+ lymphocyte infiltration and tumor cell PDL1 staining and potentially can be modified by relevant determinants, such as tumor mutational burden (Goodman et al., 2017) or the human microbiome of the gut (Gopalakrishnan et al., 2018). Moreover, these parameters represent a simplifying average value of these fluctuating parameters. We take this to be a reasonable assumption because (i) these quantities are not directly measurable (this is also the case for cell proliferation events underlying the tumor growth rate parameter α), (ii) in our inhouse data, times between patient reassessment ranged 17–91 days, so even direct measurement of these would provide an average value over this time period, and (iii) these processes take place at far shorter times than patient reassessments. By presenting the model in this reduced form, we enable simplified interpretation of the results while also providing a single, easytounderstand scalar that contains significant information about the treatment response.
Finally, a few operational aspects of this work deserve further comment. Response was defined using RECIST v1.1 criteria, in lieu of the more recent adaptation of RECIST criteria to immunotherapy treatment (iRECIST), because the studies used to derive the model parameters in this mechanistic analysis did not incorporate these new criteria, and our use of published data for the calibration cohort limited or precluded us from obtaining a priori knowledge about nontarget lesions. Moreover, our work to date has only involved testing our model in a handful of cancer types, and further validation in additional disease phenotypes and larger total patient populations remains outstanding. These inherent study limitations notwithstanding, we will continue to improve our model to more accurately predict outcomes in the immunotherapyspecific setting of individualized cancer patients, potentially through the inclusion of additional biomarkers, as bioactive molecules such as IFNγ, CD206, CX3CR1, CD1D, and iNOS, along with cellmediated mechanisms known to have an effect on immune checkpoint inhibitor therapy (Park et al., 2018; Gubin et al., 2018). However, we do not anticipate that such improvements and refinements would change the basic findings and the potential value of the mechanistic mathematical model reported in this work. In going forward, one hopes that knowledge broadly applicable across multiple cancer types and/or immune microenvironment control will be serially incorporated to the dynamic model upon their future discovery and validation.
In conclusion, we present a mechanistic model of immune checkpoint inhibition that is able to describe various immunotherapy response profiles, with the inputs to the model correlating with common pathological biomarkers in current clinical use. An early and robust a priori predictor for checkpoint inhibition response and outcome might provide a glimpse of the immense potential for timely adjustments and therapeutic personalization. Future prospective investigations of this computational scienceassisted approach will focus on readily available clinical data as inputs to the model and further refining the complex interplay between the immune system and the cancer environment to extract other important variables for immunotherapy efficacy. Towards this end, we are currently pursuing inhouse collection of perpatient, paired IHC measures of PDL1 or T cell counts with tumor response, which will enable us to directly correlate IHC measures of interest with tumor response, while also allowing for examination of additional patient parameters, such as tumor stage or patient age. We will also obtain data from different cancer types known to be receptive to ICI therapy in order to further evaluate the potential of our model to perform as a more universal predictor across an even more diverse array of cancer phenotypes. Ultimately, the merit of this approach will rely on its future ability to reliably predict early individual patient response with the goal of improving personalized cancer care.
Appendix 1
Immunotherapy model derivation
The mechanistic model of cancer immunotherapy presented in this paper is essentially an extension of our prior mathematical model describing chemotherapy response that has been shown to predict cancer cell kill with various cytotoxic or targeted agents. In particular, we extend these equations to consider immune cell presence as necessary for cancer cell kill secondary to immunotherapy, and that immune cells respond to not only diffusion gradients within a tumor microenvironment but can also be influenced by the presence of cytokines through cell signaling pathways and chemotaxis. The model hypothesis of immunotherapy presented here is that these molecules or antibodies diffuse and block the complimentary interaction between immune checkpoint ligands expressed on cancer cells with their complementary proteins on immune cells. This mechanism renders effector immune cells potentially active for cancer cell killing; thus, the therapeutic site of action occurs within the tumor microenvironment (i.e., because our model only describes the tumor region and the factors and processes contained therein, we have made the assumption that all key mechanisms such as drug binding occur only within the tumor). These assumptions lead to the following system of PDEs:
Here $\rho $, ${\psi}_{k}$ , $\sigma $, and $C$ represent the local concentration of cancer cells, therapeutic immune cells, immunotherapy antibodies, and cytokines, respectively. Moreover, α is the proliferation rate of tumor cells, ${\lambda}_{p}$ is the specific death rate of cancer cells, $\lambda $ is the binding coefficient of immunotherapy antibodies that block the interaction between immune inhibitory ligands on cancer cells and their counterparts on immune cells, ${\Lambda}_{\psi}$ is the coupling of immune cell activity relative to the number of cancer cells (i.e., killing and tumor infiltration capacity of the host immune system), χ is the chemotaxis coefficient, Λ_{C} is the number of released cytokines as immunotherapy antibodies are bound, and D_{a} and D_{c} are the diffusivity of antibodies and cytokines (see also Figure 1, main text).
The first equation describes the death rate of cancer cells as proportional to the concentration of therapeutic immune cells and the time history of immunotherapy antibody uptake and binding within the tumor environment. The second equation represents mass conservation of ‘effective’ therapeutic immune cells, including the rate at which these cells become ineffective at killing cancer cells. The third and fourth equations represent the concentration and diffusion of immunotherapy antibodies and cytokines in the presence of immunotherapy antibody binding.
To compare this mechanistic model with clinical immunotherapy data, we make multiple reductionist assumptions about the influence of various parameters from Equation A14 on immunotherapy response. These assumptions suppose that immune cells are in relatively close physical proximity to cancer cells, allowing one to ignore the influence of cytokines to guide their movement, and that the primary effector immune cell response to immunotherapy is driven by the presence of immune cells already present in the tumor at the initiation of the therapy. Furthermore, we assume uptake and binding of immunotherapy antibodies within the tumor environment to occur diffusely relative to cancer and immune cell concentrations; therefore, in this model, we set $\chi =0$ and remove the influence of immunotherapy antibody concentration. These assumptions reduced the four PDEs to the following set of coupled ODEs as follows:
Here, Equation A6 represents the change in tumorinfiltrating immune cells relative to the change in cancer cells killed by the immune system. In the case where immune cell killing is weak (i.e., ${\lambda}_{p}\cong 0$, and thus $\frac{d\rho}{dt}=\alpha \cdot \rho $), the change in immune cell concentration within the tumor is negligible compared to the growth of the tumor, thus Equation A6 satisfies the relationship $\frac{d{\psi}_{k}}{dt}\cong 0$ (i.e., immune cell concentration within the tumor remains roughly constant over time). When immune cell killing is sufficiently effective, then
${\lambda}_{\text{p}}\cdot \rho \cdot {\psi}_{k}\cdot {\int}_{0}^{t}\lambda \cdot \sigma \cdot \rho \cdot \text{d}{t}^{`}\S amp;gt;\alpha \cdot \rho $ and thus $\frac{d{\psi}_{k}}{dt}\cong {\Lambda}_{\psi}\cdot \left(\frac{d\rho}{dt}\right)$ . In essence, the change in immune cells in the tumor over time is roughly equal to the immune cell coupling to cancer cells [i.e., immune cells in the tumor environment are related to the tumor volume by a coupling factor (${\Lambda}_{\psi}$)], which captures the immunogenicity of an individual tumor. Integrating both sides of Equation A6 over time leads to the following relationship:
where ${\rho}_{0}$ and ${\psi}_{0}$ represent the concentration of tumor cells and immune cells at the start of immunotherapy. We then substitute ${\psi}_{k}$ from, Equation (A7) into Equation (A5) express ${\Lambda}_{\psi \rho}={{\Lambda}_{\psi}\cdot \rho}_{0}/{\psi}_{0}$ , and replace the concentration of tumor cells $\rho $ by a proportion of the original tumor cell concentration to obtain the normalized tumor mass ${\rho}^{`}=\rho /{\rho}_{0}$ (while noting that one may also use normalized tumor volume here) to give us one equation of tumor response represented by
Finally, we assume that the binding of immunotherapy antibodies within the tumor environment (${\int}_{0}^{t}\lambda \cdot \sigma \cdot \rho `\cdot \text{d}{t}^{`}$) and rate of tumor cell death secondary to effector immune cells (${\lambda}_{\text{p}}$) reach constant steady states on time scales faster than measurable tumor cell kill, implying ${\lambda}_{\text{p}}\cdot {\int}_{0}^{t}\lambda \cdot \sigma \cdot {\rho}^{\prime}\cdot \text{d}{t}^{\prime}=\mu$, leading to
This ODE can be solved analytically for $\rho `$ by integrating both sides, yielding
Equation A10 is the same as Equation (1) in the primary text, which is used to compare with clinical response data to immunotherapy.

Appendix 1—figure 3—source data 1
 https://cdn.elifesciences.org/articles/70130/elife70130app1fig3data1v2.xlsx
Data availability
No new clinical patient data was produced in this study. Data used that was obtained from literature is available in the original publications; we have been careful to cite each of these in the manuscript. Interested researchers should reach out directly to the primary authors of these studies. Data for the inhouse clinical trial cohort are from the study reported in PMC7555111; interested researchers should contact the authors of this publication with any data requests.
References

Nivolumab versus docetaxel in advanced nonsquamous nonsmallcell lung cancerThe New England Journal of Medicine 373:1627–1639.https://doi.org/10.1056/NEJMoa1507643

Safety and activity of antiPDL1 antibody in patients with advanced cancerThe New England Journal of Medicine 366:2455–2465.https://doi.org/10.1056/NEJMoa1200694

Mathematical model of BCG immunotherapy in superficial bladder cancerBulletin of Mathematical Biology 69:1847–1870.https://doi.org/10.1007/s115380079195z

Improving bacillus CalmetteGuérin (BCG) immunotherapy for bladder cancer by adding interleukin 2 (IL2): A mathematical modelMathematical Medicine and Biology 33:159–188.https://doi.org/10.1093/imammb/dqv007

Predicting immunotherapy response through genomicsCurrent Opinion in Genetics & Development 66:1–9.https://doi.org/10.1016/j.gde.2020.11.004

An Introduction to Physical Oncology: How Mechanistic Mathematical Modeling Can Improve Cancer Therapy OutcomesAn Introduction to Physical Oncology, An Introduction to Physical Oncology: How Mechanistic Mathematical Modeling Can Improve Cancer Therapy Outcomes, CRC Press, Taylor & Francis Group, 10.4324/9781315374499.

Mathematical modeling in cancer nanomedicine: a reviewBiomedical Microdevices 21:e3802.https://doi.org/10.1007/s1054401903802

Imageguided mathematical modeling for pharmacological evaluation of nanomaterials and monoclonal antibodiesWiley Interdisciplinary Reviews. Nanomedicine and Nanobiotechnology 12:e1628.https://doi.org/10.1002/wnan.1628

A mathematical model to predict nanomedicine pharmacokinetics and tumor deliveryComputational and Structural Biotechnology Journal 18:518–531.https://doi.org/10.1016/j.csbj.2020.02.014

New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)European Journal of Cancer 45:228–247.https://doi.org/10.1016/j.ejca.2008.10.026

Pembrolizumab for the treatment of nonsmallcell lung cancerThe New England Journal of Medicine 372:2018–2028.https://doi.org/10.1056/NEJMoa1501824

Tumor mutational burden as an independent predictor of response to immunotherapy in diverse cancersMolecular Cancer Therapeutics 16:2598–2608.https://doi.org/10.1158/15357163.MCT170386

Mechanisms of resistance to immune checkpoint inhibitorsBritish Journal of Cancer 118:9–16.https://doi.org/10.1038/bjc.2017.434

Transport properties of pancreatic cancer describe gemcitabine delivery and responseThe Journal of Clinical Investigation 124:1525–1536.https://doi.org/10.1172/JCI73455

PD1 Blockade in tumors with mismatchrepair deficiencyThe New England Journal of Medicine 372:2509–2520.https://doi.org/10.1056/NEJMoa1500596

Intravital imaging of CD8+ T cell function in cancerClinical & Experimental Metastasis 26:311–327.https://doi.org/10.1007/s1058500891969

Senescence markers: Predictive for response to checkpoint inhibitorsTernational Journal of Cancer 144:1147–1150.https://doi.org/10.1002/ijc.31763

Nivolumab for metastatic renal cell carcinoma: results of a randomized phase II trialJournal of Clinical Oncology 33:1430–1437.https://doi.org/10.1200/JCO.2014.59.0703

The blockade of immune checkpoints in cancer immunotherapyNature Reviews. Cancer 12:252–264.https://doi.org/10.1038/nrc3239

Future prospects of immune checkpoint blockade in cancer: from response prediction to overcoming resistanceExperimental & Molecular Medicine 50:1–13.https://doi.org/10.1038/s1227601801301

Cancer immunotherapy: it’s time to better predict patients’ responseBritish Journal of Cancer 125:927–938.https://doi.org/10.1038/s4141602101413x

Immune checkpoint blockade in cancer therapyJournal of Clinical Oncology 33:1974–1982.https://doi.org/10.1200/JCO.2014.59.4358

PD1 blockade expands intratumoral memory T cellsCancer Immunology Research 4:194–203.https://doi.org/10.1158/23266066.CIR150210

Nivolumab in previously untreated melanoma without BRAF mutationThe New England Journal of Medicine 372:320–330.https://doi.org/10.1056/NEJMoa1412082

Pembrolizumab versus ipilimumab in advanced melanomaThe New England Journal of Medicine 372:2521–2532.https://doi.org/10.1056/NEJMoa1503093

WebsiteWebPlotDigitalizer: HTML5 based online tool to extractnumerical data from plot imagesAccessed April 11, 2019.

Classifying cancers based on Tcell infiltration and PDL1Cancer Research 75:2139–2145.https://doi.org/10.1158/00085472.CAN150255

Safety, activity, and immune correlates of antiPD1 antibody in cancerThe New England Journal of Medicine 366:2443–2454.https://doi.org/10.1056/NEJMoa1200690

Simulating cancer growth with multiscale agentbased modelingSeminars in Cancer Biology 30:70–78.https://doi.org/10.1016/j.semcancer.2014.04.001
Decision letter

Caigang LiuReviewing Editor; Shengjing Hospital of China Medical University, China

Mone ZaidiSenior Editor; Icahn School of Medicine at Mount Sinai, United States
Our editorial process produces two outputs: i) public reviews designed to be posted alongside the preprint for the benefit of readers; ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "Early prediction of clinical response to checkpoint inhibitor therapy in human solid tumors through mathematical modeling" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Mone Zaidi as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.
Essential revisions:
1) Multiple parameters could reflect treatment efficacy of immunotherapy to cancers while three specific parameters were included in this model, the selection process and reasons for choosing these three parameters need to be described in detail, the definition of the three parameters seems to in discordant with clinical setting which requires further explanation.
2) As a heterogenous entity, different types of cancer possess distinct features, rationality to employ this model as one fitinall with a relatively small training cohort requires additional discussion after validated in a largescale validation cohort.
3) As a novel model evaluation of the sensitivity and accuracy needs to be addressed, and comparison of this model with current applied parameters for predicting immunotherapy treatment efficacy requires to be supplemented.
Reviewer #1 (Recommendations for the authors):
The authors demonstrate a translational mathematical model dependent on three key parameters for describing efficacy of checkpoint inhibitors in human cancer. This paper describes some very interesting work. However, it remains questions that need to be addressed as follows:
1. The authors need present why they select these three parameters for describing efficacy of checkpoint inhibitors in human cancer. How about other factors?
2. How can the authors make sure that this model is suitable for all human cancer? Is the clinical tumor response dataset (n = 189 patients) enough?
3. What is the accuracy of the model?
4. I think there is no scope of this work shown in this "Introduction part".
5. Please identify the innovation point in the paper.
Reviewer #2 (Recommendations for the authors):
1. Several claims would need to be revised or toned down throughout the manuscript that describe the mathematical model as mechanistic and translatable. For example,
a. 'senescence markers (10) such as CD27, Tim‐3, CD57, and/or T‐cell receptor (TCR) repertoires (11)'. Only CD57 is a senescence marker among the listed proteins.
b. 'Our method may be implemented directly into clinical practice, as it relies on standard‐of‐care imaging and pathology' in Line 79,
c. 'the merit of this approach will rely on its future ability to reliably predict early individual patient response with the goal of improving personalized cancer care.'
d. Line 204205: 'the model may be adaptable to other types of checkpoint inhibitors affecting the CTLA‐4 pathway'. Studies have described the effect of aCTLA4 therapies to be mediated in peripheral lymphoid organs rather than the tumour microenvironment. Therefore, the model that considers chemotaxis and intratumoural Ab activity will likely not apply to aCTLA4 therapies.
2. In addition to the assumptions described in the text, are the following assumptions implied:
a. The parameters µ and λ are considered constant over time, which implies that the ratio of cancer cells and immune infiltration remains the same over the treatment course.
b. Every cancer cell is assumed to have the same proliferation rate.
c. Each tumour cell has the same likelihood to be recognized by the immune system.
3. The cutoff used to split the µ and λ parameters and compare to responders and nonresponders was done 'by maximizing the Youden's J statistic revealed cutoff thresholds where sensitivity'; A correction method needs to be applied to account for multiple testing correction. How does the result compare to using the median values and quantiles? How do changes in cutoff affect the Roc curve?
4. The manuscript text refers to Table S1 and supplementary figures, but supplementary information wasn't submitted for review. Unfortunately, the work on model sensitivity was not provided.
5. The work builds on a recent publication by the authors.
a. The manuscript could benefit from more information on the model, even if described previously elsewhere. For example, parameters in the methods and in Figure 1 are introduced but not described.
b. Importantly, the novelty of this study compared to the previous publication should be highlighted. The comparison with clinical and histology data appears to be the main novelty, which is why demonstrating the correlation with clinical data per patient, would be crucial. It is not clear why the calibration and validation dataset are different from the previous study. How do the estimated parameters differ?
6. The sensitivity and specificity of the λ parameter are low. Does it have an additive predictive value to the µ parameter?
7. How does the predictive value of the parameters compare to previously reported biomarkers of response such as TMB, % PDL1+ cells, T cell count? A predictive model controlling for age, stage, etc. should be implemented to account for these confounders.
8. How does the predictive value of the parameters change when the validation cohort is swapped with one of the calibration cohorts? How robust are the parameters to crossvalidation? How does the predictive classification change with a different cancer type?
9. There is a lack of robust predictors of immunotherapy response due to a large number of confounder, such as heterogeneity at molecular and cellular level, immune escape, immunosenescence, etc. How does the model overcome this?
10. Figure 4 could be more clear and informative.
11. The published cohort in JITC has 72 patients but only 64 patients were included in the validation cohort, why is that so?
12. Which other pathological parameters correlate with λ and µwhen measured per patient, not per response group?
Reviewer #3 (Recommendations for the authors):
1) In the Figure 2, it would be helpful to show different cancer types as different panel.
2) An additional figure is favorable to show whether those key parameters were cancer typespecific or the antibody drugs.
https://doi.org/10.7554/eLife.70130.sa1Author response
Essential revisions:
1) Multiple parameters could reflect treatment efficacy of immunotherapy to cancers while three specific parameters were included in this model, the selection process and reasons for choosing these three parameters need to be described in detail, the definition of the three parameters seems to in discordant with clinical setting which requires further explanation.
This point is welltaken. Mathematically, these are modelderived terms (i.e., after simplifying the initial model, composed of a system of differential equations) that allow for a closedform solution, reducing the equation in a way that makes it possible to obtain unique solutions when only a few data points are available (a relatively common problem when using clinicallyderived data), while also providing easy to interpret results and direct comparison to measurable parameters; we have now clarified this point on lines 222225 of the revised manuscript. In fact, in the often chaotic, real world of daytoday clinical practice, such simple indicators (or measures) are more likely to see adaptation, and it might likely (at least in part) explain the widespread success of the RECIST criteria. We also believe such simple measures (the values of which can be readily determined from standofcare measurements in current clinical practice) presents a major potential advantage for “translating” mathematical modelingbased predictions into clinical applications.
In order to help clarify the rationale behind this design process, we have now added the following text, lines 188191:
“By reducing the model to this closedform solution, we present the model in a form that combines related biological processes (see Figure 1) into only a few easytointerpret values, while also ensuring the ability to obtain unique solutions when only minimal data are available.”
2) As a heterogenous entity, different types of cancer possess distinct features, rationality to employ this model as one fitinall with a relatively small training cohort requires additional discussion after validated in a largescale validation cohort.
Agreed: We have now added more discussions on this point, lines 474476, 548550, and 568574. We have also softened the tone in line 470, by changing “demonstrates” to “suggests” that the overlaying master equation may be universally applied.
Lastly, because our calibration cohort consisted of n=189 patients and our validation cohort contained only n=64 patients, we believe that there may have been some inadvertent confusion regarding the number of patients in each cohort. This was likely due to undisciplined verbiage in the original text. We have now reworked this sentence (lines 5558) to now read as follows:
“The model was calibrated by fitting it to a compiled clinical tumor response dataset (n = 189 patients) obtained from published antiPD1 and antiPDL1 clinical trials, and then validated on an additional validation cohort (n = 64 patients) obtained from our inhouse clinical trials.”
3) As a novel model evaluation of the sensitivity and accuracy needs to be addressed, and comparison of this model with current applied parameters for predicting immunotherapy treatment efficacy requires to be supplemented.
Thank you for bringing out this essential suggestion for discussion.
First, to assess the accuracy of the model in the revision work, we have calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV); these are now provided in lines 390400.
Moreover, we recognize that the novelty of the work presented in this work, which uses immunohistochemical (IHC) measurements to inform the model, might apparently have lost some of its central focus as originally presented. In fact, the model has already been demonstrated to perform reliably to predict patient response and even survival (e.g., PMID: 32426472), but this published work was done with a different approach (i.e., by using CT scanbased imaging data). While we have reported similar results here, the purpose of this current work is to evaluate how our new method of informing model parameters from IHC measurements compares to the previous (already established), imagingbased method. The purpose for doing so is twofold: (i) Clinical data may often be convoluted/complex and measurements may be missing; thus, having multiple ways to inform the model (i.e., to determine model parameters) increases its ability to robustly handle a wider range of clinical scenarios; and (ii) IHC measurements may be collected at far earlier time points (relative to start of treatment) than imagingbased followup assessment of tumor responses; thus, this strategy potentially enables prediction tumor responses at an earlier time point than it was previously possible. We have revised the manuscript in several places to help bring this focus into greater clarity, as well as added discussion, lines 157160, and 439442.
Finally, we are actually unaware of any currently applied, clinically used parameters for predicting immunotherapy response, although several possible methods have been published in the last few years including transcriptomic rubrics (PMID 30127394), machine learning algorithms (PMID 33208341), genomic approaches (PMID 33307238) such as tumor mutational burden (PMID 31315901), among others (PMID 34112949). Instead, most “currently applied parameters” merely document tumor responses that have already occurred as opposed to predicting it a priori; these include the standard RECIST v1.1 and even the newer proposed response assessment rubrics specific for immunotherapy (iRECIST). As such, our model parameters, which can be determined in multiple ways (e.g., from early time point imaging and/or histopathology data), presents a clear advantage over the standard measures currently used in the clinic for predicting treatment outcome and patient survival. This key advantage of the methodology introduced here is now mentioned in the lines 126127, 129131, and 160163 of the revised manuscript.
Reviewer #1 (Recommendations for the authors):
The authors demonstrate a translational mathematical model dependent on three key parameters for describing efficacy of checkpoint inhibitors in human cancer. This paper describes some very interesting work. However, it remains questions that need to be addressed as follows:
1. The authors need present why they select these three parameters for describing efficacy of checkpoint inhibitors in human cancer. How about other factors?
This point is again welltaken. The details of the full model derivation have recently been reported elsewhere (e.g., PMIDs 32426472, 33398132); however, we have now added it to the SI of this manuscript as well (Appendix 1, lines 14103); and we have now added a reference to the derivation in the revised main text (line 192). We have also added a clarification as to why this approach was implemented in the revised main text (lines 188191).
Additional details relative to this appropriate concern are also provided in this letter (please see Essential Revisions, point #1). We further note that, beyond those points, we have endeavored to carefully reduce the full model (wherein all relevant, key biological parameters must be included at mathematical derivation for scientific integrity) such that the model may be informed through clinicallymeasurable quantities. For example, how can we measure cytokine diffusion in vivo or in patients during the course of actual treatment? We cannot, at least by using the currently available technologies in a standard outpatient clinic or inpatient hospital setting. We have now found a way to solve this dilemma by further simplifying the model to reduce the parameter space of the model through biologically sound, reasonable assumptions (please see the newly added section on mathematical model derivation; Appendix 1, lines 14103 of the revised work). That is, by focusing on only parameters that can actually be measured in current clinical practice, it allows us to focus this work on demonstrating additional ways that the model may be used for clinical predictions under current clinical practice, with a focus here on IHC measurement. We take our successful results by using only these limited data (% change in tumor volumes vs. time or IHC staining) as evidence of the robustness of the model performance.
2. How can the authors make sure that this model is suitable for all human cancer? Is the clinical tumor response dataset (n = 189 patients) enough?
The Referee has appropriately expressed some caution over the sample size of the patient cohorts. While future studies will obviously be required in order to fully address “all human cancer” as he/she inquired, we have now softened the tone of model results, clarifying that the model has only been tested in the cancer types described in the revised manuscript (lines 474476, 548550, and 568574). Specifically, there were n=55 responders and n=134 nonresponders. Assuming type1 error α=0.05 and type=2 error β=0.1, the measured difference between means for μ responders vs nonresponders (0.054 vs. 0.013, respectively) with population variance S^{2}_{μ} = 0.00394 (data not shown in the manuscript) could be detected by a minimum necessary sample size of n=52 patients (comparison of 2 independent means). Equivalent analysis for Λ revealed a minimum sample size of n=51, so the calibration dataset is sufficiently large enough for the cancer types that have been reported here.
3. What is the accuracy of the model?
To assess the accuracy of the model, we have now calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV); these are now provided in lines 390400.
4. I think there is no scope of this work shown in this "Introduction part".
We have clarified this in the revised manuscript, lines 157160.
5. Please identify the innovation point in the paper.
We have clarified this in the revised manuscript, lines 157160.
Reviewer #2 (Recommendations for the authors):
1. Several claims would need to be revised or toned down throughout the manuscript that describe the mathematical model as mechanistic and translatable. For example,
a. 'senescence markers (10) such as CD27, Tim‐3, CD57, and/or T‐cell receptor (TCR) repertoires (11)'. Only CD57 is a senescence marker among the listed proteins.
Agreed. Tim3 is actually a marker of Tcell exhaustion, and loss of CD27 is associated with immune incompetence, exhaustion, or premature senescence (e.g., PMID: 15882352). We have now corrected this sentence to read (lines 121124 of the revised manuscript):
“An accumulating body of evidence has established that certain immunological features, including Tcell exhaustion (e.g., Tim3) and exclusion (9), senescence markers (10) such as CD57, or immune incompetence, exhaustion, or premature senescence (e.g., loss of CD27) (11, 12) could perhaps reflect or even predict sensitivity and resistance to checkpoint inhibitorbased cancer immunotherapy.”
b. 'Our method may be implemented directly into clinical practice, as it relies on standard‐of‐care imaging and pathology' in Line 79,
We take this to mean that, although our method could be translated to the clinic by using only alreadyestablished techniques, it must be rigorously validated in future studies, just as with any predictive clinical parameter or tool. We have added this caveat to the discussion of the revised manuscript, lines 474476, 548550, and 568574.
c. 'the merit of this approach will rely on its future ability to reliably predict early individual patient response with the goal of improving personalized cancer care.'
This goes along with the previous point (b); please see our response above.
d. Line 204205: 'the model may be adaptable to other types of checkpoint inhibitors affecting the CTLA‐4 pathway'. Studies have described the effect of aCTLA4 therapies to be mediated in peripheral lymphoid organs rather than the tumour microenvironment. Therefore, the model that considers chemotaxis and intratumoural Ab activity will likely not apply to aCTLA4 therapies.
We have recently demonstrated this to be the case (PMID 32426472). That being said, to avoid any further confusion, we have removed this statement (as well as mention of cellbased immunotherapies at this location) from the main text.
We however emphasize that this is not the intended implication of the model assumption as described in Appendix 1, lines 2526 (that is, in our model we assume that the therapeutic cite of action occurs within the tumor microenvironment). The PD1 and CTLA4 receptors that are targeted in many types of immunotherapy are expressed on T cells; which may not be located in the tumor at time of systemic drug administration (while the PDL1 receptor, located on the cancer cell plasma membrane, will always be in the tumor). This implies that, in the case of antiCTLA4 and antiPD1 therapy, a portion of the drug can potentially bind to Tcells outside the tumor (e.g., in the thymus, peripheral blood, etc.). However, our model assumes that (i) drug diffuses through the tumor to its binding site (as per Eqs. S3, S4), (ii) all drug binding occurs within the tumor, and (iii) all drug effects (e.g., tumor kill) occurs within the tumor microenvironment. That is, all “sites of therapeutic action” are located within the tumor.
We recognize that this statement may indeed be inadvertently confusing, so we have now clarified this in the model description of the revised Appendix 1 as follows (Appendix 1, lines 2628):
“that is, because our model only describes the tumor region and the factors and processes contained therein, we have made the assumption that all key mechanisms such as drug binding occur only within the tumor.”
2. In addition to the assumptions described in the text, are the following assumptions implied:
a. The parameters µ and λ are considered constant over time, which implies that the ratio of cancer cells and immune infiltration remains the same over the treatment course.
It is established that, for example, Tcells do experience “burnout” and loss of cytotoxic efficacy over their lifetimes (e.g., our reply to point 1a, above), and will influx/efflux the tumor over time, and as such it should not be expected that Tcell density remains constant in the tumor at all times. Thus, these parameters represent the average value of these fluctuating parameters. We take this to be a reasonable assumption because (i) these quantities are not directly measurable, and (ii) even in our inhouse data, times between patient reassessment ranged ~46 weeks, so even if you could measure these, you would be forced to average between these times either way, and (iii) these processes take place at far shorter times than patient reassessments. By presenting the model in this reducedform, we enable simplified interpretation of the results while also providing a single, easy to understand scalar that contains significant information about the treatment response. We also note here that RECIST and its immune checkpoint inhibitor counterparts provide readout in a similar manner. These points are now emphasized in the revised manuscript (lines 535542).
b. Every cancer cell is assumed to have the same proliferation rate.
This is indeed a correct observation. Because we do not explicitly model individual cancer cells, we must use the average proliferation rate across the entire tumor in order to capture the total proliferation and proliferation rate of the cancer cell population. Also, understandably, this is not something that would even be measurable on a percell level in the clinic or in vivo. This approach is commonly used in mathematical modeling studies based on continuum representations. We have now pointed out this limitation in the revised manuscript (lines 537538, with accompanying discussion in lines 535542).
c. Each tumour cell has the same likelihood to be recognized by the immune system.
Again, this is correct, for the same reasons provided in the previous reply. Future singlecell imaging may shed light in these limitations and refine the methodology introduced here.
3. The cutoff used to split the µ and λ parameters and compare to responders and nonresponders was done 'by maximizing the Youden's J statistic revealed cutoff thresholds where sensitivity'; A correction method needs to be applied to account for multiple testing correction. How does the result compare to using the median values and quantiles? How do changes in cutoff affect the Roc curve?
We appreciate the attention to detail. We believe the suggestion for multiple testing correction here is likely based on a potential misunderstanding. We did not identify an “ideal” threshold for each cohort by Youden’s J statistic, we only identified one single threshold for μ and Λ (that is, one for each parameter) in the calibration cohort, and then assessed the performance of the same threshold values in the validation cohorts. The results show that the thresholds were able to satisfactorily separate responders from nonresponders, which in our opinion has proven the validity of μ and Λ for this purpose already, i.e., to test how these carefully designed thresholds would perform across independent datasets. Therefore, it is a little unclear as to what value testing other cutoff thresholds would add to the manuscript. However, in order to avoid any confusion, we have clarified this in the text, lines 392393:
“Testing these same response threshold values (identified in the calibration cohort) in the validation cohort”.
4. The manuscript text refers to Table S1 and supplementary figures, but supplementary information wasn't submitted for review. Unfortunately, the work on model sensitivity was not provided.
We have now verified that we had originally uploaded both main text and SI at time of initial submission. We do not know why the SI (now Appendix 1) has not been properly provided to the Referees for external peerreview, but regret that this may have indeed been the case. We will be sure to submit both revised main text and SI when we resubmit the revised draft; we will also check with the editorial office to make sure that they have been fully transmitted to the Referees.
5. The work builds on a recent publication by the authors.
a. The manuscript could benefit from more information on the model, even if described previously elsewhere. For example, parameters in the methods and in Figure 1 are introduced but not described.
Agreed. We have added a detailed model derivation, including discussion of individual parameters and assumptions made, in Appendix 1 (Appendix 1, lines 14103); the reader is also now referred to this on line 192.
b. Importantly, the novelty of this study compared to the previous publication should be highlighted. The comparison with clinical and histology data appears to be the main novelty, which is why demonstrating the correlation with clinical data per patient, would be crucial. It is not clear why the calibration and validation dataset are different from the previous study. How do the estimated parameters differ?
Thank you for this suggestion. Unfortunately, we have been unable to obtain any perpatient data that contains both patient response to ICI therapy and also IHC measures of PDL1 expression or CD8^{+} tumorinfiltrating lymphocyte counts from the same patients that would enable this direct comparison. While we are working towards inhouse collection of this data, we believe that this study represents an important step towards this goal. The calibration cohort in this work was expanded from the literature cohort used in previous work (PMID: 32426472), because we found more published data since that time, and wanted to use all data available. Model parameters are the same (perpatient) between studies when estimated by tumor volume measurements (although means have changed due to addition of new data); these serve as the established baseline for comparison with the IHCbased estimation of model parameters presented in this work.
We however fully agree that further emphasis should be placed on the novelty of informing model parameters by IHC; therefore, we have further emphasized this point in the manuscript, lines 157160 and 439442.
6. The sensitivity and specificity of the λ parameter are low. Does it have an additive predictive value to the µ parameter?
It is correct that µ was found to have higher sensitivity and specificity in the results shown. We have now included a full description of the results for Λ as well, (i) for sake of scientific completeness, because it is an integral part of the model reducedform and we intended to show the complete set of our results, and (ii) because we have shown previously (PMID: 33398132) that Λ×μ may also be used as a dugdisease specific measure of treatment response.
7. How does the predictive value of the parameters compare to previously reported biomarkers of response such as TMB, % PDL1+ cells, T cell count? A predictive model controlling for age, stage, etc. should be implemented to account for these confounders.
We were unable to obtain TBM, % PDL1+ cells, or intratumoral T cell count for the patients studied. This was not in the literaturederived cohort, and needle biopsies were not required in the inhouse studies, so we have been as yet unable to do these analyses at this time. Age, stage, etc. are well suited for certain kinds of statistical models (e.g., Cox proportional hazards), and we agree that this could be something interesting to investigate, e.g., maybe we could find out whether the model could serve better for age groups of patients. That said, since this is not the focus of this work, we have decided to put it as part of our future work, lines 568574:
" Towards this end, we are currently pursuing inhouse collection of perpatient, paired IHC measures of PDL1 or Tcell counts with tumor response, which will enable us to directly correlate IHC measures of interest with tumor response, while also allowing for examination of additional patient parameters, such as tumor stage or patient age.”
8. How does the predictive value of the parameters change when the validation cohort is swapped with one of the calibration cohorts? How robust are the parameters to crossvalidation? How does the predictive classification change with a different cancer type?
This point is welltaken. In fact, we have shown similar results to the analysis requested here previously, in PMID: 32426472, Figure 5. In order to shed further light on parameter robustness across cancer types, as suggested, we have added a new analysis wherein we performed a ‘leaveonecancertypeout’ validation analysis with the calibration cohort (made up of 6 different cancer types). That is, the model parameter rages for each response category from the remaining 5 cancer types are compared to the range from the leftout type; this was repeated for all 6 cancer types, and the results are shown in the new figure, Appendix 1figure 3. We have also added discussion on this new analysis in the main text, lines 356361. We have also now referred the reader to this publication (PMID: 32426472) in case they are interested, line 360. We would like to thank the Referee for his/her helpful suggestion.
9. There is a lack of robust predictors of immunotherapy response due to a large number of confounder, such as heterogeneity at molecular and cellular level, immune escape, immunosenescence, etc. How does the model overcome this?
This is a fair point, and the Referee is correct that these cofounding factors are known to be relevant in ICI treatment and outcome. In the model as presented, we approximate heterogeneous parameters (such as immune cell distribution or infiltration rates, immune checkpoint ligand densities, ICI drug perfusion, immune cell activation and killing capacity, etc.) based on their spatiotemporal averages. As such, we overcome these by indirectly capturing their contributions to the overall system, allowing us to characterize the system in order to make predictions without explicit measures of these factors.
Indeed, complete characterization of these factors remains clinically prohibitive, if even possible, and we believe that the high cost (time, money, and inconvenience of clinical adoption) of such models that must capture them to function will not progress beyond the research or hypothesisgenerating stage. We also believe that these lowlevel factors contribute to overarching mechanisms that are more readily measurable, allowing their effects (including those not as yet discovered) to be captured indirectly. By developing methods to inform more parameters in our model by using clinicallyavailable measures, we are moving forward to a place where one day we may be able to mathematically backcalculate these unmeasurable factors, which we hope will be of benefit to both researchers and clinicians.
10. Figure 4 could be more clear and informative.
If the Referee could perhaps help us by further clarifying his/her comment, that would be greatly appreciated; we are open for constructive suggestions.
11. The published cohort in JITC has 72 patients but only 64 patients were included in the validation cohort, why is that so?
Some patients were excluded due to inclusion criteria specific to this study, which were different than those in JITC. Details are now provided on lines 263266 of the revised manuscript:
“a total of 95 patients were obtained; however, 18 left the study after admission, 11 were removed due to lack of pretreatment measurements, one was excluded because all indexed lesions were treated with XRT, and one discontinued treatment but continued followup”.
12. Which other pathological parameters correlate with λ and µwhen measured per patient, not per response group?
Thank you for the comment. Unfortunately, only % change in tumor burden over time was available for our calibration cohort; thus, we were unable to investigate pathological parameters in this study. Please also see our reply to Referee #2 Public Review comment 2, above.
Reviewer #3 (Recommendations for the authors):
1) In the Figure 2, it would be helpful to show different cancer types as different panel.
We appreciate this comment; however, we elected this format to simply highlight the differences in parameter values while keeping our focus on the main innovation of this revised manuscript: i.e., by using IHC to inform model parameters. We have shown results similar to those requested here in a previous study however; we have now added text to point the interested reader to these results, line 771.
2) An additional figure is favorable to show whether those key parameters were cancer typespecific or the antibody drugs.
Please see the previous comment. We have shown cancerdrug specific results in our previous work; we have added a reference to point the interested reader to this analysis, line 771.
https://doi.org/10.7554/eLife.70130.sa2Article and author information
Author details
Funding
National Science Foundation (DMS1930583)
 Zhihui Wang
 Vittorio Cristini
National Institutes of Health (1R01CA253865)
 Zhihui Wang
 Vittorio Cristini
National Institutes of Health (1U01CA196403)
 Zhihui Wang
 Eugene J Koay
 Vittorio Cristini
National Institutes of Health (1U01CA213759)
 Zhihui Wang
 Vittorio Cristini
National Institutes of Health (1R01CA226537)
 Zhihui Wang
 Renata Pasqualini
 Wadih Arap
 Vittorio Cristini
National Institutes of Health (1R01CA222007)
 Zhihui Wang
 Vittorio Cristini
National Institutes of Health (U54CA210181)
 Zhihui Wang
 Mauro Ferrari
 ShuHsia Chen
 PingYing Pan
 Eugene J Koay
 Vittorio Cristini
National Institutes of Health (P30CA016672)
 David S Hong
 James W Welsh
 Eugene J Koay
National Institutes of Health (P30CA072720)
 Steven K Libutti
 Shridar Ganesan
 Renata Pasqualini
 Wadih Arap
European Commission (SER Cymru II Programme)
 Bruna Corradetti
National Institutes of Health (U01CA200468)
 Eugene J Koay
National Institutes of Health (R01CA222959)
 Mauro Ferrari
DOD Breast Cancer Research (Breakthrough Level IV Award W81XWH1710389)
 Mauro Ferrari
AngelWorks
 Renata Pasqualini
 Wadih Arap
Gillson Longenbaugh Foundation
 Renata Pasqualini
 Wadih Arap
Marcus Foundation
 Renata Pasqualini
 Wadih Arap
National Institutes of Health (R01CA109322)
 ShuHsia Chen
National Institutes of Health (R01CA127483)
 ShuHsia Chen
National Institutes of Health (R01CA208703)
 ShuHsia Chen
National Institutes of Health (R01CA140243)
 PingYing Pan
National Institutes of Health (R01CA188610)
 PingYing Pan
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Ethics
Clinical trial registration NCT02444741.
Human subjects: Inhouse patient cohort were obtained as deidentified data from a study conducted in accordance with the U.S. Common Rule and with Institutional Review Board Approval at MD Anderson (20141020), including waiver of informed consent. This work has been published in J Immunother Cancer. 2020; 8(2): e001001. PMC7555111. doi: 10.1136/jitc2020001001.
Senior Editor
 Mone Zaidi, Icahn School of Medicine at Mount Sinai, United States
Reviewing Editor
 Caigang Liu, Shengjing Hospital of China Medical University, China
Publication history
 Received: May 6, 2021
 Accepted: October 25, 2021
 Accepted Manuscript published: November 9, 2021 (version 1)
 Version of Record published: November 29, 2021 (version 2)
Copyright
© 2021, Butner et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 1,121
 Page views

 209
 Downloads

 1
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading

 Medicine
Pharmacological and genetic studies over the past decade have established the folliclestimulating hormone (FSH) as an actionable target for diseases affecting millions, namely osteoporosis, obesity, and Alzheimer's disease. Blocking FSH action prevents bone loss, fat gain and neurodegeneration in mice. We recently developed a firstinclass, humanized, epitopespecific FSHblocking antibody, MSHu6, with a K_{D} of 7.52 nM. Using a GLPcompliant platform, we now report the efficacy of MSHu6 in preventing and treating osteoporosis in mice and parameters of acute safety in monkeys. Biodistribution studies using ^{89}Zrlabelled, biotinylated or unconjugated MSHu6 in mice and monkeys showed localization to bone and bone marrow. MSHu6 displayed a β phase t_{½} of 7.5 days (180 hours) in humanized Tg32 mice. We tested 217 variations of excipients using the protein thermal shift assay to generate a final formulation that rendered MSHu6 stable in solution upon freezethaw and at different temperatures, with minimal aggregation, and without self, cross, or hydrophobic interactions or appreciable binding to relevant human antigens. MSHu6 showed the same level of 'humanness' as human IgG1 in silico and was nonimmunogenic in ELISPOT assays for IL2 and IFNg in human peripheral blood mononuclear cell cultures. We conclude that MSHu6 is efficacious, durable, and manufacturable, and is therefore poised for future human testing.

 Epidemiology and Global Health
 Medicine
There is increasing awareness throughout biomedical science that many results do not withstand the trials of repeat investigation. The growing abundance of medical literature has only increased the urgent need for tools to gauge the robustness and trustworthiness of published science. Dichotomous outcome designs are vital in randomized clinical trials, cohort studies, and observational data for ascertaining differences between experimental and control arms. It has however been shown with tools like the fragility index (FI) that many ostensibly impactful results fail to materialise when even small numbers of patients or subjects in either the control or experimental arms are recoded from event to nonevent. Critics of this metric counter that there is no objective means to determine a meaningful FI. As currently used, FI is not multidimensional and is computationally expensive. In this work a conceptually similar geometrical approach is introduced, the ellipse of insignificance (EOI). This method yields precise deterministic values for the degree of manipulation or miscoding that can be tolerated simultaneously in both control and experimental arms, allowing for the derivation of objective measures of experimental robustness. More than this, the tool is intimately connected with sensitivity and specificity of the event / nonevent tests, and is readily combined with knowledge of test parameters to reject unsound results. The method is outlined here, with illustrative clinical examples.