Optimal control analysis of combined chemotherapy-immunotherapy treatment regimens in a PKPD cancer evolution model

The author constructs a mathematical model capturing tumor-immune dynamics, incorporating the evolution of drug resistance, pharmacokinetics and pharmacodynamics of administered drugs, and immunotherapy possibilities. Numerical simulations are performed to analyze the model under a variety of treatment possibilities. A sensitivity analysis is performed to determine the parameters contributing the most to the variance in effector cell, resistant, and sensitive tumor cell populations. Then, a detailed optimal control analysis is performed, along with a numerical simulation of optimal treatment profiles for a hypothetical patient.


I. INTRODUCTION
Mathematical modeling has the potential to significantly impact the treatment of cancer through optimization of drug administration schedules. Currently, clinical treatments are administered in an ad hoc way, modifying the treatment along the way, after observing how the patient is responding to it. Since it is not possible to clinically test all possibilities for drug dosage, combinations, sequence, timing, and duration for a patient, mathematical modeling is able to help determine the optimal treatment profile for each patient, leading to truly personalized cancer treatment [Rockne et al. 2019].
In recent years, immunotherapy has become a viable and promising treatment option for cancer patients. This biological treatment focuses on using the immune system to fight cancerous cells. Though the immune system can naturally prevent/slow cancer growth, cancer cells have evolved many ways to bypass the body's immune system such as having genetic mutations which make them harder to detect by the immune system, having surface proteins which deactivate immune cells, and changing the healthy cells surrounding the tumor so that they interfere with the immune response to cancer. Immunotherapies elicit or am-plify the immune response in these cases to fight against the cancer more effectively in many ways. For example, immune checkpoint inhibitors block immune checkpoints, leading to a stronger immune response, adoptive cell therapy boosts the natural ability of T cells to fight against cancerous ones, and monoclonal antibodies help the immune cells recognize cancerous cells more accurately [Kruger et al. 2019].
Chemotherapy is another common cancer treatment plan. Like immunotherapy, there are many drugs with different mechanisms chemotherapy subsumes including doxorubicin, novantrone, and epirubicin. Each of these drugs targets and attempts to kill cells which are growing and dividing rapidly, like those characteristic of cancer. However, a marked side effect of this is that the drug also kills many healthy cells in the process, especially those fast-growing cells in the bone marrow, hair, and skin [Schirrmacher 2019].
Current medical literature is focused on attempting to combine these two treatments. Though much work has been done in this area, the question still remains: what is the best way to combine these treatments to ensure tumor remission while also attempting to minimize side effects?
This paper attempts to address this question in a few ways. In section 2, a mathematical model capturing the key components of our system is presented. In section 3, parameter value estimates will be provided and simulations with varying amounts of continuous infusion chemotherapy will be presented. In section 4, a sensitivity analysis will be performed, allowing us to understand which parameter values impact the variance in the effector cell and tumor cell populations the most. In section 5, the author analyzes the existence and characterization of the optimal control. In section 6, numerical simulations depicting the optimal control profile under different treatment side effect combinations for a hypothetical patient will be provided.

II. MODEL CREATION
The model presented below can be thought of to consist of two parts: the first three equations de-scribe tumor-immune interactions. Separate equations are given for sensitive and resistant (to the chemotherapy drug) tumor cells, and a term is provided which converts the sensitive cells to resistant cells. For greatest generality, immunotherapy is included by adding to the effector cell population. The last two equations form the second part of the model, capturing the pharmacokinetics and pharmacodynamics of chemotherapy drug administration. This component is linked to the first set of equations via the effect of the chemotherapy drug on the sensitive tumor cell population. Below is our model: The terms u(t) and v(t) are termed controls for our treatment: u(t) refers to the chemotherapy treatment, and v(t) refers to the immunotherapy treatment.
In this model, S and R represent the drugsensitive and drug-resistant tumor cells, T T represents the total number of tumor cells (S + R), E represents the effector cells, and C 1 and C 2 represent the drug concentrations in the plasma and at the tumor site, respectively. Below, we will explain the functional forms of each of the equations above: dS/dt: the first term represents the normal dynamics of these cells, taking into account the carrying capacity (captured by C max ). The second term captures the killing rate of tumor cells by effector cells. The third term is the killing of sensitive tumor cells by the drug; IC 50 is the median inhibitory concentration, and w scales between the measurements in vitro to in vivo (we can assume this is 1); γ is provided to help scale the hyperbolic curve for PKPD data fitting purposes. The fourth term is used to incorporate drug resistance-l represents the mutation rate from sensitive to resistant tumor cells. Finally, the last term captures the natural death rate of the cancer cells.
dR/dt: The first, second, and last terms serve the same purpose as in the dS/dt equation. The lS captures the transferal of sensitive to resistant cells via mutation. Note that we assume identical base growth dynamics for the sensitive and resistant cells, as well as identical interactions with effector cells.
dE/dt: The first term is the rate of influx of effector cells into the region of tumor localization and the second term captures the tumor activation of effector cells while ensuring a maximum rate at which effector cells are produced and d E is their death rate. The third term is the death of effector cells due to interaction with the tumor cells. The fourth term is added to incorporate drug toxicity. Finally, the last term, s 2 v(t), captures the administration of the immunotherapy treatment, in which v(t) is the input function ranging from 0 to 1 and s 2 is a scaling term. Since each immunotherapy treatment mechanistically works differently, the generalization made here is that the immunotherapy effectively increases the number of effector cells at the tumor site. dC 1 /dt: Here, k 1 is the elimination from the plasma, and k 2 is the elimination from the effect compartment. u is the input function (ranging from 0 to 1) for administration protocol of the chemotherapy treatment (with s 1 serving as a scaling factor) and V 1 is the volume of distribution of the compartment. dC 2 /dt: k 12 is the link process between the plasma component and the tumor, V 1 and k 2 are as above, V 2 is the effect component for C 2 (t). Together, these last two equations incorporate the pharmacokinetic components of drug administration.

III. IMPACTS OF CONTINUOUS INFUSION OF CHEMOTHERAPY
The parameter values used in model simulations are given in Table I.
The most standard clinical treatment for cancer is chemotherapy administration. As such, in this section, numerical simulations are performed for various chemotherapy treatment situations. Initial conditions of 300 effector cells and 10,000 sensitive cancer cells and an s 1 value of 100 were used for all simulations below. Three simulations were performed for varying intensities of chemotherapy treatment. The results can be seen in Figure 1.  In this figure, the top panel corresponds to a u value of 0.2, or 20% of the maximal chemotherapy administration possible. The middle panel represents 50% and the bottom panel is 100%. There are a few things to notice in these figures. First, consider the key similarity: in all three treatment scenarios, the effector cell population reached the same equilibrium of 4 * 10 6 cells at similar times. Note that, in these simulations, a low side effect of the chemotherapy drug on the immune system was assumed. The critical differences lie in the populations of sensitive and resistant cells. We see a common trend in that the higher the dose of chemotherapy, the faster the sensitive cancer cell population goes extinct, but the larger the final equilibrium of resistant cancer cells is.
Since, in this model, we are dealing with a rapidly evolving cancer cell population, regardless of the intensity of continuous infusion chemotherapy, resistance is ensured to occur, without complete remission of the cancer. Thus, to more rigorously determine which aspects of the cancer to attack, a sensitivity analysis was performed.

IV. SENSITIVITY ANALYSIS
A sensitivity analysis was performed on the above ODE model using the Sobol-Martinez method. Parameters which cannot be modified with treatments were given a range 10% higher and 10% lower than the original value used in the original numerical simulations. The Sobol-Martinez method is based on variance decomposition techniques to provide a rigorous measure of the contributions of the input to output variance. The algorithm outlined by Zhang et al. was implemented for the sensitivity analysis given in Figures  2, 3 and 4.
The images on the left are the first order Sobol indices: the contribution to the output variance by the single model input alone. The second image is the total Sobol index, one which measures the contribution to the output variance caused by a model input, including the first-order effects as well as all higher-order interactions. As one can see, in the case of the effector cells, the most sensitive parameter value for the first 50 or so time steps is s, a quantity that is increased by immunotherapy treatments such as dendritic cell therapy. However, after that time, until 400 time steps, the µ parameter (side effects of the drug) is just as sensitive, if not more sensitive. Since the side effects of immunotherapy do not directly impact the effector cells usually, this can be interpreted as the side effects of chemotherapy impacting the effector cells even more than the potential benefits of the immunotherapy. Medically, from the effector cell side, it would seem that the best way to progress would be to give a high dose of dendritic cell therapy, for example, at the beginning, then work to limit the side effects of other chemotherapy drugs that are given.
In the case of the resistant cells, one can see For the sensitive cells, it's clear that, at the beginning 50 time steps, a is the most sensitive parameter. This then quickly switches to q being the most important parameter, with the IC 50 value of the drug becoming more sensitive as time progresses. Medically, this implies that it's best to give immunotherapy throughout the treatment, while introducing increasing amounts of chemotherapy over time.
Thus, overall, it seems that the parameters a, q, s, and l are the parameters that impact the dynamics the most. Medically, it then makes sense to target these parameter values. Though it is often not possible to target all of these parameters in a treatment plan, our analysis suggests the plausibility of a combined chemotherapyimmunotherapy treatment plan. So, to determine what the best combination and timing of chemotherapy-immunotherapy administered protocols are, we perform an optimal control analysis.

V. OPTIMAL CONTROL
Here, we formulate the problem of determining the most effective treatment regimen as an optimal control one. Specifically, we desire to minimize the amount of drug given and the tumor size over a fixed therapy interval [0,T] while maximizing the number of effector cells, thus maintaining an effective tumor-immune balance as done in [Bukkuri 2019]. Drug toxicity is accounted for by the u(t) term in the integral, denoting the total drug dosage over the given treatment period as a penalty term.
We choose as our control class piecewise continuous functions defined for all t such that 0 ≤ u(t) ≤ 1 where u(t)= 1 represents maximal chemotherapy and u(t)= 0 represents no chemotherapy. Thus, we depict the class of admissible controls as Now, we define our objective functional and optimal control problem. We specifically consider two cases: one treatment (J 1 ) with just chemotherapy and another treatment (J 2 ) with a combined chemotherapy-immunotherapy (e.g. dendritic cell therapy) regimen.
For a fixed therapy horizon [0,T], maximize the objective functional over all Lebesgue-measurable functions u : [0, T ] → [0, u max ] subject to the above ODE dynamics and initial conditions. Theorem 1: Consider the objective functional J 1 (u) subject to the state system in section 2. Assume that: • there exists an admissible pair ( K,u(t))

t 1 ] and all admissible pairs ( K,u(t))
Then there exists an optimal control pair ( K * , v * ) that maximizes J 1 (u). Moreover, for sufficiently small T, the optimal control system has a unique solution.
Proof. An admissible pair ( K, v(t)) is needed for the existence of optimal control. Since the system in equations (1) -(5) (our original set of equations) has bounded coefficients and any solutions are bounded on the finite time interval, using a result from Lukes [Lukes 1982], we obtain the existence of the solution of the system described by equations (1) - (5). Now, we need N( K,U 1 ,t) to be convex in U 1 for each ( K,t). We define: for some γ 1 ≤ 0 and u 1 ∈ U 1 , for some γ 2 ≤ 0 and u 2 ∈ U 1 . We must now prove To do this, let and define Then, γ 3 = λγ 1 + (1 − λ)γ 2 ≤ 0, since γ 1 , γ 2 ≤ 0 and λ ∈ [0, 1]. Then, we see that Combining this information, we find a v 3 ∈ [0, 1] and γ 3 ≤ 0 such that The next requirement for the existence of optimal control is that U is closed and bounded. This is true, by definition. Finally, there exists a number θ s.t. || K|| ≤ θ∀t ∈ [t 0 , t 1 ] and all admissible pairs ( K,u(t)). The boundedness argument is analogous to those previously performed [Fister et al. 1998, Burden et al. 2004, Fister et al. 2005.
A similar argument can be applied for the existence of an optimal control for J 2 (u, v).
For the analysis, we shall continue with the J 2 (u, v) objective so that the proofs are more insightful. Note that J 1 is identical to the J 2 objective, but with a zero v term.
Since we've proven the existence of optimal controls to maximize the J 1 and J 2 functionals, first order necessary conditions for optimality can be determined by a version of the Pontryagin maximum principle. For the characterization of the optimal control, we first define the Hamiltonian associated with J 2 (u, v) and the system of ODEs as follows: The existence of an optimal control for the state system given in section 2 associated with the objective J 1 (u) can be determined from the Filippov-Cesari theorem. For the theorem, the following notation shall be used: where φ ≤ 0 and u ∈ U 1 . Regarding the uniqueness of the controls, similar proofs are given in [Burden et al. 2004, Fister et al. 1998]. The additional constraint that T must be small is due to the fact that the state system is moving forward in time while the adjoint system is moving backwards.
Theorem 2: Given optimal controls u * and v * and solutions of the corresponding state system, there exist adjoint variables λ i for i = 1,2,3,4,5 satisfying the following: where λ i (T ) = 0 for i=1,2,3,4,5 by the PMP transversality condition. Furthermore, from the optimality condition, u * is given by: Proof. From the Hamiltonian, the derivatives of the adjoints were calculated, and the following is seen. u * is given by: while v * is similarly given by: To determine the representation of the control, we first prove that both controls are bang-bang via proof by elimination. If both controls are singular on [0,T] and take time derivatives, then λ 3 (t) = 0 and λ 4 (t) = 0 on [0,T] due to the continuity of λ 3 (t) and λ 4 (t). This is a contradiction of the fact that λ 3 = B2 s2 and λ 4 = B1V1 s1 since we're assuming the presence of a tumor and some side effects of the drug. Thus, it's not possible for both controls to be singular.
Now, let's analyze the possibility of one of the controls being singular. For simplicity, we will work with the v * control, but the same argument applies to the u * control. If the v * control is singular, λ 3 = B2 s2 on [0,T]. Taking a time derivative of s 2 λ 3 − B 2 = 0, we get s 2 λ 3 (t) = 0, implying that λ 3 = 0 and that λ 3 (t) is constant. However, since λ 3 (t) is continuous on [0,T] and λ 3 (T ) = 0 by the transversality condition, then λ 3 (t) = 0 for any subset on [0,T]; however, again, λ 3 = B2 s2 , a nonzero quantity for any meaningful therapy. Thus, we conclude that both controls must be bang-bang.
Note that this agrees with current medical knowledge that chemotherapy is best given in bang-bang controls and hence why chemotherapy is given in cycles: a dose of one or more drugs followed by several days or weeks without treatment. This ensures that the normal cells have enough time to recover from the drug side effects. A similar argument can be given for immunotherapy, though the side effects are less direct than those for chemotherapy Intuitively, this makes sense since it's stating that no treatment will be used when the V 1 is high (i.e. essentially that the tumor/normal compartment ratio is high) and the maximum effect of the drug is low compared to the patient's treatment tolerance level. On the other hand, if the patient can tolerate great side effects, the tumor size is large, and there exists a high drug efficacy, maximum treatment should be given.

VI. SIMULATED PATIENTS
We can think of our optimality system as a twopoint boundary value problem, which we solve using a fourth-order iterative Runge-Kutta scheme, as done in [Jung et al. 2002]. In this scheme, we perform a forward sweep of the state equations with initial guesses for u and v, before performing a backward calculation using the adjoint equation and an update of the controls. This method is done iteratively until convergence is obtained.
Below are simulations of a few hypothetical patients. We specifically analyze the cases of low and high side effects for chemotherapy and immunotherapy. The weighting terms in the objective functional were fixed at α = 0.03, b 1 = 0.5, and b 2 = 0.6. s 1 was fixed at 100 and s 2 was fixed at 100,000.
The  Figure 5 depictions of all four possible combinations of side effects for chemotherapy and immunotherapy over 700 days are presented.
Note that, in accordance with clinical oncology practices, the optimal controls in all side effect scenarios include bang-bang controls for both chemotherapy and immunotherapy.
The key differences in optimal control protocols occur in the first 400 days of treatment. In the case of low side effects for both chemotherapy and immunotherapy, it is advised that chemotherapy be delivered for the first 150 days and immunotherapy be delivered for the last 380 days. This is in stark contrast to the other three scenarios, in which chemotherapy and immunotherapy treatments do not overlap for the first 400 days. For example, consider low side effects for chemotherapy and high side effects for immunotherapy. In this case, chemotherapy should be prescribed for the first 150 days and immunotherapy should be given for the next 250 days. When chemotherapy has high side effects for the patient, it is effectively removed from the optimal treatment plan (for the first 400 days). Instead immunotherapy is given for the last 380 days (if it has a low side effect) or the last 250 days (if it has a high side effect). Past day 400, optimal treatment protocols in all four cases are to give 70 days of chemotherapy followed by 70 days of a drug vacation and to also give 70 days of immunotherapy followed by 70 days of a drug vacation-note that these two treatments overlap and are often given together.

VII. CONCLUSION
In this paper, a model was created capturing dynamics among tumor cells sensitive and resistant to chemotherapy, effector cells, and chemotherapy PKPD. Evolution of resistance to chemotherapy was taken into count. Parameter values were estimated from medical literature and numerical simulations were performed displaying dynamics under control and treatment cases. Then, Sobol-Martinez sensitivity analyses were performed which found the growth rate of cancer cells, killing of cancer cells by the immune system, effector cells in the tumor compartment, and the resistance emergence in cancer cells to be the parameters which impact the dynamics of effector cells, sensitive tumor cells, and resistant tumor cells the most. Next, a characterization of the optimal treatment profile was given analytically, along with proofs for the existence and uniqueness of such a protocol. Numerical optimal control was then performed to illustrate the optimal treatment schedule for a hypothetical cancer patient. This showed a profile of bang-bang chemotherapy and immunotherapy controls, often with overlapping treatment regimens. The author hopes that the model and analysis will help inspire further medical research to be conducted in creating drugs which reduce the natural growth rate of cancer cells, as well as the clinical use of combined chemotherapyimmunotherapy treatments in accordance with the optimal protocols presented.