Determination of parameters for Cauchy ’ s problem for systems of ODEs with application to biological modelling

An inverse numerical method that estimates parameters of dynamic mathematical models given some information about unknown trajectories at some time is applied to examples taken from Biology and Ecology. The method consists of determining an overdetermined system of algebraic equations using the experimental data. The solution of the overdetermined system is then obtained using, for example the least-square method. To illustrate the effectiveness of the method an analysis of examples and a numerical example for the model that monitors the dynamics of HIV is presented. Keywords-Inverse problem; least squares methods; parameter estimation; dymamic systems; predator-prey system;


I. INTRODUCTION
Function approximation on a fixed interval by means of an initial value problem of an ordinary differential equation with unknown coefficients and unknown initial values as presented by M Shatalov, I. Fedotov and S.V. Joubert in [7], is central to this study.In their paper al method to determine both the unknown coefficients and initial values of a dynamic system by minimizing a certain goal function is presented.In earlier collaborations Shatalov and Fedotov suggested the use of such an approach in identifying dynamic systems' parameters from experimental data, see [6].
Several other parameter estimation methods are presented in literature, for instance the stochastic models; the Bayesion approach, the Monte Carlo technique, the numerical method with combined Adomain/Alienor approach, the differential evolution (DE) and the hybrid Taguchi-differential evolution algorithm.
The method used is based on integrating both sides of equations of a dynamic system, and applying regression methods to the overdetermined system of linear algebraic equations with possible constraints.The unknown parameters and initial values can then be obtained using the method of least squares.In this paper, the proposed method gives parameter estimates that have a percentage relative error that is mostly less than 0.4% for artificially generated data and parameters that are in the expected range for real data.
As an illustration of the method of identifica- Numerical results with artificially generated data and real data is then given for the model that monitors the dynamics of HIV.

A. Inverse methods
The general approach to the inverse system identification of a n -dimensional parameter a ∈ A ⊂ R n , where A can coincide with R n (no constrains between entries of a) and A can be subset of R n (there are constrains) in a system of ordinary differential equations of the following form where R m subject to experimental information concerning the values x (t j ) at the point t j ∈ [0, T ] is known: The general approach to find a solution of the formulated problem (1) consists of determining an overdetermined system of algebraic equations using the experimental data (2) with respect to unknown vector a.The solution of the system (3) is then obtained using any method of solution of the overdetermined system, for example the least-square method which minimize the difference Aa − h using Euclidean metric.It is known that in this case (see, for example [5]), the solution of (3) can be obtained by solving the following system to obtain a.

II. MATHEMATICAL MODELS
1) Problem 1: Free population: Consider a single species that grows by sexual reproduction.Assume that the individuals move in the population like Brownian motion particles (or that the population is colonial), then the frequency of contact between the individuals is proportional to the squared population density [1].Further assuming that mortality is different from zero and is independent of the population size, the scalar function f (t, x, a) is given by with the unknown vector a = (a 1 , a 2 , a 3 ) ∈ R 3 , n = 3, m = 1 and a 1 > a 3 .The parameters a 1 and a 2 represent per capita birth rate (fecundity) and the population at which half of the females are able to reproduce, respectively.The mortality rate of the population is represented by a 3 .The parameters a 1 , a 2 and a 3 are positive and x(t) = x(t) ∈ R is a scalar function.We also assume that x 0 is specified.
A special case arises if we assume that the population is of negligible mortality.That is, if a 3 = 0 the scalar function f (t, x, a) is then given as 2) Problem 2: Population with negligible mortality and x 0 unknown: Here we consider the special case of Problem 1 (that is, if a 3 = 0) but with the value x 0 considered as an incorrect value which must be corrected.The statement of such a problem naturally appears since the initial value x 0 = x(t 0 ) plays an important role, namely; it defines the Cauchy's problem for equation (1) and, secondly x 0 is included widely in computations below.Therefore if x 0 given from an experiment has low accuracy, it would be desirable to define this value with more accuracy.
3) Problem 3:Nonlinear predation at small prey population: In this problem we consider an interspecies modification of the Lotka-Voltera model where there is a nonlinear predation at small prey population.Consider the system where p = 1, 2, a 1 , a 2 , a 4 > 0, a 3 , a 5 ≥ 0 and n = 5, m = 2.The prey and predator densities are represented by x(t) and y(t), respectively.In this model, the parameters a 1 represent the rate of the consumption of prey by the predator population, a 2 the prey population density at which the predator's consumption is half the maximum value [1], a 3 the prey's growth rate, a 4 is rate at which the prey contributes to the predator's growth rate and a 5 is the predator's death rate.4) Problem 4: A model that monitors the dynamics of HIV: Consider the following twodimensional model that monitors the dynamics of HIV.The model considers two sub-populations: HIV susceptible (x), the HIV infected population (y).The total population size is given by N = x + y.The model is described by where a 1 , a 2 , a 3 and a 4 are all positive constants model parameters.The parameter a 1 and a 2 denotes the average rate of infection by HIV, a 2 the natural cessation of sexual activity, a 3 the recruitment rate of susceptible and a 4 denotes the death rates of the infected population due to HIV.This model is a modified version of the three dimensional one by Gumel (see, [3]).The vector a = (a 1 , a 2 , a 3 , a 4 ) T ∈ R 4 is unknown.Note that in this case n = 4 and m = 2.

SYSTEMS
Consider equation (1) where f is defined by The resultant equation can be rewritten as Integration of (10) with respect to t from t 0 to t j (j = 1, 2, . . ., N ) gives where x 2 dt, The integral P j can be calculated by using a quadrature rule, for example trapezoidal rule.Thus, System (3) is solved with Now, suppose that (1) is defined as in (10) and the initial condition is unknown.Let us replace in (11) ∆x j and ∆h j by those given in (12): where h j = 1 2 x 2 j under the assumption that for P j we use the old values of x 0 defined by (2) for j = 1 and for j ≥ 2 the values of P j are evaluated by an open quadrature rule.Formally speaking system (13) is nonlinear but setting we obtain the linear system of the form (3) with The value of x 0 can then be obtained from Equation (14).Equation ( 1) with f defined by ( 5) can be written in the form In contrast to (10) two supplementary terms −a 3 x 2 and −a 2 a 3 x arise.Integrating (15) we obtain where x 2 0 + a 2 x 0 and b 4 = a 2 a 3 the system can be written in the form (3) where In other words, we obtain the following system Since the Jacobian is ∂b ∂a = a 2 + x 0 = 0, we can find the vector b and hence the unknown vector a.
In other words, we obtain the following system of 2N equations with seven unknowns: To evaluate numerically the right hand side of S j can be considered as the Riemann-Stietjes integral S j = tj 0 x p (t)dy(t), (see, for example Dragomir and Fedotov [3]).Otherwise this integral can be computed using numerical differentiation.Another approach to construct the over determined system is generally appealing since the system obtained is simpler than (19).This method is possible due to the special form of the system (7); namely the system can be written as where g(x, y, a 2 ) = x 2 y x p +a2 .Eliminating g(x, y, a 2 ) we get where Using the integration techniques that we discussed earlier, we determine the vector c = (c 1 , c 2 , c 3 ) and hence c3 c2 = a 3 , which will be considered as unknown in the next step.
Multiplying the first equation of ( 7) by (x p + a 2 ), we obtain After integration of (28) the overdetermined system can be written as a 1 Z j + a 2 (∆x j − a 3 X j ) = −∆h j + a 3 P j , (29) where j = 1, . . ., N , and Z j , P j , X j , ∆h j , ∆x j were defined in (20) and ( 22).The unknown parameters a 1 and a 2 may then be determined from equation (29) and a 4 and a 5 from (27).

A. A model that monitors the dynamics of HIV
The system (8) can be written as where ∆x j = x(t j ) − x(t 0 ), ∆y j = y(t j ) − y(t 0 ), x(t)dt and Y j = tj 0 y(t)dt.
The system (30) can be written in the form (3) with In order to illustrate the effectiveness of the method the parameter identification of the system (8) is presented.Using parameter values from Gumel [4], we generate points of solutions of the system (8) by the adaptive Runge-Kutta method.A mathematical software Mathcad was used for the adaptive Runge-Kutta method and for the integration by quadratue rules.These solutions are then perturbed by a normal distribution with mean x and standard deviation δ = 0.5 and further taken as "experimental data".
Furthermore, system (8) is studied in the context of the Gauteng Province, South Africa.Data used in the numerical simulation is obtained from the Acturial Society of South Africa (ASSA) (see [2]) HIV prevalence estimates.The data is compiled from the population sensus, antenatal survey and registered deaths [2].The ASSA 2003 tables gives the population N and the number infected with HIV y.From the relationship we obtain the susceptible population x as The results given in Table 2 below, show a comparison of the parameters used (from Gumel [4]) , the estimated parameters and the percentage error given by ||α − α||/||α|| × 100.From the comparison of the parameters in Table 2, Figures 1 and 2, it can be seen that the estimated parameters are close enough to the actual values.The estimated parameters of dynamic HIV mathematical models for the Gauteng Province are all nonnegative and also in the expected ranges.

V. CONCLUSION AND SUGGESTIONS FOR FURTHER RESEARCH
A method for estimating parameters of dynamic mathematical models given some information about unknown trajectories at some time, by Shatalov, Fedotov and Joubert [7], was applied to problems from Biology and Ecology.In Problem 2 where the initial value was assumed to be unknown with accuracy, the method was applied to find this value.To illustrate the efficiency of the method, a numerical example for the model that monitors the dynamics of HIV was presented.The estimated parameters for the artificially generated data are close enough to the actual parameter values and for the Gauteng Province the estimated parameters are in the expected ranges.
The method for estimating parameters values could be improved by incorporating a suitable penalty term that minimizes the error caused by numerical quadrature and high observational noise levels in the real data.An improvement of the method will be investigated in future studies.

Fig. 2 .
Fig. 2. Comparison of the estimated solutions, Yi, with the "experimental data" and the absolute error in the decimal logarithmic scale.

Table 2 :
Problem 4:The parameter estimates and errors.Figures1 and 2below show the comparison of the estimated solutions with the "experimental data" and the absolute error in the decimal logarithmic scale.The initial value problem was solved with the new coefficients and old initial conditions.
i Fig.1.Comparison of the estimated solutions, Xi, with the "experimental data" and the absolute error in the decimal logarithmic scale.