© 2018 Aaron A. King.

It is frequently natural to formulate expected relationships among variables in terms of differential equations (DE). Simply put, such equations express the relationship between the values of variables and the rates at which those values are changing. This maps well onto the way we typically think about systems. The solution of a set of DE is the trajectory of the variables through time. Thus DE often naturally arise when we analyze dynamical data.

None but the very simplest DE have solutions that can be expressed in closed form, i.e., as simple expressions in terms of elementary functions. However, as a result of decades of intensive research, various highly reliable and accurate numerical algorithms exist for the approximation of solutions of DE. In this laboratory exercise, we will see how these numerical integration algorithms can be accessed in **R** and explore some uses of DE in statistical analysis.

We begin by introducing a simple DE model. We next learn how to obtain numerical solutions in **R**. Finally, we embed numerical integration in a nonlinear least squares context to obtain parameter estimates and perform inference.

The classical SIR compartmental model divides a population of hosts into three classes: susceptible, infected, recovered (see the diagram below). The model describes how the numbers of individuals in each of these classes changes with time. Births are modeled as flows from “nowhere” into the susceptible class; deaths are modeled as flows from the S, I, or R compartment into “nowhere”.

The host population is divided into three classes according to their infection status:

- S, susceptible hosts;
- I, infected (and infectious) hosts;
- R, hosts removed from the infectious population.
- Births result in new susceptibles and all individuals have a common
*per capita*death rate \(\mu\). The raw birth rate (births per unit time) is \(B\). - The
*per capita*S\(\to\)I rate, \(\lambda\), called the*force of infection*, depends on the number of infectious individuals according to \(\lambda(t)={\beta\,I}/{N}\). - The
*per capita*I\(\to\)R, or recovery, rate is \(\gamma\).

Let’s let \(S\), \(I\), and \(R\) refer to the numbers of hosts in the respective compartments. Then these *state variables* change according to the system of differential equations: \[\label{eq:sir}
\begin{aligned}
{\frac{\mathrm{d}^{}{S}}{\mathrm{d}{t}^{}}} &= B-\lambda\,S-\mu\,S\\
{\frac{\mathrm{d}^{}{I}}{\mathrm{d}{t}^{}}} &= \lambda\,S-\gamma\,I-\mu\,I\\
{\frac{\mathrm{d}^{}{R}}{\mathrm{d}{t}^{}}} &= \gamma\,I-\mu\,R\\
\end{aligned}
\] Here, \(B\) is the crude birth rate (births per unit time), \(\mu\) is the death rate, and \(\gamma\) is the recovery rate. We’ll assume that the force of infection, \(\lambda\), has the form \[ \lambda = \beta\,\frac{I}{N}
\] where \(N\) is the total population size (\(N=S+I+R\)), so that the risk of infection a susceptible faces is proportional to the *prevalence* (the fraction of the population that is infected). This is known as the assumption of frequency-dependent transmission.

Since these equations are nonlinear, it’s not surprising that one can’t solve them analytically. However, we can compute the *trajectories* of a continuous-time model such as this one by integrating the equations numerically. Doing this accurately involves a lot of calculation, and there are smart ways and not-so-smart ways of going about it. This very common problem has been very thoroughly studied by numerical analysts for generations. The result of all that work is that, when the equations are smooth, well-behaved functions, excellent numerical integration algorithms are readily available to compute approximate solutions to high precision. In particular, **R** has several sophisticated DE solvers which (for many problems) will give highly accurate solutions. These algorithms are flexible, automatically perform checks, and give informative errors and warnings. To use the numerical differential equation solver package, we load the **deSolve** package

`library(deSolve)`

The ODE solver we’ll use is called `ode`

. Have a look at the help page for this function.

`?ode`

The help page tells us that a call to `ode`

looks like

```
ode(y, times, func, parms,
method = c("lsoda", "lsode", "lsodes", "lsodar", "vode", "daspk",
"euler", "rk4", "ode23", "ode45", "radau",
"bdf", "bdf_d", "adams", "impAdams", "impAdams_d", "iteration"),
...)
```

In particular, `ode`

needs to know the *initial values* of the state variables (`y`

), the `times`

at which we want solutions, and the right-hand side of the ODE `func`

. The latter can optionally depend on some parameters (`parms`

). Finally, there are a large number of different algorithms (`method`

) that we can choose from; the default is the Livermore solver of ordinary differential equations with automatic method switching (LSODA). It’s an excellent, general purpose solver.

Let’s study the SIR model for a closed population, i.e., one in which we can neglect births and deaths. Putting \(B=\mu=0\) into the SIR equations, we obtain \[\label{eq:closed-sir}
\begin{aligned}
{\frac{\mathrm{d}^{}{S}}{\mathrm{d}{t}^{}}} &= -\frac{\beta\,S\,I}{N}\\
{\frac{\mathrm{d}^{}{I}}{\mathrm{d}{t}^{}}} &= \frac{\beta\,S\,I}{N}-\gamma\,I\\
{\frac{\mathrm{d}^{}{R}}{\mathrm{d}{t}^{}}} &= \gamma\,I\\
\end{aligned}
\] To encode these equations in a form suitable for use as the `func`

argument to `ode`

, we’ll need to write a function. For example:

```
closed.sir.model <- function (t, x, params) {
## first extract the state variables
S <- x[1]
I <- x[2]
R <- x[3]
## now extract the parameters
beta <- params["beta"]
gamma <- params["gamma"]
N <- S+I+R
## now code the model equations
dSdt <- -beta*S*I/N
dIdt <- beta*S*I/N-gamma*I
dRdt <- gamma*I
## combine results into a single vector
dxdt <- c(dSdt,dIdt,dRdt)
## return result as a list!
list(dxdt)
}
```

Note that the order and type of the arguments and output of this function must exactly match `ode`

’s expectations. Thus, for instance, the time variable `t`

must be the first argument even if, as is the case here, nothing in the function depends on time. [When the RHS of the ODE are independent of time, we say the ODE are *autonomous*; when they depend explicitly on time, they are said to be *nonautonomous*.] Note also, that `ode`

expects the values of the ODE RHS to be the first element of a `list`

.

Now we can call `ode`

to compute trajectories of the model, provided we have values for the parameters and the *initial conditions*, i.e., the values of the state variables \(S\), \(I\), and \(R\) at some initial time. If we’re thinking of a disease something like measles, and measuring time in years, we might use something like:

`parms <- c(beta=400,gamma=365/13)`

We now state the times at which we want solutions and specify the initial conditions:

```
times <- seq(from=0,to=60/365,by=1/365/4)
xstart <- c(S=999,I=1,R=0)
```

Next, we compute a model trajectory with the `ode`

command and store the result in a data-frame:

```
library(tidyverse)
ode(
func=closed.sir.model,
y=xstart,
times=times,
parms=parms
) %>%
as.data.frame() -> out
```

and plot the results using the commands:

```
out %>%
gather(variable,value,-time) %>%
ggplot(aes(x=time,y=value,color=variable))+
geom_line(size=2)+
theme_classic()+
labs(x='time (yr)',y='number of individuals')
```