This is our collection of resources on the theme of *Population Dynamics*. It will take you through the fascinating mathematics of creating mathematical models to describe the changes in populations of living creatures. This is an advanced set of material, taking you right through to university-level mathematical modelling.

If you have ambitions to become a famous and successful scientist or applied mathematician, this is a good place to start!

We made these resources with the help of two undergraduate Cambridge mathematicians and a postgraduate mathematical ecologist, who had often used many of these concepts throughout their studies, and we all hope that you enjoy this challenging and stimulating collection. Good luck!

# Population Dynamics

A hive of bees, a colony of ants and a parliament of owls.

These are just a few examples of animal groups, or *populations.* A population is dynamic; this means it is constantly changing in size and demographics. New animals are born, old animals die and other factors such as drought, fire and lack of predators, all cause a change in the population.

The *population growth* is the change in the number of individuals in a population, per unit time. For example, if a population has ten births and five deaths per year, then the population growth is five individuals per year.

In the following pages, we aim to represent populations and changes in populations using mathematics. This involves using differential equations and even probability.

Links to pages on differential equations:

A First Model |
Exponential and Geometric Models |
The Logistic Equation |

The Logistic Map |
The Lotka-Volterra Equations |
Modified L-V Equations |

Links to pages on probability:

### Beginning the Model

We are able to describe population growth by making some generalizations and using simple differential equations:

The size, $N_t$, of a population depends upon:

- The initial number of individuals, $N_0$
- The number of births, B
- The number of deaths, D
- The number of immigrants, I
- The number of migrants, E

This gives us the equation: $$N_t=N_0+B-D+I-E$$

When a population is *closed*, there is no immigration or emigration. This often occurs on remote islands, such as the Galapagos Islands. Our equation then becomes $N_t=N_0+B-D$ , or equivalently $$N_{t+1}=N_t+B-D$$

Clearly the population will increase if $B> D$, and will decrease if $B< D$.

A population is in *equilibrium* if on average the population size remains constant over a long period of time. Mathematically, this means: $N_t=N_{t+\Delta t}$

**Question:**

We can rewrite the equation $N_{t+1}=N_t+B-D$ , as: $$N_{t+1}-N_t=\Delta N_t=B-D$$ Intuitively, why does this make sense? Think of an example of a population to explain why.

# Population Dynamics - part 1

### Per Capita Rates

It is important to relate the basic population parameters (such as births or deaths) to the size of the whole population. This allows us to make a better decision if a population is at risk.

We define the *per capita birth rate* (or nativity rate) as the number of births per individual per unit time interval: $b=\frac {B}{N}$ .

Similarly we define the *per capita death rate* (or mortality rate) as the number of deaths per individual per unit time interval: $d=\frac {D}{N}$

### The First Model

Recall the population equation from before: $$N_{t+1}=N_t+B-D$$ Because per capita birth and death rates do not change with the size (or density) of the population, we can rewrite our model in terms of per capita rates: $$N_{t+1}=N_t+bN_t-dN_t=N_t+(b-d)N_t$$ This model is said to be *density-independent.*

We call the term $r=b-d$, the *geometric rate of increase*. Note that $r=\frac{\Delta N_t}{N_t}$ , so *r* can be interpreted as the per capita rate of change of population size.

The equation for our model becomes: $$\begin{align*} N_{t+1}&=N_t+rN_t \\ &=(1+r)N_t \\ &=\lambda N_t \end{align*}$$ where $\lambda=1+r$ is defined as the *finite rate of increase*. Note that $\lambda=\frac{N_{t+1}}{N_t}$ , so $\lambda$ can be interpreted as the ratio between the population size at one time to another time.

How do you think we can solve this new equation? Go here for more information.

Do you think this model is valid in reality? What problems do you think might occur? Think about environmental resources and density-independence. An investigation of these problems can be found here.

**Question:**

If 20 sea otters from a total population of 850 are fatally affected by disease, what is the mortality as a per capita rate?

Given the population is initially 850, and increases to 1000 after one year, what is the value of $\lambda$?

Use this to find the per capita birth rate, and find the population size in ten years.

# Population Dynamics - part 2

### Discrete Modelling

We often use *discrete* mathematics to model a population when time is modelled in discrete steps. This fits well with annual censuses of wildlife populations.

Sometimes populations are themselves discrete, such as:

- Species with non-overlapping generations (eg. annual plants)
- Species with pulsed reproductions (eg. many wildlife species in seasonal environments)

### Geometric Growth

The population equation, $N_{t+1}=\lambda N_t$ , from before means that over discrete intervals of time,$t_0, t_1, t_2, ...$, the rate of change in population size is proportional to the size of the population.

We first solve this equation: $$\begin{align*} N_{t+1}&=\lambda N_t \\ &=\lambda \lambda N_{t-1} \\& =...\\ &= \lambda^{t+1} N_0 \\ \Rightarrow N_t &=\lambda^t N_0 \end{align*}$$ The population size will depend on the value of $\lambda$

- If $\lambda> 1$ then exponential increase
- If $\lambda=1$ then stationary population
- If $\lambda< 1$ then exponential decrease

**Question:** If a population of owls increases by 40% in a year, what is the value of *r* and $\lambda$ ?

Given there were initially 10 owls, what will the population size be in 75 days? Can you plot this population growth?

### Exponential Growth

Some populations may grow continuously, without pulsed births and deaths (eg. humans). In these cases, time is a continuous smooth curve, so we use differential equations to represent this continuous model.

Using our discrete model from above: $$\begin{align*} N_{t+\Delta t}&=\lambda^{\Delta t} N_t =(1+r)^{\Delta t}N(t)\approx (1+r\Delta t) N(t)\\ \Rightarrow \Delta N_t&\approx r \Delta t N_t \\ \\\Rightarrow \lim_{\Delta t \to 0} \frac{\Delta N(t)}{\Delta t} &=\frac {\mathrm{d}N(t)}{\mathrm{d}t}=rN(t) \end{align*}$$ **Question: ** Solve the equation,
$\frac {\mathrm{d}N(t)}{\mathrm{d}t}=rN(t)$ , using standard integrals, showing that the solution is $N(t)=N_0e^{rt}$.

Different values of *r* determine the change in population size, as shown below.

Image

Also note the connection between the discrete and continous solutions: $$\begin{align*} N_t =\lambda^t N_0 &\text{ and } N(t)=N_0 e^{rt} \\ \Rightarrow \lambda^t&=e^{rt} \\ \lambda&=e^r \\ \ln(\lambda)&=r \end{align*}$$ **Question:** Using the discrete model above, how long does it take for this population to double in size? What
about the continous case?

### Limitations of the Models

Consider a population of insects which suddenly dies out right before the start of every time period, and whose children hatch right after. A discrete model would lead us to believe that there are no insects during the entire period, so instead we should use a continuous model.

On the other hand, it is often impossible to continually monitor the population size, so we approximate using the discrete case.

Choosing which of discrete or continuous to use is an important decision in modelling populations.

Can you also think of any assumptions we have made with these models, and why they could be a problem? Consider the environment the population inhabits and differences between members of the population.

Click here to see the geometric model adapted to include environmental resistance.

**Question: ** If $\lambda = 1.25$, by how much does a population of blue footed boobies increase per year?

The population N(t) of blue footed boobies is assumed to satisfy the logistic growth equation $\frac {\mathrm{d}N}{\mathrm{d}t}=\frac{1}{500} N(t) \big( 1-N(t)\big)$ . Given $N_0=200$, solve for N(t). Repeat for $N_0=2000$. Discuss the long-term behaviour of the population in both cases.

We then solve this equation: $$\begin{align*} \frac {\mathrm{d}N}{\mathrm{d}t}&=rN(t) \\ \frac {\mathrm{d}N}{N(t)}&=r\mathrm{d}t \\ ln\big(N(t)\big)&=rt+c \\ N(t)&=e^{rt}e^{c} \\ \therefore N(t)&=N_0e^{rt} \end{align*}$$

- The birth and death rates remain constant and unvaried for different individuals
- No random changes over time (eg. due to fire, drought)
- The population is closed
- No time lag in the continuous model

# Population Dynamics - part 3

### The Logistic Equation

Unlimited growth is generally impossible, because birth and death rates are affected by factors such as lack of food, water and land. We now incorporate the effect of the environment on population size into the exponential model: $\frac {\mathrm{d}N}{\mathrm{d}t} =rN(t)$

Below are two derivations of this new model, called the logistic equation. It was originally devised by the mathematician Thomas Malthus. Notes on the discrete form of this equation can be found here.

### First Derivation

The *carrying capacity*, *K*, is the largest population that can be supported indefinitely, given the resources available in the environment. When the population size is far below *K*, its growth is exponential. As the population approaches *K*, it begins to be affected by the reduced ability of the environment to provide necessary resources.

In order to have this effect, we consider the term $\frac {K-N}{K}$ . Note that when $N< < K$ then $\frac {K-N}{K}\approx 1$ and when $N \rightarrow K$ then $\frac {K-N}{K} \rightarrow 0$.

Including the above term, we now get the equation: $$\frac {\mathrm{d}N}{\mathrm{d}t}=r \frac{K-N}{K} N(t)= rN\left(1-\frac{N}{K}\right)$$Note that each individual added to the population reduces the rate of increase of the whole population.

### Second Derivation

Consider a population of lions of size *y*, and begin by assuming the simple exponential equation $\frac {\mathrm{d}y}{\mathrm{d}t}=r y$ , where *r* is the intrinsic growth rate.

If the probability of food being found by an individual lion is *y*, then the probability of some food being found by two lions is proportional to $y^2$. Fighting will occur within the species for these limited resources, so the death rate due to fighting can be represented by $\gamma y^2$ . Our equation then becomes: $$\frac {\mathrm{d}y}{\mathrm{d}t}=ry-\gamma
y^2=ry\left(1-\frac{y}{Y}\right)$$where $Y=\frac{r}{\gamma}$. Again this is the logistic equation.

**Question:** Graph the logistic equation to find the equilibrium points. Then solve the equation using standard integrals, showing that the solution is given by $N(t)=\frac {K N_0 e^{rt}}{K+N_0 (e^{rt}-1)}$

Graphically we describe this kind of population growth by a sigmoid, or S-shaped growth curve:

By looking at the blue curve, we see the size of the population begins to level off and reaches a stable value, which must be less than the carrying capacity of the environment (marked in red).

Mathematically, we can show this by looking at the above solution for $N(t)$ and showing $\lim_{t\to\infty} N(t)=K$

**Question**: A small lake has a carrying capacity of 100 geese. Starting with a pair of geese, how would the population change over 70 years if $r=0.1$? Draw a sigmoid graph of this change.

Recall that *r* is the intrinsic growth rate. How will this affect the time taken for the number of geese to reach the carrying capacity of the lake? Perhaps start by drawing graphs with different *r*-values.

We first solve this equation: $$\begin{align*} \frac{\mathrm{d}N}{\mathrm{d}t}&=rN(1-\frac{N}{K}) \\ \frac{1}{K} \frac{\mathrm{d}N}{\mathrm{d}t} &= \frac{N}{K} r(1-\frac{N}{K}) \end{align*}$$ To simplify this, let $x=\frac{N}{K}$ $$\begin{align*} \frac{\mathrm{d}x}{\mathrm{d}t}&=xr(1-x) \\ \frac{\mathrm{d}x}{x(1-x)}&=r\mathrm{d}t \\ \Big( \frac{1}{x}+\frac{1}{1-x} \Big)
\mathrm{d}x&=r\mathrm{d}t \\ ln\Big(\frac{x}{1-x}\Big)&=rt+c \\ \frac{x}{1-x}&=X_0e^{rt} \\ x&=X_0e^{rt}-x X_0e^{rt} \\ x&=\frac{X_0e^{rt}}{1+X_0e^{rt}} \end{align*}$$ Putting *N* back in: $$N(t)=\frac {K N_0 e^{rt}}{K+N_0 (e^{rt}-1)}$$As expected, over time the population tends to the maximum allowed by the environment: $$\lim_{t\to\infty} N(t)=K$$

# Population Dynamics - part 4

### The Logistic Map

The logistic map is the discrete case of the logistic equation, given by: $\frac {\mathrm{d}y}{\mathrm{d}t}=ry(1-\frac{y}{Y})$

We then approximate to deduce the discrete case:$$ \begin{align*} \frac{y_{n+1}-y_n}{\Delta t} &\approx ry_n\left(1-\frac{y_n}{Y}\right) \\ y_{n+1} &\approx r \Delta t y_n \left(1-\frac{y_n}{Y}\right)+y_n \\ y_{n+1}&=(1+r \Delta t)y_n-\frac {r\Delta t}{Y}{(y_n)}^2 \\ y_{n+1}&=(1+r \Delta t)y_n\Bigg( 1-\bigg(\frac{r\Delta t}{1+r\Delta t}\bigg)\frac{y_n}{Y}\Bigg) \end{align*} $$

Let $\lambda=1+r\Delta t$ and $x_n=\frac {r\Delta t}{1+r \Delta t} \frac {y_n}{Y}$ . Then our equation becomes: $$x_{n+1}=\lambda x_n (1-x_n) $$ This is the logistic map. We can also think of it as a function $x_{n+1}=f(x_n)$.

### Finding Equilibrium Points

**Question:** A fixed point implies $x_{n+1}=x_n$ . Find the fixed points by solving $$ \lambda x_n (1-x_n) = x_n $$ To determine the stability of these points, we are going to find the stability, by investigating the function for values nearby the equilibrium points.

Start by supposing that $x_n=X$ is a fixed point. This means that $f(X)=X$.

To find a value near the equilibrium point, let $x_n=X+\epsilon{_n}$ where $\epsilon_n < < 1$. Then using the Taylor expansion: $$ \begin{align*} x_{n+1}&=f(x_n) \\ X+\epsilon_{n+1} &= f(X+\epsilon_n) \\ &=f(X)+\epsilon_n f'(X)+... \end{align*}$$

We neglect the higher-order terms to get: $$X+\epsilon_{n+1}=f(X)+\epsilon_n f'(X)$$ Now from above we saw that $f(X)=X$ , so we can simplify to get: $$\epsilon_{n+1} \approx f'(X) \epsilon_n$$ A fixed point, *X*, is then stable if: $\Bigg|\frac{\epsilon_{n+1}}{\epsilon_n}\Bigg | =\Bigg |f'(X)\Bigg | < 1$

**Question:** Given that $f'(x)=\lambda-2\lambda x$ , find the stability of the fixed points $x_n=0$ and $x_n=1-\frac{1}{\lambda}$

### Different Cases of Stability

Below are some graphs of the logistic map for different values of $\lambda$ .

**Case 1: $\lambda< 1$**

Only fixed point is 0, which is stable:

**Case 2: $1< \lambda < 2$**

Unstable fixed point at 0 and stable fixed point at $1-\frac{1}{\lambda}$

**Question**: Can you find the stability for the case $2< \lambda < 3$ ?

Below is a picture of some fantastic fractal behaviour which occurs for $3< \lambda< 4$.

**Question:** Can you relate these values of $\lambda$ to what would actually be occuring in a population of organisms?

So $x=0$ is stable for $-1< \lambda < 1$ and $x=1-\frac{1}{\lambda}$ for $1< \lambda < 3$

Oscillatory convergence to the stable fixed point.

# Population Dynamics - part 5

### The Lotka-Volterra Equations

The Lotka-Volterra equations allow us to model a biological system containing a predator and a prey species. They arose in the 1920s due to the independent work of the mathematicians Alfred Lotka and Vito Volterra.

Formulating the Equations

Let *x* be the size of a fish population and *y* be the size of the shark population. If the prey have an unlimited food supply and no predators, then they grow exponentially according to the equation $\frac {\mathrm{d}x}{\mathrm{d}t}=\lambda x$ , for some constant $\lambda$.

If there are more sharks, more fish will be killed and if there are fewer sharks, then fewer fish will be killed. So the rate of predation upon the prey is proportional to the rate at which the predators and prey meet. We represent this by $\alpha xy$.

Note that if either *x* or *y* is zero then there can be no predation.

The equation for prey becomes: $$\frac {\mathrm{d}x}{\mathrm{d}t}=\lambda x-\alpha xy =x(\lambda -\alpha y)$$ Conversely, predators rely on prey to survive, so in the absence of any prey, the predator equation is $\frac {\mathrm{d}y}{\mathrm{d}t}=-\gamma y$. The growth of the predator population will depend on the population sizes of the predators and prey, and the ability of the predators to successfully catch prey. We represent this by $\beta xy$.

The equation for predators becomes: $$\frac {\mathrm{d}y}{\mathrm{d}t}=\beta xy-\gamma y =y(\beta x -\gamma)$$

### Using the Equations

We are interested in population equilibrium, which occurs when neither population is changing: $$ \begin{align*} \frac {\mathrm{d}x}{\mathrm{d}t}&=\frac {\mathrm{d}y}{\mathrm{d}t}=0 \\ x(\lambda -\alpha y)&=y(\beta x -\gamma)=0 \\ \Rightarrow x=0,y=0 &\text{ and } x=\frac{\gamma}{\beta}, y=\frac{\lambda}{\alpha} \end{align*}$$ The first solution occurs when both species die out, the second when the sizes of both species reach an equilibrium with each other.

Determining the stability of these equations, requires knowledge of the Jacobian matrix and linearization, which we exclude for complexity.

**Question**: Consider a species of fish, *x*, and a species of shark, *y*, with population equations: $$\frac {\mathrm{d}x}{\mathrm{d}t}=2x-4xy , \frac {\mathrm{d}y}{\mathrm{d}t}=xy-2y$$ **a)** Where are the equilibrium points?

**b)** Below is a phasor diagram of the above equations. Try and work out what it is representing. What happens at the red point?

We can also plot the size of the populations against time on the same graph. The graph below clearly shows the dependency of the shark population on a large population of prey, and the oscillatory nature of the two population sizes.

**Question**: Why do you think the shark population peaks after the fish population has soared? Why do you think the shark population is always less than the fish population?

### Limitations to our model

In reality if two species coexist, their interactions also depend on the environment they inhabit. This is explored further here. Some of the assumptions made in the above equations are:

- The prey always find ample food
- The rate of change of population is proportional to its size

**Question:** Can you think of any other assumptions we have made in our model?

Although there is no 'right' answer to an estimation, there are good or bad estimations and sensible or over detailed calculations.

Think how you might make your estimation a good one, and think how it makes sense to ignore certain complexities in the calculation.

Red dot is when the populations of sharks and fish will remain unchanged.

the equilibrium points are $x=y=0$ and $x=2, y=0.5$

Other factors

- The predators only get food by eating the prey
- The environment does not change to favour one species and genetic evolution is slow

### Why do this problem ?

Practice with the use of numbers is a crucial biological skill. These interesting questions will allow you to practise these skills whilst developing awareness of orders of magnitude in scientific contexts.### Possible approach

There are several parts to this question. The individual pieces could be used as starters or filler activities for students who finish classwork early. Enthusiastic students might work through them in their own time. Since there is no absolutely 'correct' answer to many of these questions, they might productively be used for discussion: students create their own answers and then explain them to the rest of the class. Does the class agree? Disagree? Is there an obvious best 'collective' answer?### Key questions

- What assumptions will you need to make in this question?

- How accurate do you think you answer is?

- What order of magnitude checks could you make to test that your answer is sensible?

### Possible extension

Can students make up similar questions? Can they put any upper or lower bounds on the numbers?### Possible support

Students might struggle with the 'open' nature of the questions. To begin, they might like to read the Student Guide to Getting Started with rich tasks# Population Dynamics - part 6

We now incorporate the effect of the environment on the Lotka-Volterra equations derived earlier. Consider a population of giraffes of size *x* and a population of hyenas of size *y.*

Using the logistic equation from before, we can model the effect of the carrying capacity with the equations: $$ \begin{align*} \frac {\mathrm{d}x}{\mathrm{d}t}&=r_1 \frac{K_1-x}{K_1} x= r_1 x\left(1-\frac{x}{K_1}\right) \\ \frac {\mathrm{d}y}{\mathrm{d}t}&=r_2 \frac{K_2-y}{K} = r_2y\left(1-\frac{y}{K_2}\right) \end{align*}$$
An increase in either population of species will reduce the resources available to both. In order to model this, we introduce a *competition coefficient* to represent the competitive effect of one species on the other.

Let $\alpha$ be the competitive effect of the hyenas on the giraffes, and $\beta$ be the competitive effect of the giraffes on the hyenas. We then consider the terms $\frac{K_1-x-\alpha y}{K_1}$ and $\frac{K_2-y-\beta x}{K_2}$ .

**Question:** Can you explain the logic behind these terms? Think what would happen if either the giraffe or hyena population died out.

Our population equations then become: $$\begin{align*} \frac {\mathrm{d}x}{\mathrm{d}t}&=r_1 x \Bigg(1- \frac{x+\alpha y}{K_1}\Bigg) \\ \frac {\mathrm{d}y}{\mathrm{d}t}&=r_2 y \Bigg(1- \frac{y+\beta x}{K_2}\Bigg) \end{align*}$$

**Question**: Suppose (quite grandly) that two populations of giraffes and hyenas have population equations $\frac {\mathrm{d}x}{\mathrm{d}t}=2x\bigg(1-\frac{x+2y}{5}\bigg)$ and $\frac {\mathrm{d}y}{\mathrm{d}t}=y\bigg(1-\frac{y+x}{3}\bigg)$ .

What are the equilibrium points?

Look at the phase diagram below. What is happening to the populations of both animals? What do you think the red point means?

**Question**: If you were to create your own model, what other parameters would you consider when creating differential equations to describe the population sizes? Perhaps think about other predators and seasonal variation of the carrying capacity.

# Branching Processes and Extinction

#### This is the final destination in our series on population dynamics. Well done for making it this far!

#### Branching Processes

Following on from the basic branching processes introduction, we now calculate the expected number of individuals at the *n*th generation.

As in the above example, let $Z_n$ be the number of individuals in generation *n,* and *X* be a random variable describing the number of offspring an individual has, with $E[X]=\mu$ and $Var[X]=\sigma^2$

$$ \begin{align*} \therefore E[Z_n] & = G'_n(1) \\ &= G'_{n-1} \Big (G(1) \Big)G'(1) \\ &= G'_{n-1}(1)G'(1) \\ &= E[Z_{n-1}] \mu \\ &= \mu^2 E[Z_{n-2}] =... \\ &= \mu^n E[Z_0] \\ &= \mu^n \end{align*} $$

Clearly the eventual population size is highly dependent on the value of $\mu$

- if $\mu < 1$ then $E[Z_n] \rightarrow 0$
- if $\mu = 1$ then $E[Z_n] \rightarrow 1$
- if $\mu > 1$ then $E[Z_n] \rightarrow \infty $

So if each individual is expected to have more than one offspring, then the population will increase. If each individual is expected to have either one or no offspring, then the population will remain constant or decrease and eventually die out.

** Question**:

Why do elephants not die out, if the above comment on mean family size holds? What are the limitations of our model in representing the reproductive lifespan of an elephant?

#### Probability of Extinction

By evaluating the mean, we see that ultimate extinction is certain only when the mean family size is $\mu \leq 1$.

To find this probability exactly, we let the probability of extinction at the nth generation be $\theta_n=P(Z_n=0)$ . So the probability of ultimate extinction is $\theta = lim_{n \to \infty} \theta_n=lim_{n \to \infty} P(X_n=0)$ .

$$ \begin{align*} \theta_n & = G_n(0) \\ & = G_{n-1} \Big(G(0)\Big) \\ &= G\Big(G\big(...(s)...\big)\Big) \\ &= G\Big(G_{n-1}(s)\Big) \\ &= G(\theta_{n-1}) \end{align*} $$

So as $n \rightarrow \infty$ , we have $\theta_n\rightarrow \theta$ and $G(\theta_{n-1}) \rightarrow G(\theta)$ . And so we can find $\theta$ by solving $$\theta=G(\theta)$$

Now there may be other roots to this equation, so we show $\theta$ is the smallest by supposing $\alpha$ is also a root. Then $\theta_1=G(0) \leq G(\alpha)=\alpha \Rightarrow \theta_2=G(\theta_1) \leq G(\alpha)=\alpha$

And so proceeding by induction, $\theta =lim_{n \to \infty} \theta_n \leq \alpha$ , which shows $\theta$ is indeed the smallest non-negative root.

The dependence of $\theta$ on the value of the mean family size, is shown in the diagrams below. The first case being $\mu \leq 1$ and the second case $\mu > 1$ .

**Example:**

In the previous elephant example, we now solve for $\theta$ in the equation $$ \begin{align*} \theta & = G(\theta) \\ &= (1-p^n)+p^n \theta\\ \theta(1-p^n) &= (1-p^n) \\ \therefore \theta &=1 \end{align*}$$

# Population Ecology using Probability

### Branching Processes

Branching processes, or tree graphs, model the growth and eventual size of a population. If we know the probabilities of the number of offpsring produced at each generation, then we can determine the probability of ultimate extinction, or the eventual population size.

Image

### Probability Generating Functions

Consider a variable *X,* where $P(X=0)=p_0, P(X=1)=p_1, ...$

This is an integer valued variable with its mass function as a sequence. We set two conditions:

- All probabilities need to be positive $ p_k \geq 0 $
- Only one event can and must occur, so $p_0+p_1+...=\displaystyle\sum\limits_{k=0}^{\infty} p_k =1$

The *probability generating function* *G*, is an ordinary function in terms of *s:* $$G_X(s)=p_0+p_1 s+p_2 s^2+...$$ **Question:** What is the value of *G(s)* when $s=0$? And when $s=1$?

**Example:** Consider a random variable *Y* with the geometric distribution with parameter *p*.

Then $P(Y=k)=p(1-p)^{k-1}=pq^{k-1}$ for $k=0,1,...$.

So *Y* has PGF given by: $$\begin{align*} G_Y(s) & = \displaystyle \sum_{k=1}^{\infty} p q^{k-1} s^k \\ &= ps \displaystyle \sum_{k=0}^{\infty} (qs)^k \\ &= \frac {ps}{1-qs} \end{align*}$$

### Expectation

We can relate the PGF to the mean, or *expectation*. Recall that: $$E(X)=\bar x = \displaystyle \sum_{all x}^{ } xP(X=x)$$We can extend this definition to not just a variable, but to a function of a variable: $$E(g(X))=\bar{g}(x) = \displaystyle \sum_{all x}^{ } g(x) P(X=x)$$This definition reminds us of our PGF polynomial, with the important result: $$ G_X(s)=p_0+p_1
s+p_2 s^2+...=E(s^X)$$

### Random Sums Formula

Consider a population of meerkats, where each individual has a random number of offspring in the next generation. Using this information, we can determine the total expected number of offspring in future generations.

First let $N, X_1, X_2, ...$ be independent variables, with $X_1, X_2, ...$ all having the same probability generating function *G*. Think of these *X* as the individual meerkats in our population. This also means that our PGF is given by $G(s)=p_0+p_1s+p_2s^2+...$, where $p_0=P(\text{no offspring}), p_1=P(\text{one offspring}) , ...$

We are interested in finding the PGF of the sum $X_1+X_2+...+X_N$ $$\begin{align*} G_T(s) & = E[s^T] \\ &= \displaystyle \sum_{n=o}^{\infty} E\Big [s^T|N=n\Big ] P(N=n) \\ & = \displaystyle \sum_{n=o}^{\infty} G(s)^n P(N=n) \\ & = E[G(s)^n] \\ &= G_N \Big( G(s) \Big) \end{align*} $$**Example:** Elephants (in most cases) only have
one offspring at a time, with probability *p*, say. We can model the number of offspring using the Bernoulli distribution with parameter *p.*

Generation *n+1* consists of the offspring of generation *n*.

Let $Z_{n+1}= \displaystyle \sum_{j=1}^{Z_n} X_j$ , where $X_j$ is the number of offspring of the *j*th individual in generation *n.*

In the first generation: $G_{Z_1} (s)=G_X(s)=(1-p)+ps$

In the second generation: $G_{Z_2} (s)=G_{Z_1} \bigg(G_X (s) \bigg)=(1-p)+p\big((1-p)+ps\big)=(1-p^2)+p^2 s$

Continuing, we see that at the *n*th generation: $G_{Z_n} (s)=(1-p^n)+p^n s$

Now click here to find out about branching processes and how we can use probability to determine the likelihood of a population becoming extinct.

Note that $G_X(0)=p_0$ and $G_X(1)=1$.