Some of the situations discussed in this solution also appear in the problem Binomial or Not?

We focus on the case $n=2$, though these examples can easily be extended.

The probability distribution for $X\sim \mathrm{B}(2,p)$ is:

*(a) Is it possible that only (ii) holds, but not (i)? That is, can you design a scenario where the probability of success is not equal for all the trials, even though they are independent?*

We are taking balls from bags of green and red balls. Taking a green ball is considered success. On each trial, we draw a ball at random from a*different* bag; each bag has a different proportion of green balls. Each trial is independent, but the probability of success is not equal for all the trials. We let $X$ be the number of green balls drawn from the $n$ bags,
where $n$ is fixed.

As an extreme case, let us suppose that we have just two bags ($n=2$), the first bag being all red and the second being all green. Then we will always draw exactly one green ball, so $\mathrm{P}(X=0)=\mathrm{P}(X=2)=0$. This does not match the binomial distribution in the table above no matter what $p$ is.

A non-example, however, is sampling from the*same* bag without replacement. We discuss this in (b) below.

There are less extreme examples which have the same non-binomial distribution behaviour.

*(b) Is it possible that only (i) holds, but not (ii)? That is, can you design a scenario where the trials are not independent, even though each trial has equal probability of success?*

This is quite subtle. Let us imagine that we are counting the number of heads ($X$) appearing on flips of coins, where a head is considered success and a tail is considered failure. We stick two coins the same way up onto a ruler and toss the ruler. Then the probability of obtaining a head on either coin is equal (and neither 0 nor 1), as they either both land heads or both land tails. But the results for the two coins are not independent: once we know how the first one landed, we are certain about the result from the second one. So $\mathrm{P}(X=1)=0$, even though $\mathrm{P}(X=0)$ and $\mathrm{P}(X=2)$ are both non-zero. Hence $X$ does not have a binomial distribution.

A more familiar context - though not a perfect example - is asking for the number of sunny days in a certain town during the month of May. The probability of any particular day in May being sunny is approximately the same. However, if we know that 9th May, say, was sunny, then it is more likely that 10th May will also be sunny. Therefore the probability of any given day being sunny is the same as the probability of any other given day being sunny, but the sunniness of the days are not independent events.

Note that the probabilities here are only equal (or in the second case, approximately equal) when we are asking for the probabilities before the experiment has started. Once the experiment has started, we have more information, and so the probabilities of future trials will change.

A more subtle example of the same phenomenon occurs with drawing balls from a bag without replacement. Let us consider the case of a bag with 2 green and 2 red balls initially. We draw two balls, and count a green ball as a success. $X$ is the total number of green balls drawn. We can calculate probabilities using a tree diagram:

So the probabilities are:

$$\begin{align*}

\mathrm{P}(\text{GG}) &= \tfrac{1}{2}\times \tfrac{1}{3} = \tfrac{1}{6} \\

\mathrm{P}(\text{GR}) &= \tfrac{1}{2}\times \tfrac{2}{3} = \tfrac{2}{6} \\

\mathrm{P}(\text{RG}) &= \tfrac{1}{2}\times \tfrac{2}{3} = \tfrac{2}{6} \\

\mathrm{P}(\text{RR}) &= \tfrac{1}{2}\times \tfrac{1}{3} = \tfrac{1}{6} \\

\mathrm{P}(\text{first ball green}) &= \tfrac{1}{6} + \tfrac{2}{6} = \tfrac{3}{6} \\

\mathrm{P}(\text{second ball green}) &= \tfrac{1}{6} + \tfrac{2}{6} = \tfrac{3}{6}

\end{align*}$$

Therefore the first ball and second ball each have a probability of $\frac{1}{2}$ of being green. However, the probability of the second ball being green given that the first ball is green is only $\frac{1}{3}$, so the trials are not independent. The probability distribution of $X$ is also not binomial. (Why the trials have equal probabilities of green is an interesting question and worth pondering.)

#### A final note

Another way of describing the two conditions (i) and (ii) is with the single condition: "the probability of any trial being successful is the same, regardless of what happens on any other trial". The phrase "regardless of what happens on any other trial" is equivalent to saying that the trials are independent.

We focus on the case $n=2$, though these examples can easily be extended.

The probability distribution for $X\sim \mathrm{B}(2,p)$ is:

$x$ | $0$ | $1$ | $2$ |

$\mathrm{P}(X=x)$ | $(1-p)^2$ | $2p(1-p)$ | $p^2$ |

We are taking balls from bags of green and red balls. Taking a green ball is considered success. On each trial, we draw a ball at random from a

As an extreme case, let us suppose that we have just two bags ($n=2$), the first bag being all red and the second being all green. Then we will always draw exactly one green ball, so $\mathrm{P}(X=0)=\mathrm{P}(X=2)=0$. This does not match the binomial distribution in the table above no matter what $p$ is.

A non-example, however, is sampling from the

There are less extreme examples which have the same non-binomial distribution behaviour.

This is quite subtle. Let us imagine that we are counting the number of heads ($X$) appearing on flips of coins, where a head is considered success and a tail is considered failure. We stick two coins the same way up onto a ruler and toss the ruler. Then the probability of obtaining a head on either coin is equal (and neither 0 nor 1), as they either both land heads or both land tails. But the results for the two coins are not independent: once we know how the first one landed, we are certain about the result from the second one. So $\mathrm{P}(X=1)=0$, even though $\mathrm{P}(X=0)$ and $\mathrm{P}(X=2)$ are both non-zero. Hence $X$ does not have a binomial distribution.

A more familiar context - though not a perfect example - is asking for the number of sunny days in a certain town during the month of May. The probability of any particular day in May being sunny is approximately the same. However, if we know that 9th May, say, was sunny, then it is more likely that 10th May will also be sunny. Therefore the probability of any given day being sunny is the same as the probability of any other given day being sunny, but the sunniness of the days are not independent events.

Note that the probabilities here are only equal (or in the second case, approximately equal) when we are asking for the probabilities before the experiment has started. Once the experiment has started, we have more information, and so the probabilities of future trials will change.

A more subtle example of the same phenomenon occurs with drawing balls from a bag without replacement. Let us consider the case of a bag with 2 green and 2 red balls initially. We draw two balls, and count a green ball as a success. $X$ is the total number of green balls drawn. We can calculate probabilities using a tree diagram:

So the probabilities are:

$$\begin{align*}

\mathrm{P}(\text{GG}) &= \tfrac{1}{2}\times \tfrac{1}{3} = \tfrac{1}{6} \\

\mathrm{P}(\text{GR}) &= \tfrac{1}{2}\times \tfrac{2}{3} = \tfrac{2}{6} \\

\mathrm{P}(\text{RG}) &= \tfrac{1}{2}\times \tfrac{2}{3} = \tfrac{2}{6} \\

\mathrm{P}(\text{RR}) &= \tfrac{1}{2}\times \tfrac{1}{3} = \tfrac{1}{6} \\

\mathrm{P}(\text{first ball green}) &= \tfrac{1}{6} + \tfrac{2}{6} = \tfrac{3}{6} \\

\mathrm{P}(\text{second ball green}) &= \tfrac{1}{6} + \tfrac{2}{6} = \tfrac{3}{6}

\end{align*}$$

Therefore the first ball and second ball each have a probability of $\frac{1}{2}$ of being green. However, the probability of the second ball being green given that the first ball is green is only $\frac{1}{3}$, so the trials are not independent. The probability distribution of $X$ is also not binomial. (Why the trials have equal probabilities of green is an interesting question and worth pondering.)

Another way of describing the two conditions (i) and (ii) is with the single condition: "the probability of any trial being successful is the same, regardless of what happens on any other trial". The phrase "regardless of what happens on any other trial" is equivalent to saying that the trials are independent.