Binomial or not?
Are these scenarios described by the binomial distribution?
Problem
Which of these situations could be modelled using the binomial distribution?
For those which can be modelled by the binomial distribution $\mathrm{B}(n,p)$, what are "success" and "failure", and what are $n$ and $p$?
For those which are not, why not? Is there something small you could change that would make the binomial distribution appropriate? Is the binomial distribution at least a good approximation in this situation?
This resource is part of the collection Statistics - Maths of Real Life
For those which can be modelled by the binomial distribution $\mathrm{B}(n,p)$, what are "success" and "failure", and what are $n$ and $p$?
For those which are not, why not? Is there something small you could change that would make the binomial distribution appropriate? Is the binomial distribution at least a good approximation in this situation?
- A bag contains red, green and blue balls. 10 balls are taken at random, one at a time. Each ball's colour is recorded, and then returned to the bag.
- A coin is flipped until a tail is obtained. The total number of flips needed is recorded.
- The number of rainy days in April in the village of Springfield is recorded.
- The children in a class each do the same mathematics test. The number who score above 80% is recorded.
- Five fair coins are stuck to a piece of clear plastic. The plastic is flipped in the air, and the number of heads showing when the plastic lands is recorded.
- A person plays a lottery every week. They record the number of times they win a prize during one year.
- A cancer drug is being tested. 1000 patients are given the drug, and the number of patients who die within five years is recorded.
- A basketball player is practising taking shots. The number of successful shots out of 10 attempts is recorded.
- A bag contains red and blue balls. 10 balls are taken at random, one at a time. Each ball's colour is recorded, and then returned to the bag.
- A box of pens contains working pens and broken pens. 10 pens are taken together from the box at random, and the number of working pens is recorded.
- A bag contains red, green and blue balls. 10 balls are taken from the bag at random one at a time, and replaced immediately. The number of green balls taken is recorded.
- A farmer is planting a crop. On average, a certain percentage of the seeds grow to maturity. The number of seeds that grow to maturity in this field in this year is recorded.
This resource is part of the collection Statistics - Maths of Real Life
Getting Started
You might find it helpful to write down the conditions for the binomial distribution before you start deciding whether each situation can be described by the binomial distribution or not.
For each situation, can you match the binomial distribution conditions up to the situation?
If you can, it's binomial; if not, it almost certainly isn't. Which condition or conditions fail?
One thing to watch out for: "success" doesn't necessarily mean something "good"; "success" is used only to describe the thing we are counting.
For each situation, can you match the binomial distribution conditions up to the situation?
If you can, it's binomial; if not, it almost certainly isn't. Which condition or conditions fail?
One thing to watch out for: "success" doesn't necessarily mean something "good"; "success" is used only to describe the thing we are counting.
Student Solutions
(1) A bag contains red, green and blue balls. 10 balls are taken at random, one at a time. Each ball's colour is recorded, and then returned to the bag.
There is no random variable involved, so the binomial distribution is not relevant. If we were to count the number of, say, red balls taken, then the total number of red balls would follow a binomial distribution.
(2) A coin is flipped until a tail is obtained. The total number of flips needed is recorded.
There is not a fixed number of trials, so this is not described by a binomial distribution. There is no easy way to get a binomial distribution out of this without changing the setting dramatically.
(3) The number of rainy days in April in the village of Springfield is recorded.
We take each day to be a trial, with "success" meaning "rainy". Because of the way the weather works, we do not have independence: if one day is rainy, there is a higher probability that the next day will also be rainy than if it is sunny. So the number of rainy days is not described by a binomial distribution, as the trials are not independent.
(4) The children in a class each do the same mathematics test. The number who score above 80% is recorded.
We take each child's test to be a trial, with "success" meaning "scoring above 80%". The probability of child 1 scoring above 80% is not the same as the probability of child 2 doing so, so the probability of success is not the same for each trial. Therefore the number who score above 80% is not described by a binomial distribution.
(5) Five fair coins are stuck to a piece of clear plastic. The plastic is flipped in the air, and the number of heads showing when the plastic lands is recorded.
A trial is the state of a coin after the plastic is flipped, with "success" meaning it lands on heads. Let's say that the probability of the plastic landing one way up is $p$ and the other way up is $q=1-p$. So for each coin, the probability of landing heads is either $p$ or $q$, depending on which way round it is stuck to the plastic. But these are not independent: either the plastic lands one way up or the other, so either $k$ coins show heads or $5-k$ do, where $k$ is the number of heads showing when the plastic is a particular way up.
In the simplest case, all of the coins are stuck on the plastic the same way up, and now the probability of any given coin landing heads is $p$, so the probability of success for each trial is the same, but they are not independent: there will either be 0 or 5 heads. So this is not a binomial distribution situation.
(6) A person plays a lottery every week. They record the number of times they win a prize during one year.
This will depend on the rules of the lottery, the number of "tickets" bought and so on. Let's say we're using the UK National Lottery, where players buy a "ticket" by choosing 6 different numbers from 1 to 59 (and paying for it!). Six numbered balls are then drawn at random for the draw. The ticket wins a prize if it matches at least three of the balls drawn. Therefore the probability of any given ticket winning a prize is always the same, and every draw is independent of all others.
So if the person buys one ticket each week, the number of wins during the year will be described by a binomial distribution (as there are a fixed number of draws).
If they buy more than one ticket each week, though, the total number of prizes won is no longer binomial, as the events "winning a prize with this week's first ticket" and "winning a prize with this week's second ticket" are not independent. For example, if the numbers on the two tickets have no overlap, then winning on one ticket means that it is impossible to win on the other. However, if they enter with the same set of tickets each week, and they record the number of draws in which they win a prize, then this number will be binomially distributed, as the probability of winning a prize in a draw is constant.
(7) A cancer drug is being tested. 1000 patients are given the drug, and the number of patients who die within five years is recorded.
Here, a trial is a patient being tested, and "success" means "dying within five years". ("Success" does not have to mean something good!) But we have no idea whether each person's probability of success is the same - it could depend on so many other factors, or whether the trials are truly independent (though we would hope that they are), so it seems that it could not be described by a binomial distribution. However, another way of thinking about this is that probabilities are a description of what we believe about the world. It may be, say, that over a very large number of trials, we discover that only 40% of patients die within five years, so in the absence of any better information, we would say that each patient has a probability of 40% of "success". And if we assume that we have both independence and a random sample of patients, then the number of patients who die within five years will, indeed, be described by a binomial distribution.
(8) A basketball player is practising taking shots. The number of successful shots out of 10 attempts is recorded.
The key question here is whether the 10 attempts are independent and have equal probability of success. Perhaps, for example, the player will be frustrated if they miss a shot and so play the next one worse or better. Or perhaps they will be a bit more tired by the time they get to the 10th shot, so their probability of success is less. If the shots are all independent with equal probability of success on each, then the number of successful shots will follow a binomial distribution, otherwise it will not.
(9) A bag contains red and blue balls. 10 balls are taken at random, one at a time. Each ball's colour is recorded, and then returned to the bag.
As with (1), there is no random variable here, even though there are only two colours of balls.
(10) A box of pens contains working pens and broken pens. 10 pens are taken together from the box at random, and the number of working pens is recorded.
The probability of the 7th pen, say, working is the same as the probability of the first pen working: it is the number of working pens divided by the total number of pens. However the probabilities are not independent: if we know that the first pen is working, then the probability of the second pen working is reduced. As an extreme example, if there is only one working pen in the box, then there can only be 0 or 1 working pens in the group of 10 pens.
So the binomial distribution is not suitable in this situation.
(11) A bag contains red, green and blue balls. 10 balls are taken from the bag at random one at a time, and replaced immediately. The number of green balls taken is recorded.
This is a binomial distribution situation: "success" is taking a green ball, while "failure" is taking a red or blue ball. The replacement means that the trials are independent, and they each have the same probability of success. So the number of green balls is described by a binomial distribution.
(12) A farmer is planting a crop. On average, a certain percentage of the seeds grow to maturity. The number of seeds that grow to maturity in this field in this year is recorded.
Are the seeds independent? It is unlikely to be the case: it may be that there is some sort of drought or a flood, or some sort of disease affecting the crop one year. In such a case, the whole crop will be affected, not just individual seeds. So if one seed fails to grow to maturity, the probability of other seeds failing to grow is higher. So the number of seeds that grow to maturity cannot be described by a binomial distribution.
There is no random variable involved, so the binomial distribution is not relevant. If we were to count the number of, say, red balls taken, then the total number of red balls would follow a binomial distribution.
(2) A coin is flipped until a tail is obtained. The total number of flips needed is recorded.
There is not a fixed number of trials, so this is not described by a binomial distribution. There is no easy way to get a binomial distribution out of this without changing the setting dramatically.
(3) The number of rainy days in April in the village of Springfield is recorded.
We take each day to be a trial, with "success" meaning "rainy". Because of the way the weather works, we do not have independence: if one day is rainy, there is a higher probability that the next day will also be rainy than if it is sunny. So the number of rainy days is not described by a binomial distribution, as the trials are not independent.
(4) The children in a class each do the same mathematics test. The number who score above 80% is recorded.
We take each child's test to be a trial, with "success" meaning "scoring above 80%". The probability of child 1 scoring above 80% is not the same as the probability of child 2 doing so, so the probability of success is not the same for each trial. Therefore the number who score above 80% is not described by a binomial distribution.
(5) Five fair coins are stuck to a piece of clear plastic. The plastic is flipped in the air, and the number of heads showing when the plastic lands is recorded.
A trial is the state of a coin after the plastic is flipped, with "success" meaning it lands on heads. Let's say that the probability of the plastic landing one way up is $p$ and the other way up is $q=1-p$. So for each coin, the probability of landing heads is either $p$ or $q$, depending on which way round it is stuck to the plastic. But these are not independent: either the plastic lands one way up or the other, so either $k$ coins show heads or $5-k$ do, where $k$ is the number of heads showing when the plastic is a particular way up.
In the simplest case, all of the coins are stuck on the plastic the same way up, and now the probability of any given coin landing heads is $p$, so the probability of success for each trial is the same, but they are not independent: there will either be 0 or 5 heads. So this is not a binomial distribution situation.
(6) A person plays a lottery every week. They record the number of times they win a prize during one year.
This will depend on the rules of the lottery, the number of "tickets" bought and so on. Let's say we're using the UK National Lottery, where players buy a "ticket" by choosing 6 different numbers from 1 to 59 (and paying for it!). Six numbered balls are then drawn at random for the draw. The ticket wins a prize if it matches at least three of the balls drawn. Therefore the probability of any given ticket winning a prize is always the same, and every draw is independent of all others.
So if the person buys one ticket each week, the number of wins during the year will be described by a binomial distribution (as there are a fixed number of draws).
If they buy more than one ticket each week, though, the total number of prizes won is no longer binomial, as the events "winning a prize with this week's first ticket" and "winning a prize with this week's second ticket" are not independent. For example, if the numbers on the two tickets have no overlap, then winning on one ticket means that it is impossible to win on the other. However, if they enter with the same set of tickets each week, and they record the number of draws in which they win a prize, then this number will be binomially distributed, as the probability of winning a prize in a draw is constant.
(7) A cancer drug is being tested. 1000 patients are given the drug, and the number of patients who die within five years is recorded.
Here, a trial is a patient being tested, and "success" means "dying within five years". ("Success" does not have to mean something good!) But we have no idea whether each person's probability of success is the same - it could depend on so many other factors, or whether the trials are truly independent (though we would hope that they are), so it seems that it could not be described by a binomial distribution. However, another way of thinking about this is that probabilities are a description of what we believe about the world. It may be, say, that over a very large number of trials, we discover that only 40% of patients die within five years, so in the absence of any better information, we would say that each patient has a probability of 40% of "success". And if we assume that we have both independence and a random sample of patients, then the number of patients who die within five years will, indeed, be described by a binomial distribution.
(8) A basketball player is practising taking shots. The number of successful shots out of 10 attempts is recorded.
The key question here is whether the 10 attempts are independent and have equal probability of success. Perhaps, for example, the player will be frustrated if they miss a shot and so play the next one worse or better. Or perhaps they will be a bit more tired by the time they get to the 10th shot, so their probability of success is less. If the shots are all independent with equal probability of success on each, then the number of successful shots will follow a binomial distribution, otherwise it will not.
(9) A bag contains red and blue balls. 10 balls are taken at random, one at a time. Each ball's colour is recorded, and then returned to the bag.
As with (1), there is no random variable here, even though there are only two colours of balls.
(10) A box of pens contains working pens and broken pens. 10 pens are taken together from the box at random, and the number of working pens is recorded.
The probability of the 7th pen, say, working is the same as the probability of the first pen working: it is the number of working pens divided by the total number of pens. However the probabilities are not independent: if we know that the first pen is working, then the probability of the second pen working is reduced. As an extreme example, if there is only one working pen in the box, then there can only be 0 or 1 working pens in the group of 10 pens.
So the binomial distribution is not suitable in this situation.
(11) A bag contains red, green and blue balls. 10 balls are taken from the bag at random one at a time, and replaced immediately. The number of green balls taken is recorded.
This is a binomial distribution situation: "success" is taking a green ball, while "failure" is taking a red or blue ball. The replacement means that the trials are independent, and they each have the same probability of success. So the number of green balls is described by a binomial distribution.
(12) A farmer is planting a crop. On average, a certain percentage of the seeds grow to maturity. The number of seeds that grow to maturity in this field in this year is recorded.
Are the seeds independent? It is unlikely to be the case: it may be that there is some sort of drought or a flood, or some sort of disease affecting the crop one year. In such a case, the whole crop will be affected, not just individual seeds. So if one seed fails to grow to maturity, the probability of other seeds failing to grow is higher. So the number of seeds that grow to maturity cannot be described by a binomial distribution.
Teachers' Resources
Why do this problem?
Beyond learning to apply the binomial distribution formula, it is important for students to recognise when the binomial distribution is an appropriate model and when it is not. The twelve situations offered cover a variety of misconceptions and misunderstandings with regards to the binomial distribution, as well as having some "real life" situations which could be reasonably modelled using the binomial distribution, even if it is not exact. This aims to help students delineate the boundaries between examples and non-examples of binomial distribution situations, and thereby to improve their mental map of this topic. It also forms a nice complement to Binomial Conditions, which asks why each of the conditions needed for a situation to be described by a binomial distribution are necessary: suitable examples for that problem can be found among these twelve situations.
Possible approach
This printable worksheet may be useful, and it could be cut up to make a card sort: Binomial or not.pdf
This could be run as a whole-class activity. Show the situations one at a time to the students, and ask them to consider whether it is or is not described by the binomial distribution. They could then vote on their ideas, and one person from each side could then justify the reasoning for their choice. The activity has some prompting questions at the start of the problem page, and these could then be used to probe further.
An alternative approach is to cut out the sheets and to use this as a sorting activity. This has the advantage that a few of the situations are fairly similar, but the small differences are very significant. This could help students improve their ability to discriminate between binomial and non-binomial situations. After they have sorted the cards into binomial and non-binomial groups, they could then be asked the prompting questions from the problem page. Students could also be invited to create their own similar situations and challenge each other to sort them in the same way.
Key questions
In what situations is the binomial distribution a suitable model?What can cause the binomial distribution to fail to be a suitable model?
Possible extension
In what situations is the binomial distribution a good approximation?
Once students are confident at deciding when a situation can be modelled with a binomial distribution, they could work on Binomial Conditions.
Possible support
Write down the conditions for the binomial distribution. Can you match the conditions up to the situation? If so, it's binomial, if not, it almost certainly isn't. Which condition fails?