### Over-booking

The probability that a passenger books a flight and does not turn up is 0.05. For an aeroplane with 400 seats how many tickets can be sold so that only 1% of flights are over-booked?

### Statistics - Maths of Real Life

This pilot collection of resources is designed to introduce key statistical ideas and help students to deepen their understanding.

### Binomial Conditions

When is an experiment described by the binomial distribution? Why do we need both the condition about independence and the one about constant probability?

# Binomial or Not?

##### Age 16 to 18 Challenge Level:
(1) A bag contains red, green and blue balls.  10 balls are taken at random, one at a time.  Each ball's colour is recorded, and then returned to the bag.

There is no random variable involved, so the binomial distribution is not relevant.  If we were to count the number of, say, red balls taken, then the total number of red balls would follow a binomial distribution.

(2) A coin is flipped until a tail is obtained.  The total number of flips needed is recorded.

There is not a fixed number of trials, so this is not described by a binomial distribution.  There is no easy way to get a binomial distribution out of this without changing the setting dramatically.

(3) The number of rainy days in April in the village of Springfield is recorded.

We take each day to be a trial, with "success" meaning "rainy".  Because of the way the weather works, we do not have independence: if one day is rainy, there is a higher probability that the next day will also be rainy than if it is sunny.  So the number of rainy days is not described by a binomial distribution, as the trials are not independent.

(4) The children in a class each do the same mathematics test.  The number who score above 80% is recorded.

We take each child's test to be a trial, with "success" meaning "scoring above 80%".  The probability of child 1 scoring above 80% is not the same as the probability of child 2 doing so, so the probability of success is not the same for each trial.  Therefore the number who score above 80% is not described by a binomial distribution.

(5) Five fair coins are stuck to a piece of clear plastic.  The plastic is flipped in the air, and the number of heads showing when the plastic lands is recorded.

A trial is the state of a coin after the plastic is flipped, with "success" meaning it lands on heads.  Let's say that the probability of the plastic landing one way up is $p$ and the other way up is $q=1-p$.  So for each coin, the probability of landing heads is either $p$ or $q$, depending on which way round it is stuck to the plastic.  But these are not independent: either the plastic lands one way up or the other, so either $k$ coins show heads or $5-k$ do, where $k$ is the number of heads showing when the plastic is a particular way up.

In the simplest case, all of the coins are stuck on the plastic the same way up, and now the probability of any given coin landing heads is $p$, so the probability of success for each trial is the same, but they are not independent: there will either be 0 or 5 heads.  So this is not a binomial distribution situation.

(6) A person plays a lottery every week.  They record the number of times they win a prize during one year.

This will depend on the rules of the lottery, the number of "tickets" bought and so on.  Let's say we're using the UK National Lottery, where players buy a "ticket" by choosing 6 different numbers from 1 to 59 (and paying for it!).  Six numbered balls are then drawn at random for the draw.  The ticket wins a prize if it matches at least three of the balls drawn.  Therefore the probability of any given ticket winning a prize is always the same, and every draw is independent of all others.

So if the person buys one ticket each week, the number of wins during the year will be described by a binomial distribution (as there are a fixed number of draws).

If they buy more than one ticket each week, though, the total number of prizes won is no longer binomial, as the events "winning a prize with this week's first ticket" and "winning a prize with this week's second ticket" are not independent.  For example, if the numbers on the two tickets have no overlap, then winning on one ticket means that it is impossible to win on the other.  However, if they enter with the same set of tickets each week, and they record the number of draws in which they win a prize, then this number will be binomially distributed, as the probability of winning a prize in a draw is constant.

(7) A cancer drug is being tested.  1000 patients are given the drug, and the number of patients who die within five years is recorded.

Here, a trial is a patient being tested, and "success" means "dying within five years".  ("Success" does not have to mean something good!)  But we have no idea whether each person's probability of success is the same - it could depend on so many other factors, or whether the trials are truly independent (though we would hope that they are), so it seems that it could not be described by a binomial distribution.  However, another way of thinking about this is that probabilities are a description of what we believe about the world.  It may be, say, that over a very large number of trials, we discover that only 40% of patients die within five years, so in the absence of any better information, we would say that each patient has a probability of 40% of "success".  And if we assume that we have both independence and a random sample of patients, then the number of patients who die within five years will, indeed, be described by a binomial distribution.

(8) A basketball player is practising taking shots.  The number of successful shots out of 10 attempts is recorded.

The key question here is whether the 10 attempts are independent and have equal probability of success.  Perhaps, for example, the player will be frustrated if they miss a shot and so play the next one worse or better.  Or perhaps they will be a bit more tired by the time they get to the 10th shot, so their probability of success is less.  If the shots are all independent with equal probability of success on each, then the number of successful shots will follow a binomial distribution, otherwise it will not.

(9) A bag contains red and blue balls.  10 balls are taken at random, one at a time.  Each ball's colour is recorded, and then returned to the bag.

As with (1), there is no random variable here, even though there are only two colours of balls.

(10) A box of pens contains working pens and broken pens.  10 pens are taken together from the box at random, and the number of working pens is recorded.

The probability of the 7th pen, say, working is the same as the probability of the first pen working: it is the number of working pens divided by the total number of pens.  However the probabilities are not independent: if we know that the first pen is working, then the probability of the second pen working is reduced.  As an extreme example, if there is only one working pen in the box, then there can only be 0 or 1 working pens in the group of 10 pens.

So the binomial distribution is not suitable in this situation.

(11) A bag contains red, green and blue balls.  10 balls are taken from the bag at random one at a time, and replaced immediately.  The number of green balls taken is recorded.

This is a binomial distribution situation: "success" is taking a green ball, while "failure" is taking a red or blue ball.  The replacement means that the trials are independent, and they each have the same probability of success.  So the number of green balls is described by a binomial distribution.

(12) A farmer is planting a crop.  On average, a certain percentage of the seeds grow to maturity.  The number of seeds that grow to maturity in this field in this year is recorded.

Are the seeds independent?  It is unlikely to be the case: it may be that there is some sort of drought or a flood, or some sort of disease affecting the crop one year.  In such a case, the whole crop will be affected, not just individual seeds.  So if one seed fails to grow to maturity, the probability of other seeds failing to grow is higher.  So the number of seeds that grow to maturity cannot be described by a binomial distribution.