Skip to main content
### Number and algebra

### Geometry and measure

### Probability and statistics

### Working mathematically

### For younger learners

### Advanced mathematics

# Powerful Hypothesis Testing

Or search by topic

Age 16 to 18

Challenge Level

- Problem
- Getting Started
- Student Solutions
- Teachers' Resources

Here are some comments on the questions in the problem (but not full solutions):

*What is the probability of $H_0$ being rejected?*

Do your answers change if the true proportion of greens in the bag changes?

What would happen if you changed the hypothesised proportion $\pi$?

What would happen if you changed the significance level of the test from 5% to 10% or 1%?

This depends on the proportion in $H_0$, the true proportion, the number of trials and the significance level. We can get evidence from the simulation, or we can work theoretically. In general, we would expect that the greater the difference between $\pi$ in $H_0$ and the true proportion, the greater the probability of $H_0$ being rejected (the null hypothesis is "more wrong"); the greater the number of trials, the greater the probability of rejection (the sample proportion will be more likely to be close to the true proportion), and as the significance level is raised, the probability of $H_0$ being rejected will also increase (as we are reducing the range of acceptance).

The probability of rejecting $H_0$ in this problem can be calculated as follows. Let the hypothesised proportion be $\pi_0$ and the true proportion be $\pi_1$. Let $X$ be the number of greens observed after $n$ trials. Under the null hypothesis with significance level $\alpha$ (so typically $\alpha=0.05$), $X\sim \mathrm{B}(n,\pi_0)$, and the null hypothesis will be rejected if $X$ lies in the critical region, which is $Xx_2$, where $x_1$ is the largest integer for which $\mathrm{P}(Xx_2|H_0)\le \alpha/2$. We can then calculate these probabilities given that $H_1$ is true, so that $X\sim \mathrm{B}(n,\pi_1)$ and deduce that the probability of $H_0$ being rejected is $\mathrm{P}(Xx_2|H_1)$. These calculations can be easily performed by computer.

Note that it is**only **possible to perform this calculation if we **know** the actual proportion. But if we know the actual proportion, why are we doing a hypothesis test?! This makes the power of a test a somewhat difficult idea. We could, though, be more specific, and say that we are testing $H_0\colon \pi=0.5$ against $H_1\colon \pi=0.6$, and
ask which of these hypotheses is more likely to be true. This is a different way of performing hypothesis testing, which is dealt with in the article [yet to be written].

*If $H_0$ is rejected, how likely is it that the alternative hypothesis $H_1$ is true?*

A tree diagram will help here: we have two possibilities, $H_0$ is true and $H_1$ is true. And for each of these, either $H_0$ will be accepted or rejected. So we have, looking at the tree diagram [which would be nice to draw]

$$\mathrm{P}(\text{$H_1$ true} | \text{$H_0$ rejected}) = \frac{\mathrm{P}(\text{$H_1$ true} \cap \text{$H_0$ rejected})}{\mathrm{P}(\text{$H_1$ true} \cap \text{$H_0$ rejected})+\mathrm{P}(\text{$H_0$ true} \cap \text{$H_0$ rejected})} = \frac{\mathrm{P}(\text{$H_0$ rejected} | \text{$H_1$ true})\mathrm{P}(\text{$H_1$ true})}{\mathrm{P}(\text{$H_0$ rejected} | \text{$H_1$ true})\mathrm{P}(\text{$H_1$ true})+\mathrm{P}(\text{$H_0$ rejected} | \text{$H_0$ true})\mathrm{P}(\text{$H_0$ true})}.$$

But we don't know the majority of probabilities in this calculation! We only know that $\mathrm{P}(\text{$H_0$ rejected} | \text{$H_0$ true})$ is the significance of the test, which we have chosen. So without some idea of how likely it is that $H_1$ is true, and some idea of the probability of rejecting $H_0$ if $H_1$ is true, we**cannot say how likely it is** that $H_1$
is true **even if we reject** $H_0$**!** Likewise, we cannot say how likely it is that $H_0$ is true if we accept it.

*If Robin wants to be 90% certain of rejecting the null hypothesis if it is wrong, how many trials are needed?*

This again depends on the actual proportion of green balls. If, though, Robin assumes what the actual proportion might be, we can then use the above calculations, trying different values of $n$ until we find one that is large enough so that $\mathrm{P}(Xx_2|H_1)>0.9$.

*Remembering that each trial costs a certain amount, what is the best number of trials to perform? (And what does "best" mean?)*

This is a hard question! It depends on what is most important to Robin. It is a balance between getting the "correct" answer, avoiding the "wrong" answer, the cost of the trials, and the assumed alternative hypothesis actual proportion.

Do your answers change if the true proportion of greens in the bag changes?

What would happen if you changed the hypothesised proportion $\pi$?

What would happen if you changed the significance level of the test from 5% to 10% or 1%?

This depends on the proportion in $H_0$, the true proportion, the number of trials and the significance level. We can get evidence from the simulation, or we can work theoretically. In general, we would expect that the greater the difference between $\pi$ in $H_0$ and the true proportion, the greater the probability of $H_0$ being rejected (the null hypothesis is "more wrong"); the greater the number of trials, the greater the probability of rejection (the sample proportion will be more likely to be close to the true proportion), and as the significance level is raised, the probability of $H_0$ being rejected will also increase (as we are reducing the range of acceptance).

The probability of rejecting $H_0$ in this problem can be calculated as follows. Let the hypothesised proportion be $\pi_0$ and the true proportion be $\pi_1$. Let $X$ be the number of greens observed after $n$ trials. Under the null hypothesis with significance level $\alpha$ (so typically $\alpha=0.05$), $X\sim \mathrm{B}(n,\pi_0)$, and the null hypothesis will be rejected if $X$ lies in the critical region, which is $Xx_2$, where $x_1$ is the largest integer for which $\mathrm{P}(Xx_2|H_0)\le \alpha/2$. We can then calculate these probabilities given that $H_1$ is true, so that $X\sim \mathrm{B}(n,\pi_1)$ and deduce that the probability of $H_0$ being rejected is $\mathrm{P}(Xx_2|H_1)$. These calculations can be easily performed by computer.

Note that it is

A tree diagram will help here: we have two possibilities, $H_0$ is true and $H_1$ is true. And for each of these, either $H_0$ will be accepted or rejected. So we have, looking at the tree diagram [which would be nice to draw]

$$\mathrm{P}(\text{$H_1$ true} | \text{$H_0$ rejected}) = \frac{\mathrm{P}(\text{$H_1$ true} \cap \text{$H_0$ rejected})}{\mathrm{P}(\text{$H_1$ true} \cap \text{$H_0$ rejected})+\mathrm{P}(\text{$H_0$ true} \cap \text{$H_0$ rejected})} = \frac{\mathrm{P}(\text{$H_0$ rejected} | \text{$H_1$ true})\mathrm{P}(\text{$H_1$ true})}{\mathrm{P}(\text{$H_0$ rejected} | \text{$H_1$ true})\mathrm{P}(\text{$H_1$ true})+\mathrm{P}(\text{$H_0$ rejected} | \text{$H_0$ true})\mathrm{P}(\text{$H_0$ true})}.$$

But we don't know the majority of probabilities in this calculation! We only know that $\mathrm{P}(\text{$H_0$ rejected} | \text{$H_0$ true})$ is the significance of the test, which we have chosen. So without some idea of how likely it is that $H_1$ is true, and some idea of the probability of rejecting $H_0$ if $H_1$ is true, we

This again depends on the actual proportion of green balls. If, though, Robin assumes what the actual proportion might be, we can then use the above calculations, trying different values of $n$ until we find one that is large enough so that $\mathrm{P}(Xx_2|H_1)>0.9$.

This is a hard question! It depends on what is most important to Robin. It is a balance between getting the "correct" answer, avoiding the "wrong" answer, the cost of the trials, and the assumed alternative hypothesis actual proportion.