Who is cheating?

A new test is in development to try to identify athletes who use a certain banned substance to enhance their performance.

The development team has accumulated data which indicates that although the test shows good results in detecting an athlete who has used this substance, the number of false positives is more worrying.

What is the probability that an athlete who is not taking the substance tests positive?
What is the probability that an athlete who is taking the substance tests negative?

Why do this problem?

This problem models the interpretation of statistics for testing for eg. cancer, HIV, pregnancy, DNA at a crime scene, and many other similar situations, including drug testing.

Students often find theoretical approaches difficult in the early stages of learning about probability.  The approach used in this problem will help to structure their understanding of the questions that can be answered from tree diagrams and 2-way tables, and will lead them to the multiplication rule.  In addition, they will become more aware of the need for care in deciding which data subset provides the figure for the denominator of a probability.

More advanced students can use it to establish the difference between P(A|B) and P(B|A), and to be introduced to Bayes' Theorem.

Possible approach

Put students into groups of 3 or 4, and have each group collect the equipment they need.  We would suggest you don't hand out any worksheets until students have collected their data (pairs of cubes) and had initial discussion of the results both in their groups and in the class as a whole.

Take the students through the scenario for one athlete - more as necessary.  Then get them to collect data - pairs of multi-link cubes - for 36 athletes.

When everyone has collected their data, get all groups to record their data on the worksheet, and then put their pairs of cubes on a large tree diagram or 2-way table.

This Sample Space worksheet will help students to identify correctly all the possible outcomes, and to see how many give the required result.

Key questions

Are you surprised by the results?

What do you think about the test - how effective is it?

What can you say about the proportion of false positives (that is, athletes who test positive even though they have not taken the banned substance)?

Given that an athlete is not taking the banned substance, what is the probability that they test positive?

What can you say about the proportion of false negatives (that is, athletes who test negative even though they have taken the banned substance)?

Given that an athlete is taking the banned substance, what is the probability that the test is negative?  (What figure should go in the denominator here?)

Possible further approach

Display a large tree diagram, with the expected results for 36 trials, and a second blank tree diagram.

On the tree diagram with the expected results, ask students what proportion of the 36 trials would result in each outcome, and show these figures on the right hand ends of the branches.

Then ask students what proportion of the 36 athletes we would expect to be taking the banned substance.  Display this on the first set of branches of the second tree diagram.
For the second sets of branches, focus on the proportion of the 24 or 12 athletes who would test positive and negative.  Again display the proportions on the branches of the tree diagram.

This worksheet provides students with a structure to work through this analysis.

The key to establishing the multiplication rule is to focus students' attention on the process that is going on in tree diagrams.  For instance:

'What proportion of the 36 athletes are not taking the banned substance?'
'24 out of 36, or 2 out of 3'
'Of that 24, what proportion test positive?'
'One eighth'
'What's an eighth of 24?'
'3'
'So we calculated 2 out of 3 to start with, then 1 out of 8.  Can we simplify that?'
'It's the same as taking 2/3, then 1/8'
'And that's the same as 2/3 x 1/8'
'So the answer is 2/24 or 1/12 - and we know that we expect 3 out of 36 athletes to be not taking the banned substance but to test positive, and 3 out of 36 is the same as 1 out of 12'

This may need to be gone through several times, using different language, for each set of branches.  But once students understand what is going on, they will then also understand the rule that we multiply along branches of a tree diagram.

It would also be good to reinforce that each set of branches on a tree diagram provides a single outcome, and that the complete set of branches provides the only possible outcomes, each of which is mutually exclusive.

Once students have grasped how the multiplication rule works with the tree diagram, they could use it to show that the probabilities for obtaining three heads when tossing a coin and a total of 4 when throwing two dice can be established this way also.

Possible extension

Extension questions are given in the problem and at the end of the worksheet which provide an opportunity for students to find probabilities of the form P(A|B) and P(B|A), and to clarify the difference.

The extension questions in the problem concern the same set of 3 people who are not taking the banned substance, but nevertheless test positive.
In each question, the numerator is therefore 3.
The denominator in each case is different, however, since we are considering these 3 athletes as a proportion of a different set of athletes in each case.

Critique the model.  What assumptions does it make?  How could you improve it, to make it more realistic?

There are more resources, plus video clips from an expert, in the Maths and Our Health pack - The test is positive: But what are the odds it's wrong?

Possible support

All students should be able to carry out the experiment, once they have understood the scenario and done one or more initial trials all together.  The Sample Space worksheet will help those who find it difficult to calculate the probabilities from the experiment.