Why do this problem?
This problem offers an opportunity to use tree diagrams or two-way tables to analyse a useful survey technique. Some moderately sophisticated reasoning is required to find the proportions involved. The results may also surprise and intrigue students.
The issue of people answering opinion polls dishonestly may also have been a factor in some recent pre-election surveys, where the polling results indicated significantly different predictions from the final results: some people may have been embarrassed to admit that they were voting for a particular person, party or position and so claimed they would be voting for a different one. This could
provide an interesting discussion point about the real-life relevance of the techniques discussed in the problem.
You could use the context in the problem or change the context to something else appropriately embarrassing. You could start by explaining: "We're going to be looking at one of the challenges of performing surveys on difficult topics. And to get some understanding of the problem, we'll do an example survey." Then perform the survey for real with your students: ask them the
question, ask them to secretly write Y or N on a slip of paper and fold it, and then collect them into a box.
Then ask the students to secretly
answer the question "Did you tell the truth? Write T for Truth and L for Lie." Again, they write their answer on a slip of paper which they fold and then collect.
You can then count and announce the number of Yes and No responses, and also the number of Truth and Lie responses. (An alternative version, closer to many types of real survey, would be to ask them to also write their name on the original response.
After collecting the T/L responses, you could then publicly shred the original responses without looking at them, to avoid actual embarrassment.)
There is then a chance to discuss the question of how we could work out the true proportion of people who brush their teeth every day (or whatever your question was) when so many people will lie about it; it will also be far worse if they are being asked by an interviewer in person.
It is worth saying at the start that you plan to share one specific technique later in the lesson, but it is far from being the only one, and statisticians often need to be quite creative to gather useful data. Therefore they are not looking to "guess what's in the teacher's head", but genuinely trying to come up with approaches to tackle this problem.
The students may well come up with some interesting ideas which could be followed up, either in this lesson or in a later lesson. (They can be tested with questions like: Would you follow the instructions exactly as given if it were a really sensitive question, such as "do you ..."? Having anonymous responses to this question, as before, might give the class an idea of the likely
effectiveness of the suggested techniques.)
You can then introduce the dice technique, and give the example presented in the problem for students to work on and discuss.
- Does this approach give us a good estimate of the true proportion of people surveyed who brush their teeth every day?
- How likely are interviewees to follow the instructions as given?
- Another possible approach is to ask interviewees "If your birthday is in January or February, AND you brush your teeth every day, then tick YES, otherwise tick no." How effective would this be?
- To get a better understanding of the margin of error inherent in this method, students could write a simulation. (See A Well-stirred Sample for more on margins of error.) How much worse is this method than a case where people are likely to answer honestly?
- What is the optimal implementation of this method? More precisely, let the probability that the interviewee is asked to tell the truth be $\tau$ (for "truth"). If the true proportion of people who pick their nose once a week is $\pi$, what would be the optimal value of $\tau$ to choose? What would be the optimal value of $\tau$ if you don't know the value of $\pi$? And
what might we mean by the word "optimal" here? Remember also that if $\tau$ is too close to 0 or 1, then people are more likely to not follow the instructions, so there is a balance to be struck...
Encourage students to draw a two-way table, a Venn diagram or a tree diagram, in order to obtain all the necessary numbers.