### Why do this problem:

This problem was created as a preliminary problem before students
try

Data
Matching , where it was felt that an introduction to
probability distributions would be useful.

### Possible approach :

The aim of this activity is for students to acquire a feel for
what a probability distribution is : that it is abstract and though
sample data often follows the theoretical distribution
more closely the larger the sample size it isn't compelled to do
so. It's just a rare event when it doesn't.

Additionally, at Stage 4 students mostly calculate only
individual probabilities and it is helpful to also draw attention
to the profile of probability across the complete range
of values for our variable of interest.

The 'copy to clipboard' facility collects data. Ask students to
collect ten sets of data (ten times 100 throws) pasted
into Excel.

Keep a data set that has the samples separate and another one
which forms a combined sample. These two are to be compared.

Draw the frequency distribution (or relative frequency)
distributions for the first 100, the first 200, then 300 and so
on.

Encourage students into commenting on the graph for
each of these and help them to explain what they observe until this
example of randomness has begun to become secure.

### Key questions :

- What does the probability distribution look like for the
outcomes when you roll one die, and can you explain why?

- Why isn't the graph on the 'distribution maker' a horizontal
line ?

After the samples have been collected and accumulated :

- How many of the samples of 100 are less even than the combined
sample of 1000 ?

- Is a larger (or combined) sample always closer to the actual
probability distribution ?

### Possible extension :

Abler students may cover the above issues as a class discussion and
move quickly on to

Data
Matching
### Possible support :

For less able students the data acquisition and accumulation is
best done as a teacher-led whole group activity so that technical
challenges do not obstruct their view of the main question : what
is the relationship between the abstract probability distribution
and the sample data.