A and B have been collecting data. A has been finding out the heights of students in a class, and B has been timing how long it takes for sunflower seeds to grow.

**Can you think of some different ways they could present their data?**

Once you've had a chance to think, click below to see some ideas.

A histogram, a frequency polygon or a box-and-whiskers diagram would be suitable for representing this data.

The above were examples of one-variable problems where the data is *quantitative*.

**Can you think of some other examples of one-variable quantitative data?
Can you think of other ways of representing one-variable quantitative data?**

Click below to see some ideas.

If the data is discrete, such as shoe sizes or number of siblings, we could use a bar chart. We could also use a pictogram or something similar.

If the data is geographically related, for example the temperature at noon today across the country, we could represent it on a map using colours to show the temperatures (a choropleth map) or using lines to show the limits of each temperature (a contour diagram). As another example, to show the population size of different cities, we might use discs on a map with the size of the disc indicating the population size.

If the data is geographically related, for example the temperature at noon today across the country, we could represent it on a map using colours to show the temperatures (a choropleth map) or using lines to show the limits of each temperature (a contour diagram). As another example, to show the population size of different cities, we might use discs on a map with the size of the disc indicating the population size.

If instead of collecting quantitative data, A and B had been collecting *qualitative* (or *categorical*) data, why would some of the above approaches be unsuitable for representing the data? Which of them would still be suitable?

A and B have collected some more data, but this time with **two** quantitative variables. A collected the age and height of everyone travelling on the 08:36 train this morning, while B collected the length and mass of every goldfish in the pet shop. (This type of data is sometimes called *bivariate data*: each person (or fish) in the sample provides two pieces
of data.)

**How could they represent their data this time?**

They could use a scattergraph. If they used two separate diagrams, say two box-and-whisker diagrams, one for the age and one for the height of the passengers (or one for the length and one for the mass of each fish), they would lose the connection between the two variables.

In the real world, data scientists often have to make choices about how to present other types of data. Here are some different cases for you to consider:

- If you want to graphically represent data which has
**one**quantitative variable and**one**qualitative variable, such as people's gender and height, or cars' fuel efficiency and the type of car, how could you do so? Can you think of more than one way?

- What could you do with data which has
**two**qualitative variables, such as people's gender and their favourite character from a certain popular show?

How could you graphically represent data which has **two** quantitative variables and **one** qualitative variable?

Can you suggest a type of data where this would be useful? (You might like to download and explore a dataset from https://data.gov.uk/ or elsewhere to get inspiration if you need it!)

What about **three** quantitative variables? Or **one** quantitative and **two** qualitative variables?

Could you extend your ideas to **four** variables (of any type)?

Hans Rosling (1948-2017) was a Professor of International Health in Sweden. He developed an expertise in presenting data. Have a look at this short video (under five minutes) that he made for the BBC showing the development of health and wealth in the world over the last 200 years.

How effective is his presentation in communicating the data? How many variables is he representing in his graphs? How many of them are quantitative and how many are qualitative?

You may like to use some of the interesting datasets available at the JSE Data Archive or on other websites, and plot aspects of them using your ideas. You could use CODAP or a spreadsheet to do the plotting; CODAP is a free online system for visualising data.

How effective are your approaches at communicating underlying patterns in the data? What do you observe in the data? Are any of the things you observe likely to be meaningful in the context of the data, or are they more likely to just be random variation?

*This resource is part of the collection Statistics - Maths of Real Life*