Welcome to NRICH.

 
Comparing random variables project


By Colin Prue on Tuesday, November 05, 2002 - 11:38 am:

Hello all.

Does anybody have any good ideas for a statistics project in which i have to determine whether there is linear correlation in a random-on-random situation?

I'm in deperate need of inspiration here! Anything interesting that i could get data for would be deeply appreciated.

thanks
colin


By Geoff Milward on Tuesday, November 05, 2002 - 12:19 pm:


I guess it depends on the data you have access to; once you have the data investigating a correlation is just crunching the numbers. Might be fun to plot average GCSE "scores" against A level scores for various schools in your area. Is there a correlation? How about key stage 3 vs GCSE?

How about sport data? Golf scores on the first and final days. Do they correlate? Or Grand Prix, a correlation between postion on the grid and in the race (Might be close to 1 I suspect). How has that correlation changed over the years. Does the better the correlation mean the more boring the race, as has GP racing got more boring? All this data can be gleaned from papers or the web.

Data can produce some interesting correlations. The is a correlation in devon & cornwall of 18th cent priest's incomes and the amount of smuggling! More likely explained by the fact locals were richer and gave more to the church when smuggling rife than that the clergy where out there themselves, but you never know.

Geoff


By Stephen Burgess on Tuesday, November 05, 2002 - 12:30 pm:

The problem with Formula 1 is that a minority of drivers finish a race. Formula 3000 would be a better bet (and a more fun and genuine race) to investigate.


By Vicky Neale on Tuesday, November 05, 2002 - 01:23 pm:

I seem to recall people doing things like performance in a maths test against reaction time (are mathematicians half asleep?!), and time taken to run a certain distance against reaction time. I think I did one piece comparing test results from the first test we did at the beginning of the lower 6th with the P1 result; college had all this data so it was very easy to obtain.

Vicky


By Andre Rzym on Tuesday, November 05, 2002 - 01:44 pm:

If you are feeling adventurous, you could give the Met Office a call and ask for their historic temperature data. They have clean daily high and low temperatures for the last 20-30 years for 8 (or so) reference sites (London Heathrow, Edinburgh etc.). You will have to stress that it is for school work and that the data will not be distributed (they normally charge for it).

You could then take, say, 5 years of data for a pair of sites and plot one against each other. Depending on where the sites are, you will see one is systematically higher than the other and (depending, perhaps, on nearness to the sea) one have a wider range of temperatures.

This sort of analysis has very practical uses. There are many institutions that make or lose money based on temperature:
*) Car insurers lose money in the cold (people have accidents)
*) Water companies lose money in the cold (mains water pipes freeze)
*) Endowment policy providers make money in the cold (people die)
*) Power companies make money in the cold (heating is turned up)
etc.

So banks do deals with these institutions, agreeing to make/receive payments depending on temperatures. Thus are the losses/gains of the companies above reduced. This leaves the bank with exposure to temperatures. They may gain if it is cold in South England and lose if it is cold in North England. If this were their risk, they would doubtless perform the sort of analysis you are doing.

Andre


By Kieren Holt on Thursday, November 07, 2002 - 05:19 pm:

are tall people cleverer than short people? does your birth size determine your adult size? do people get paid more as they get older and by how much? does how many pets you have determine how long it takes to walk to school? do taller people pull more? do shorter people?
are these random on random? i dont know what that is...
a couple of ones i've always wanted to know...
or just get an atlas. look up all the boring stats on their gdp and health index etc. just use two of them as a corelation.


By William Hall on Friday, November 08, 2002 - 08:24 pm:

It might just be worth making a small point about the two variables you do choose to look for correlation - just because two variables actually show some correlation does not actually mean that the two variables are actually linked directly; there may be some intermediary link(s) which is the real reason for the correlation, or just plain coincidence! My point is that there should be some plausible reason (even if it vague) that the two could be linked, and that some 'causality' is involved i.e. the occurance of one event can cause influence the outcome of an event. This is different from pure correlation. So you at least need to choose to sensible variables rather than two variables chosen completely at random!

Bill