| By Anna Lapuk Alap on Tuesday, March 19, 2002 - 01:16 am: |
Hello!
Please, could you explain to me in a simple words what's the
sense of term "weighted" when applied to some statistical terms?
For example "weighted frequency"? What's the relationship with
just frequency (I'd appreciate if you'd give me a formula).
Thanx so much.
| By Dan Goodman on Tuesday, March 19, 2002 - 02:48 am: |
Probably the best way to explain the
term "weighted" is with an example.
Suppose you wanted to find the mean (average) age of people in
Britain (don't ask me why). Also, suppose that 51% of the
population are female and 49% male (I think this is about right).
If you know that the mean age of women is 45yrs and the mean age
of men is 40yrs, then you can find the mean age of men and women
by taking the weighted mean of these two. In other words, the
mean age will be:
(51% x 45yrs + 49% x 40yrs) / 100%
In general, a weighted mean is given by the formula
m = (w1 x1 +w2 x2
+...+wn xn )/(w1 +w2
+...+wn )
Here the w's are the "weight" given to the x's. In the example
above, we had x1 being the mean age of women and
x2 being the mean age of men, w1 being the
percentage of women and w2 being the percentage of
men.
| By Anna Lapuk on Tuesday, March 19, 2002 - 06:15 pm: |
Thank you very much! This is very demonstrative! So, these
"weights" are used for taking into account additional
characteristic which devides the dataset into subsets,
right?
Perhaps you could also clearify a sense and applicability of
variance and standard deviation for me? I do understand, that the
standart deviation is a measure of how spread out the
data set is and is also a square root of variance. But what the
variance itself reflects is a bit vague to me. It's definition -
the measure of spread of distribution. What does this mean
in
a simple words? Could you give me an example to feel the
difference between these two.
| By Dan Goodman on Wednesday, March 20, 2002 - 12:09 am: |
(I hope I got that right)
The question is - why are variance and standard deviation the
best measure of spread of a data set? After all, the difference
between the biggest and smallest elements of a data set is also a
measure of the spread.
The problem is, unless you know about the normal (or Gaussian)
distribution I don't think I can explain why variance and
standard deviation are more important than these other measures
of spread. Basically, it turns out that knowing the mean and
variance of a large data set tells you all you need to know for
most purposes.
A good example is "confidence intervals". Suppose you have a data
set of ages of people, you've collected the ages of 1000 randomly
selected people and worked out the mean m and the standard
deviation s of this data set. What you really want to know is the
mean age of everyone in the country, but there are 60,000,000
people in the UK so you don't want to go and find out everyone's
age. What you can say is that with 95% certainty the mean age of
people in the UK is between m-a and m+a for some number a. The
point is that the number a only depends on the variance of the
data set (it would be a bit complicated to explain how to
calculate a, but it can be done).
I'm sorry this explanation is a bit useless, I can't really think
of an easy way of explaining it. Someone else on this site might
post something better.
| By Anna Lapuk on Wednesday, March 20, 2002 - 01:22 am: |
OK, this seems to make sense to me somehow. So the
distribution function is related with standard deviation in the
manner:
f(x)~ 1/s x exp(1/s^2). So the more s, the less f(0) (if x=0,
f(x)~1/s) and hense the wider shape of the distribution graph and
the bigger the fraction of x with larger deviation from the mean.
In other words, the bigger standard deviation, the more "spread
out" the dataset. This is my understanding of stand. deviation
and its relationship with the Gaussian distribution of any
dataset. But when I come to the variance and it's sense, I don't
feel a difference between itself and stand. deviation. Or there's
no differense in a sense of characterizing the dataset? Does it
matter what to use - the var. or stand. dev.? Is there any shade
of their sense or use?
| By Dan Goodman on Wednesday, March 20, 2002 - 02:06 am: |
| f(x)=1/( | Ö |
2p | s)exp(-(x-m)2/(2s2)) |
| f(x)=1/ | Ö |
2pv | exp(-(x-m)2/(2v)) |
| f(0)=1/ | Ö |
2pv |