Miller - variance of the sample mean vs. sample variance

FlorenceCC

Member
Hello,

I was reading about the Central Limit Theorem today, in the study notes for Miller chapter 4 (p79 specifically), and I realized that I am unclear about the following:

(I) we indicate that the variance of each random variable is σ^2/n. As we have shown in the preceding ochapter, this is the variance of the sample mean. It makes sense. Each time we have a sample of n variables, defined by a sample mean and its variance. However we have also shown that the sample variance is an unbiased estimator of a population's variance -> why isn't the sample variance the variance of our sample, as opposed to the variance of the sample mean?

(II) In addition, I think I am slightly confused with our underlying population - are we interested in (a) the distribution of the sample means of our various samples, or (b) the distribution of each random variables (let's call them Xi). I know this is all related but I think it is linked to my confusion explained in point (I): for me, σ^2/n is the variance of the sample mean, as opposed to the sample variance, which would be an unbiased estimator of the variance of a selection of n random variables Xi in an overall population of N.

I hope I'm expressing this clearly enough! Thanks in advance for your feedback!

Florence
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @FlorenceCC

The classic example is a six-sided die. As P(X=1) = P(X = 2) = ... P(X=) = 1/6, this is a discrete uniform probability distribution. It can also be viewed as the distribution that characterizes the population in the sense that we know the expected values. The population variance is given by E(X^2) - [E(X)]^2 = (1/6 * 1^2 + 1/6 * 2^2 + ... 1/6 * 6^2) - 3.5^2 = 2.917, which can also be found via the variance formula for a discrete uniform distribution (https://en.wikipedia.org/wiki/Discrete_uniform_distribution) such that [(b - a + 1)^2 - 1]/12 = [(6-1+1)^2 - 1]/12 = 35/12 = 2.917. This is a population, before any "rolls of the dies" are observed.

If we roll ten of these dice, we will get a sample (e.g., 2, 4, 6, ... 3, 3). If we compute the variance of this set of ten values, we get the sample variance. If we compute the average of these ten values, we get the sample mean and we expect it to equal about 3.5 (but at the same time, we don't expect it to equal exactly 3.5, as we expect sampling variation). Similarly, maybe our sample variance is 3.15. This sample variance is an estimate (produced by the estimator, which is the formula we used, and we had a choice between dividing by n or n-1, so we had a choice between at least two estimators from which to produce our estimate); the sample variance estimates the population variance. Generically, the sample statistic estimates (produced by an estimator) the population parameter. So, you are correct that "isn't the sample variance the variance of our sample." In my example, the sample variance of 3.15 estimates the population variance of 2.917, which in the rare, academic, unrealistic situation we just happen to know (typically we do not know the population parameter). So far this is not CLT.

CLT comes into magical play when our variable is the average (or sum) of these ten dice (and it works because the dice are i.i.d.!). The average of our ten dice is itself a random variable; aka, the sample mean. Each time we re-roll ten dice, we get a different sample variance and a different sample mean. While our expected sample mean is 3.5, it's going to fluctuate above or below 3.5, acting itself like a random variance. The magical (my word!) lesson of the CLT concerns not the distribution of the sample, but the distribution of the sample mean as a random variable. It has variance equal to 2.917/n, or in my example 2.917/10; so it has standard deviation equal to sqrt(2.917/10). Put another way, if we roll 10 dice, we expect the sample mean to be 3.5 but the standard deviation of the sample mean is sqrt(2.917/10) = 0.54; I'd like to conclude that we won't observe a sample mean less than 3.0 with only probability of (3.0 - 3.5)/0.54 ~= 1.0 ~ 16% but it's a small sample and the population is non-normal so i'm not really justified in using Z/t yet. So if we roll 40 dice (sample size is large!), now the CLT magic kicks in as the really rough approximation of normal becomes properly approximate so to speak! In spite of an underlying population distribution that is non-normal (i.e., uniform), this sample mean has an approximately normal distribution with standard deviation = sqrt(2.917/40) = 0.27; ie., one-half the previous. So we are referring to the variance/σ of the sample mean as a random variable. I roughly expect the sample mean to fall between 3.0 and 4.0, being +/ 2 σ roughly, about 95.0% of the time. I hope that's helpful!
 

FlorenceCC

Member
Hi David,

Thank you very much for the detailed feedback, it is much clearer for me now. So when we say "the variance of each random variable is equal to the population's variance divided by n", the random variable is the sample mean itself, i.e. the average of the dices (sample size n) and not the dices themselves, am I correct?

Thanks!

Florence
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @FlorenceCC yes, exactly! And I admit the text is imprecise (unclear) at that very sentence and should instead read something like "the variance of each sample mean (viewed itself as the outcome of a random variable) is equal to the population's variance divided by n." Thank you!
 
Top