The variance of a random variable or distribution is the expectation, or mean, of the squared deviation of that variable from its expected value or mean. Thus the variance is a measure of the amount of variation of the values of that variable, taking account of all possible values and their probabilities or weightings.
The variance is the Mean of the Squared Distances from the mean (it sounds a little confusing to say mean twice, but that is what it is). It is expressed as: E( (x-mx) * (x-mx) ) ... mx = E(x) = mean of x
E( . ) is called the "Expected Value" or Expectation... (it is usually interchangeable with the concept of mean, but this uses to be more mathematical and used to express future events, probabilities... not past events)
The concept of variance is something like the "Energy" of the "Error" (or "noise"). If there weren't any "errors" or "noise" the value of the variable x you get would be always the same, the mean.
It is also a level of uncertainty.
The covariance is E( (x-mx) * (y-my) ) , that is, the expected value or mean of the product of the two errors. It is a measurement of the correlation of the "errors" / "differences from the mean" (not a measurement of the correlation of the variables themselves!!)... that is, for any error in x...
is the error in y usually the same sign??? in this case, the covariance is positive
is the error in y usually the opposite sign??? in this case, the covariance is negative
is the error in y sometimes positive sometimes negative??? in this case, the covariance is zero or nearly zero... that is, the errors are not correlated, one error has nothing to do with the other, they are independent
So, P has covariances and it is called Covariance Matrix or "Uncertainty matrix"... but some of the values in the matrix (only in the diagonal) are the covariance of a variable with itself... and that is also what we call a variance, only in the diagonal.
Now, let's make a picture of it: we draw a 2D graph made using (x, y) pairs...
Cov = 0. Suppose that x has a mean of 3 (mx = 3) and a variance of 1 (sigma_x^2 = 1, sigma_x = 1) and y has a mean of 6 (my = 6) and a variance of 4 (sigma_y^2 = 4 ... sigma_y = 2). Now, suppose that their covariance is 0 ( cov(x,y) = 0 ). Then, the differences are independent. So, when we have an x of 4, (x-mx = 1) the y can be 4 or also 8 or maybe 6. Also, when y is 4 the x can be 2 or 3 or 4. So the most probable points would be inside an ellipse around (mx, my) = (3, 6) This ellipse would be oriented vertically, because it is longer in the y axis (length of sigma_y = 2) than in the x axis (length of sigma_x = 2).
Cov = 2. Maximum covariance. Suppose the same means and variances as before. But now, let's suppose that y = x * 2 ... Notice that E(y) = E(2 * x) = 2 * E(x) ... 6 = 2 * 3 and Var(y) = E( (y-6) * (y-6) ) = E( (2 * x - 6) * (2 * x - 6) ) = E( 2 * (x - 3) * 2 * (x - 3) ) = 4 * Var(x) ... correct: Var(y) = 4, Var(x) = 1.... What is their covariance?? Cov(x,y) = E( (x - 3) * (y - 6) ) = E( (x - 3) * (2 * x - 6) ) = 2 * Var(x) = 2. Then, the differences are not independent!. So, when we have an x of 4, (x-mx = 1) the y MUST be 8. Also, when y is 4 the x MUST be 2!!. So the most probable points would be in a straight line and that straight line would be inclined y = 2 * x
Covariance always involves 2 random variables. Variance always involves 1 random variable.
So you can say the variance of X as var(X), the covariance between X and Y as cov(X,Y).
Note that cov(X, X) = var(X).
This means the covariance between X and X itself is just its variance.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.