P1.T2.Ch.6 BT Notes (Hypothesis Testing)

dtammerz

Active Member
Hello

With regards to P1.T2 (QA) Ch.6 (Hypothesis Testing) Notes: in the "Identify the steps to test a hypothesis about the difference between two population means" (p.28-29)

where does the 0.78 and the 1.02 come from? I feel like i'm missing something.

1596293601834.png
1596293617555.png
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @dtammerz Those are solutions to the maximum difference (hopefully the "⇒" is not throwing you off; "⇒" signifies "implies")

The denominator (i.e., the square root) is the standard deviation (aka, standard error) of the difference between correlated means and is equal to sqrt[to 4 + 1 - 2*0.3*sqrt(4)*sqrt(1)] = 0.398 (shown in exhibit).

So we really just have T = |D|/ σ(diff); i.e., the test-statistic T is the raw difference standardized by dividing by σ(diff). Just like we standardize the raw difference between the observed sample mean and the null hypothesized mean, X - μ, by dividing it by the SE, (X - μ)/SE, to retrieve the test statistic for a (univariate) sample mean.

Given T = |D|/ σ(diff), the max distance |D| = T*σ(diff); in this case, |D|= T*0.398. If we seek two-sided 95.0%, then |D| = 1.96*0.398 = 0.78, and if we seek two-sided 99.0% confidence, then |D| = 2.58*0.398 = 1.02. I hope that's helpful!
 

dtammerz

Active Member
Hi @dtammerz Those are solutions to the maximum difference (hopefully the "⇒" is not throwing you off; "⇒" signifies "implies")

The denominator (i.e., the square root) is the standard deviation (aka, standard error) of the difference between correlated means and is equal to sqrt[to 4 + 1 - 2*0.3*sqrt(4)*sqrt(1)] = 0.398 (shown in exhibit).

So we really just have T = |D|/ σ(diff); i.e., the test-statistic T is the raw difference standardized by dividing by σ(diff). Just like we standardize the raw difference between the observed sample mean and the null hypothesized mean, X - μ, by dividing it by the SE, (X - μ)/SE, to retrieve the test statistic for a (univariate) sample mean.

Given T = |D|/ σ(diff), the max distance |D| = T*σ(diff); in this case, |D|= T*0.398. If we seek two-sided 95.0%, then |D| = 1.96*0.398 = 0.78, and if we seek two-sided 99.0% confidence, then |D| = 2.58*0.398 = 1.02. I hope that's helpful!
Thank you David. that makes sense now. One more question; what exactly does "|D|" notation represent?
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Sure @dtammerz The vertical bars represent absolute value (https://en.wikipedia.org/wiki/Absolute_value); as in, for example, the absolute value of |-2.33| = 2.33. As mentioned, the numerator given by "D" is the raw distance between the sample means; but it doesn't matter if the difference is positive or negative; the example shows a difference of µ(X) - µ(Y) = 0.75, but it wouldn't (and shouldn't) matter if we calculated a raw difference of µ(Y) - µ(X) = -0.75. Thanks,
 

dtammerz

Active Member
Sure @dtammerz The vertical bars represent absolute value (https://en.wikipedia.org/wiki/Absolute_value); as in, for example, the absolute value of |-2.33| = 2.33. As mentioned, the numerator given by "D" is the raw distance between the sample means; but it doesn't matter if the difference is positive or negative; the example shows a difference of µ(X) - µ(Y) = 0.75, but it wouldn't (and shouldn't) matter if we calculated a raw difference of µ(Y) - µ(X) = -0.75. Thanks,
Thank you!!
 

NStha8467

New Member
Hi. A couple of queries if I may:

1) the BT FRM 1 video has nice detail on chi squared testing (in context of VaR backtests) but I only see testing of sample mean (ie t testing) in the garp materials. For the part 1 exam just how much facility does one need with chi squared and f testing? Or is that just background while the t test is the one to focus on for examinable calculations?

2) in the miller ch 5 EOC worked problems I see the confidence interval constructed both times using the two sided lookup/critical t even if the H(0) is one sided /directional. Why do we not use the smaller one sided critical ts to decide how many standard errors to add or subtract from the sample mean to frame that interval?

Thank you
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Subscriber
HI @NStha8467

1) The chi-squared distribution is included for completeness: before GARP "simplified" their econometrics chapters, the four sampling distributions were (normal, student's t, chi-squared, and F-distribution) were foundational. The normal/student's t because they test a sample mean, and the chi-squared because it tests the sample variance. True, the formula itself likely has low testability, but its also not too difficult:

the test statistic being χ^2(df = n - 1) = (S^2/σ^2)*df, where S is the sample variance.

.... so, to me, it's very natural to include the chi-square test for sample variance along with the normal/student's for sample mean. Although for both chi-squared and F test, the quantitative testability is currently low, such that you are probably fine with conceptual recognition.

2) I can't find the example(s) to which you refer? Although the test statistic does not vary with the 1- versus 2-sided hypothesis, the critical value (which informs the one- or two-sided critical region) should be different for the 1- versus 2-sided confidence interval. I don't recall hearing this specific problem about the miller questions, which tend to be fairly reliable. Nevertheless, for a sample mean test, the one-side test definitely implies a smaller critical value, I agree with your general point here. I hope that's helpful,
 

NStha8467

New Member
HI @NStha8467

1) The chi-squared distribution is included for completeness: before GARP "simplified" their econometrics chapters, the four sampling distributions were (normal, student's t, chi-squared, and F-distribution) were foundational. The normal/student's t because they test a sample mean, and the chi-squared because it tests the sample variance. True, the formula itself likely has low testability, but its also not too difficult:

the test statistic being χ^2(df = n - 1) = (S^2/σ^2)*df, where S is the sample variance.

.... so, to me, it's very natural to include the chi-square test for sample variance along with the normal/student's for sample mean. Although for both chi-squared and F test, the quantitative testability is currently low, such that you are probably fine with conceptual recognition.

2) I can't find the example(s) to which you refer? Although the test statistic does not vary with the 1- versus 2-sided hypothesis, the critical value (which informs the one- or two-sided critical region) should be different for the 1- versus 2-sided confidence interval. I don't recall hearing this specific problem about the miller questions, which tend to be fairly reliable. Nevertheless, for a sample mean test, the one-side test definitely implies a smaller critical value, I agree with your general point here. I hope that's helpful,
Thank you.
The worked questions are Millers end of chapter ones 1 and 2 as covered in the BT video for hypothesis testing, about 28 minutes in for the first one.

I think I follow the confidence interval approach as sample mean +\- crtitical t standard errors. It varies but ‘confidence’ (one Minus alpha ) percent of the time it should contain the mean. So we get decision rules for rejection of the null that way.
I also see that we can pivot this to the test statistic instead and the have the significance approach yield decision rules.

I like very much the warning to be careful to decode the exam question when working out your null and whether you need a one or two sided test.

What confused me is where the guided examples and spreadsheets
illustrate the confidence interval approach using the critical t for a two side test ie 2.262 yet we seem to focus on the one sided test at sig 5% for the decision rule given the phrasing of the question.

Looking at question 1 - Are all these just all illustrative en route to the ultimate answer. As the video concludes and per Millers own narrow answer we are only 1-30.1%= 69.9% confident of a positive north of 40 mean performance -the t test stat was not significantly north of the null’s mean at 5% one sided.

There was a point made about how (as a rule of thumb) we are looking for a test stat and critical value or 2 or more and the sample size in this case is small. So I was wondering if that explained the focus on the two sided critical t for the confidence interval part. Maybe I’m conflating two things here though.
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @NStha8467 The spreadsheet template aspires to be universally useful (although I see that I can make an improvement with respect to the Confidence Interval: it is only showing a 2-sided CI, which would not be useful to the question!). As you can see, the XLS is actually trying to be comprehensive by showing both 1- and 2-sided values. In this case of EOC #1, the observed sample mean is 45.0. For a null hypothesis, H(0) = 40.0, the test statistic (illustrated) is (45 - 40)/9.260 = 0.54 and this test statistic applies whether the test is 1- or 2-sided. It's true there the Confidence Interval of {24.1, 65.9} is 2-sided, and indeed it uses the 2-sided (2-tailed) critical value of 2.262. So the spreadsheet is trying to give you maximum information ... The critical (lookup) values are functions of the student's t distribution and the degrees of freedom. They are in the first group because they are "prior" to any application/question.

... but the question is, "Given the following data sample, how confident can we be that the mean is greater than 40?" and that's a 1-sided question, as I explained in the video. We've observed a sample average of 45.0 and now we're asking about a null of 40.0. That's the key math: we observe a 45.0 and we select a null of 40.0 which happens to be only 0.54 standard units away (i.e., pretty near). If there is only one rejection region (imagine the line that starts the shaded region to the right, at 45.0, and the null is to the left and the left is entirely acceptance region), the p-value is 30.1% which is the area to the right in the one-sided rejection region. If we instead visualized a two-sided, we double the rejection region because there is also a big one on the left.

But the answer to the question is about interpretation of this math. It's not my favorite question, it's a bit confusing. For the one-sided null (i.e., greater than 40), we'd reject a true null (aka, Type I error) with p-value probability of 30.1% so Miller says "For a one-sided t-test with 9 degrees of freedom, the associated probability is 70%. There is a 30% chance that the true mean is found below 40, and a 70% chance that it is greater than 40." I don't think I explained very well in the video that the actual answer is 69.9% but it sounds like you get that. Phew. hope that helps, thanks,
 
Top