Standard Error in P1.T2.209.1 vs in P1.T2.71.2

MissJaguar

Member
Subscriber
God help us!

I am nowhere with my preparation :( I hope guys the situation is not as dark on your end :)

Anyways, I am comparing logic behind questions P1.T2.209.1 and P1.T2.71.2.

Sample size in both questions is large, by definition (e.g. n> 30). Yet standard error in the P1.T2.209.1 is computed with sample data (SQRT (0.15*0.85/60)), while in P1.T2.71.2 it is computed without adjustment for sample / population size, e.g 5000 (SQRT(0.02*0.98*5000). Why?

Moreover in the first case we play with Bernoulli in the second with Binomial. How to define which to use?

Thanks, if you read. Thanks power infinity if you answer :) :) :)

Nice sunny day,
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Hi @MissJaguar In the PQ PDF, do you see at the end of each answer is a link to the source question in the forum? At the end of 209.3 answer it reads "Discuss in forum here: http://forum.bionicturtle.com/threads/p1-t2-209-t-statisticand-confidence-interval.5318/"

Your question is very good, but it has been asked before and addressed. There really isn't a difference here, except one concerns the variance (standard error = square root of variance) of the number of defaults, versus the variance of the default rate (default rate = default / n, so n drops out!). Please see https://forum.bionicturtle.com/thre...stic-and-confidence-interval.5318/#post-22826 which includes:
  • as this is a binomial, the variance of the number of defaults is (indeed) = n*p*(p-1)
  • but this is default rate; e.g., 9 defaults/60 = 15% default rate
  • so [n*p*(p-1)]/n = p*(1-p) is the variance of the default rate ( same as variance of the Bernoulli, hence the importance of random sample = i.i.d)
  • But this p*(1-p) is the variance, in % terms, of the really the population distribution; e.g., if p = 10% is actually true, then the variance = 9.0%
  • Okay but we want here the sample average and per CLT it is the population variance/n = p*(1-p)/n, such that the standard error = SQRT[p*(1-p)/n]
... note the difference: the 9% variance is the variance of a single company: expects to default 10% of the time, with SQRT(9%) = 30% variance. But when we retrieve the sample average default rate from a set of n companies, the dispersion of that average is going to tighten pretty quickly; e.g., if we used half the companies, it should be very near to 10%. I hope that explains,

I hope that helps, I hope the situation is not dark :) (I doubt it really is very dark, you used four happy faces, which seems like positive energy to me!)
 
Last edited:

MissJaguar

Member
Subscriber
Hahaha...I am an optimist indeed! Thanks, David for your message + my apologizes - I printed materials in order to take notes. Will use the links in future. Thanks.
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
I knew it! :) I am sincere, I have a theory that people who use emoticons tend to be optimistic :cool:

I've been thinking about this comparison between these two questions, it is a really interesting comparison, it caused me to temporarily doubt the method in 209.1. (but i *think* it's okay, 71.3 is the easier case) There is a valid confusion here, IMO, because:
  • In 209.1, the standard error divides by (n), where n = 60: SE = SQRT(15%*85%/60), yet
  • In 71.3, the standard error multiplies by (n), where n = 5,000: Z = (120 - 100)/SQRT(2%*98%*5000).
How can this be? The difference is between number of defaults and default rate, and both answers rely CLT, but for different reasons. In 71.2, CLT is cited to utilize the normal distribution to approximate the sum of i.i.d. defaults (Bernouillis); i.e., CLT to support using the normal to approximate a binomial for large values of n, known as de Moivre–Laplace theorem (http://en.wikipedia.org/wiki/Binomial_distribution), which Jorion also uses in Backtesting VaR (Chapter 6). In 209.1, as the sampled default rate is the sample average of a set of Bernouillis, here it is using the CLT fact that the sample average has a variance = (variance/n).

More intuitively, if we consider (n) bonds with true default probability = p and i.i.d. Now increase the number of bonds. As the number of bonds in the sample increases, I think we can expect:
  • The standard deviation of the number of defaults to increase, as it does non-linearly in SQRT(2%*98%*5000); e.g., given PD = p, if we go from 10 bonds to 100 to 1,000 bonds, we would not expect the S.E. of the number of total defaults to remain constant, we might expect it to increase
  • The standard error of the rate of default to decrease, tending toward zero, as it does in SQRT(15%*85%/60). Zero because that signifies the sampled default rate is tending toward the population default rate; e.g., as we go from 10 bonds to 100 to 1,000, we would expect the sampled default rate to converge on the true default rate
I hope that is helpful, thanks!
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Subscriber
sorry to append, my last is so wordy :eek: I just wanted to satisfy myself with a simple example, because I find myself getting stuck on this (I mean: the original question, why multiply by n in one case and divide by n, in another? It is not immediately intuitive to me!)

Consider i.i.d. default probability of 2.0% under three sample sizes, where CLT tells us that the sum of the number of defaults tends toward a normal distribution but that fact is not used here:
  • n = 100 and s.e. = sqrt(2% * 98% * 100) = 1.4
  • n = 1,000 and s.e. = sqrt(2% * 98% * 1,000) = 4.427
  • n = 10,000 and s.e. = sqrt(2% * 98% * 10,000) = 14.0
Now the corresponding s.e. of the default rate where the CLT tells us that the default rate of a sample, because it happens also to be an average of a series of i.i.d. Bernoulli variables, is sqrt(variance/n) and is used here:
  • n = 100 and s.e. = sqrt(2% * 98% / 100 ) = 1.4%; and note that 1.4 / 100 = 1.4%
  • n = 1,000 and s.e. = sqrt(2% * 98% / 1,000) = 0.443%; and note that 4.427 / 1,000 = 0.443%
  • n = 10,000 and s.e. = sqrt(2% * 98% / 10,000) = 0.140%; and note that 14.0 / 10,000 = 0.140%
In case that helps anybody else!
 
Last edited:

MissJaguar

Member
Subscriber
David,

You are awesome!!!

Many thanks for detailed explanation. I will redo the question once again prior to the exam and hope no further questioning of the logic behind the solution(s) arises :)

:) :) :)

P.S. Your theory on human behavior and psychology is surely valid. When the financial industry crashes (let's hope and pray not :)), you will at least know which next step in career to take :) :) :)
 

David Harper CFA FRM

David Harper CFA FRM
Subscriber
Oh @MissJaguar what a shameless display of emoticon goodness, you are clearly competitive ;) ! Thank you for recognizing my true desire to pursue a career in human behavior and psychology :cool:
 
Top