quantitative methods - iii
TRANSCRIPT
© EduPristine CFA - Level – I © EduPristine – www.edupristine.com
Quantitative Methods - III
© EduPristine CFA - Level – I 1
Mapping to Curriculum
Reading 10: Sampling and Estimation
Reading 11: Hypothesis Testing
Reading 12: Technical Analysis
© EduPristine CFA - Level – I
Reading 10: Sampling and Estimation
2
© EduPristine CFA - Level – I
Coverage of the Reading 10
Central Limit Theorem
Sampling Distribution
Standard error of sample mean
Student’s t-distribution
Confidence Interval
3
© EduPristine CFA - Level – I
Sampling
A probability sample is a sample selected such that each item or person in the population being studied has a known likelihood of being included in the sample.
The sampling distribution of the sample mean is a probability distribution consisting of all possible sample means of a given sample size selected from a population.
Need for Sampling:
The physical impossibility of checking all items in the population.
The cost of studying all the items in a population.
The sample results are usually adequate.
Contacting the whole population would often be time-consuming.
The destructive nature of certain tests.
4
© EduPristine CFA - Level – I
Time-Series and Cross-Sectional Data
Time Series
A sequence of data collected at discrete and equally spaced intervals of time.
For example, the quarterly revenue figures of a public company
While choosing a time interval over which the data is collected, the analyst make take into account changes in external factors such as fixed vs. floating interest rate scenarios or tight vs. loose monetary policies.
Cross Sectional Data
Data on some characteristic of individuals, groups, companies or geographical locations.
For example, the 2012 EPS of all stocks in the S&P 500.
While selecting data, the analyst must consider whether it comes from the same underlying population. For example, while looking at fixed capital, aviation companies have large fixed assets but a small size textile industry may not and their comparison may not be meaningful.
5
© EduPristine CFA - Level – I
Central Limit Theorem
For a population with a mean μ and a variance σ2 the sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed.
The mean of the sampling distribution equal to μ and the variance equal to σ2/n.
How is variance related to standard error?
6
As sample size gets large (typically > 30) Sampling distribution becomes almost normal regardless of shape of population
X
© EduPristine CFA - Level – I
Sampling Error
The sampling error is the difference between a sample statistic and its corresponding population parameter. It is found by subtracting the value of a Parameter from the value of a Statistic.
For example, if a poll was conducted where the population included all students in that school and the sample was a class. If the sample had a mean GPA of 3.4, and the population’s mean GPA was 3.2, then the sample error was 0.2.
7
© EduPristine CFA - Level – I
Methods of Probability Sampling
Simple Random Sampling: A sample formulated so that each item or person in the population has the same chance of being included. This requires that the entire population must be known and serial numbered.
Systematic Random Sampling: The items or individuals of the population are arranged in some order. A random starting point is selected and then every kth member of the population is selected for the sample. Used in case the entire population cannot be identified.
Stratified Random Sampling: A population is first divided into subgroups called strata, and a sample is selected from each stratum. Ensures that all sub-group’s are represented in the sample. Has a smaller variance than the estimates observed from simple random sampling. Example: Bond Indexing
8
© EduPristine CFA - Level – I
Developing Sampling Distributions
Suppose there’s a population of 4 oldest scientists in a university: Jack, Andrew, Michelle and Tom
Random variable, X is the ages of the individuals
• Values of X: 78, 76, 72, 74
Summary Measure for Population Distribution
9
236.2
N
μX
σ
754
74727678
N
X
AgeAverage
N
1i
2
i
N
1i
i
69
70
71
72
73
74
75
76
77
78
79
Andrew Jack Michelle Tom
Ages of Population
0
0.05
0.1
0.15
0.2
0.25
0.3
Andrew Jack Michelle Tom
Prob. Of selection
Optional Topic
© EduPristine CFA - Level – I
0
0.05
0.1
0.15
0.2
0.25
0.3
72 73 74 75 76 77 78
Sampling Distribution of Sample Means
16 Sample Means
78 76 74 72
78 78 77 76 75
76 77 76 75 74
74 76 75 74 73
72 75 74 73 72
1st
Observ
2nd Observation
16 Samples of size n=2 each
78 76 74 72
78 78,78 76,78 74,78 72,78
76 78,76 76,76 74,76 72,76
74 78,74 76,74 74,74 72,74
72 78,72 76,72 74,72 72,72
2nd Observation1st
Obs
All Possible Samples of Size n = 2
10
Optional Topic
© EduPristine CFA - Level – I
7516
787373721
N
XN
i
i
x
SizeSample
VariancePopulationError) (Standardmean ofon distributi sampling of Variance
Summary Measures for the Sampling Distribution
The mean of the sample
The standard deviation of the sample means:
Two important points worth noting in population and sampling distributions:
• Population mean and the sample mean is same which is equal to 76.
• Variance of the population = 2.2362=5 and Variance of the sample = 1.582=2.5 which is lower than the population variance.
Also the
11
Optional Topic
58.1
16
757875737572222
1
2
N
XN
i
xi
x
© EduPristine CFA - Level – I
Standard Error of sample mean
It is the standard deviation of the distribution of the sample means
When the standard deviation of the population is known, the standard error of the sample mean is calculated as:
Standard error of sample mean = Standard deviation of population
Square root of the sample size (n)
Example: The mean hourly wage for Mumbai farm workers is $13.50 with a population standard deviation of $2.90. Calculate & interpret the standard error of the sample mean for a sample size of 30
Answer: Because the population standard deviation is known, the standard error of the sample mean is expressed as = $2.90/ root of (30) = $0.53
12
© EduPristine CFA - Level – I
Desirable properties of an estimator
Unbiasedness: expected value of an estimator is equal to the parameter you are trying to estimate
Efficiency: Variance of the sampling distribution is smaller than all the other unbaised estimators of the parameter you are trying to estimate
Consistency: accuracy of the parameter estimate increases as the sample size increases
13
© EduPristine CFA - Level – I
Point Estimate & Confidence Interval
Point estimates: These are the single (sample) values used to estimate population parameters
Confidence interval: It is a range of values in which the population parameter is expected to lie
Confidence interval takes on the following form where N ≥ 30
CI = m + Z*sx
True for a population distribution
Where, m is the mean of the population
sx is the standard deviation of the population
For a sample mean,
Point estimate + (reliability factor * standard error )
CI = m + Z*(sx/√n)
14
© EduPristine CFA - Level – I
Student’s t – distribution (in cases where n < 30)
Student’s t-distribution, or simply the t-distribution, is a bell-shaped probability distribution that is symmetrical about its mean
It is appropriate distribution to use when constructing confidence intervals based on small samples (n<30) from population with unknown variance & a normal, or approximately normal, distribution
It is symmetrical
It is defined by a single parameter, the degrees of freedom (df):
Degrees of freedom = n-1
It has more probability in the tails (“fatter tails”)
than normal distribution; which means higher kurtosis.
As the degrees of freedom gets larger, the shape of t-distribution
more closely approaches a standard normal distribution
Confidence Interval CI= m ± t * s
15
© EduPristine CFA - Level – I
Calculate & interpret a confidence interval for a sample distribution given population mean, and assuming a normal distribution
Population having normal distribution with a known variance: Confidence interval for population mean is
x(mean) + z α/2 * standard deviation of population
square root of the sample size (n)
Population is normal with unknown variance: we can use t-distribution to construct a confidence interval as
Population with unknown variance given a large sample from any type of distribution
If the distribution is non-normal but the population variance is known, the z-statistic can be used as long as sample size is large (n>=30)
If the distribution is non-normal but the population variance is unknown, the t-statistic can be used as long as sample size is large (n>=30)
This means that while sampling from non-normal distribution, we cannot create a confidence interval if the sample size is less than 30
16
(n) size sample theofroot square
deviation standard sample * á/2 z x(mean)
© EduPristine CFA - Level – I
Selection of Sample Size
Factors affecting the width of a confidence interval and the Reliability Factor:
The choice of statistic (t or z values)
Choice of degree of confidence (90%, 95%, 99% levels of confidence)
Choice of the sample size
• A larger sample size decreases the width of a confidence interval, all else equal
Considerations to be made while deciding to increase the sample size:
Risk of sampling from more than one population
Additional expense that outweigh the value of additional precision.
17
Size Sample
Deviation Standard Sample Mean Sample theofError Standard
© EduPristine CFA - Level – I
Sampling Related Issues
Data mining bias
It is the practice of determining a model by extensive searching through a dataset for statistically significant patterns.
It can be tested by using out-of-sample data
Two signs that may indicate the presence of data mining bias:
• Low significance levels
• No plausible economic rational behind the variable.
Sample Selection Bias
Arises when data availability leads to certain entities being excluded from the analysis.
This is a major issue in the hedge fund industry. Since performance disclosure is not mandatory, hedge fund returns are difficult to obtain.
This is also a problem in the mutual fund industry, as only funds that are currently exist are available in the database. Funds that no longer exist, perhaps due to poor performance, are not available in the database. This leads to survivorship bias.
18
© EduPristine CFA - Level – I
Sampling Related Issues
Look-ahead bias
Arises while using information that was not available on the test date.
For example, if using P/BV ratios, the BV may not be available till sometime in the following quarter.
Time-Period Bias
Arises when the analysis is based on a time period that may make the results time-period specific.
For example, a time period too short may give results that may not hold in the long run.
A time period too long has a potential for structural changes in which one segment cannot be compared to the other segment. It could result in two different returns distribution.
19
© EduPristine CFA - Level – I
Question
1. As compared to normal distribution, the t-distribution has:
A. Similar tails
B. Fatter tails
C. Narrower tails
2. Which of the following is most likely to be a property of an estimator?
A. Correctness
B. Reliability
C. Consistency
3. The mean age of all CFA candidates is 30 years. The mean age of random sample of 100 candidates is found to be 27.5 years. The difference , 30-27.5=2.5, is called the:
A. Random error
B. Sampling error
C. Population error
20
© EduPristine CFA - Level – I
Questions (Cont…)
4. Assume that a population has a mean of 14 with a standard deviation of 3. If a random sample of 64 observations is drawn from this population, the standard error of the sample mean is closest to:
A. 0.575 B. 0.375 C. 0.575
5. The population mean is 30 & the mean of a sample of size 144 is 28.5. The variance of the sample is 25. The standard error of the sample mean is closest to:
A. 0.450 B. 0.317 C. 0.417
6. A random sample of 100 mobile store customers spent an average of $150 at the store. Assuming the distribution is normal & the population standard deviation is $10, the 95% confidence interval for the population mean is closest to:
A. $148.04 to $151.96
B. $144.08 to $159.96
C. $149.04 to $152.96
21
© EduPristine CFA - Level – I
Questions (Cont…)
7. The Central Limit Theorem is best described as stating that the sampling distribution of the sample mean will be approximately normal for large-size samples:
A. if the population distribution is normal.
B. if the population distribution is symmetric.
C. for populations described by any probability distribution.
22
© EduPristine CFA - Level – I
Solution
1. B. The t-distribution has fatter tails compared to normal distribution
2. C. Consistency, Efficiency & unbaisedness are desirable properties of an estimator
3. B. It is the correct definition of the sampling error
4. B. = 3/8 = 0.375
5. C. = 5/12 = 0.417
6. A. Confidence interval is 150+ 1.96(10/10) = 150+ 1.96 = 148.04 to 151.96
7. C.
23
64
3
144
5
© EduPristine CFA - Level – I
Reading 11: Hypothesis Testing
24
© EduPristine CFA - Level – I
Coverage of the Reading 11
Hypothesis Test
Type-1,2 error
P-Value
T-test
F-test, Chi-square test
25
© EduPristine CFA - Level – I
Hypothesis Testing
A statistical hypothesis test is a method of making statistical decisions from and about experimental data.
Null-hypothesis testing answers the question:
“How well the findings fit the possibility that chance factors alone might be responsible."
Example: Does your score of 6/10 imply that I am a good teacher???
26
© EduPristine CFA - Level – I
Key steps in Hypothesis Testing
Null Hypothesis (H0): The hypothesis that the researcher wants to reject
Alternate Hypothesis(Ha): The hypothesis which is concluded if there is sufficient evidence to reject null hypothesis
Test Statistic
Rejection/Critical Region
Conclusion
27
© EduPristine CFA - Level – I
Launching a niche course for MBA students?
Christos, a brand manager for a leading financial training center, wants to introduce a new niche finance course for MBA students. He met some industry stalwarts and found that with the skills acquired by attending such a course, the students would be able to land up in a good job.
He meets a random sample of 100 students and discovers the following characteristics of the market
Mean household income to $20,000
Interest level in students = high
Current knowledge of students for the niche concepts = low
Christos strongly believes the course would adequately profitable in students if they have the buying power for the course. They would be able to afford the course only if the mean household income is greater than $19,000.
Would you advice Christos to introduce the course?
What should be the hypothesis?
• Hint: What is the point at which the decision changes (19,000 or 20,000)?
• What about the alternate hypothesis?
What other information do you need to ensure that the right decision is arrived at?
• Hint: confidence intervals/ significance levels?
• Hint: Is there any other factor apart from mean, which is important? How do I move from population parameters to standard errors?
28
© EduPristine CFA - Level – I
Criterion for Decision Making
What is the risk still remaining, when you take this decision?
• Hint: Type-I/II errors?
• Hint: P-value
To reach a final decision, Christos has to make a general inference (about the population) from the sample data.
Criterion: Mean income across all households in the market area under consideration.
• If the mean population household income is greater than $19,000, then PD should introduce the product line into the new market.
Christos’s decision making is equivalent to either accepting or rejecting the hypothesis:
• The population mean household income in the new market area is greater than $19,000
The term one-tailed signifies that all z-values that would cause Christos to reject H0, are in just one tail of the sampling distribution
• -> Population Mean
• H0: $19,000
• Ha: $19,000
29
© EduPristine CFA - Level – I
Identifying the Critical Sample Mean Value – Sampling Distribution
Sample mean values greater than $19,000--that is x-values on the right-hand side of the sampling distribution centered on µ = $19,000--suggest that H0 may be false.
More important the farther to the right x is , the stronger is the evidence against H0
30
0
0.05
0.1
0.15
0.2
0.25
-10 -5 0 5 10$19,000
Critical Value (Xc)
Reject H0 if the sample mean exceeds Xc
© EduPristine CFA - Level – I
Computing the Criterion Value
Standard deviation for the sample of 100 households is $4,000. The standard error of the mean (sx) is given by:
Critical mean household income xc through the following two steps:
• Determine the critical z-value, zc. For =0.05:
– zc = 1.645.
• Substitute the values of zc, s, and (under the assumption that H0 is "just" true )
• Critical Value xc
• xc = + zcs = $19,658.
• In this case, since the observed sample statistic (20,000) is greater than the critical value (19,658), so the null hypothesis is rejected =>
31
400$n
ssx
Decision Rule If the sample mean household income is greater than $19,658, reject the null hypothesis and introduce the new course
© EduPristine CFA - Level – I
Test Statistic
The value of the test statistic is simply the z-value corresponding to = $20,000.
Here, sx is the standard error
32
5.2
xs
xZ
0
0.05
0.1
0.15
0.2
0.25
-10 -5 0 5 10μ=$19,000 Z=0
x= $ 20,000 Z=2.5
Do not Reject H0 Reject H0
645.1
658,19$
c
c
Z
X
α= 0.05
There is a significant difference in the hypothesized population parameter and the observed sample statistic =>
Mean income > 19,000 =>
Launch the course
© EduPristine CFA - Level – I
Errors in Estimation
Please note: You are inferring for a population, based only on a sample
• This is no proof that your decision is correct & It’s just a hypothesis
There is still a chance that your inference is wrong. How do I quantify the prob. of error in inference?
Type I and Type II Errors:
Type I error occurs if the null hypothesis is rejected when it is true
Type II error occurs if the null hypothesis is not rejected when it is false
Significance Level:
-> Significance level : The upper-bound probability of a Type I error
1 - ->confidence level : The complement of significance level
The power of a test is the probability of correctly rejecting the null.
33
Actual
Inference
H0 is True H0 is False
H0 is True Correct Decision Confidence Level=1-α
Type-II Error P(Type-II Error)=β
H0 is False Type-I Error Significance Level=α
Power=1-β
© EduPristine CFA - Level – I
P - Value – Actual Significance Level
The p-value is the smallest level of significance at which the null hypothesis can be rejected.
P-value
The probability of obtaining an observed value of x (From the sample) as high as $20,000 or more when actual populations mean () is only $19,000 = 0.00621
Calculated probability of rejecting the null hypothesis (H0) when that hypothesis (H0) is true (Type I error)
The actual significance level of 0.00621 in this case means that the odds are less than 62 out of 10,000 that the sample mean income of $20,000 would have occurred entirely due to chance (when the population mean income is $19,000)
34
μ=$19,000 Z=0
p-value= 0.00621
Do not Reject H0 Reject H0
α= 0.05
0
0.05
0.1
0.15
0.2
0.25
© EduPristine CFA - Level – I
Some variations in the Z-Test - I
What if Christos surveyed the market and found that the student behavior is estimated to be:
They would found the training too expensive if their household income is < US$ 19,000 and hence would not have the buying power for the course?
They would perceive the training to be of inferior quality, if their household income is > US$19,000 and hence not buy the training?
How would the decision criteria change? What should be the testing strategy?
Hint: From the question wording infer: Two tailed testing
Appropriately modify the significance value and other parameters
Use the Z-test
Appropriate change in the decision making and testing process process:
Students will not attend the course if:
• The household income >$19,000 and the students perceive the course to be inferior
• The household income is <$19,000
This becomes a two tailed test wherein the student will join the course only when the household lie between a particular boundary. i.e. the household income should be neither very high neither very low
35
© EduPristine CFA - Level – I
216,18$400*95.1000,19*2/ Z
784,19$400*95.1000,19*2/ Z
Two- Tailed Test
Now the test is modified to two-tailed test, which signifies that all z-values that would cause PD to reject H0, are in both the tails of the sampling distribution
• -> Population Mean
• H0: = $19,000
• Ha: ≠ $19,000
Since we are checking for significance difference on both the ends, so it’s a two tailed test
The lower boundary =
Conclusion: If the household income lies between $18,216 and $19,784 then the student will attend the course at 95% confidence
36
μ=$19,000 Z=0
Do not Reject H0
Reject H0
α= 0.025
0
0.05
0.1
0.15
0.2
0.25
- 10 - 5 10
α= 0.025
Reject H0
© EduPristine CFA - Level – I
t-test
The t-distribution is a probability distribution defined by a single parameter known as degrees of freedom (df).
Like the standard normal distribution, a t-distribution has a mean of zero.
However, unlike a standard normal distribution that has a variance of one, a t-distribution has a variance greater than one.
The t-distribution also has fatter tails than a normal distribution.
The t-distribution approaches a normal distribution as the degrees of freedom increases.
A sample size greater or equal to 30 is treated as a large sample and a sample less than 30 is treated as a small sample.
The test statistic for a sample size n (and degrees of freedom n-1) is given by.
37
ns
X
/
- t 0
1-n
© EduPristine CFA - Level – I
Question
1. A researcher has a sample of 400 observations from a population whose standard deviation is known to be 136. The mean of the sample is calculated to be 17.2. The null hypothesis is stated as Ho: mean = 4. The p-value under the alternative hypothesis H1: mean > 4 equals
A. 3.92% B. 2.6% C. 5.2%
2. Buchanan thinks that KKR is unable to perform because of Ganguly. He sees the statistics and conducts leadership survey, which reveals that Ganguly scores low on Leadership qualities. Buchanan Hypothesize Ho: Ganguly Not a Leader, HA: Ganguly a Leader Buchanan removes Ganguly as KKR captain, but KKR keeps losing. Subsequent analysis shows that ShahRukh Khan was causing the problem. By Removing Ganguly, Buchanan:
A. Made a Type II error.
B. Is correct.
C. Made a Type I error.
38
© EduPristine CFA - Level – I
Question
3. If the standard deviation of a population is 100 and a sample size taken from that population is 64, the standard error of the sample means is closest to:
A. 0.08. B.1.56. C. 12.50.
4. Which of the following statements about the hypothesis testing is most accurate?
A. A Type II error is rejecting the null when it is actually true
B. The significance level equals one minus the probability of a Type I error
C. A two-tailed test with a significance level of 5% has z-critical values of + 1.96
39
© EduPristine CFA - Level – I
Solution
1. B. 2.6%. The z-statistic under the null is calculated to be (17.2 - 4)/(136/(400^.5)) = 1.94. The right-tailed probability of observing a z-statistic at least as big as 1.94 equals 1.0 - 0.9738 = 0.026 = 2.6%. This is the p-value of the right-tailed test in this sample.
2. C. Made a Type II error. Type II error is an which occurs when you fail to reject a hypothesis when it is actually false (also known as the power of the test). A Type I error is the rejection of a hypothesis when it is actually true (also known as the significance level of the test). P(Type II) = P(Accepting H0| H0 false).
3. C. 12.5
4. C. Rejecting the null when it is true is a Type I error. A Type II error I failing to reject the null hypothesis when it is false
40
5.128
100
64
100
n
X
X
© EduPristine CFA - Level – I
Hypothesis Tests for Variances
41
Test for Single Population Variance
Hypothesis Test of Variances
Test for Two Population Variances
Chi-Square Test Statistic
F-test Statistic
2
0
22
)1(,
)1(
snn
2
2
2
1,,
s
sF ddfndf
Example Hypothesis
Example Hypothesis
H0: σ12 – σ2
2 = 0 HA: σ1
2 – σ22 ≠ 0
H0: σ2 = σ02
HA: σ2 ≠ σ02
In testing for variances, there are two different tests, because sum of two chi-squares is not a chi-square
© EduPristine CFA - Level – I
Chi-square test
It is used for hypothesis tests concerning the variance of a normally distributed population
Hypothesis for two-tailed test of single-population variance:
Hypothesis for one-tailed test are structured as:
Steps:
1) Collect the sample & calculate the sample statistics
2) Make a decision regarding the hypothesis
3) Make a decision based on the results of the test
42
σσ:H versesσσ:H 022
a022
0
022
a022
0
022
a022
0
σσ:H versesσσ:H
or ,σσ:H versesσσ:H
© EduPristine CFA - Level – I
Appendix: The Chi-square Distribution
The chi-square distribution is a family of distributions, depending on degrees of freedom:
d.f. = n - 1
43
0 4 8 12 16 20 24 28
d.f. = 15
2 0 4 8 12 16 20 24 28
d.f. = 5
2 0 4 8 12 16 20 24 28
d.f. = 1
2
© EduPristine CFA - Level – I
Example : F-test
Q : William Waugh is examining the earnings for two different industries. He suspects that the earnings for chemical industry are more divergent than those of petroleum industry. To confirm, he took a sample of 35 chemical manufacturers & a sample of 45 petroleum companies. He measured the sample standard deviation of earnings across the chemical industry to be $3.5 & that of petroleum industry to be $3.00. Determine if the earnings of the chemical industry have greater standard deviation than those of the petroleum industry.
A: 1) State the hypothesis:
where variance of earnings for the chemical industry =
variance of earnings for the petroleum industry =
Note:
2) Select the appropriate test statistic:
3) Specify the level of significance: Take it 5% here
4) State the decision rule regarding the hypothesis:
5) Collect the sample & calculate the sample statistics:
Using the information provided, the F-statistic can be computed as:
44
022
a022
0 σσ:H versesσσ:H 2
12
22
2
2
1
2
2
2
1
S
SF
1165.1002.3$
502.3$2
2
2
1 S
SF
1.74F if HReject 0
© EduPristine CFA - Level – I
Example : F-square Test
Question: You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the BSE & NSE. You collect the following data:
BSE NSE Number 30 50
Mean 3.27 2.53
Std dev 1.5 1.4
Is there a difference in the variances between the BSE & NSE at the = 0.05 level?
45
© EduPristine CFA - Level – I
Example : F-square Test (Cont…)
Form the hypothesis test:
H0: σ21 – σ2
2 = 0 (there is no difference between variances)
HA: σ21 – σ2
2 ≠ 0 (there is a difference between variances)
Find the F critical value for = 0.05
Numerator
• df1 = n1 – 1 = 30 – 1 = 29
Denominator:
• df2 = n2 – 1 = 50 – 1 = 49
• F.05/2, 29, 49 = 1.881
46
© EduPristine CFA - Level – I
Example : F-square Test (Cont…)
The test statistic is:
F = 1.148 is not greater than the critical F value of 1.881, so we do not reject H0
Conclusion: There is no evidence of a difference in variances at = 0.05
47
148.140.1
50.12
2
2
2
2
1 s
sF
0
/2 = .025
F/2 =1.881 Reject H0 Do not
reject H0
H0: σ12 – σ2
2 = 0 HA: σ1
2 – σ22 ≠ 0
© EduPristine CFA - Level – I
Parametric & Non –parametric tests
Parametric tests: They rely on assumptions regarding the distribution of the population & are specific to population parameters
Example: z-test
Nonparametric tests: They either do not consider a particular parameter or have few assumptions about the population that is sampled
These are used when there is concern about quantities other than the parameters of a distribution or when the assumptions of parametric tests can’t be supported
Example: ranked observations
Spearman rank correlation test: It can be used when the data are not normally distributed
Example: The performance ranks of 20 mutual funds for two years which are not normally distributed
48
© EduPristine CFA - Level – I
Questions
1. An analyst is testing a hypothesis about stock returns. He would like to minimise the chances of rejecting the null hypothesis when it is true. Which of the following is most likely to be the level of significance?
A. 0.05 B. 0.95 C. 0.01
2. An analyst would like to compare the returns of two sample portfolios derived from the S&P 500 index. If he performs a two sample test to test the hypothesis with a 5% level of significance, which of the following is most likely?
A. The probability of Type I error is 95%
B. The probability that the null hypothesis would not be rejected when it is true is 5%
C. The probability of Type I error is 5%
3. What is the power of the test if the significance level of the test is 0.05 & the probability of the Type II error is 0.25?
A. 0.250
B. 0.750
C. C. 0.850
49
© EduPristine CFA - Level – I
Questions
4. Which of the following statements of the central limit theorem is least likely true?
A. For large n if the population distribution is uniform, the sample distribution is always normal
B. The standard deviation of the sample is always less than the population standard deviation
C. The interval within which the sample mean is expected to fall is µ ± zσ.
5. The probability of an investment earning an average return of 15% is 33% out of a given portfolio of investment options. The probability distribution of such investments options would follow which of the following distributions?
A. Binomial distribution
B. Poisson distribution
C. Normal distribution
6. Which of the following is false about the t statistic and the z values?
A. For a given confidence interval, as the degrees of freedom increases the t- values approach the normal z values
B. The student’s t test is used when the population is normal but its standard deviation is unknown.
C. The z value is used for hypothesis testing when the sample variance is known.
50
© EduPristine CFA - Level – I
Questions
7. The F Statistic is:
A. Always +ve and is +ve skewed
B. Always -ve and is -ve skewed
C. Can be +ve or Negative and is symmetric
8. Which of the following statements about the F-distribution & chi-square distribution is least accurate? Both distributions:
A. Are asymmetrical
B. Are bound by zero on the left
C. Have means that are less than their standard deviations
51
© EduPristine CFA - Level – I
Solutions
1. C. As here the analysts want to minimize the chances of rejecting the null hypothesis when it is true then he will use the least possible level of significance 0.01
2. C. The probability of Type I error is 5%
3. B. Power of the test = 1 – P(Type II error) = 1 - 0.25 – 0.750
4. A. For large n if the population distribution is uniform, the sample distribution is always normal
5. A. In this case, the investment options will follow Binomial Distribution
6. C. The z value is used for hypothesis testing when the sample variance is known.
7. A. F Statistic is ratio of 2 variances and hence always +ve. F Distribution is also +vely skewed.
8. C. There is no consistent relationship between the mean & the standard deviation of the chi-square distribution or F-distribution
52
© EduPristine CFA - Level – I
Reading 12: Technical Analysis
53
© EduPristine CFA - Level – I
Coverage of the Reading 12
Technical Analysis vs. Fundamental Analysis
Advantages & Challenges of Technical Analysis
Line Charts, Bar Charts & Candlestick charts
Point and Figure Charts
Trend, support, resistance lines & change in polarity
54
© EduPristine CFA - Level – I
Technical Analysis vs. Fundamental Analysis
Technical vs. Fundamental Analysis : The main difference between technical analysis and fundamental analysis is the use of financial statements to value equities.
Technical analysis is the practice of valuing stocks on past volume and pricing information. Technical analysis combines both the use of past information (how stocks have reacted previously) and "feeling" (how the market is moving the name) to value a security.
Fundamental analysis, however, takes a more formal approach. Fundamental analysts review the financial statements of a company and generate metrics, such as price-to-book value and enterprise value-to-EBITDA to value a security.
Assumptions of Technical Analysis :
Prices are determined by investor supply and demand for assets.
Supply and demand are driven by both rational and irrational behaviour.
While the causes of changes in supply and demand are difficult to determine, the actual shifts in supply and demand can be observed in market prices.
Prices move in trends and exhibit patterns that can be identified and tend to repeat themselves over time.
55
© EduPristine CFA - Level – I
Advantages & Challenges of Technical Analysis
Advantages of Technical Analysis:
Technical analysis is easy to understand and can be performed relatively quickly, especially with the aid of one of the many types of charting software.
Technical analysis does not rely on the use of financial statements for valuation purposes.
Rather than strict fundamental valuation, technical analysis takes into account the "feeling" of the market, which is subjective.
Challenges to Technical Analysis:
The past is not always an indication of future results, calling into question the validity of technical analysis.
Technical analysis violates the premise of EMH because EMH believers assume that price adjustments happen too quickly to be profitable.
56
© EduPristine CFA - Level – I
Line Charts
A line chart is the most basic and simplest type of stock charts that are used in technical analysis.
The line chart is also called a close-only chart as it plots the closing price of the underlying security, with a line connecting the dots formed by the close price.
The price data used in line charts is usually the close price of the underlying security. The uncluttered simplicity of the line chart is its greatest strength as it provides a clean, easily recognizable, visual display of the price movement. This makes it an ideal tool for use in identifying the dominant support and resistance levels, trend lines, and certain chart patterns.
However, the line chart does not indicate the highs and lows and, hence, they do not indicate the price range for the session
57
© EduPristine CFA - Level – I
Bar Charts & Candlestick charts
OHLC Bar Charts
Bar charts consist of bars, which are vertical lines with the bottom representing the low price (L) of the time-frame and the top representing the high price (H). The bars also have a horizontal dash on the right side of the bar to indicate the close price (C) for the time frame and some have a horizontal dash on the left side to indicate the open price (O)
Japanese candlestick charts form the basis of the oldest form of technical analysis. Candlestick charts provide the same information as OHLC bar charts.
Candlesticks indicate a bullish up bar, when the closing price is higher than the opening price, using a light color such as white or green, and a bearish down bar, when the closing price is lower than the opening price, using a darker color such as black or red for the real body of the candlestick
58
© EduPristine CFA - Level – I
Point and Figure Charts
Point and Figure (P&F) charts differ from other stock charts as it does not plot price movement from left to right within fixed time intervals. It also does not plot the volume traded.
Instead it plots unidirectional price movements in one vertical column and moves to the next column when the price changes direction.
It represent increases in price by plotting X's in the column and decreases in price by plotting O's. Each X and O represents a box of a set size or price amount.
This box size determines how far the price must move before another X or O is added to the chart, depending on the direction of the price movement.
Thus if the box sixe is set at 15, the price must move 15 points above the previous box before the next X or O is plotted. Any movement below 15 is ignored.
59
© EduPristine CFA - Level – I
Trend, support, resistance lines & change in polarity
In an uptrend, prices are reaching higher highs and higher lows. An uptrend line is drawn below the prices on a chart by connecting the increasing lows with a straight line.
In a downtrend, prices are reaching lower lows and lower highs. A downtrend line is drawn above the prices on a chart by connecting the decreasing highs with a straight line.
Support and resistance are price levels or ranges at which buying or selling pressure is expected to limit price movement. Commonly identified support and resistance levels include trend lines and previous high and low prices.
The change in polarity principle is the idea that breached resistance levels become support levels and breached support levels become resistance levels.
60
© EduPristine CFA - Level – I
Chart Patterns
Continuation patterns : indicate a higher probability for the continuation of the existing trend. These are usually momentary consolidation or retracements within the trend. Common continuation patterns include flags and pennants, and the various triangle patterns.
Reversal patterns : indicate a high probability that the existing trend has come to an end and will reverse direction. The common reversal patterns include double and triple tops, double and triple bottoms, head and shoulders, rising and falling wedges.
61
Double Top Pattern
© EduPristine CFA - Level – I
Technical Analysis Indicators
Price-based indicators include moving averages, Bollinger bands, and momentum oscillators such as the Relative Strength Index, moving average convergence/divergence lines, rate-of-change oscillators, and stochastic oscillators.
These indicators are commonly used to identify changes in price trends, as well as “overbought” markets that are likely to decrease in the near term and “oversold” markets that are likely to increase in the near term.
Sentiment indicators include opinion polls, the put/call ratio, the volatility index, margin debt, and the short interest ratio. Margin debt, the Arms index, the mutual fund cash position, new equity issuance, and secondary offerings are flow-of-funds indicators.
Technical analysts often interpret these indicators from a “contrarian” perspective, becoming bearish when investor sentiment is too positive and bullish when investor sentiment is too negative
62
© EduPristine CFA - Level – I
Cycles in Technical Analysis
Some technical analysts believe market prices move in cycles. Examples include the Kondratieff wave, which is a 54-year cycle, decennial patterns or 10-year cycles & a 4-year cycle related to U.S. presidential elections.
Elliott wave theory suggests that prices exhibit a pattern of five waves in the direction of a trend and three waves counter to the trend.
Technical analysts who employ Elliott wave theory frequently use ratios of the numbers in the Fibonacci sequence to estimate price targets and identify potential support and resistance levels
63
© EduPristine CFA - Level – I
Terms & Definitions
64
Terms Definitions
What does price and volume reflect? the collective behavior of buyers and sellers
What is the key assumption of TA? market prices reflect both rational and irrational investor behavior; implies that
the efficient markets hypothesis does not hold
What do TAs believe about investor
behavior?
it is reflected in trends and patterns that tend to repeat and can be identified
and used for forecasting prices
What are two advantages of TA? 1) actual price and volume data is observable whereas much of fundamental
data is subject to assumptions or restatements
2) it can be applied to prices of assets that do not produce future cash flows
If prices have changes exponentially over
long periods of time what might an
analyst do to his charts?
draw a chart on a logarithmic scale instead of a linear scale
What are the three main types of
charts?
1) line charts
2) bar charts
3) candlestick charts
What does relative strength mean? a trend that indicates the asset is outperforming the benchmark
© EduPristine CFA - Level – I
Terms & Definitions
65
Terms Definitions
What does relative weakness mean? a trend that indicates the asset is underperforming the benchmark
What is an uptrend? if prices are consistently reaching higher highs and retracing to higher lows;
demand is increasing relative to supply
What is a downtrend? if prices are consistently declining to lower lows and retracing to lower highs;
supply is increasing relative to demand
What is a breakout? when price crosses the trendline from a downtrend by what the analyst
considers a significant amount
What is a breakdown? when price crosses the trendline from an uptrend by what the analyst
considers a significant amount
What is a support level? buying which is expected to emerge that prevents further price decreases
What is a resistance level? selling which is expected to emerge that prevents further price increases
What is a change in polarity? belief that breached resistance levels become support levels and that
breached support levels become resistance levels
© EduPristine CFA - Level – I
Terms & Definitions
66
Terms Definitions
What is a head-and-shoulders pattern? a reversal pattern that suggests the demand that has been driving the uptrend
is fading, especially if each of the highs in the pattern occurs on declining
volume
What are three reversal patterns for
downtrends?
1) inverse head-and-shoulders
2) double bottom
3) triple bottom
What is a continuation pattern? suggests a pause in a trend rather than a reversal
What are triangles? Form when prices reach lower highs and higher lows over a period of time.
Trendlines on the highs and on the lows thus converge when they are
projected forward. they can be symmetrical, ascending or descending;
suggests buying and selling pressure have become roughly equal temporarily
but they do not imply a change in direction of a trend
What are rectangles? when trading temporarily forms a range between a support level and a
resistance level; suggests the prevailing trend will resume and can be used to
set a price target; they do not imply a change in direction of a trend
What is a moving average? mean of the last 'x' closing prices; often viewed as support or resistance levels
© EduPristine CFA - Level – I
Terms & Definitions
67
Terms Definitions
In an uptrend where is price in relation
to the moving average?
price is higher than the moving average
In a downtrend where is price in
relation to the moving average?
price is lower than the moving average
What is a golden cross? when short-term average crosses the long-term average from below; 'buy'
signal; emerging uptrend
What is a dead cross? when a short-term average crosses the long-term average from above, 'sell
signal'; emerging downtrend
What are bollinger bands? constructed based on the standard deviation of closing prices over the last 'n'
periods; move away from each other when volatility increases and move closer
together when prices are less volatile
What do contrarians believe? markets get overbought or oversold because most investors tend to buy and
sell at the wrong times, and thus it can be profitable to trade in the opposite
direction
© EduPristine CFA - Level – I
Terms & Definitions
68
Terms Definitions
What is an oscillator? group of technical tools TAs use to identify overbought/oversold markets;
based on market prices but scaled so that they "oscillate" around a given value
such as zero or between two values such as zero and 100; extremely high
values indicate overbought condition whereas extremely low values indicate
oversold condition; can be used to identify convergence or divergence.
What does convergence indicate? price trend is likely to continue
What does divergence indicate? potential change in price trend
What are four examples of oscillators? 1) ROC (rate of change)
2) RSI (relative strength index)
3) MACD (moving average convergence/divergence)
4) stochastic oscillator
What is the ROC oscillator? 100 x latest closing price - closing price from n period earlier; buy when ROC
changes from negative to positive during an uptrend and sell when ROC
changes from positive to negative during downtrend
© EduPristine CFA - Level – I
Questions
1. Which of the following is most likely to be considered a momentum indicator?
A. Put-call ratio
B. Breadth of market
C. Mutual fund cash position
2. A low price range in which buying activity is sufficient to stop a price decline is best described as:
A. Support
B. Resistance
C. Change in polarity
3. A technical analyst has detected a price chart pattern with three segments. The left segment shows a decline followed by a reversal to the starting price level. The middle segment shows a more pronounced decline than in the first segment and again a reversal to near the starting price level. The third segment is roughly a mirror image of the first segment. This chart pattern is most accurately described as:
A. A triple bottom
B. A head and shoulders
C. An inverse head and shoulders
69
© EduPristine CFA - Level – I
Solution
1. B. List and describe examples of each major category of technical trading rules and indicators. Breadth of market is a momentum indicator. Put-call ratio and mutual fund cash position are contrary-opinion rules.
2. A. Support is defined as a low price range in which buying activity is sufficient to stop the decline in price.
3. C. An inverse head and shoulders pattern consists of a left segment that shows a decline followed by a reversal to the starting price level, a middle segment that shows a more pronounced decline than in the first segment and again a reversal to near the starting price level, and a third segment that is roughly a mirror image of the first segment.
70
© EduPristine CFA - Level – I
Five Minute Recap
71
Central Limit Theorem
All possible samples of size n generated from a population will be approximately normally distributed.
The mean of the sampling distribution equal to μP and the standard deviation is equal to σP/√n. This is know as standard error.
Methods of Sampling
Simple Random Sampling Systematic Random Sampling Stratified Random Sampling
Desirable Properties of an Estimator
Unbiasedness Efficiency Consistency
Sampling Biases: Data mining bias Sample Selection Bias Look-ahead bias Time-Period Bias
Actual
Inference H0 is True H0 is False
H0 is True Correct Decision Confidence Level=1-α
Type-II Error P(Type-II Error)=β
H0 is False Type-I Error Significance Level=α
Power=1-β
Chi Square Test : Used for testing hypothesis concerning variance of a population F – Test : Used to test hypothesis about difference in variance of two different population
t-Distribution
0
0.05
0.1
0.15
0.2
0.25
0
0.1
0.15
0.2
0.25
0.05