quantitative methods - iii

© EduPristine CFA - Level – I © EduPristine – www.edupristine.com

Quantitative Methods - III

© EduPristine CFA - Level – I 1

Mapping to Curriculum

Reading 10: Sampling and Estimation

Reading 11: Hypothesis Testing

Reading 12: Technical Analysis

© EduPristine CFA - Level – I

Reading 10: Sampling and Estimation

2


Coverage of the Reading 10

Central Limit Theorem

Sampling Distribution

Standard error of sample mean

Student’s t-distribution

Confidence Interval

3


Sampling

A probability sample is a sample selected such that each item or person in the population being studied has a known likelihood of being included in the sample.

The sampling distribution of the sample mean is a probability distribution consisting of all possible sample means of a given sample size selected from a population.

Need for Sampling:

The physical impossibility of checking all items in the population.

The cost of studying all the items in a population.

The sample results are usually adequate.

Contacting the whole population would often be time-consuming.

The destructive nature of certain tests.

4


Time-Series and Cross-Sectional Data

Time Series

A sequence of data collected at discrete and equally spaced intervals of time.

For example, the quarterly revenue figures of a public company

While choosing a time interval over which the data is collected, the analyst make take into account changes in external factors such as fixed vs. floating interest rate scenarios or tight vs. loose monetary policies.

Cross Sectional Data

Data on some characteristic of individuals, groups, companies or geographical locations.

For example, the 2012 EPS of all stocks in the S&P 500.

While selecting data, the analyst must consider whether it comes from the same underlying population. For example, while looking at fixed capital, aviation companies have large fixed assets but a small size textile industry may not and their comparison may not be meaningful.

5



For a population with a mean μ and a variance σ2 the sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed.

The mean of the sampling distribution equal to μ and the variance equal to σ2/n.

How is variance related to standard error?

6

As sample size gets large (typically > 30) Sampling distribution becomes almost normal regardless of shape of population

X


Sampling Error

The sampling error is the difference between a sample statistic and its corresponding population parameter. It is found by subtracting the value of a Parameter from the value of a Statistic.

For example, if a poll was conducted where the population included all students in that school and the sample was a class. If the sample had a mean GPA of 3.4, and the population’s mean GPA was 3.2, then the sample error was 0.2.

7


Methods of Probability Sampling

Simple Random Sampling: A sample formulated so that each item or person in the population has the same chance of being included. This requires that the entire population must be known and serial numbered.

Systematic Random Sampling: The items or individuals of the population are arranged in some order. A random starting point is selected and then every kth member of the population is selected for the sample. Used in case the entire population cannot be identified.

Stratified Random Sampling: A population is first divided into subgroups called strata, and a sample is selected from each stratum. Ensures that all sub-group’s are represented in the sample. Has a smaller variance than the estimates observed from simple random sampling. Example: Bond Indexing

8


Developing Sampling Distributions

Suppose there’s a population of 4 oldest scientists in a university: Jack, Andrew, Michelle and Tom

Random variable, X is the ages of the individuals

• Values of X: 78, 76, 72, 74

Summary Measure for Population Distribution

9

236.2

N

μX

σ

754

74727678

N

X

AgeAverage

N

1i

2

i

N

1i

i

69

70

71

72

73

74

75

76

77

78

79

Andrew Jack Michelle Tom

Ages of Population

0

0.05

0.1

0.15

0.2

0.25

0.3

Andrew Jack Michelle Tom

Prob. Of selection

Optional Topic


0

0.05

0.1

0.15

0.2

0.25

0.3

72 73 74 75 76 77 78

Sampling Distribution of Sample Means

16 Sample Means

78 76 74 72

78 78 77 76 75

76 77 76 75 74

74 76 75 74 73

72 75 74 73 72

1st

Observ

2nd Observation

16 Samples of size n=2 each

78 76 74 72

78 78,78 76,78 74,78 72,78

76 78,76 76,76 74,76 72,76

74 78,74 76,74 74,74 72,74

72 78,72 76,72 74,72 72,72

2nd Observation1st

Obs

All Possible Samples of Size n = 2

10

Optional Topic


7516

787373721

N

XN

i

i

x

SizeSample

VariancePopulationError) (Standardmean ofon distributi sampling of Variance

Summary Measures for the Sampling Distribution

The mean of the sample

The standard deviation of the sample means:

Two important points worth noting in population and sampling distributions:

• Population mean and the sample mean is same which is equal to 76.

• Variance of the population = 2.2362=5 and Variance of the sample = 1.582=2.5 which is lower than the population variance.

Also the

11

Optional Topic

58.1

16

757875737572222

1

2

N

XN

i

xi

x


Standard Error of sample mean

It is the standard deviation of the distribution of the sample means

When the standard deviation of the population is known, the standard error of the sample mean is calculated as:

Standard error of sample mean = Standard deviation of population

Square root of the sample size (n)

Example: The mean hourly wage for Mumbai farm workers is $13.50 with a population standard deviation of $2.90. Calculate & interpret the standard error of the sample mean for a sample size of 30

Answer: Because the population standard deviation is known, the standard error of the sample mean is expressed as = $2.90/ root of (30) = $0.53

12


Desirable properties of an estimator

Unbiasedness: expected value of an estimator is equal to the parameter you are trying to estimate

Efficiency: Variance of the sampling distribution is smaller than all the other unbaised estimators of the parameter you are trying to estimate

Consistency: accuracy of the parameter estimate increases as the sample size increases

13


Point Estimate & Confidence Interval

Point estimates: These are the single (sample) values used to estimate population parameters

Confidence interval: It is a range of values in which the population parameter is expected to lie

Confidence interval takes on the following form where N ≥ 30

CI = m + Z*sx

True for a population distribution

Where, m is the mean of the population

sx is the standard deviation of the population

For a sample mean,

Point estimate + (reliability factor * standard error )

CI = m + Z*(sx/√n)

14


Student’s t – distribution (in cases where n < 30)

Student’s t-distribution, or simply the t-distribution, is a bell-shaped probability distribution that is symmetrical about its mean

It is appropriate distribution to use when constructing confidence intervals based on small samples (n<30) from population with unknown variance & a normal, or approximately normal, distribution

It is symmetrical

It is defined by a single parameter, the degrees of freedom (df):

Degrees of freedom = n-1

It has more probability in the tails (“fatter tails”)

than normal distribution; which means higher kurtosis.

As the degrees of freedom gets larger, the shape of t-distribution

more closely approaches a standard normal distribution

Confidence Interval CI= m ± t * s

15


Calculate & interpret a confidence interval for a sample distribution given population mean, and assuming a normal distribution

Population having normal distribution with a known variance: Confidence interval for population mean is

x(mean) + z α/2 * standard deviation of population

square root of the sample size (n)

Population is normal with unknown variance: we can use t-distribution to construct a confidence interval as

Population with unknown variance given a large sample from any type of distribution

If the distribution is non-normal but the population variance is known, the z-statistic can be used as long as sample size is large (n>=30)

If the distribution is non-normal but the population variance is unknown, the t-statistic can be used as long as sample size is large (n>=30)

This means that while sampling from non-normal distribution, we cannot create a confidence interval if the sample size is less than 30

16

(n) size sample theofroot square

deviation standard sample * á/2 z x(mean)


Selection of Sample Size

Factors affecting the width of a confidence interval and the Reliability Factor:

The choice of statistic (t or z values)

Choice of degree of confidence (90%, 95%, 99% levels of confidence)

Choice of the sample size

• A larger sample size decreases the width of a confidence interval, all else equal

Considerations to be made while deciding to increase the sample size:

Risk of sampling from more than one population

Additional expense that outweigh the value of additional precision.

17

Size Sample

Deviation Standard Sample Mean Sample theofError Standard


Sampling Related Issues

Data mining bias

It is the practice of determining a model by extensive searching through a dataset for statistically significant patterns.

It can be tested by using out-of-sample data

Two signs that may indicate the presence of data mining bias:

• Low significance levels

• No plausible economic rational behind the variable.

Sample Selection Bias

Arises when data availability leads to certain entities being excluded from the analysis.

This is a major issue in the hedge fund industry. Since performance disclosure is not mandatory, hedge fund returns are difficult to obtain.

This is also a problem in the mutual fund industry, as only funds that are currently exist are available in the database. Funds that no longer exist, perhaps due to poor performance, are not available in the database. This leads to survivorship bias.

18


Sampling Related Issues

Look-ahead bias

Arises while using information that was not available on the test date.

For example, if using P/BV ratios, the BV may not be available till sometime in the following quarter.

Time-Period Bias

Arises when the analysis is based on a time period that may make the results time-period specific.

For example, a time period too short may give results that may not hold in the long run.

A time period too long has a potential for structural changes in which one segment cannot be compared to the other segment. It could result in two different returns distribution.

19


Question

1. As compared to normal distribution, the t-distribution has:

A. Similar tails

B. Fatter tails

C. Narrower tails

2. Which of the following is most likely to be a property of an estimator?

A. Correctness

B. Reliability

C. Consistency

3. The mean age of all CFA candidates is 30 years. The mean age of random sample of 100 candidates is found to be 27.5 years. The difference , 30-27.5=2.5, is called the:

A. Random error

B. Sampling error

C. Population error

20


Questions (Cont…)

4. Assume that a population has a mean of 14 with a standard deviation of 3. If a random sample of 64 observations is drawn from this population, the standard error of the sample mean is closest to:

A. 0.575 B. 0.375 C. 0.575

5. The population mean is 30 & the mean of a sample of size 144 is 28.5. The variance of the sample is 25. The standard error of the sample mean is closest to:

A. 0.450 B. 0.317 C. 0.417

6. A random sample of 100 mobile store customers spent an average of $150 at the store. Assuming the distribution is normal & the population standard deviation is $10, the 95% confidence interval for the population mean is closest to:

A. $148.04 to $151.96

B. $144.08 to $159.96

C. $149.04 to $152.96

21


Questions (Cont…)

7. The Central Limit Theorem is best described as stating that the sampling distribution of the sample mean will be approximately normal for large-size samples:

A. if the population distribution is normal.

B. if the population distribution is symmetric.

C. for populations described by any probability distribution.

22


Solution

1. B. The t-distribution has fatter tails compared to normal distribution

2. C. Consistency, Efficiency & unbaisedness are desirable properties of an estimator

3. B. It is the correct definition of the sampling error

4. B. = 3/8 = 0.375

5. C. = 5/12 = 0.417

6. A. Confidence interval is 150+ 1.96(10/10) = 150+ 1.96 = 148.04 to 151.96

7. C.

23

64

3

144

5


Reading 11: Hypothesis Testing

24



Hypothesis Test

Type-1,2 error

P-Value

T-test

F-test, Chi-square test

25


Hypothesis Testing

A statistical hypothesis test is a method of making statistical decisions from and about experimental data.

Null-hypothesis testing answers the question:

“How well the findings fit the possibility that chance factors alone might be responsible."

Example: Does your score of 6/10 imply that I am a good teacher???

26


Key steps in Hypothesis Testing

Null Hypothesis (H0): The hypothesis that the researcher wants to reject

Alternate Hypothesis(Ha): The hypothesis which is concluded if there is sufficient evidence to reject null hypothesis

Test Statistic

Rejection/Critical Region

Conclusion

27


Launching a niche course for MBA students?

Christos, a brand manager for a leading financial training center, wants to introduce a new niche finance course for MBA students. He met some industry stalwarts and found that with the skills acquired by attending such a course, the students would be able to land up in a good job.

He meets a random sample of 100 students and discovers the following characteristics of the market

Mean household income to $20,000

Interest level in students = high

Current knowledge of students for the niche concepts = low

Christos strongly believes the course would adequately profitable in students if they have the buying power for the course. They would be able to afford the course only if the mean household income is greater than $19,000.

Would you advice Christos to introduce the course?

What should be the hypothesis?

• Hint: What is the point at which the decision changes (19,000 or 20,000)?

• What about the alternate hypothesis?

What other information do you need to ensure that the right decision is arrived at?

• Hint: confidence intervals/ significance levels?

• Hint: Is there any other factor apart from mean, which is important? How do I move from population parameters to standard errors?

28


Criterion for Decision Making

What is the risk still remaining, when you take this decision?

• Hint: Type-I/II errors?

• Hint: P-value

To reach a final decision, Christos has to make a general inference (about the population) from the sample data.

Criterion: Mean income across all households in the market area under consideration.

• If the mean population household income is greater than $19,000, then PD should introduce the product line into the new market.

Christos’s decision making is equivalent to either accepting or rejecting the hypothesis:

• The population mean household income in the new market area is greater than $19,000

The term one-tailed signifies that all z-values that would cause Christos to reject H0, are in just one tail of the sampling distribution

• -> Population Mean

• H0: $19,000

• Ha: $19,000

29


Identifying the Critical Sample Mean Value – Sampling Distribution

Sample mean values greater than $19,000--that is x-values on the right-hand side of the sampling distribution centered on µ = $19,000--suggest that H0 may be false.

More important the farther to the right x is , the stronger is the evidence against H0

30

0

0.05

0.1

0.15

0.2

0.25

-10 -5 0 5 10$19,000

Critical Value (Xc)

Reject H0 if the sample mean exceeds Xc


Computing the Criterion Value

Standard deviation for the sample of 100 households is $4,000. The standard error of the mean (sx) is given by:

Critical mean household income xc through the following two steps:

• Determine the critical z-value, zc. For =0.05:

– zc = 1.645.

• Substitute the values of zc, s, and (under the assumption that H0 is "just" true )

• Critical Value xc

• xc = + zcs = $19,658.

• In this case, since the observed sample statistic (20,000) is greater than the critical value (19,658), so the null hypothesis is rejected =>

31

400$n

ssx

Decision Rule If the sample mean household income is greater than $19,658, reject the null hypothesis and introduce the new course


Test Statistic

The value of the test statistic is simply the z-value corresponding to = $20,000.

Here, sx is the standard error

32

5.2

xs

xZ

0

0.05

0.1

0.15

0.2

0.25

-10 -5 0 5 10μ=$19,000 Z=0

x= $ 20,000 Z=2.5

Do not Reject H0 Reject H0

645.1

658,19$

c

c

Z

X

α= 0.05

There is a significant difference in the hypothesized population parameter and the observed sample statistic =>

Mean income > 19,000 =>

Launch the course


Errors in Estimation

Please note: You are inferring for a population, based only on a sample

• This is no proof that your decision is correct & It’s just a hypothesis

There is still a chance that your inference is wrong. How do I quantify the prob. of error in inference?

Type I and Type II Errors:

Type I error occurs if the null hypothesis is rejected when it is true

Type II error occurs if the null hypothesis is not rejected when it is false

Significance Level:

-> Significance level : The upper-bound probability of a Type I error

1 - ->confidence level : The complement of significance level

The power of a test is the probability of correctly rejecting the null.

33

Actual

Inference

H0 is True H0 is False

H0 is True Correct Decision Confidence Level=1-α

Type-II Error P(Type-II Error)=β

H0 is False Type-I Error Significance Level=α

Power=1-β


P - Value – Actual Significance Level

The p-value is the smallest level of significance at which the null hypothesis can be rejected.

P-value

The probability of obtaining an observed value of x (From the sample) as high as $20,000 or more when actual populations mean () is only $19,000 = 0.00621

Calculated probability of rejecting the null hypothesis (H0) when that hypothesis (H0) is true (Type I error)

The actual significance level of 0.00621 in this case means that the odds are less than 62 out of 10,000 that the sample mean income of $20,000 would have occurred entirely due to chance (when the population mean income is $19,000)

34

μ=$19,000 Z=0

p-value= 0.00621

Do not Reject H0 Reject H0

α= 0.05

0

0.05

0.1

0.15

0.2

0.25


Some variations in the Z-Test - I

What if Christos surveyed the market and found that the student behavior is estimated to be:

They would found the training too expensive if their household income is < US$ 19,000 and hence would not have the buying power for the course?

They would perceive the training to be of inferior quality, if their household income is > US$19,000 and hence not buy the training?

How would the decision criteria change? What should be the testing strategy?

Hint: From the question wording infer: Two tailed testing

Appropriately modify the significance value and other parameters

Use the Z-test

Appropriate change in the decision making and testing process process:

Students will not attend the course if:

• The household income >$19,000 and the students perceive the course to be inferior

• The household income is <$19,000

This becomes a two tailed test wherein the student will join the course only when the household lie between a particular boundary. i.e. the household income should be neither very high neither very low

35


216,18$400*95.1000,19*2/ Z

784,19$400*95.1000,19*2/ Z

Two- Tailed Test

Now the test is modified to two-tailed test, which signifies that all z-values that would cause PD to reject H0, are in both the tails of the sampling distribution

• -> Population Mean

• H0: = $19,000

• Ha: ≠ $19,000

Since we are checking for significance difference on both the ends, so it’s a two tailed test

The lower boundary =

Conclusion: If the household income lies between $18,216 and $19,784 then the student will attend the course at 95% confidence

36

μ=$19,000 Z=0

Do not Reject H0

Reject H0

α= 0.025

0

0.05

0.1

0.15

0.2

0.25

- 10 - 5 10

α= 0.025

Reject H0


t-test

The t-distribution is a probability distribution defined by a single parameter known as degrees of freedom (df).

Like the standard normal distribution, a t-distribution has a mean of zero.

However, unlike a standard normal distribution that has a variance of one, a t-distribution has a variance greater than one.

The t-distribution also has fatter tails than a normal distribution.

The t-distribution approaches a normal distribution as the degrees of freedom increases.

A sample size greater or equal to 30 is treated as a large sample and a sample less than 30 is treated as a small sample.

The test statistic for a sample size n (and degrees of freedom n-1) is given by.

37

ns

X

/

- t 0

1-n


Question

1. A researcher has a sample of 400 observations from a population whose standard deviation is known to be 136. The mean of the sample is calculated to be 17.2. The null hypothesis is stated as Ho: mean = 4. The p-value under the alternative hypothesis H1: mean > 4 equals

A. 3.92% B. 2.6% C. 5.2%

2. Buchanan thinks that KKR is unable to perform because of Ganguly. He sees the statistics and conducts leadership survey, which reveals that Ganguly scores low on Leadership qualities. Buchanan Hypothesize Ho: Ganguly Not a Leader, HA: Ganguly a Leader Buchanan removes Ganguly as KKR captain, but KKR keeps losing. Subsequent analysis shows that ShahRukh Khan was causing the problem. By Removing Ganguly, Buchanan:

A. Made a Type II error.

B. Is correct.

C. Made a Type I error.

38


Question

3. If the standard deviation of a population is 100 and a sample size taken from that population is 64, the standard error of the sample means is closest to:

A. 0.08. B.1.56. C. 12.50.

4. Which of the following statements about the hypothesis testing is most accurate?

A. A Type II error is rejecting the null when it is actually true

B. The significance level equals one minus the probability of a Type I error

C. A two-tailed test with a significance level of 5% has z-critical values of + 1.96

39


Solution

1. B. 2.6%. The z-statistic under the null is calculated to be (17.2 - 4)/(136/(400^.5)) = 1.94. The right-tailed probability of observing a z-statistic at least as big as 1.94 equals 1.0 - 0.9738 = 0.026 = 2.6%. This is the p-value of the right-tailed test in this sample.

2. C. Made a Type II error. Type II error is an which occurs when you fail to reject a hypothesis when it is actually false (also known as the power of the test). A Type I error is the rejection of a hypothesis when it is actually true (also known as the significance level of the test). P(Type II) = P(Accepting H0| H0 false).

3. C. 12.5

4. C. Rejecting the null when it is true is a Type I error. A Type II error I failing to reject the null hypothesis when it is false

40

5.128

100

64

100

n

X

X


Hypothesis Tests for Variances

41

Test for Single Population Variance

Hypothesis Test of Variances

Test for Two Population Variances

Chi-Square Test Statistic

F-test Statistic

2

0

22

)1(,

)1(

snn

2

2

2

1,,

s

sF ddfndf

Example Hypothesis

Example Hypothesis

H0: σ12 – σ2

2 = 0 HA: σ1

2 – σ22 ≠ 0

H0: σ2 = σ02

HA: σ2 ≠ σ02

In testing for variances, there are two different tests, because sum of two chi-squares is not a chi-square


Chi-square test

It is used for hypothesis tests concerning the variance of a normally distributed population

Hypothesis for two-tailed test of single-population variance:

Hypothesis for one-tailed test are structured as:

Steps:

1) Collect the sample & calculate the sample statistics

2) Make a decision regarding the hypothesis

3) Make a decision based on the results of the test

42

σσ:H versesσσ:H 022

a022

0

022

a022

0

022

a022

0

σσ:H versesσσ:H

or ,σσ:H versesσσ:H


Appendix: The Chi-square Distribution

The chi-square distribution is a family of distributions, depending on degrees of freedom:

d.f. = n - 1

43

0 4 8 12 16 20 24 28

d.f. = 15

2 0 4 8 12 16 20 24 28

d.f. = 5

2 0 4 8 12 16 20 24 28

d.f. = 1

2


Example : F-test

Q : William Waugh is examining the earnings for two different industries. He suspects that the earnings for chemical industry are more divergent than those of petroleum industry. To confirm, he took a sample of 35 chemical manufacturers & a sample of 45 petroleum companies. He measured the sample standard deviation of earnings across the chemical industry to be $3.5 & that of petroleum industry to be $3.00. Determine if the earnings of the chemical industry have greater standard deviation than those of the petroleum industry.

A: 1) State the hypothesis:

where variance of earnings for the chemical industry =

variance of earnings for the petroleum industry =

Note:

2) Select the appropriate test statistic:

3) Specify the level of significance: Take it 5% here

4) State the decision rule regarding the hypothesis:

5) Collect the sample & calculate the sample statistics:

Using the information provided, the F-statistic can be computed as:

44

022

a022

0 σσ:H versesσσ:H 2

12

22

2

2

1

2

2

2

1

S

SF

1165.1002.3$

502.3$2

2

2

1 S

SF

1.74F if HReject 0


Example : F-square Test

Question: You are a financial analyst for a brokerage firm. You want to compare dividend yields between stocks listed on the BSE & NSE. You collect the following data:

BSE NSE Number 30 50

Mean 3.27 2.53

Std dev 1.5 1.4

Is there a difference in the variances between the BSE & NSE at the = 0.05 level?

45


Example : F-square Test (Cont…)

Form the hypothesis test:

H0: σ21 – σ2

2 = 0 (there is no difference between variances)

HA: σ21 – σ2

2 ≠ 0 (there is a difference between variances)

Find the F critical value for = 0.05

Numerator

• df1 = n1 – 1 = 30 – 1 = 29

Denominator:

• df2 = n2 – 1 = 50 – 1 = 49

• F.05/2, 29, 49 = 1.881

46


Example : F-square Test (Cont…)

The test statistic is:

F = 1.148 is not greater than the critical F value of 1.881, so we do not reject H0

Conclusion: There is no evidence of a difference in variances at = 0.05

47

148.140.1

50.12

2

2

2

2

1 s

sF

0

/2 = .025

F/2 =1.881 Reject H0 Do not

reject H0

H0: σ12 – σ2

2 = 0 HA: σ1

2 – σ22 ≠ 0


Parametric & Non –parametric tests

Parametric tests: They rely on assumptions regarding the distribution of the population & are specific to population parameters

Example: z-test

Nonparametric tests: They either do not consider a particular parameter or have few assumptions about the population that is sampled

These are used when there is concern about quantities other than the parameters of a distribution or when the assumptions of parametric tests can’t be supported

Example: ranked observations

Spearman rank correlation test: It can be used when the data are not normally distributed

Example: The performance ranks of 20 mutual funds for two years which are not normally distributed

48


Questions

1. An analyst is testing a hypothesis about stock returns. He would like to minimise the chances of rejecting the null hypothesis when it is true. Which of the following is most likely to be the level of significance?

A. 0.05 B. 0.95 C. 0.01

2. An analyst would like to compare the returns of two sample portfolios derived from the S&P 500 index. If he performs a two sample test to test the hypothesis with a 5% level of significance, which of the following is most likely?

A. The probability of Type I error is 95%

B. The probability that the null hypothesis would not be rejected when it is true is 5%

C. The probability of Type I error is 5%

3. What is the power of the test if the significance level of the test is 0.05 & the probability of the Type II error is 0.25?

A. 0.250

B. 0.750

C. C. 0.850

49


Questions

4. Which of the following statements of the central limit theorem is least likely true?

A. For large n if the population distribution is uniform, the sample distribution is always normal

B. The standard deviation of the sample is always less than the population standard deviation

C. The interval within which the sample mean is expected to fall is µ ± zσ.

5. The probability of an investment earning an average return of 15% is 33% out of a given portfolio of investment options. The probability distribution of such investments options would follow which of the following distributions?

A. Binomial distribution

B. Poisson distribution

C. Normal distribution

6. Which of the following is false about the t statistic and the z values?

A. For a given confidence interval, as the degrees of freedom increases the t- values approach the normal z values

B. The student’s t test is used when the population is normal but its standard deviation is unknown.

C. The z value is used for hypothesis testing when the sample variance is known.

50


Questions

7. The F Statistic is:

A. Always +ve and is +ve skewed

B. Always -ve and is -ve skewed

C. Can be +ve or Negative and is symmetric

8. Which of the following statements about the F-distribution & chi-square distribution is least accurate? Both distributions:

A. Are asymmetrical

B. Are bound by zero on the left

C. Have means that are less than their standard deviations

51


Solutions

1. C. As here the analysts want to minimize the chances of rejecting the null hypothesis when it is true then he will use the least possible level of significance 0.01

2. C. The probability of Type I error is 5%

3. B. Power of the test = 1 – P(Type II error) = 1 - 0.25 – 0.750

4. A. For large n if the population distribution is uniform, the sample distribution is always normal

5. A. In this case, the investment options will follow Binomial Distribution

6. C. The z value is used for hypothesis testing when the sample variance is known.

7. A. F Statistic is ratio of 2 variances and hence always +ve. F Distribution is also +vely skewed.

8. C. There is no consistent relationship between the mean & the standard deviation of the chi-square distribution or F-distribution

52


Reading 12: Technical Analysis

53



Technical Analysis vs. Fundamental Analysis

Advantages & Challenges of Technical Analysis

Line Charts, Bar Charts & Candlestick charts

Point and Figure Charts

Trend, support, resistance lines & change in polarity

54


Technical Analysis vs. Fundamental Analysis

Technical vs. Fundamental Analysis : The main difference between technical analysis and fundamental analysis is the use of financial statements to value equities.

Technical analysis is the practice of valuing stocks on past volume and pricing information. Technical analysis combines both the use of past information (how stocks have reacted previously) and "feeling" (how the market is moving the name) to value a security.

Fundamental analysis, however, takes a more formal approach. Fundamental analysts review the financial statements of a company and generate metrics, such as price-to-book value and enterprise value-to-EBITDA to value a security.

Assumptions of Technical Analysis :

Prices are determined by investor supply and demand for assets.

Supply and demand are driven by both rational and irrational behaviour.

While the causes of changes in supply and demand are difficult to determine, the actual shifts in supply and demand can be observed in market prices.

Prices move in trends and exhibit patterns that can be identified and tend to repeat themselves over time.

55


Advantages & Challenges of Technical Analysis

Advantages of Technical Analysis:

Technical analysis is easy to understand and can be performed relatively quickly, especially with the aid of one of the many types of charting software.

Technical analysis does not rely on the use of financial statements for valuation purposes.

Rather than strict fundamental valuation, technical analysis takes into account the "feeling" of the market, which is subjective.

Challenges to Technical Analysis:

The past is not always an indication of future results, calling into question the validity of technical analysis.

Technical analysis violates the premise of EMH because EMH believers assume that price adjustments happen too quickly to be profitable.

56


Line Charts

A line chart is the most basic and simplest type of stock charts that are used in technical analysis.

The line chart is also called a close-only chart as it plots the closing price of the underlying security, with a line connecting the dots formed by the close price.

The price data used in line charts is usually the close price of the underlying security. The uncluttered simplicity of the line chart is its greatest strength as it provides a clean, easily recognizable, visual display of the price movement. This makes it an ideal tool for use in identifying the dominant support and resistance levels, trend lines, and certain chart patterns.

However, the line chart does not indicate the highs and lows and, hence, they do not indicate the price range for the session

57


Bar Charts & Candlestick charts

OHLC Bar Charts

Bar charts consist of bars, which are vertical lines with the bottom representing the low price (L) of the time-frame and the top representing the high price (H). The bars also have a horizontal dash on the right side of the bar to indicate the close price (C) for the time frame and some have a horizontal dash on the left side to indicate the open price (O)

Japanese candlestick charts form the basis of the oldest form of technical analysis. Candlestick charts provide the same information as OHLC bar charts.

Candlesticks indicate a bullish up bar, when the closing price is higher than the opening price, using a light color such as white or green, and a bearish down bar, when the closing price is lower than the opening price, using a darker color such as black or red for the real body of the candlestick

58


Point and Figure Charts

Point and Figure (P&F) charts differ from other stock charts as it does not plot price movement from left to right within fixed time intervals. It also does not plot the volume traded.

Instead it plots unidirectional price movements in one vertical column and moves to the next column when the price changes direction.

It represent increases in price by plotting X's in the column and decreases in price by plotting O's. Each X and O represents a box of a set size or price amount.

This box size determines how far the price must move before another X or O is added to the chart, depending on the direction of the price movement.

Thus if the box sixe is set at 15, the price must move 15 points above the previous box before the next X or O is plotted. Any movement below 15 is ignored.

59


Trend, support, resistance lines & change in polarity

In an uptrend, prices are reaching higher highs and higher lows. An uptrend line is drawn below the prices on a chart by connecting the increasing lows with a straight line.

In a downtrend, prices are reaching lower lows and lower highs. A downtrend line is drawn above the prices on a chart by connecting the decreasing highs with a straight line.

Support and resistance are price levels or ranges at which buying or selling pressure is expected to limit price movement. Commonly identified support and resistance levels include trend lines and previous high and low prices.

The change in polarity principle is the idea that breached resistance levels become support levels and breached support levels become resistance levels.

60


Chart Patterns

Continuation patterns : indicate a higher probability for the continuation of the existing trend. These are usually momentary consolidation or retracements within the trend. Common continuation patterns include flags and pennants, and the various triangle patterns.

Reversal patterns : indicate a high probability that the existing trend has come to an end and will reverse direction. The common reversal patterns include double and triple tops, double and triple bottoms, head and shoulders, rising and falling wedges.

61

Double Top Pattern


Technical Analysis Indicators

Price-based indicators include moving averages, Bollinger bands, and momentum oscillators such as the Relative Strength Index, moving average convergence/divergence lines, rate-of-change oscillators, and stochastic oscillators.

These indicators are commonly used to identify changes in price trends, as well as “overbought” markets that are likely to decrease in the near term and “oversold” markets that are likely to increase in the near term.

Sentiment indicators include opinion polls, the put/call ratio, the volatility index, margin debt, and the short interest ratio. Margin debt, the Arms index, the mutual fund cash position, new equity issuance, and secondary offerings are flow-of-funds indicators.

Technical analysts often interpret these indicators from a “contrarian” perspective, becoming bearish when investor sentiment is too positive and bullish when investor sentiment is too negative

62


Cycles in Technical Analysis

Some technical analysts believe market prices move in cycles. Examples include the Kondratieff wave, which is a 54-year cycle, decennial patterns or 10-year cycles & a 4-year cycle related to U.S. presidential elections.

Elliott wave theory suggests that prices exhibit a pattern of five waves in the direction of a trend and three waves counter to the trend.

Technical analysts who employ Elliott wave theory frequently use ratios of the numbers in the Fibonacci sequence to estimate price targets and identify potential support and resistance levels

63


Terms & Definitions

64

Terms Definitions

What does price and volume reflect? the collective behavior of buyers and sellers

What is the key assumption of TA? market prices reflect both rational and irrational investor behavior; implies that

the efficient markets hypothesis does not hold

What do TAs believe about investor

behavior?

it is reflected in trends and patterns that tend to repeat and can be identified

and used for forecasting prices

What are two advantages of TA? 1) actual price and volume data is observable whereas much of fundamental

data is subject to assumptions or restatements

2) it can be applied to prices of assets that do not produce future cash flows

If prices have changes exponentially over

long periods of time what might an

analyst do to his charts?

draw a chart on a logarithmic scale instead of a linear scale

What are the three main types of

charts?

1) line charts

2) bar charts

3) candlestick charts

What does relative strength mean? a trend that indicates the asset is outperforming the benchmark


Terms & Definitions

65

Terms Definitions

What does relative weakness mean? a trend that indicates the asset is underperforming the benchmark

What is an uptrend? if prices are consistently reaching higher highs and retracing to higher lows;

demand is increasing relative to supply

What is a downtrend? if prices are consistently declining to lower lows and retracing to lower highs;

supply is increasing relative to demand

What is a breakout? when price crosses the trendline from a downtrend by what the analyst

considers a significant amount

What is a breakdown? when price crosses the trendline from an uptrend by what the analyst

considers a significant amount

What is a support level? buying which is expected to emerge that prevents further price decreases

What is a resistance level? selling which is expected to emerge that prevents further price increases

What is a change in polarity? belief that breached resistance levels become support levels and that

breached support levels become resistance levels


Terms & Definitions

66

Terms Definitions

What is a head-and-shoulders pattern? a reversal pattern that suggests the demand that has been driving the uptrend

is fading, especially if each of the highs in the pattern occurs on declining

volume

What are three reversal patterns for

downtrends?

1) inverse head-and-shoulders

2) double bottom

3) triple bottom

What is a continuation pattern? suggests a pause in a trend rather than a reversal

What are triangles? Form when prices reach lower highs and higher lows over a period of time.

Trendlines on the highs and on the lows thus converge when they are

projected forward. they can be symmetrical, ascending or descending;

suggests buying and selling pressure have become roughly equal temporarily

but they do not imply a change in direction of a trend

What are rectangles? when trading temporarily forms a range between a support level and a

resistance level; suggests the prevailing trend will resume and can be used to

set a price target; they do not imply a change in direction of a trend

What is a moving average? mean of the last 'x' closing prices; often viewed as support or resistance levels


Terms & Definitions

67

Terms Definitions

In an uptrend where is price in relation

to the moving average?

price is higher than the moving average

In a downtrend where is price in

relation to the moving average?

price is lower than the moving average

What is a golden cross? when short-term average crosses the long-term average from below; 'buy'

signal; emerging uptrend

What is a dead cross? when a short-term average crosses the long-term average from above, 'sell

signal'; emerging downtrend

What are bollinger bands? constructed based on the standard deviation of closing prices over the last 'n'

periods; move away from each other when volatility increases and move closer

together when prices are less volatile

What do contrarians believe? markets get overbought or oversold because most investors tend to buy and

sell at the wrong times, and thus it can be profitable to trade in the opposite

direction


Terms & Definitions

68

Terms Definitions

What is an oscillator? group of technical tools TAs use to identify overbought/oversold markets;

based on market prices but scaled so that they "oscillate" around a given value

such as zero or between two values such as zero and 100; extremely high

values indicate overbought condition whereas extremely low values indicate

oversold condition; can be used to identify convergence or divergence.

What does convergence indicate? price trend is likely to continue

What does divergence indicate? potential change in price trend

What are four examples of oscillators? 1) ROC (rate of change)

2) RSI (relative strength index)

3) MACD (moving average convergence/divergence)

4) stochastic oscillator

What is the ROC oscillator? 100 x latest closing price - closing price from n period earlier; buy when ROC

changes from negative to positive during an uptrend and sell when ROC

changes from positive to negative during downtrend


Questions

1. Which of the following is most likely to be considered a momentum indicator?

A. Put-call ratio

B. Breadth of market

C. Mutual fund cash position

2. A low price range in which buying activity is sufficient to stop a price decline is best described as:

A. Support

B. Resistance

C. Change in polarity

3. A technical analyst has detected a price chart pattern with three segments. The left segment shows a decline followed by a reversal to the starting price level. The middle segment shows a more pronounced decline than in the first segment and again a reversal to near the starting price level. The third segment is roughly a mirror image of the first segment. This chart pattern is most accurately described as:

A. A triple bottom

B. A head and shoulders

C. An inverse head and shoulders

69


Solution

1. B. List and describe examples of each major category of technical trading rules and indicators. Breadth of market is a momentum indicator. Put-call ratio and mutual fund cash position are contrary-opinion rules.

2. A. Support is defined as a low price range in which buying activity is sufficient to stop the decline in price.

3. C. An inverse head and shoulders pattern consists of a left segment that shows a decline followed by a reversal to the starting price level, a middle segment that shows a more pronounced decline than in the first segment and again a reversal to near the starting price level, and a third segment that is roughly a mirror image of the first segment.

70


Five Minute Recap

71


All possible samples of size n generated from a population will be approximately normally distributed.

The mean of the sampling distribution equal to μP and the standard deviation is equal to σP/√n. This is know as standard error.

Methods of Sampling

Simple Random Sampling Systematic Random Sampling Stratified Random Sampling

Desirable Properties of an Estimator

Unbiasedness Efficiency Consistency

Sampling Biases: Data mining bias Sample Selection Bias Look-ahead bias Time-Period Bias

Actual

Inference H0 is True H0 is False

H0 is True Correct Decision Confidence Level=1-α

Type-II Error P(Type-II Error)=β

H0 is False Type-I Error Significance Level=α

Power=1-β

Chi Square Test : Used for testing hypothesis concerning variance of a population F – Test : Used to test hypothesis about difference in variance of two different population

t-Distribution

0

0.05

0.1

0.15

0.2

0.25

0

0.1

0.15

0.2

0.25

0.05

quantitative methods - iii

Documents