ec203i2ee lec3 kt

7/30/2019 Ec203I2EE Lec3 Kt

1/21

Lecture 3 Basic

probability theory

Lectures 1 and 2 (and Labs 1 and 2)

basic work with economic data

Very valuable!!

However, economists often want to use

more sophisticated statistical techniques to

examine relationships between economic

variables in more detail

Remainder of class work towards basic

single variable regression analysis

First step basic probability theory

Gujarati Chapters 2 and 3

read all

here, key points

EC203 Introduction to Empirical Economics. KT. 1

7/30/2019 Ec203I2EE Lec3 Kt

2/21

Random variables

A random (or statistical) experiment is aprocess leading to at least two possible

outcomes

There will be uncertainty as to which

outcome will occur

Example: rolling a fair dice

observe the number shownuppermost

possible outcomes: 1, 2, 3, 4, 5 or 6

For a random experiment, we know in

advance all the possible outcomes

What we do NOT know in advance is

which outcome will occur in any particularexperiment

Sample space (population)

The set of all possible outcomes of the

experiment: here, {1,2,3,4,5,6}


7/30/2019 Ec203I2EE Lec3 Kt

3/21

A random variable is a variable whose

(numerical) value is determined by the

outcome of a random experiment.

Example: toss two fair coins

Let H denote a head, T a tail

There are four possible outcomes:

{HH, HT, TH, TT}

Now consider a variable X, defined as the

number of heads that are observed in the

throw of two fair coins, or number of

heads

The situation is as follows

Possible outcomes Number of heads

TT 0TH 1

HT 1

HH 2


7/30/2019 Ec203I2EE Lec3 Kt

4/21

The variable, X, number of heads , is a

random orstochastic variable and has 3

possible outcomes:

X

0

1

2

Random variables (r.v) may be discrete or

continuous:

A discrete r.v. takes on only a finitenumber of particular values

A continuous r.v. can take on anyvalue in some interval of values

Both the roll of the dice and toss of the

coin we have looked at are discrete r.v.s

An example of a continuous r.v. is the

rainfall falling in Glasgow per year.

Focussing initially on discrete r.v.s makes

concepts easier to grasp.


7/30/2019 Ec203I2EE Lec3 Kt

5/21

Probability

Logical reasoning and/or empiricalevidence may give us some feeling of how

likely different outcomes are

E.g. throw of the dice: outcomes{1,2,3,4,5,6} are all equally likely

Basic coin toss H or T bothoutcomes equally likely

For the two coin toss example,

we can expect the value 1 to occurwith twoic the likelihood of value 0

the values 0 and 2 are equally likely

Lets use the notation Pr of a particular

outcome, we can now deduce some

probabilities for this example

Possible outcome Pr of outcome

O heads 1/4

1 head 2/4

2 heads 1/4


7/30/2019 Ec203I2EE Lec3 Kt

6/21

Or, we can write Pr(2 heads) = 2/4 (or 1/2)

etc i.e.

Note that

the probabilities sum to one, as weare distributing a total of 1

the value of 1 corresponds to a

certainty: we know that one of theoutcomes will occur

each of the outcomes are mutuallyexclusive (i.e. they cannot occur at

the same time)

Note that this classical definition of

probability is what we call an a priori

definition

the probabilities are derived frompurely deductive reasoning

However, what if the outcomes of an

experiment are not finite and cannot be

stated with certainty?


7/30/2019 Ec203I2EE Lec3 Kt

7/21

E.g. what is the probability that GDP will

rise by a certain amount?

Relative frequency or empiricaldefinition of probability

Distinguish between absolute and relative

frequency

the absolutefrequency is thenumber of occurrences of a given

event

e.g. 10 students in this class get an

exam mark of70%

if there are 50 students in the classthe relativefrequency of the event

of achievement of first class marks

is 1/5

The frequency distribution of marks

achieved by all 50 students in the classwould show the different marking bands

and how students are distributed across it

in both relative and absolute terms.


7/30/2019 Ec203I2EE Lec3 Kt

8/21

Can we treat relative frequencies as

probabilities?

Yes,provided the number of observations

that the relative frequencies are based on

is reasonably large

The empirical, or relative frequency,definition of probability

See Gujarati on properties of probabilities,

but ignore Bayes Theorem

Probability of random variables

First discrete r.v.s

takes only a finite number of values

If X is an r.v. with distinct values x1, x2..xnThe functionfis defined by

f(X=xi) = P(X=xi) i=1,2,N

= 0 if xxi


7/30/2019 Ec203I2EE Lec3 Kt

9/21

is called the probability mass function

(PMF) or probability function (PF)

Note that

0 f(xi) 1

i.e. the probability of X taking the value of

xi lies between 0 and 1, and

f(xi)=1

From slide 5

Number of heads PFX f(X)

O heads 1/4

1 head 1/2

2 heads 1/4

Sum 1

Geometrically?Insert the PMF of the number of heads in a two coin toss (see Gujarati Fig 2.2)

Expected value of a random variable, X


7/30/2019 Ec203I2EE Lec3 Kt

10/21

1( ) ( * ( )

i n

i iiE X x f x

=

=

= )


7/30/2019 Ec203I2EE Lec3 Kt

11/21

Probability distribution of a continuous

r.v

Instead of a probability mass function we

have a probability density function

(PDF)

Because a continuous r.v. can take an

infinite number of values the probability of

it taking any one is always measured over

an interval

Formally, means use of integral rather than

summation operator (used for discreter.v.s)

2

11 2

( ( )

x

xP x X x f x dx< < =

for all x1

7/30/2019 Ec203I2EE Lec3 Kt

12/21

(Insert diagram and/or see Gujarati Fig 2.3)

Note thatf(xi) =0

Properties of a PDF

1. Total area under the curvef(x) is 1

2. P(x1

7/30/2019 Ec203I2EE Lec3 Kt

13/21

Gujarati

Miss material on cumulative

distribution functions for now Also miss section on multivariate

probability density functions (later

class)

Statistical independence will also be keyin later courses

For now, move onto Chapter 3

Characteristics of probability

distributions

* also referred to as moments of PDFs

Next slide


7/30/2019 Ec203I2EE Lec3 Kt

14/21

Moments of PDFs

The first moment of a PDF is the expectedvalue of the random variable it represents

the weighted average of all possiblevalues of all possible values

where the probabilities of these

values serve as weights also the average ormean valuethe population mean value

E.g. throwing a dice

outcomes are {1,2,3,4,5,6}

each with a probability of 1/6

So, EV(X)=1/6+2/6+3/6+4/6+5/6+6/6 =

21/6 = 3.5

Odd, since this is a discrete r.v., with 3.5not an option? Think of if someone gave

you 1 for each number on the dice (i.e. 6

for the 6, 1 for the 1), after a number of

rolls of the die, you would anticipate

receiving 3.50 per roll


7/30/2019 Ec203I2EE Lec3 Kt

15/21

GeometricallyInsert or take Gujarati Fig 3.1

Gujarati read section on properties of the

expected value

Key here is that the expected value is a

measure of central tendency of the PDF

Ignore for now section on EV ofmultivariate PDFs

Our next focus is the second moment of

the PDF the variance, a measure of

dispersion


7/30/2019 Ec203I2EE Lec3 Kt

16/21

Variance of a PDF

In Lecture 2, we looked at the standarddeviation

2

1( )

1

Nii

Y Ys

N=

=

i.e. we square the total of summing

across the deviation of each

observation of Y from the sample mean

and divide by the number of

observations minus 1

In empirical economics we normally

replace s with x as notation for thestandard deviation of variable X

The variance is defined as

2var( ) xX =

that is, the square of the standard deviation


7/30/2019 Ec203I2EE Lec3 Kt

17/21

We wont go into details on computing the

variance here see Gujarati, but not very

intuitive, so well move on..

Several r.v.s may have the same expected

value but different variances.

Geometrically..(Insert or Gujaratic Fig 3.2)

See Gujarati on the properties of variance.

The key one for this particular class is the

first the variance of a constant is zero (by

definition, a constant has no variability

The final 3 concepts today link to Lec 4..


7/30/2019 Ec203I2EE Lec3 Kt

18/21

In Gujarati, skip until we get to the

coefficient of variation

this is a measure of relative variationbetween the mean and standarddeviation of an r.v.

because the mean and standarddeviation will be in common units of

measurement, the coefficient ofvariation this is independent of units

therefore it is useful for comparisonacross different r.v.s

Then there is the covariance

this is a special kind of EV thatmeasures how two variables vary or

move together

it can be positive, negative or zero

Again, the computation of the covarianceis not too intuitive and we wont be

focussing on it here other than its role in

calculating the correlation coefficient, but

you will in future applied classes


7/30/2019 Ec203I2EE Lec3 Kt

19/21

The last but one thing we are interested in

in Gujarati Ch3 is the (population)

correlation coefficient this is found by taking the covariance

of 2 r.v.s and dividing by the product

of their standard deviations

Thus, it is a measure oflinearassociation

between two variables..this is the focus

of our next lecture

Finally, if you skip forward to the section

titled From the population to the sample,

and read over about thesample mean,variance, covariance and correlation

coefficient, this will be our focus in

Lecture 6


7/30/2019 Ec203I2EE Lec3 Kt

20/21


7/30/2019 Ec203I2EE Lec3 Kt

21/21

ec203i2ee lec3 kt

Documents