ec203i2ee lec3 kt
TRANSCRIPT
-
7/30/2019 Ec203I2EE Lec3 Kt
1/21
Lecture 3 Basic
probability theory
Lectures 1 and 2 (and Labs 1 and 2)
basic work with economic data
Very valuable!!
However, economists often want to use
more sophisticated statistical techniques to
examine relationships between economic
variables in more detail
Remainder of class work towards basic
single variable regression analysis
First step basic probability theory
Gujarati Chapters 2 and 3
read all
here, key points
EC203 Introduction to Empirical Economics. KT. 1
-
7/30/2019 Ec203I2EE Lec3 Kt
2/21
Random variables
A random (or statistical) experiment is aprocess leading to at least two possible
outcomes
There will be uncertainty as to which
outcome will occur
Example: rolling a fair dice
observe the number shownuppermost
possible outcomes: 1, 2, 3, 4, 5 or 6
For a random experiment, we know in
advance all the possible outcomes
What we do NOT know in advance is
which outcome will occur in any particularexperiment
Sample space (population)
The set of all possible outcomes of the
experiment: here, {1,2,3,4,5,6}
EC203 Introduction to Empirical Economics. KT. 2
-
7/30/2019 Ec203I2EE Lec3 Kt
3/21
A random variable is a variable whose
(numerical) value is determined by the
outcome of a random experiment.
Example: toss two fair coins
Let H denote a head, T a tail
There are four possible outcomes:
{HH, HT, TH, TT}
Now consider a variable X, defined as the
number of heads that are observed in the
throw of two fair coins, or number of
heads
The situation is as follows
Possible outcomes Number of heads
TT 0TH 1
HT 1
HH 2
EC203 Introduction to Empirical Economics. KT. 3
-
7/30/2019 Ec203I2EE Lec3 Kt
4/21
The variable, X, number of heads , is a
random orstochastic variable and has 3
possible outcomes:
X
0
1
2
Random variables (r.v) may be discrete or
continuous:
A discrete r.v. takes on only a finitenumber of particular values
A continuous r.v. can take on anyvalue in some interval of values
Both the roll of the dice and toss of the
coin we have looked at are discrete r.v.s
An example of a continuous r.v. is the
rainfall falling in Glasgow per year.
Focussing initially on discrete r.v.s makes
concepts easier to grasp.
EC203 Introduction to Empirical Economics. KT. 4
-
7/30/2019 Ec203I2EE Lec3 Kt
5/21
Probability
Logical reasoning and/or empiricalevidence may give us some feeling of how
likely different outcomes are
E.g. throw of the dice: outcomes{1,2,3,4,5,6} are all equally likely
Basic coin toss H or T bothoutcomes equally likely
For the two coin toss example,
we can expect the value 1 to occurwith twoic the likelihood of value 0
the values 0 and 2 are equally likely
Lets use the notation Pr of a particular
outcome, we can now deduce some
probabilities for this example
Possible outcome Pr of outcome
O heads 1/4
1 head 2/4
2 heads 1/4
EC203 Introduction to Empirical Economics. KT. 5
-
7/30/2019 Ec203I2EE Lec3 Kt
6/21
Or, we can write Pr(2 heads) = 2/4 (or 1/2)
etc i.e.
Note that
the probabilities sum to one, as weare distributing a total of 1
the value of 1 corresponds to a
certainty: we know that one of theoutcomes will occur
each of the outcomes are mutuallyexclusive (i.e. they cannot occur at
the same time)
Note that this classical definition of
probability is what we call an a priori
definition
the probabilities are derived frompurely deductive reasoning
However, what if the outcomes of an
experiment are not finite and cannot be
stated with certainty?
EC203 Introduction to Empirical Economics. KT. 6
-
7/30/2019 Ec203I2EE Lec3 Kt
7/21
E.g. what is the probability that GDP will
rise by a certain amount?
Relative frequency or empiricaldefinition of probability
Distinguish between absolute and relative
frequency
the absolutefrequency is thenumber of occurrences of a given
event
e.g. 10 students in this class get an
exam mark of70%
if there are 50 students in the classthe relativefrequency of the event
of achievement of first class marks
is 1/5
The frequency distribution of marks
achieved by all 50 students in the classwould show the different marking bands
and how students are distributed across it
in both relative and absolute terms.
EC203 Introduction to Empirical Economics. KT. 7
-
7/30/2019 Ec203I2EE Lec3 Kt
8/21
Can we treat relative frequencies as
probabilities?
Yes,provided the number of observations
that the relative frequencies are based on
is reasonably large
The empirical, or relative frequency,definition of probability
See Gujarati on properties of probabilities,
but ignore Bayes Theorem
Probability of random variables
First discrete r.v.s
takes only a finite number of values
If X is an r.v. with distinct values x1, x2..xnThe functionfis defined by
f(X=xi) = P(X=xi) i=1,2,N
= 0 if xxi
EC203 Introduction to Empirical Economics. KT. 8
-
7/30/2019 Ec203I2EE Lec3 Kt
9/21
is called the probability mass function
(PMF) or probability function (PF)
Note that
0 f(xi) 1
i.e. the probability of X taking the value of
xi lies between 0 and 1, and
f(xi)=1
From slide 5
Number of heads PFX f(X)
O heads 1/4
1 head 1/2
2 heads 1/4
Sum 1
Geometrically?Insert the PMF of the number of heads in a two coin toss (see Gujarati Fig 2.2)
Expected value of a random variable, X
EC203 Introduction to Empirical Economics. KT. 9
-
7/30/2019 Ec203I2EE Lec3 Kt
10/21
1( ) ( * ( )
i n
i iiE X x f x
=
=
= )
EC203 Introduction to Empirical Economics. KT. 10
-
7/30/2019 Ec203I2EE Lec3 Kt
11/21
Probability distribution of a continuous
r.v
Instead of a probability mass function we
have a probability density function
(PDF)
Because a continuous r.v. can take an
infinite number of values the probability of
it taking any one is always measured over
an interval
Formally, means use of integral rather than
summation operator (used for discreter.v.s)
2
11 2
( ( )
x
xP x X x f x dx< < =
for all x1
-
7/30/2019 Ec203I2EE Lec3 Kt
12/21
(Insert diagram and/or see Gujarati Fig 2.3)
Note thatf(xi) =0
Properties of a PDF
1. Total area under the curvef(x) is 1
2. P(x1
-
7/30/2019 Ec203I2EE Lec3 Kt
13/21
Gujarati
Miss material on cumulative
distribution functions for now Also miss section on multivariate
probability density functions (later
class)
Statistical independence will also be keyin later courses
For now, move onto Chapter 3
Characteristics of probability
distributions
* also referred to as moments of PDFs
Next slide
EC203 Introduction to Empirical Economics. KT. 13
-
7/30/2019 Ec203I2EE Lec3 Kt
14/21
Moments of PDFs
The first moment of a PDF is the expectedvalue of the random variable it represents
the weighted average of all possiblevalues of all possible values
where the probabilities of these
values serve as weights also the average ormean valuethe population mean value
E.g. throwing a dice
outcomes are {1,2,3,4,5,6}
each with a probability of 1/6
So, EV(X)=1/6+2/6+3/6+4/6+5/6+6/6 =
21/6 = 3.5
Odd, since this is a discrete r.v., with 3.5not an option? Think of if someone gave
you 1 for each number on the dice (i.e. 6
for the 6, 1 for the 1), after a number of
rolls of the die, you would anticipate
receiving 3.50 per roll
EC203 Introduction to Empirical Economics. KT. 14
-
7/30/2019 Ec203I2EE Lec3 Kt
15/21
GeometricallyInsert or take Gujarati Fig 3.1
Gujarati read section on properties of the
expected value
Key here is that the expected value is a
measure of central tendency of the PDF
Ignore for now section on EV ofmultivariate PDFs
Our next focus is the second moment of
the PDF the variance, a measure of
dispersion
EC203 Introduction to Empirical Economics. KT. 15
-
7/30/2019 Ec203I2EE Lec3 Kt
16/21
Variance of a PDF
In Lecture 2, we looked at the standarddeviation
2
1( )
1
Nii
Y Ys
N=
=
i.e. we square the total of summing
across the deviation of each
observation of Y from the sample mean
and divide by the number of
observations minus 1
In empirical economics we normally
replace s with x as notation for thestandard deviation of variable X
The variance is defined as
2var( ) xX =
that is, the square of the standard deviation
EC203 Introduction to Empirical Economics. KT. 16
-
7/30/2019 Ec203I2EE Lec3 Kt
17/21
We wont go into details on computing the
variance here see Gujarati, but not very
intuitive, so well move on..
Several r.v.s may have the same expected
value but different variances.
Geometrically..(Insert or Gujaratic Fig 3.2)
See Gujarati on the properties of variance.
The key one for this particular class is the
first the variance of a constant is zero (by
definition, a constant has no variability
The final 3 concepts today link to Lec 4..
EC203 Introduction to Empirical Economics. KT. 17
-
7/30/2019 Ec203I2EE Lec3 Kt
18/21
In Gujarati, skip until we get to the
coefficient of variation
this is a measure of relative variationbetween the mean and standarddeviation of an r.v.
because the mean and standarddeviation will be in common units of
measurement, the coefficient ofvariation this is independent of units
therefore it is useful for comparisonacross different r.v.s
Then there is the covariance
this is a special kind of EV thatmeasures how two variables vary or
move together
it can be positive, negative or zero
Again, the computation of the covarianceis not too intuitive and we wont be
focussing on it here other than its role in
calculating the correlation coefficient, but
you will in future applied classes
EC203 Introduction to Empirical Economics. KT. 18
-
7/30/2019 Ec203I2EE Lec3 Kt
19/21
The last but one thing we are interested in
in Gujarati Ch3 is the (population)
correlation coefficient this is found by taking the covariance
of 2 r.v.s and dividing by the product
of their standard deviations
Thus, it is a measure oflinearassociation
between two variables..this is the focus
of our next lecture
Finally, if you skip forward to the section
titled From the population to the sample,
and read over about thesample mean,variance, covariance and correlation
coefficient, this will be our focus in
Lecture 6
EC203 Introduction to Empirical Economics. KT. 19
-
7/30/2019 Ec203I2EE Lec3 Kt
20/21
EC203 Introduction to Empirical Economics. KT. 20
-
7/30/2019 Ec203I2EE Lec3 Kt
21/21