econometrics i topic 1. introduction to econometricsmrubas/econometrics/pdf/ei_t1.pdf ·...
TRANSCRIPT
Econometrics ITopic 1. Introduction to
Econometrics
1
Katarzyna Bech-WysockaSGH Warsaw School of Economics
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Semestralny plan zajęć i struktura przedmiotu
Temat
1. Wprowadzenie do Ekonometrii
2. Metoda Najmniejszych Kwadratów
3. Istotność zmiennych objaśniających
4. Specyfikacja modelu. Nieliniowość.
5. Weryfikacja: współliniowość i normalność
6. Weryfikacja: heteroskedastyczność
7. Weryfikacja: autokorelacja
8. Modele dynamiczne
9. Stacjonarność i kointegracja
10. Prognoza ekonometryczna
11. Modele zmiennej jakościowej
12. Endogeniczność
13. Zmienne instrumentalne
14. Testy przy zmiennych instrumentalnych
2
Wykład (30)
Ćwiczenia (30)
Obecność na ćwiczeniach jest obowiązkowa, na wykładach więcej niż zalecana.
Składowe oceny końcowej:
- 2 sprawdziany w trakcie semestru- każdy za 6 pkt
- Projekt + aktywność w trakcie zajęć- 6 pkt
- Pisemny egzamin końcowy- 32 pkt
Oceny:∑ Ocena
(0,25) 2.0
<25,30) 3.0
<30,35) 3.5
<35,40) 4.0
<40,45) 4.5
<45,50> 5.0
Basic definitions & concepts
3
Katarzyna Bech-Wysocka, Econometrics I, Introduction
What is „Econometrics”?
4
Application of mathematical and statistical techniques to economics in the study of problems, the analysis of data, and the development and testing of theories and models. (i.e. Maths and Statistics
used in Economics).
ECONOMETRICS
Interesting facts: the term ’Econometrics’ was introduced to the literature by Paweł Ciompa in hiswork ’Zarys ekonometryi i teorya buchalteryi’ published in Lviv in 1910.
The fathers of modern Econometrics are the first Nobel Price winners in Economics: Ragnar Frisch
and Jan Tinbergen.
WHAT IS ECONOMETRICS USED FOR?
• To verify the existing economic theories• To quantify the relationships between observable data• To forecast (either the future or out of sample
outcomes)
Econometrics is all about how we canuse theory and data from economics,
business and the social sciences, alongwith tools from statistics, to answer
„how much” questions.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Some examples of research questions
5
A city council ponders the question of how much violent crime will be reduced if an additionalmilion dollars is spent putting uniformed police on the street.
The owner of a local Pizza Hut must decide how much advertising space to purchase in the local newspaper, and thus must estimate the relationship between advertising and sales.
1
2
Luisiana State University must estimate how much enrollment will fall if tuition is raised by $300 per semester, and thus whether its revenue from tuition will rise or fall.3
The CEO of Proctor&Gamle must estimate how much demand there will be in ten years for the detergent Tide, and how much to invest in new plant and equipment.4
You must decide how much of your savings will go into a stock fund, and how much into the money market. This requires you to predict the level of economic activity and interest rates.
5
Katarzyna Bech-Wysocka, Econometrics I, Introduction
What is the difference between Econometrics
and Economics?
6
Econometricians and Economists understand differently the meaning of a word ’Model’.
For Economists, the model is a given relationship between certain objects, as in simpledemand/supply model, where demand is simply a function of a price of a centain good:
� � a � bP
An Econometrician would see this relationship as :
�� � α � β� � �� �ℎ��� ���� ��������� ������ � ��
and they would use this equation to estimate how price and demand are related.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Different types of data
7
• Qualitative vs. Quantitative
• Data might be collected at various levels of aggregation: • Microeconomic Dataset (individual units; agents, firms, etc)• Macroeconomic Dataset (aggregate economic features; GDP, unemployment rates, etc)
• Discrete vs. Continuous
• Which type of dataset answers your research question best?• Cross-sectional data: sample of different units (individuals, firms, etc.) observed at a given
point in time• Time Series data: the same unit (individual, firm, etc.) is observed at different points in time• Panel (longitudinal) data: the same cross-section of units is observed over time.
• Flow vs. Stock
Before choosing the most appropriate dataset, you must understand what type of economicelements participate in the phenomenon you want to analyse.
• Experimental vs. Non-experimental
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Example of cross-section
8
Examples of research questions?
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Example of time-series
9
Examples of research questions?
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Example of panel data
10
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Online Data Collections and Sources
11
A good place to start: The Economic Network’s website:http://www.economicsnetwork.ac.uk/links/data_free.htmProvides a wide array of data sources in an organized fashion
The Econometrics Journal Links: http://www.feweb.vu.nl/econometriclinks/#data Financial, social, etc. & US and world
The World Bank’s website: http://data.worldbank.org/topic Agriculture, education, climate change, poverty, etc. & by country
The OECD’s website: http://www.oecd-ilibrary.org/statistics OECD regional, productivity, tax, energy, trade statistics, etc. & OECD and international
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Online Data Collections and Sources
12
The CIC website: https://pwt.sas.upenn.edu Penn world table, international comparisons of production, income and prices
The American Economic Association’s website: http://rfe.org/showCat.php?cat_id=2 Macro, regional, financial, etc.; US and world
The FRED’s website: http://research.stlouisfed.org/fred2/categories/ National accounts, labour market, population, finance, banking, production, etc.; US and international
The UK Data Archive’s website: http://ukdataservice.ac.uk and http://data-archive.ac.uk/deposit/use/
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Econometric model specification
13
� � �� � ����� � ⋯ � � � � � ��• � - dependent variable• ��� , … , � � - regressors (explanatory variables)• �� - error (disturbance) term• ��, ��, … , � - population parameters• substrict i indicates the sample observation. Convention: i in cross-section, t in time series• N is the sample size in cross section, T in time series• K is the number of regressors, K+1 number of betas to estimate • This is an example of a linear model. Later in the semester you will see some nonlinear models.
How does the econometric model look like?
Theoretical model:
�# � �$� � �$���� � ⋯ � �$ � �
Empirical model (after you estimate its parameters):
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Econometric model specification –
matrix notation
14
� %� � �• & ' 1 - vector of dependent variable sample observations• % )& ' )* � 1++ - matrix of regressors• � & ' 1 - vector of error (disturbance) terms• � )* � 1+ ' 1 - vector of parameters• dimensions: N sample size, K+1 number of parameters.
Typically it is more convenient to employ matrix and vector notation.
Empirical model (after you estimate its parameters):
Theoretical model:
, � %�$
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Approach to econometric model building
15
Setting up a reserach question
Choosing a functional form and the set of explanatory variables
1
2
Collecting the data3
Estimating the model4
Verification process5
Application6
There are certain steps in econometric modeling:
Statistical Primer
16
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Random Variables and Probabilities
17
A variable whose value is unknown until it is observed.
RANDOM VARIABLE
A discrete random variable can take only a limited, or countable, number of values• An indicator variable taking the values one if yes, or zero if no.
A random variable that can have any value is treated as a continuous random variable
For continuous random variables, since P(X = x)=0 the probability density function can be
interpreted as the relative probability, so that � - % - . � / � � ��01 and / � � �� � 1.34
54
PROBABILITY DENSITY FUNCTION
We summarize the probabilities of possible outcomes using a probability mass function (for
discrete RVs) or probability density function (for continuous RVs):
PROBABILITY MASS FUNCTIONFor a discrete random variable X the value of the probability mass function f(x) is the probability that the random variable X takes the value x, f(x) = P(X = x). It is obvious that ∑ � �� � 1� .
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Random variables and Probabilities
18
CUMULATIVE DISTRIBUTION FUNCTION
The cdf of the random variable X, denoted F(x), gives the probability that X is less than or equal to a specific value x:
7 � � % - � .The cumulative probability can be calculated for any x between 8∞ and �∞.
The First Fundamental Theorem of Calculus states:
: � � �� � 7 . 8 7 �0
1or
7; � � � � .
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Joint, Marginal and Conditional Probabilities
19
JOINT PROBABILITY
MARGINAL PROBABILITY
CONDITIONAL PROBABILITY
A joint probability is about the probability of two events occurring simultaneously:�<,= �, � % � �, > � .
Given a joint probability density function, we can obtain the probability distributions of individual random variables, which are also known as marginal distributions:
�< � � % � � � ∑ �<,=)�, +? .
For continuous RVs: �< � � / � �, � .
The conditional probability is the probability of the outcome X = x given that Y = y has occurred . The effect of the conditioning is to reduce the set of possible outcomes .The conditional probability that the random variable X takes the value x giventhat Y=y is written P(X = x|Y = y) . This conditional probability is given by the conditional pdf f(x|y):
� � � % � � > � � @A,B)C,?+@B)?+ .
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Statistical Independece
20
INDEPENDENCETwo random variables are statistically independent if the conditional probability that X = x given that Y = y, is the same as the unconditional probability that X = x:
� � � @A,B)C,?+@B)?+ � �<)�+.
We can also say that X and Y are statistically independent if their joint pdf factors into the product of their marginal pdf ’s:
�<,= �, � �< � �= .
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Expectations
21
EXPECTED VALUEThe mean of a random variable is given by its mathematical expectation .If X is a discrete random variable, then the mathematical expectation, or expected value, of X is:
D � E % � F ��� �� .�
For continuous RVs: D � E % � / �� � ��.The expected value of the random variable is the average value that occurs in many repeated trials of an experiment.
Note that population mean (D+ G sample average (�̅+.Many economic questions are formulated in terms of conditional expectation, or the conditional mean. For a discrete random variable the conditional expected value is:
D<|= � E % > � � F ��� �� .For continuous RVs: D<|= � E % > � � / �� �| ��.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Expectations
22
EXPECTED VALUE
RULES:• If g(X) is a function of the random variable X, then g(X) is also random .• If X is a discrete random variable, then the expected value of g(X) is obtained using:
E � % � F � � � � .For example: if a is a constant, then g(X) = aX is a function of X, and:
E �% � �E % .Similarly, if a and b are constants, then:
E �% � . � �E % � ..Also, if g1(X) and g2(X) are functions of X, then:
E �� % � �J % � E �� % � EK�J % L.
Remember the phrase ‘‘the expected value of a sum is the sum of the expected values’’.
• Law of Iterated Expectations:E % � E KE)%|>+L
most commonly used in E %> � E<K%E)>|%+L.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Variance & Covariance
23
VARIANCE
COVARIANCE
The variance of a discrete or continuous random variable X is:
M�N % � O<J � EK% 8 E)%+LJ� : �)�+)� 8 D+J ��.Simple algebra gives:
M�N % � E %J 8 E % J.The square root of the variance is called the standard deviation.
RULES:
• Let a and b be constants, then:M�N �% � . � �JM�N % .
• M�N % � > � M�N % � M�N > � 2�QR %, > .
The covariance between X and Y is:
�QR %, > � O<= � E % 8 E % > 8 E > � : : � � �, ��� .
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Variance & Covariance
24
CORRELATION COEFFICIENTThe correlation between X and Y is:
�QNN %, > � S<= � O<=O<O=
.
NOTE: uncorrelatedness does not imply independence, unless variables are normal.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Examples of standard distributions
25
NORMALFor %~& D, OJ :
� � � �JV exp 5)C5Z+[
\[ .
Standarization to ]~& 0,1 :] � <5Z
\ and � _ � �JV exp 8 �
J _J .Calculating probabilities:
� - % - . � � 8 DO - ] - . 8 D
O � Φ . 8 DO 8 Φ � 8 D
Owhere Φ is the cdf of Z.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Examples of standard distributions
26
CHI-SQUAREFor independent %�~&)D, OJ+ the variable
M � %�J � ⋯ � %aJ ~χJ)�+has a chi-square distribution with m degrees of freedom and
E M � � ��� M�N M � 2�.Chi-square pdf
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Examples of standard distributions
27
STUDENT TFor independent Z~& 0,1 ��� M~χJ � the variable
� � def
~� �has a t-distribution with m degrees of freedom.
Important: for large m, t distribution converges to standard normal i.e. �)� → ∞+ isindistinguishable from & 0,1 . Practically, it is enough when � � 150.
FFor independent M�~χJ � ��� MJ~χJ i the variable
7 � jk/aj[/m ~7 �, i
has an F-distribution with m and k degrees of freedom.
Important: for large k, 7 �, i distribution converges to chi-square(m).
Linear Algebra
28
Katarzyna Bech-Wysocka, Econometrics I, Introduction
General definitions
MATRIX
TRANSPOSE
A matrix A is a rectangular array of elements with, say, m rows and n columns. We then say that A
is of order � ' �.
Given the matrix (� ' �) A, its transpose A’ is defined as the (n ' �) matrix obtained by interchanging rows and columns of A.
Properties:o � p ; � o; � p;
o; ; � oλo ; � λo;
op ; � p;o′
Katarzyna Bech-Wysocka, Econometrics I, Introduction
General definitions
30
SUM
PRODUCT WITH SCALAR
PRODUCT OF TWO MATRICIES
Let A and B be two matrices of order � ' � and n ' s, respectively. The product of A by B, denoted AB is the � ' s matrix C such that
��m � ∑ ��t.tmutv� , � � 1, … , � ��� i � 1, … , s.
The product of a matrix A � ' � by a scalar λ is an � ' � matrix λA with (i,j) element given by λ��tfor � � 1, … , � ��� w � 1, … , �.
Given two matrices of orders � ' �, call tchem A and B, their sum C=A+B is an � ' � matris suchthat ��t � ��t � .�t for � � 1, … , � ��� w � 1, … , �.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
General definitions
31
SQUARE MATRIX
SYMMETRIC MATRIX
IDENTITY MATRIX
Let A be an � ' � matrix. A is square if � � �.Note: for square matrices A and B of the same order AB and BA are well defined but op G po.
A square matrix of order n is the identity matrix, denoted xu if its diagonal terms are equal to one, zeros elsewhere. Note that for A and B of right orders oxu � o ��� xup � p.
Let A be a squarematrix of order �. A is symmetric if A � o′.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
General definitions
32
INVERSE
DETERMINANT
Let A be a square matrix of order n. The matrix B is the inverse of A if op � po � xu . If such B
exists, then A is said to be invertible (or non-singular). Note that a square matrix has at most one inverse.
Properties:)o5�+5�� o
)o;+5�� )o5�+;)op+5�� p5�o5�
The determinant of a square matrix A of order n, denoted |A| is defined as
o � F ��t)81+�3to�t ,u
tv�where o�t is the )�, w+z{ minor, i.e. the determinant of the matrix formed by deleting the ith row
and jth column of A.
Note: if a row (or column) can be expressed as a linear combination of the other rows (or columns), then the determinant is zero. A matrix is invertible if |o| G 0.
Katarzyna Bech-Wysocka, Econometrics I, Introduction
General definitions
33
LINEAR INDEPENDENCE- RANK
A set of vectors is said to be linearly independent if none of tchem can be expressed as a linearcombination of the rest.
Let A be an � ' � matrix. We define the row rank of A as the numer of linear independent rowvectors and the column rank of A as the numer of linear independent column vectors.
Note that an invertible square matrix must be of a full rank.
Exercises
34
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Exercises
Exercise 1.1.
a) Expand the matrix product
% � )op � )|�+′+) E7 5� � }~+ ′.Assume that all matrices are square and E and F are non-singular.
b) If X is a non-null matrix of order � ' i � � i show that X’X is symmetric.
35
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Exercises
Exercise 1.2.
a) Think of at least two examples of each type of variables described during lecture:
continuous
count
nominal binary
ordinal multi-categorical.
b) From the variables you listed above choose one from each type (call it y) and write down a full econometric linear model where your dependent variable is y.
36
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Exercises
Exercise 1.3.
Let X take 4 values �� � 1, �J � 3, �� � 5, �� � 3.
a) Calculate the arithmetic average �̅ � ∑ C��
��v� .b) Calculate ∑ )�� 8 �̅+��v� .
c) Calculate ∑ )�� 8 �̅+J��v� .
d) Calculate ∑ ��J��v� 8 4�̅J.
37
Katarzyna Bech-Wysocka, Econometrics I, Introduction
Exercises
Exercise 1.4.
Open Gretl. Upload the file cps5.gdt and do the following:
a) Learn what can be found under Tools—Statistical tables, Tools—P-value finder, Tools—Test statistic calculator.
b) Generate and discuss the descriptive statistics of the variable WAGE.
c) Plot WAGE against EDUC. Comment on this relationship.
d) Find the matrix of the correlation coefficients between WAGE, EDUC and EXPER.
e) Generate new variables: E��|J, ln E��| , E��|, ���� �<���⁄ .
f) Generate a new variable, which takes value 1 for the first 300 observations, 0 for the rest. Edit the last observation in the sample, so that the value of this new variable is also 1.
g) Limit the sample of observations to 1-500.
h) Write down the theoretical model explaining variation in wages. Choose explanatory variables based on the economic knowledge you have gained so far and using common sense.
i) What are your expectations on the signs of the above relationship? Discuss whether the impact of selected explanatory variables on wages is positive or negative and why.
38
Homework
39
Katarzyna Bech-Wysocka, Econometrics I, Introduction
As Homework please revise the material in Statistical Primer and Linear Algebra section. If you arenot familiar with these concepts, please go back to your notes from your previous Mathematics and Statistics modules.
Additionally, please spend some time experimenting with Gretl.
40
Homework