econometrics i topic 1. introduction to econometricsmrubas/econometrics/pdf/ei_t1.pdf ·...

Econometrics ITopic 1. Introduction to

Econometrics

1

Katarzyna Bech-WysockaSGH Warsaw School of Economics

Katarzyna Bech-Wysocka, Econometrics I, Introduction

Semestralny plan zajęć i struktura przedmiotu

Temat

1. Wprowadzenie do Ekonometrii

2. Metoda Najmniejszych Kwadratów

3. Istotność zmiennych objaśniających

4. Specyfikacja modelu. Nieliniowość.

5. Weryfikacja: współliniowość i normalność

6. Weryfikacja: heteroskedastyczność

7. Weryfikacja: autokorelacja

8. Modele dynamiczne

9. Stacjonarność i kointegracja

10. Prognoza ekonometryczna

11. Modele zmiennej jakościowej

12. Endogeniczność

13. Zmienne instrumentalne

14. Testy przy zmiennych instrumentalnych

2

Wykład (30)

Ćwiczenia (30)

Obecność na ćwiczeniach jest obowiązkowa, na wykładach więcej niż zalecana.

Składowe oceny końcowej:

- 2 sprawdziany w trakcie semestru- każdy za 6 pkt

- Projekt + aktywność w trakcie zajęć- 6 pkt

- Pisemny egzamin końcowy- 32 pkt

Oceny:∑ Ocena

(0,25) 2.0

<25,30) 3.0

<30,35) 3.5

<35,40) 4.0

<40,45) 4.5

<45,50> 5.0

Basic definitions & concepts

3


What is „Econometrics”?

4

Application of mathematical and statistical techniques to economics in the study of problems, the analysis of data, and the development and testing of theories and models. (i.e. Maths and Statistics

used in Economics).

ECONOMETRICS

Interesting facts: the term ’Econometrics’ was introduced to the literature by Paweł Ciompa in hiswork ’Zarys ekonometryi i teorya buchalteryi’ published in Lviv in 1910.

The fathers of modern Econometrics are the first Nobel Price winners in Economics: Ragnar Frisch

and Jan Tinbergen.

WHAT IS ECONOMETRICS USED FOR?

• To verify the existing economic theories• To quantify the relationships between observable data• To forecast (either the future or out of sample

outcomes)

Econometrics is all about how we canuse theory and data from economics,

business and the social sciences, alongwith tools from statistics, to answer

„how much” questions.


Some examples of research questions

5

A city council ponders the question of how much violent crime will be reduced if an additionalmilion dollars is spent putting uniformed police on the street.

The owner of a local Pizza Hut must decide how much advertising space to purchase in the local newspaper, and thus must estimate the relationship between advertising and sales.

1

2

Luisiana State University must estimate how much enrollment will fall if tuition is raised by $300 per semester, and thus whether its revenue from tuition will rise or fall.3

The CEO of Proctor&Gamle must estimate how much demand there will be in ten years for the detergent Tide, and how much to invest in new plant and equipment.4

You must decide how much of your savings will go into a stock fund, and how much into the money market. This requires you to predict the level of economic activity and interest rates.

5


What is the difference between Econometrics

and Economics?

6

Econometricians and Economists understand differently the meaning of a word ’Model’.

For Economists, the model is a given relationship between certain objects, as in simpledemand/supply model, where demand is simply a function of a price of a centain good:

� � a � bP

An Econometrician would see this relationship as :

�� α � β� � �� ℎ��

and they would use this equation to estimate how price and demand are related.


Different types of data

7

• Qualitative vs. Quantitative

• Data might be collected at various levels of aggregation: • Microeconomic Dataset (individual units; agents, firms, etc)• Macroeconomic Dataset (aggregate economic features; GDP, unemployment rates, etc)

• Discrete vs. Continuous

• Which type of dataset answers your research question best?• Cross-sectional data: sample of different units (individuals, firms, etc.) observed at a given

point in time• Time Series data: the same unit (individual, firm, etc.) is observed at different points in time• Panel (longitudinal) data: the same cross-section of units is observed over time.

• Flow vs. Stock

Before choosing the most appropriate dataset, you must understand what type of economicelements participate in the phenomenon you want to analyse.

• Experimental vs. Non-experimental


Example of cross-section

8

Examples of research questions?


Example of time-series

9

Examples of research questions?


Example of panel data

10


Online Data Collections and Sources

11

A good place to start: The Economic Network’s website:http://www.economicsnetwork.ac.uk/links/data_free.htmProvides a wide array of data sources in an organized fashion

The Econometrics Journal Links: http://www.feweb.vu.nl/econometriclinks/#data Financial, social, etc. & US and world

The World Bank’s website: http://data.worldbank.org/topic Agriculture, education, climate change, poverty, etc. & by country

The OECD’s website: http://www.oecd-ilibrary.org/statistics OECD regional, productivity, tax, energy, trade statistics, etc. & OECD and international


Online Data Collections and Sources

12

The CIC website: https://pwt.sas.upenn.edu Penn world table, international comparisons of production, income and prices

The American Economic Association’s website: http://rfe.org/showCat.php?cat_id=2 Macro, regional, financial, etc.; US and world

The FRED’s website: http://research.stlouisfed.org/fred2/categories/ National accounts, labour market, population, finance, banking, production, etc.; US and international

The UK Data Archive’s website: http://ukdataservice.ac.uk and http://data-archive.ac.uk/deposit/use/


Econometric model specification

13

� � �� ⋯ � � � � � ��• � - dependent variable• �� , … , � � - regressors (explanatory variables)• �� - error (disturbance) term• ��, ��, … , � - population parameters• substrict i indicates the sample observation. Convention: i in cross-section, t in time series• N is the sample size in cross section, T in time series• K is the number of regressors, K+1 number of betas to estimate • This is an example of a linear model. Later in the semester you will see some nonlinear models.

How does the econometric model look like?

Theoretical model:

�# � �$� � �$�� ⋯ � �$ � �

Empirical model (after you estimate its parameters):


Econometric model specification –

matrix notation

14

� %� � �• & ' 1 - vector of dependent variable sample observations• % )& ' )* � 1++ - matrix of regressors• � & ' 1 - vector of error (disturbance) terms• � )* � 1+ ' 1 - vector of parameters• dimensions: N sample size, K+1 number of parameters.

Typically it is more convenient to employ matrix and vector notation.

Empirical model (after you estimate its parameters):

Theoretical model:

, � %�$


Approach to econometric model building

15

Setting up a reserach question

Choosing a functional form and the set of explanatory variables

1

2

Collecting the data3

Estimating the model4

Verification process5

Application6

There are certain steps in econometric modeling:

Statistical Primer

16


Random Variables and Probabilities

17

A variable whose value is unknown until it is observed.

RANDOM VARIABLE

A discrete random variable can take only a limited, or countable, number of values• An indicator variable taking the values one if yes, or zero if no.

A random variable that can have any value is treated as a continuous random variable

For continuous random variables, since P(X = x)=0 the probability density function can be

interpreted as the relative probability, so that � - % - . � / � � ��01 and / � � �� 1.34

54

PROBABILITY DENSITY FUNCTION

We summarize the probabilities of possible outcomes using a probability mass function (for

discrete RVs) or probability density function (for continuous RVs):

PROBABILITY MASS FUNCTIONFor a discrete random variable X the value of the probability mass function f(x) is the probability that the random variable X takes the value x, f(x) = P(X = x). It is obvious that ∑ � �� 1� .


Random variables and Probabilities

18

CUMULATIVE DISTRIBUTION FUNCTION

The cdf of the random variable X, denoted F(x), gives the probability that X is less than or equal to a specific value x:

7 � � % - � .The cumulative probability can be calculated for any x between 8∞ and �∞.

The First Fundamental Theorem of Calculus states:

: � � �� 7 . 8 7 �0

1or

7; � � � � .


Joint, Marginal and Conditional Probabilities

19

JOINT PROBABILITY

MARGINAL PROBABILITY

CONDITIONAL PROBABILITY

A joint probability is about the probability of two events occurring simultaneously:�<,= �, � % � �, > � .

Given a joint probability density function, we can obtain the probability distributions of individual random variables, which are also known as marginal distributions:

�< � � % � � � ∑ �<,=)�, +? .

For continuous RVs: �< � � / � �, � .

The conditional probability is the probability of the outcome X = x given that Y = y has occurred . The effect of the conditioning is to reduce the set of possible outcomes .The conditional probability that the random variable X takes the value x giventhat Y=y is written P(X = x|Y = y) . This conditional probability is given by the conditional pdf f(x|y):

� � � % � � > � � @A,B)C,?+@B)?+ .


Statistical Independece

20

INDEPENDENCETwo random variables are statistically independent if the conditional probability that X = x given that Y = y, is the same as the unconditional probability that X = x:

� � � @A,B)C,?+@B)?+ � �<)�+.

We can also say that X and Y are statistically independent if their joint pdf factors into the product of their marginal pdf ’s:

�<,= �, � �< � �= .


Expectations

21

EXPECTED VALUEThe mean of a random variable is given by its mathematical expectation .If X is a discrete random variable, then the mathematical expectation, or expected value, of X is:

D � E % � F �� .�

For continuous RVs: D � E % � / �� .The expected value of the random variable is the average value that occurs in many repeated trials of an experiment.

Note that population mean (D+ G sample average (�̅+.Many economic questions are formulated in terms of conditional expectation, or the conditional mean. For a discrete random variable the conditional expected value is:

D<|= � E % > � � F �� .For continuous RVs: D<|= � E % > � � / �� | ��.


Expectations

22

EXPECTED VALUE

RULES:• If g(X) is a function of the random variable X, then g(X) is also random .• If X is a discrete random variable, then the expected value of g(X) is obtained using:

E � % � F � � � � .For example: if a is a constant, then g(X) = aX is a function of X, and:

E �% � �E % .Similarly, if a and b are constants, then:

E �% � . � �E % � ..Also, if g1(X) and g2(X) are functions of X, then:

E �� % � �J % � E �� % � EK�J % L.

Remember the phrase ‘‘the expected value of a sum is the sum of the expected values’’.

• Law of Iterated Expectations:E % � E KE)%|>+L

most commonly used in E %> � E<K%E)>|%+L.


Variance & Covariance

23

VARIANCE

COVARIANCE

The variance of a discrete or continuous random variable X is:

M�N % � O<J � EK% 8 E)%+LJ� : �)�+)� 8 D+J ��.Simple algebra gives:

M�N % � E %J 8 E % J.The square root of the variance is called the standard deviation.

RULES:

• Let a and b be constants, then:M�N �% � . � �JM�N % .

• M�N % � > � M�N % � M�N > � 2�QR %, > .

The covariance between X and Y is:

�QR %, > � O<= � E % 8 E % > 8 E > � : : � � �, �� .


Variance & Covariance

24

CORRELATION COEFFICIENTThe correlation between X and Y is:

�QNN %, > � S<= � O<=O<O=

.

NOTE: uncorrelatedness does not imply independence, unless variables are normal.


Examples of standard distributions

25

NORMALFor %~& D, OJ :

� � � �JV exp 5)C5Z+[

\[ .

Standarization to ]~& 0,1 :] � <5Z

\ and � _ � �JV exp 8 �

J _J .Calculating probabilities:

� - % - . � � 8 DO - ] - . 8 D

O � Φ . 8 DO 8 Φ � 8 D

Owhere Φ is the cdf of Z.



26

CHI-SQUAREFor independent %�~&)D, OJ+ the variable

M � %�J � ⋯ � %aJ ~χJ)�+has a chi-square distribution with m degrees of freedom and

E M � � �� M�N M � 2�.Chi-square pdf



27

STUDENT TFor independent Z~& 0,1 �� M~χJ � the variable

� � def

~� �has a t-distribution with m degrees of freedom.

Important: for large m, t distribution converges to standard normal i.e. �)� → ∞+ isindistinguishable from & 0,1 . Practically, it is enough when � � 150.

FFor independent M�~χJ � �� MJ~χJ i the variable

7 � jk/aj[/m ~7 �, i

has an F-distribution with m and k degrees of freedom.

Important: for large k, 7 �, i distribution converges to chi-square(m).

Linear Algebra

28


General definitions

MATRIX

TRANSPOSE

A matrix A is a rectangular array of elements with, say, m rows and n columns. We then say that A

is of order � ' �.

Given the matrix (� ' �) A, its transpose A’ is defined as the (n ' �) matrix obtained by interchanging rows and columns of A.

Properties:o � p ; � o; � p;

o; ; � oλo ; � λo;

op ; � p;o′


General definitions

30

SUM

PRODUCT WITH SCALAR

PRODUCT OF TWO MATRICIES

Let A and B be two matrices of order � ' � and n ' s, respectively. The product of A by B, denoted AB is the � ' s matrix C such that

��m � ∑ ��t.tmutv� , � � 1, … , � �� i � 1, … , s.

The product of a matrix A � ' � by a scalar λ is an � ' � matrix λA with (i,j) element given by λ��tfor � � 1, … , � �� w � 1, … , �.

Given two matrices of orders � ' �, call tchem A and B, their sum C=A+B is an � ' � matris suchthat ��t � ��t � .�t for � � 1, … , � �� w � 1, … , �.


General definitions

31

SQUARE MATRIX

SYMMETRIC MATRIX

IDENTITY MATRIX

Let A be an � ' � matrix. A is square if � � �.Note: for square matrices A and B of the same order AB and BA are well defined but op G po.

A square matrix of order n is the identity matrix, denoted xu if its diagonal terms are equal to one, zeros elsewhere. Note that for A and B of right orders oxu � o �� xup � p.

Let A be a squarematrix of order �. A is symmetric if A � o′.


General definitions

32

INVERSE

DETERMINANT

Let A be a square matrix of order n. The matrix B is the inverse of A if op � po � xu . If such B

exists, then A is said to be invertible (or non-singular). Note that a square matrix has at most one inverse.

Properties:)o5�+5�� o

)o;+5�� )o5�+;)op+5�� p5�o5�

The determinant of a square matrix A of order n, denoted |A| is defined as

o � F ��t)81+�3to�t ,u

tv�where o�t is the )�, w+z{ minor, i.e. the determinant of the matrix formed by deleting the ith row

and jth column of A.

Note: if a row (or column) can be expressed as a linear combination of the other rows (or columns), then the determinant is zero. A matrix is invertible if |o| G 0.


General definitions

33

LINEAR INDEPENDENCE- RANK

A set of vectors is said to be linearly independent if none of tchem can be expressed as a linearcombination of the rest.

Let A be an � ' � matrix. We define the row rank of A as the numer of linear independent rowvectors and the column rank of A as the numer of linear independent column vectors.

Note that an invertible square matrix must be of a full rank.

Exercises

34


Exercises

Exercise 1.1.

a) Expand the matrix product

% � )op � )|�+′+) E7 5� � }~+ ′.Assume that all matrices are square and E and F are non-singular.

b) If X is a non-null matrix of order � ' i � � i show that X’X is symmetric.

35


Exercises

Exercise 1.2.

a) Think of at least two examples of each type of variables described during lecture:

continuous

count

nominal binary

ordinal multi-categorical.

b) From the variables you listed above choose one from each type (call it y) and write down a full econometric linear model where your dependent variable is y.

36


Exercises

Exercise 1.3.

Let X take 4 values �� 1, �J � 3, �� 5, �� 3.

a) Calculate the arithmetic average �̅ � ∑ C��

��v� .b) Calculate ∑ )�� 8 �̅+��v� .

c) Calculate ∑ )�� 8 �̅+J��v� .

d) Calculate ∑ ��J��v� 8 4�̅J.

37


Exercises

Exercise 1.4.

Open Gretl. Upload the file cps5.gdt and do the following:

a) Learn what can be found under Tools—Statistical tables, Tools—P-value finder, Tools—Test statistic calculator.

b) Generate and discuss the descriptive statistics of the variable WAGE.

c) Plot WAGE against EDUC. Comment on this relationship.

d) Find the matrix of the correlation coefficients between WAGE, EDUC and EXPER.

e) Generate new variables: E��|J, ln E��| , E��|, �� <��⁄ .

f) Generate a new variable, which takes value 1 for the first 300 observations, 0 for the rest. Edit the last observation in the sample, so that the value of this new variable is also 1.

g) Limit the sample of observations to 1-500.

h) Write down the theoretical model explaining variation in wages. Choose explanatory variables based on the economic knowledge you have gained so far and using common sense.

i) What are your expectations on the signs of the above relationship? Discuss whether the impact of selected explanatory variables on wages is positive or negative and why.

38

Homework

39


As Homework please revise the material in Statistical Primer and Linear Algebra section. If you arenot familiar with these concepts, please go back to your notes from your previous Mathematics and Statistics modules.

Additionally, please spend some time experimenting with Gretl.

40

Homework

econometrics i topic 1. introduction to econometricsmrubas/econometrics/pdf/ei_t1.pdf ·...

Documents