introduction to linear regression. you have seen how to find the equation of a line that connects...

86
Introduction to Linear Regression

Upload: susan-burke

Post on 22-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Introduction to Linear Regression

Page 2: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

You have seen how to find the equation of a line that connects two points.

Page 3: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

You have seen how to find the equation of a line that connects two points.

Often, we have more than two data points, and usually the data points do not all lie on a single line.

Page 4: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

You have seen how to find the equation of a line that connects two points.

Often, we have more than two data points, and usually the data points do not all lie on a single line.

It is possible to find the equation of a line that most closely fits a set of data points. Such a line is called a regression line or a linear regression equation.

Page 5: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

You have seen how to find the equation of a line that connects two points.

Often, we have more than two data points, and usually the data points do not all lie on a single line.

It is possible to find the equation of a line that most closely fits a set of data points. Such a line is called a regression line or a linear regression equation.

Our goal here is to learn what a regression line is. You can then watch the presentation on how to find the equation of a regression line on Excel.

Page 6: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following table that the average price of a two-bedroom apartment in downtown New York City from 1994 to 2004, where t=0 represents 1994.

Page 7: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following table that the average price of a two-bedroom apartment in downtown New York City from 1994 to 2004, where t=0 represents 1994.

Page 8: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following table that the average price of a two-bedroom apartment in downtown New York City from 1994 to 2004, where t=0 represents 1994.

We can plot each of these data points on a graph. Each point is of the form (t, p), so we have 6 points to plot.

Page 9: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following table that the average price of a two-bedroom apartment in downtown New York City from 1994 to 2004, where t=0 represents 1994.

We can plot each of these data points on a graph. Each point is of the form (t, p), so we have 6 points to plot.

They are (0, 0.38), (2, 0.40), (4, 0.60), (6, 0.95), (8, 1.20), and (10, 1.60). Just looking at them like this doesn’t give much indication of a pattern, although we can see that the p-values are increasing as t increases.

Page 10: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

When we plot the points all together on a set of axes, we get the following scatter plot:

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 11: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

When we plot the points all together on a set of axes, we get the following scatter plot:

It seems that the data do follow a somewhat linear pattern.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 12: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can find the line the line that most closely fits the equation and graph it over the data points.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 13: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can find the line the line that most closely fits the equation and graph it over the data points.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 14: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can find the line the line that most closely fits the equation and graph it over the data points.

Notice that the line does not go through all of the data points.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 15: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can also find the equation of this “line of best fit”.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 16: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can also find the equation of this “line of best fit”.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 17: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can also find the equation of this “line of best fit”.

We can also get what’s called the correlation coefficient.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 18: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can also find the equation of this “line of best fit”.

We can also get what’s called the correlation coefficient.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143R² = 0.947611989957957

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 19: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

We can also find the equation of this “line of best fit”.

We can also get what’s called the correlation coefficient.

You will be able to do all of this on Excel once you watch the instructional video and read the PDFs for this material. For now, we just want to get an idea of what the regression line is and what the correlation coefficient tells us about the regression equation.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143R² = 0.947611989957957

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 20: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the regression equation tell us about the relationship between time and sale price?

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143R² = 0.947611989957957

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 21: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the regression equation tell us about the relationship between time and sale price?

The slope and the vertical intercept (usually the y-intercept, here the p-intercept) tell us different things.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

f(x) = 0.126428571428571 x + 0.222857142857143R² = 0.947611989957957

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 22: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

In this case, the p-intercept tells us what the sale price is predicted to be when t=0 (that is, in the year 1994).

Page 23: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

In this case, the p-intercept tells us what the sale price is predicted to be when t=0 (that is, in the year 1994).

The regression equation is p=0.1264t+0.2229. Recall that price is in millions of dollars.

Page 24: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

In this case, the p-intercept tells us what the sale price is predicted to be when t=0 (that is, in the year 1994).

The regression equation is p=0.1264t+0.2229. Recall that price is in millions of dollars.

Thus, if t=0, the regression equation predicts a price of $0.2229 million or $222,900.

Page 25: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

In this case, the p-intercept tells us what the sale price is predicted to be when t=0 (that is, in the year 1994).

The regression equation is p=0.1264t+0.2229. Recall that price is in millions of dollars.

Thus, if t=0, the regression equation predicts a price of $0.2229 million or $222,900.

According to the table, the actual price was $0.38 million or $380,000. These values don’t have to be the same however, since the regression equation can’t match every point exactly. It is only a model that most closely fits the data points.

Page 26: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

Page 27: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

Page 28: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

We can always write a number x as x divided by 1, so we

can write this slope as .

Page 29: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

We can always write a number x as x divided by 1, so we

can write this slope as . Recall that the definition of slope is

.

Page 30: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

We can always write a number x as x divided by 1, so we

can write this slope as . Recall that the definition of slope is

.

In this case we are using p and t, so it’s .

Page 31: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

We can always write a number x as x divided by 1, so we

can write this slope as . Recall that the definition of slope is

.

In this case we are using p and t, so it’s .

So for our problem, we have .

Page 32: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

What does the slope of the regression equation tell us?

The slope of our regression equation is 0.1264.

We can always write a number x as x divided by 1, so we

can write this slope as . Recall that the definition of slope is

.

In this case we are using p and t, so it’s .

So for our problem, we have .

We can interpret this to mean that when t increases by 1, we can expect that p will increase by 0.1264.

Page 33: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

For this problem, t is measure in years and p is measured in millions of dollars.

Page 34: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

For this problem, t is measure in years and p is measured in millions of dollars.

So more specifically, the slope can be interpreted to mean that if t increases by 1 year, the model predicts that the average price p of a two-bedroom apartment will increase by about $0.1264 million dollars, or $126,400.

Page 35: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

For this problem, t is measure in years and p is measured in millions of dollars.

So more specifically, the slope can be interpreted to mean that if t increases by 1 year, the model predicts that the average price p of a two-bedroom apartment will increase by about $0.1264 million dollars, or $126,400.

Even more plainly, we can say that the model predicts that the average price of a two-bedroom apartment in New York City will increase by about $126,400 per year.

Page 36: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

For this problem, t is measure in years and p is measured in millions of dollars.

So more specifically, the slope can be interpreted to mean that if t increases by 1 year, the model predicts that the average price p of a two-bedroom apartment will increase by about $0.1264 million dollars, or $126,400.

Even more plainly, we can say that the model predicts that the average price of a two-bedroom apartment in New York City will increase by about $126,400 per year.

We can now use the linear regression model to predict future prices. For example, if we wanted to predict what the price of an apartment was in 2008, we could plug in 14 for t in the regression equation (since t=0 is 1994).

Page 37: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Plugging in 14 for t into the regression equation gives p=0.1264(14)+0.2229=1.9925.

Page 38: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Plugging in 14 for t into the regression equation gives p=0.1264(14)+0.2229=1.9925.

This means that if the trend continued, we can expect that the price of a two-bedroom apartment was around $1,992,500 in 2008.

Page 39: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Plugging in 14 for t into the regression equation gives p=0.1264(14)+0.2229=1.9925.

This means that if the trend continued, we can expect that the price of a two-bedroom apartment was around $1,992,500 in 2008.

You can also use the regression equation to check how closely the model matches the actual price in some years that were given on the table. For example, for 2000 the equation predicts a price of p=0.1264(6)+0.2229=0.9813, or $981,300.

Page 40: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Plugging in 14 for t into the regression equation gives p=0.1264(14)+0.2229=1.9925.

This means that if the trend continued, we can expect that the price of a two-bedroom apartment was around $1,992,500 in 2008.

You can also use the regression equation to check how closely the model matches the actual price in some years that were given on the table. For example, for 2000 the equation predicts a price of p=0.1264(6)+0.2229=0.9813, or $981,300.

According to the table, the actual price was $950,000, so the regression equation is pretty close.

Page 41: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

It is important to remember that the regression equation is just a model, and it won’t give the exact values.

Page 42: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

It is important to remember that the regression equation is just a model, and it won’t give the exact values.

If the equation is a good fit to the data however, it will give a very good approximation, so it can be used to forecast what may happen in the future if the current trend continues.

Page 43: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

It is important to remember that the regression equation is just a model, and it won’t give the exact values.

If the equation is a good fit to the data however, it will give a very good approximation, so it can be used to forecast what may happen in the future if the current trend continues.

Next, let’s take a quick look at how a regression equation is derived, and then take a look at what the correlation coefficient (or the r-squared value on Excel) tell us about the regression equation.

Page 44: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Let’s take another look at the data points and the regression line.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 45: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Let’s take another look at the data points and the regression line.

Why does this particular line give the best “fit” for the data? Why not some other line?

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 46: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

It has to do with what is called a residual.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 47: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

It has to do with what is called a residual.

A residual is the difference between a particular data point and the regression line.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 48: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

If we zoom in on a particular data point, we can see what a residual is.

0 2 4 6 8 10 120

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Time t in years since 1994

Pri

ce p

in m

illions o

f $

Page 49: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

If we zoom in on a particular data point, we can see what a residual is.

Let’s zoom in on this particular data point.

Page 50: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box:

Page 51: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box: We see the data point and the line.

Page 52: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box: We see the data point and the line.

The vertical distance between the line and the data point is the residual.

Page 53: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box: We see the data point and the line.

The vertical distance between the line and the data point is the residual.

Page 54: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box: We see the data point and the line.

The vertical distance between the line and the data point is the residual.

Page 55: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Zooming into this box: We see the data point and the line.

The vertical distance between the line and the data point is the residual.

The idea behind linear regression is to keep the residuals as small as possible.

Page 56: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

There is a method that allows us to minimize the sum of all of the residuals.

Page 57: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

There is a method that allows us to minimize the sum of all of the residuals.

This is called the least-squares method. You can read about it in the PDF for linear regression.

Page 58: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

There is a method that allows us to minimize the sum of all of the residuals.

This is called the least-squares method. You can read about it in the PDF for linear regression.

Since these formulas can get fairly complicated, you will not be required to use them in the course.

Page 59: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

There is a method that allows us to minimize the sum of all of the residuals.

This is called the least-squares method. You can read about it in the PDF for linear regression.

Since these formulas can get fairly complicated, you will not be required to use them in the course.

You will only need to know how to find a regression line using Excel. You can watch the video on how to do this, or read through the PDF, or both.

Page 60: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

There is a method that allows us to minimize the sum of all of the residuals.

This is called the least-squares method. You can read about it in the PDF for linear regression.

Since these formulas can get fairly complicated, you will not be required to use them in the course.

You will only need to know how to find a regression line using Excel. You can watch the video on how to do this, or read through the PDF, or both.

Next, we look at what the correlation coefficient tells us about the regression equation.

Page 61: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Recall that in our graph, a number was given, called the correlation coefficient, denoted by the letter r.

Page 62: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Recall that in our graph, a number was given, called the correlation coefficient, denoted by the letter r.

The correlation coefficient tells us how closely the regression line “fits” the data points.

Page 63: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Recall that in our graph, a number was given, called the correlation coefficient, denoted by the letter r.

The correlation coefficient tells us how closely the regression line “fits” the data points.

It has a value between -1 and 1. A value very close to 1 indicates a very good fit with a positive sloping linear function.

Page 64: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Recall that in our graph, a number was given, called the correlation coefficient, denoted by the letter r.

The correlation coefficient tells us how closely the regression line “fits” the data points.

It has a value between -1 and 1. A value very close to 1 indicates a very good fit with a positive sloping linear function.

A value very close to -1 indicates a very good fit with a negative sloping linear function.

Page 65: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Recall that in our graph, a number was given, called the correlation coefficient, denoted by the letter r.

The correlation coefficient tells us how closely the regression line “fits” the data points.

It has a value between -1 and 1. A value very close to 1 indicates a very good fit with a positive sloping linear function.

A value very close to -1 indicates a very good fit with a negative sloping linear function.

A value very close to 0 indicates a very poor fit with the data, so there will be no linear relationship between variables in this case.

Page 66: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Excel will not give the value of r, instead it gives the value of r squared.

Page 67: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Excel will not give the value of r, instead it gives the value of r squared.

The r-squared value basically tells us the same thing, but it will only be between 0 and 1.

Page 68: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Excel will not give the value of r, instead it gives the value of r squared.

The r-squared value basically tells us the same thing, but it will only be between 0 and 1.

If the r-squared value is close to 1, there is a very good linear fit for the data points.

Page 69: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Excel will not give the value of r, instead it gives the value of r squared.

The r-squared value basically tells us the same thing, but it will only be between 0 and 1.

If the r-squared value is close to 1, there is a very good linear fit for the data points.

If the r-squared value is close to 0, there is a very poor fit between the data points.

Page 70: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Excel will not give the value of r, instead it gives the value of r squared.

The r-squared value basically tells us the same thing, but it will only be between 0 and 1.

If the r-squared value is close to 1, there is a very good linear fit for the data points.

If the r-squared value is close to 0, there is a very poor fit between the data points.

We will now look at some examples of what it looks like with an r-squared value close to 1 and with an r-squared value close to 0.

Page 71: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following set of data points.

Page 72: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following set of data points.

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

Page 73: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following set of data points.

They follow a clear linear pattern, so we should expect the r-squared value to be close to 1.

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

Page 74: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Consider the following set of data points.

They follow a clear linear pattern, so we should expect the r-squared value to be close to 1.

And it is.

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

f(x) = 0.509090909090909 x + 1.94R² = 0.994318181818182

Page 75: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Now consider the following set of data points.

Page 76: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Now consider the following set of data points.

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18

20

Page 77: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Now consider the following set of data points.

These points seem to be scattered everywhere and don’t follow any linear pattern.

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18

20

Page 78: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Now consider the following set of data points.

These points seem to be scattered everywhere and don’t follow any linear pattern.

We expect the r-squared value to be close to 0.

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18

20

Page 79: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

Now consider the following set of data points.

These points seem to be scattered everywhere and don’t follow any linear pattern.

We expect the r-squared value to be close to 0.

And it is.

0 2 4 6 8 10 120

2

4

6

8

10

12

14

16

18

20

f(x) = − 0.183030303030303 x + 8.32666666666667R² = 0.00837766319008881

Page 80: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

So, to summarize, a linear regression equation is a line that most closely fits a given set of data points.

Page 81: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

So, to summarize, a linear regression equation is a line that most closely fits a given set of data points.

The regression equation can be used to predict future values, or values that are outside of the given data range.

Page 82: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

So, to summarize, a linear regression equation is a line that most closely fits a given set of data points.

The regression equation can be used to predict future values, or values that are outside of the given data range.

We can find regression equation for any set of data points, no matter how scattered the data look, but we can tell how closely the data follow a linear pattern by looking at the r-squared value.

Page 83: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

So, to summarize, a linear regression equation is a line that most closely fits a given set of data points.

The regression equation can be used to predict future values, or values that are outside of the given data range.

We can find regression equation for any set of data points, no matter how scattered the data look, but we can tell how closely the data follow a linear pattern by looking at the r-squared value.

An r-squared value close to 1 indicates a very good fit to the given data, and an r-squared value close to zero indicates a very poor fit to the data.

Page 84: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

The topic of linear regression is very deep, and we have only given a very brief introduction to it here.

Page 85: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

The topic of linear regression is very deep, and we have only given a very brief introduction to it here.

You can read more about it in the PDF given on the Assigned Reading for section 1.4.

Page 86: Introduction to Linear Regression.  You have seen how to find the equation of a line that connects two points

The topic of linear regression is very deep, and we have only given a very brief introduction to it here.

You can read more about it in the PDF given on the Assigned Reading for section 1.4.

Be sure you also watch the video about how to find a linear regression on Excel! You can find the video link in the Assigned Reading for section 1.4.