3.6 – Approximate Linear Relationships
Approximate Linear Relationships
In this section, we deal with relationships that do not follow an exact linear relationship. There are many instances in nature, business, and countless scenarios where we find a relationship that is almost linear, but has variation. In order to create an equation that approximately describes a situation, we can start by creating a graph of data points. This is called a scatter plot. We sill see that even the best linear models will have some variation. This deviation is due to many factors, many of which we can not control, but may also be due to factor we can change such as measurement error. A plot that doesn’t seem to fit the data, when most others do is called an outlier.
To make a scatter plot, we create a graph with given points:
Make a scatter plot of the following ordered pairs: |
---|
![]() |
We can see that the points in the previous example are roughly in a straight line. This makes it a reasonable approximate linear relationship. In fact, we can draw a line through the points to show this approximate relationship. We call this line the “best fit” line, or the line of “regression”.
Using the points in the previous example, we will draw a best fit line |
---|
Make a scatter plot of the following ordered pairs and add a best fit line |
---|
![]() |
Two scientist (Margarette Brooks and Amos Dolbear) counted the number of cricket chirps heard on various days, recorded the results, and also recorded the temperature that day. A table of such findings is posted below. Create a scatter plot with “chirps counted in 15 seconds” as the ![]() ![]() |
---|
![]() |
The two scientists of the previous example found that there is a relation between the frequency of cricket chirps and the current temperature. They even created the equation of the best fit line. It is often given as , where T is the temperature, and
is the number of chirps counted in 15 seconds.
We can use the best fit line to predict values. In the cricket example, we saw that the linear regression line could have the equation , thus, if we counted 25 chirps in 15 seconds, we could input 25 into the equation to estimate the temperature.
Using the equation ![]() |
---|
We simply evaluate ![]() Checking this value on the scatter plot, we can see that the values would fit very nicely with the data. |
Estimating a value this way is called interpolation as long as the value being estimated falls within the experimental data (the data that has been collected). We can reasonably assume that our estimations are useful as long as this is the case.
When the value we wish to estimate falls outside the experimental data, we call this extrapolation. One must be extremely careful when extrapolating data. When testing data outside the known values, it is quite often the case that the trendline no longer fits outside the range of collected values.
For the cricket chirps, find the estimated temperature when 90 chirps are heard in 15 seconds. |
---|
90 is quite a ways outside the values of cricket chirps we have collected in the data. Let’s see if the estimated temperature makes sense here.
It is fairly unreasonable to assume that the crickets would be chirping in such a hyper manner when they are slowly cooking to death! |
An outlier is a data point that seems to not fit the data, and skews the line of best fit.
A teacher tests her students reading level (on a scale from 1-20), and their IQ score and finds the data listed below. Make a scatter plot to see if the data is linearly related. |
---|
![]() |