D.3.3.:  Curve Fitting

Find the curve of best fit to a given set of data.  This means that we need to find an equation which will model the data closely with the smallest error.  You have several options:

Linear Regression                              y = ax + b

Quadratic Regression                        y = ax2 + bx + c

Cubic Regression                              y = ax3 + bx2 + cx + d

Quartic Regression                            y = ax4 + bx3 + cx2 + dx + e

Logarthmic Regression                       y = a + b ln(x)

Exponential Regression                      y = abx

Power Regression                              y = axb

Logistic Regression                            y =   c / (1 + a*e(-bx))

Sinusoidal Regression                        y = a sin (bx+c) + d

The most commonly used curve fitting models are  Linear, Quadratic, Cubic and Quartic, Logarithmic and Exponential Regression. The type of model you will choose depends on the type of data. Hence, it is advantageous to first graph your data using a scatter plot.

1.)  Linear Regression

Let’s use the data Height versus Weight from the previous section. Is there a relationship between the height and weight of an individual? You can determine the line that will best fit the data by performing a linear regression using your calculator.

 Height (in.) Weight (lbs.) 62 105 67 124 58 120 60 90 66 130 65 120 69 162 69 134 74 122 76 185

Graph the data and find the best fit line using LinReg:

1.)     Use a scatter plot to graph the data

2.)     Input your data in L1 and L2

3.)     STAT

4.)     CALC

5.)     Select 4:LinReg(ax + b)

6.)     ENTER … ENTER   Note: If you do not get r2 and r, turn your Diagnostic On.

Interpretation:  Your linear regression line is as follows:

Y = 3.45 x – 100.62

Slope a = 3.45

Y-intercept of -100.62

Correlation coefficient of r = .7322

Coefficient of determination of r2 = .5361.

Explain the slope a, y-intercept, r and r2. What do they mean in terms of our data?

Slope:  _______________________________________________

y-intercept:  ___________________________________________

r  =  _________________________________________________

r2 = _________________________________________________

Activity:  Airline Data

American Airlines Flights Departing from Chicago

 Flight Gate-to-Gate Minutes (taxiing and flight time) Miles To Boston 130 868 To Dallas 140 803 To Denver 150 902 To Indianapolis 52 178 To Nashville 85 410 To New Orleans 125 838 To New York City 120 734 To Orlando 157 1006 To Toronto 86 438 To Washington, D.C. 104 613

1. Graph the data. Is there a linear trend?
2. Find the equation of the line of best fit using LinReg:  y =  ________
3. Interpret the slope, y-intercept, r, and r2.
4. Find the x-intercept. What are the airplanes doing during that time?

2.) Quadratic, Cubic and Quartic Regression

If the data does not follow a straight line or shows a linear trend, you may want to explore other options such as QuadReg, CubicReg and QuartReg.  The model with the highest r and r2 indicates the best possible fit.  However, curve fitting is always cumbersome and tricky. Sometimes none of these models will prove to model the data very well.

Example:  Gas consumption vs. Speed of car on a 250 mile trip

 Speed of Car (miles) = X Gas Consumption (in gal.)  = Y 20 5.2 30 5.8 40 6.2 50 6.4 60 6.4 70 6.2 80 5.8

Find the equation of best fit:

1.)     Graph the data.

The points show the shape of a parabola, hence use 5:QuadReg

2.)     STAT

3.)     CALC

5.)     ENTER … ENTER   The equation of best fit is:  y = -.003x2 + .32x – 1.61 with R2 = .87.

Graph the equation y in Y1 and compare the results.  Is it a good fit? Suggestion: A quick procedure to determine which model to use is finding the differences between the  y-values.  Create a table and compute the first, second, third, etc. differences of y’s.  When the differences computed are all about the same it is an indication of the model you should use.  For example, if your third differences are equal you should use a cubic model.

 Speed of Car (miles) = X Gas Consumption (in gal.)  = Y D1Y D2Y D3Y 20 5.2 30 5.8 40 6.2 50 6.4 60 6.4 70 6.2 80 5.8

Activity: Fit a curve to the following data.

 1 -4 2 0 3 1 4 0 5 -2 6 0 7 3 8 15

1. What model should you choose?
2. Find the equation:  y = ______________________
3. What is R2?  _____________
4. Graph the data and the equation you found.  Is this a good model?  __________

Activity:  Fit a curve to the following data.

 1 4 2 1 3 0.5 4 0.2 5 2 6 5 7 7 8 15

1. What model should you choose?
2. Find the equation:  y = ______________________
3. What is R2?  _____________

Graph the data and the equation you found.  Is this a good model?  __________