Products & Services Industries Academia Support User Community Company

Learn more about MATLAB   

Programmatic Fitting

MATLAB Functions for Polynomial Models

Two MATLAB functions can model your data with a polynomial.

Polynomial Fit Functions

Function

Description

polyfit

polyfit(x,y,n) finds the coefficients of a polynomial p(x) of degree n that fits the y data by minimizing the sum of the squares of the deviations of the data from the model (least-squares fit).

polyval

polyval(p,x) returns the value of a polynomial of degree n that was determined by polyfit, evaluated at x.

For example, suppose you measure a quantity y at several values of time t:

t = [0 0.3 0.8 1.1 1.6 2.3];
y = [0.6 0.67 1.01 1.35 1.47 1.25];
plot(t,y,'o')

Plot of y Versus t

Plot of time (t) on the x axis and quantity (y) on the y axis.

You can try modeling this data using a second-degree polynomial function:

y = a sub 0 + a sub 1 times t + a sub 2 times t squared

The unknown coefficients a0, a1, and a2 are computed by minimizing the sum of the squares of the deviations of the data from the model (least-squares fit).

To find the polynomial coefficients, type the following at the MATLAB prompt:

p=polyfit(t,y,2)

MATLAB calculates the polynomial coefficients in descending powers:

p =
   -0.2942    1.0231    0.4981

The second-degree polynomial model of the data is given by the following equation:

y = 0.4981 + 1.0231 times t minus 0.2942 times t squared

To plot the model with the data, evaluate the polynomial at uniformly spaced times t2 and overlay the original data on a plot:

t2 = 0:0.1:2.8;     % Define a uniformly spaced time vector
y2=polyval(p,t2);   % Evaluate the polynomial at t2
figure
plot(t,y,'o',t2,y2) % Plot the fit on top of the data 
                    % in a new Figure window

Plot of Data (Points) and Model (Line)

Plot of and the quadratic fit

Use the following syntax to calculate the residuals:

y2=polyval(p,t); % Evaluate model at the data time vector
res=y-y2; % Calculate the residuals by subtracting
figure, plot(t,res,'+') % Plot the residuals

Plot of the Residuals

Notice that the second-degree fit roughly follows the basic shape of the data, but does not capture the smooth curve on which the data seems to lie. There appears to be a pattern in the residuals, which indicates that a different model might be necessary. A fifth-degree polynomial (shown next) does a better job of following the fluctuations in the data.

Fifth-Degree Polynomial Fit

Linear Model with Nonpolynomial Terms

When a polynomial function does not produce a satisfactory model of your data, you can try using a linear model with nonpolynomial terms. For example, consider the following function that is linear in the parameters a0, a1, and a2, but nonlinear in the t data:

y = a sub 0 + a sub 1 times e to the minus t power + a sub 2 times t times e to the minus t power

You can compute the unknown coefficients a0, a1, and a2 by constructing and solving a set of simultaneous equations and solving for the parameters. The following syntax accomplishes this by forming a design matrix, where each column represents a variable used to predict the response (a term in the model) and each row corresponds to one observation of those variables:

% Enter t and y as columnwise vectors
t = [0 0.3 0.8 1.1 1.6 2.3]';
y = [0.6 0.67 1.01 1.35 1.47 1.25]';

% Form the design matrix
X = [ones(size(t))  exp(-t)  t.*exp(-t)];

% Calculate model coefficients
a = X\y

a =
    1.3983
  - 0.8860
    0.3085

Therefore, the model of the data is given by

y = 1.3983 minus 0.8860 times e to the minus t power plus 0.3085 times t times e to the minus t power

Now evaluate the model at regularly spaced points and plot the model with the original data, as follows:

T = (0:0.1:2.5)';
Y = [ones(size(T))  exp(-T)  T.*exp(-T)]*a;
plot(T,Y,'-',t,y,'o'), grid on

Linear Fit with Nonpolynomial Terms

Multiple Regression

When y is a function of more than one predictor variable, the matrix equations that express the relationships among the variables must be expanded to accommodate the additional data. This is called multiple regression.

Suppose you measure a quantity y for several values of x1 and x2. Enter these variables in the MATLAB Command Window, as follows:

x1 = [.2 .5 .6 .8 1.0 1.1]';
x2 = [.1 .3 .4 .9 1.1 1.4]';
y  = [.17 .26 .28 .23 .27 .24]';

A model of this data is of the form

y = a sub 0 + a sub 1 times x sub 1+ a sub 2 times x sub 2

Multiple regression solves for unknown coefficientsa0, a1, and a2 by minimizing the sum of the squares of the deviations of the data from the model (least-squares fit).

Construct and solve the set of simultaneous equations by forming a design matrix, X, and solving for the parameters by using the backslash operator:

X = [ones(size(x1))  x1  x2];
a = X\y

a =
    0.1018
    0.4844
   -0.2847

The least-squares fit model of the data is

y = 0.1018 + 0.4844 times x sub 1 minus 0.2847 times x sub 2

To validate the model, find the maximum of the absolute value of the deviation of the data from the model:

Y = X*a;
MaxErr = max(abs(Y - y))

MaxErr = 
     0.0038

This value is much smaller than any of the data values, indicating that this model accurately follows the data.

Example: Programmatic Fitting

In this example, you use MATLAB functions to accomplish the following:

This example uses the data in census.mat, which contains U.S. population data for the years 1790 to 1990.

To load and plot the data, type the following commands at the MATLAB prompt:

load census
plot(cdate,pop,'ro')

This adds the following two variables to the MATLAB workspace:

The following plot of the data shows a strong pattern, which indicates a high correlation between the variables.

U.S. Population from 1790 to 1990

Calculating Correlation Coefficients

In this portion of the example, you determine the statistical correlation between the variables cdate and pop to justify modeling the data. For more information about correlation coefficients, see Linear Correlation.

Type the following syntax at the MATLAB prompt:

corrcoef(cdate,pop)

MATLAB calculates the following correlation-coefficient matrix:

ans =

    1.0000    0.9597
    0.9597    1.0000

The diagonal matrix elements represent the perfect correlation of each variable with itself and are equal to 1. The off-diagonal elements are very close to 1, indicating that there is a strong statistical correlation between the variables cdate and pop.

Fitting a Polynomial to the Data

This portion of the example applies the polyfit and polyval MATLAB functions to model the data:

% Calculate fit parameters
[p,ErrorEst] = polyfit(cdate,pop,2);
% Evaluate the fit
pop_fit = polyval(p,cdate,ErrorEst);
% Plot the data and the fit
plot(cdate,pop_fit,'-',cdate,pop,'+');
% Annotate the plot
legend('Polynomial Model','Data');
xlabel('Census Year');
ylabel('Population (millions)');

The following figure shows that the quadratic-polynomial fit provides a good approximation to the data:

Quadratic Polynomial Fit to the Census Data

To calculate the residuals for this fit, type the following syntax at the MATLAB prompt:

res = pop - pop_fit;
figure, plot(cdate,res,'+')

Residuals for the Quadratic Polynomial Model

Notice that the plot of the residuals exhibits a pattern, which indicates that a second-degree polynomial might not be appropriate for modeling this data.

Plot and Calculate Confidence Bounds

Confidence bounds are confidence intervals for a predicted response. The width of the interval indicates the degree of certainty of the fit.

This example applies polyfit and polyval to the census sample data to produce confidence bounds for a second-order polynomial model.

The following syntax uses an interval of , which corresponds to a 95% confidence interval for large samples:

% Evaluate the fit and the prediction error estimate (delta)
[pop_fit,delta] = polyval(p,cdate,ErrorEst);
% Plot the data, the fit, and the confidence bounds
plot(cdate,pop,'+',...
     cdate,pop_fit,'g-',...
     cdate,pop_fit+2*delta,'r:',...
     cdate,pop_fit-2*delta,'r:'); 
% Annotate the plot
xlabel('Census Year');
ylabel('Population (millions)');
grid on

The 95% interval indicates that you have a 95% chance that a new observation will fall within the bounds.

Quadratic Polynomial Fit with Confidence Bounds

Plot of error bounds for a second-order polynomial model

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS