polyfit

Polynomial curve fitting

Syntax

p = polyfit(x,y,n)

[p,S] =
polyfit(x,y,n)

[p,S,mu]
= polyfit(x,y,n)

Description

p = polyfit(x,y,n) returns the coefficients for a polynomial p(x) of degree n that is a best fit (in a least-squares sense) for the data in y. The coefficients in p are in descending powers, and the length of p is n+1 where

$p (x) = p_{1} x^{n} + p_{2} x^{n - 1} + ... + p_{n} x + p_{n + 1} .$

example

[p,S] = polyfit(x,y,n) also returns a structure S that can be used as an input to polyval to obtain error estimates.

example

[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of both the polynomial and the fitting algorithm. This syntax additionally returns mu, which is a two-element vector with centering and scaling values. mu(1) is mean(x), and mu(2) is std(x). Using these values, polyfit centers x at zero and scales it to have unit standard deviation,

$\hat{x} = \frac{x - \bar{x}}{σ_{x}} .$

example

Examples

collapse all

Fit Polynomial to Trigonometric Function

Open Live Script

Generate 10 points equally spaced along a sine curve in the interval [0,4*pi].

x = linspace(0,4*pi,10);
y = sin(x);

Use polyfit to fit a 7th-degree polynomial to the points.

p = polyfit(x,y,7);

Evaluate the polynomial on a finer grid and plot the results.

x1 = linspace(0,4*pi);
y1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1)
hold off

Figure contains an axes object. The axes object contains 2 objects of type line. One or more of the lines displays its values using only markers

Fit Polynomial to Set of Points

Open Live Script

Create a vector of 5 equally spaced points in the interval [0,1], and evaluate $y (x) = (1 + x)^{- 1}$ at those points.

x = linspace(0,1,5);
y = 1./(1+x);

Fit a polynomial of degree 4 to the 5 points. In general, for n points, you can fit a polynomial of degree n-1 to exactly pass through the points.

p = polyfit(x,y,4);

Evaluate the original function and the polynomial fit on a finer grid of points between 0 and 2.

x1 = linspace(0,2);
y1 = 1./(1+x1);
f1 = polyval(p,x1);

Plot the function values and the polynomial fit in the wider interval [0,2], with the points used to obtain the polynomial fit highlighted as circles. The polynomial fit is good in the original [0,1] interval, but quickly diverges from the fitted function outside of that interval.

figure
plot(x,y,'o')
hold on
plot(x1,y1)
plot(x1,f1,'r--')
legend('y','y1','f1')

Figure contains an axes object. The axes object contains 3 objects of type line. One or more of the lines displays its values using only markers These objects represent y, y1, f1.

Fit Polynomial to Error Function

Open Live Script

First generate a vector of x points, equally spaced in the interval [0,2.5], and then evaluate erf(x) at those points.

x = (0:0.1:2.5)';
y = erf(x);

Determine the coefficients of the approximating polynomial of degree 6.

p = polyfit(x,y,6)

p = 1×7

    0.0084   -0.0983    0.4217   -0.7435    0.1471    1.1064    0.0004

To see how good the fit is, evaluate the polynomial at the data points and generate a table showing the data, fit, and error.

f = polyval(p,x);
T = table(x,y,f,y-f,'VariableNames',{'X','Y','Fit','FitError'})

T=26×4 table
     X        Y          Fit         FitError  
    ___    _______    __________    ___________

      0          0    0.00044117    -0.00044117
    0.1    0.11246       0.11185     0.00060836
    0.2     0.2227       0.22231     0.00039189
    0.3    0.32863       0.32872    -9.7429e-05
    0.4    0.42839        0.4288    -0.00040661
    0.5     0.5205       0.52093    -0.00042568
    0.6    0.60386       0.60408    -0.00022824
    0.7     0.6778       0.67775     4.6383e-05
    0.8     0.7421       0.74183     0.00026992
    0.9    0.79691       0.79654     0.00036515
      1     0.8427       0.84238      0.0003164
    1.1    0.88021       0.88005     0.00015948
    1.2    0.91031       0.91035    -3.9919e-05
    1.3    0.93401       0.93422      -0.000211
    1.4    0.95229       0.95258    -0.00029933
    1.5    0.96611       0.96639    -0.00028097
      ⋮

In this interval, the interpolated values and the actual values agree fairly closely. Create a plot to show how outside this interval, the extrapolated values quickly diverge from the actual data.

x1 = (0:0.1:5)';
y1 = erf(x1);
f1 = polyval(p,x1);
figure
plot(x,y,'o')
hold on
plot(x1,y1,'-')
plot(x1,f1,'r--')
axis([0  5  0  2])
hold off

Figure contains an axes object. The axes object contains 3 objects of type line. One or more of the lines displays its values using only markers

Use Centering and Scaling to Improve Numerical Properties

Open Live Script

Create a table of population data for the years 1750 - 2000 and plot the data points.

year = (1750:25:2000)';
pop = 1e6*[791 856 978 1050 1262 1544 1650 2532 6122 8170 11560]';
T = table(year, pop)

T=11×2 table
    year       pop   
    ____    _________

    1750     7.91e+08
    1775     8.56e+08
    1800     9.78e+08
    1825     1.05e+09
    1850    1.262e+09
    1875    1.544e+09
    1900     1.65e+09
    1925    2.532e+09
    1950    6.122e+09
    1975     8.17e+09
    2000    1.156e+10

plot(year,pop,'o')

Figure contains an axes object. The axes contains a line object which displays its values using only markers.

Use polyfit with three outputs to fit a 5th-degree polynomial using centering and scaling, which improves the numerical properties of the problem. polyfit centers the data in year at 0 and scales it to have a standard deviation of 1, which avoids an ill-conditioned Vandermonde matrix in the fit calculation.

[p,~,mu] = polyfit(T.year, T.pop, 5);

Use polyval with four inputs to evaluate p with the scaled years, (year-mu(1))/mu(2). Plot the results against the original years.

f = polyval(p,year,[],mu);
hold on
plot(year,f)
hold off

Figure contains an axes object. The axes object contains 2 objects of type line. One or more of the lines displays its values using only markers

Simple Linear Regression

Open Live Script

Fit a simple linear regression model to a set of discrete 2-D data points.

Create a few vectors of sample data points (x,y). Fit a first degree polynomial to the data.

x = 1:50; 
y = -0.3*x + 2*randn(1,50); 
p = polyfit(x,y,1);

Evaluate the fitted polynomial p at the points in x. Plot the resulting linear regression model with the data.

f = polyval(p,x); 
plot(x,y,'o',x,f,'-') 
legend('data','linear fit')

Figure contains an axes object. The axes object contains 2 objects of type line. One or more of the lines displays its values using only markers These objects represent data, linear fit.

Linear Regression with Error Estimate

Open Live Script

Fit a linear model to a set of data points and plot the results, including an estimate of a 95% prediction interval.

Create a few vectors of sample data points (x,y). Use polyfit to fit a first degree polynomial to the data. Specify two outputs to return the coefficients for the linear fit as well as the error estimation structure.

x = 1:100; 
y = -0.3*x + 2*randn(1,100); 
[p,S] = polyfit(x,y,1)

p = 1×2

   -0.3142    0.9614

S = struct with fields:
           R: [2×2 double]
          df: 98
       normr: 22.7673
    rsquared: 0.9407

Evaluate the first-degree polynomial fit in p at the points in x. Specify the error estimation structure as the third input so that polyval calculates an estimate of the standard error. The standard error estimate is returned in delta.

[y_fit,delta] = polyval(p,x,S);

Plot the original data, linear fit, and 95% prediction interval $y \pm 2 Δ$ .

plot(x,y,'bo')
hold on
plot(x,y_fit,'r-')
plot(x,y_fit+2*delta,'m--',x,y_fit-2*delta,'m--')
title('Linear Fit of Data with 95% Prediction Interval')
legend('Data','Linear Fit','95% Prediction Interval')

Figure contains an axes object. The axes object with title Linear Fit of Data with 95% Prediction Interval contains 4 objects of type line. One or more of the lines displays its values using only markers These objects represent Data, Linear Fit, 95% Prediction Interval.

Input Arguments

collapse all

`x` — Query points
vector

Query points, specified as a vector. The points in x correspond to the fitted function values contained in y. If x is not a vector, then polyfit converts it into a column vector x(:).

Warning messages result when x has repeated (or nearly repeated) points or if x might need centering and scaling.

Data Types: single | double
Complex Number Support: Yes

`y` — Fitted values at query points
vector

Fitted values at query points, specified as a vector. The values in y correspond to the query points contained in x. If y is not a vector, then polyfit converts it into a column vector y(:).

Data Types: single | double
Complex Number Support: Yes

`n` — Degree of polynomial fit
positive integer scalar

Degree of polynomial fit, specified as a positive integer scalar. n specifies the polynomial power of the left-most coefficient in p.

Output Arguments

collapse all

`p` — Least-squares fit polynomial coefficients
vector

Least-squares fit polynomial coefficients, returned as a vector. p has length n+1 and contains the polynomial coefficients in descending powers, with the highest power being n. If either x or y contain NaN values and n < length(x), then all elements in p are NaN. If you specify three output arguments to center and scale the data, then polyfit returns different coefficients in p compared to when the data is not centered and scaled.

Use polyval to evaluate p at query points.

`S` — Error estimation structure
structure

Error estimation structure. This optional output structure is primarily used as an input to the polyval function to obtain error estimates. S contains the fields in this table.

Field	Description
`R`	Triangular `R` factor (possibly permuted) from a QR decomposition of the Vandermonde matrix of `x`
`df`	Degrees of freedom
`normr`	Norm of the residuals
`rsquared`	Coefficient of determination, or (unadjusted) R-squared

If the data in y is random, then an estimate of the covariance matrix of p is (Rinv*Rinv')*normr^2/df, where Rinv is the inverse of R.

If the errors in the data in y are independent and normal with constant variance, then [y,delta] = polyval(...) produces error bounds that contain at least 50% of the predictions. That is, y ± delta contains at least 50% of the predictions of future observations at x.

`mu` — Centering and scaling values
two-element vector

Centering and scaling values, returned as a two-element vector. mu(1) is mean(x), and mu(2) is std(x). These values center the query points in x at zero with unit standard deviation.

Use mu as the fourth input to polyval to evaluate p at the scaled points, (x - mu(1))/mu(2).

Limitations

In problems with many points, increasing the degree of the polynomial fit using polyfit does not always result in a better fit. High-order polynomials can be oscillatory between the data points, leading to a poorer fit to the data. In those cases, you might use a low-order polynomial fit (which tends to be smoother between points) or a different technique, depending on the problem.
Polynomials are unbounded, oscillatory functions by nature. Therefore, they are not well-suited to extrapolating bounded data or monotonic (increasing or decreasing) data.

Algorithms

polyfit uses x to form Vandermonde matrix V with n+1 columns and m = length(x) rows, resulting in the linear system

$(\begin{matrix} x_{1}^{n} & x_{1}^{n - 1} & \dots & 1 \\ x_{2}^{n} & x_{2}^{n - 1} & \dots & 1 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{m}^{n} & x_{m}^{n - 1} & \dots & 1 \end{matrix}) (\begin{matrix} p_{1} \\ p_{2} \\ ⋮ \\ p_{n + 1} \end{matrix}) = (\begin{matrix} y_{1} \\ y_{2} \\ ⋮ \\ y_{m} \end{matrix}),$

which polyfit solves with p = V\y. Since the columns in the Vandermonde matrix are powers of the vector x, the condition number of V is often large for high-order fits, resulting in a singular coefficient matrix. In those cases centering and scaling can improve the numerical properties of the system to produce a more reliable fit.

Extended Capabilities

expand all

Tall Arrays
Calculate with arrays that have more rows than fit in memory.

The polyfit function supports tall arrays with the following usage notes and limitations:

X and Y must be column vectors.

For more information, see Tall Arrays.

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

For input arguments x and y:
- You must specify the input vector as a fixed-size or variable-length vector at code generation time. Either the first or the second dimension of the vector can be variable size. All other dimensions must have a fixed size of 1.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Refer to the usage notes and limitations in the C/C++ Code Generation section. The same usage notes and limitations apply to GPU code generation.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

The polyfit function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The polyfit function fully supports GPU arrays. To run the function on a GPU, specify the input data as a gpuArray (Parallel Computing Toolbox). For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.

The polyfit function fully supports distributed arrays. For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).

Version History

Introduced before R2006a

expand all

R2024a: Error estimation structure includes R-squared value

When you return the error estimation structure S as a second output, the structure includes field rsquared. rsquared is the coefficient of determination, or (unadjusted) R-squared value. Use S with the polyval function to obtain error estimates.

polyfit

Syntax

Description

Examples

Fit Polynomial to Trigonometric Function

Fit Polynomial to Set of Points

Fit Polynomial to Error Function

Use Centering and Scaling to Improve Numerical Properties

Simple Linear Regression

Linear Regression with Error Estimate

Input Arguments

`x` — Query points
vector

`y` — Fitted values at query points
vector

`n` — Degree of polynomial fit
positive integer scalar

Output Arguments

`p` — Least-squares fit polynomial coefficients
vector

`S` — Error estimation structure
structure

`mu` — Centering and scaling values
two-element vector

Limitations

Algorithms

Extended Capabilities

Tall Arrays
Calculate with arrays that have more rows than fit in memory.

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.

Version History

R2024a: Error estimation structure includes R-squared value

See Also

Topics

polyfit

Syntax

Description

Examples

Fit Polynomial to Trigonometric Function

Fit Polynomial to Set of Points

Fit Polynomial to Error Function

Use Centering and Scaling to Improve Numerical Properties

Simple Linear Regression

Linear Regression with Error Estimate

Input Arguments

x — Query points vector

y — Fitted values at query points vector

n — Degree of polynomial fit positive integer scalar

Output Arguments

p — Least-squares fit polynomial coefficients vector

S — Error estimation structure structure

mu — Centering and scaling values two-element vector

Limitations

Algorithms

Extended Capabilities

Tall Arrays Calculate with arrays that have more rows than fit in memory.

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Thread-Based Environment Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Distributed Arrays Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.

Version History

R2024a: Error estimation structure includes R-squared value

See Also

Topics

`x` — Query points
vector

`y` — Fitted values at query points
vector

`n` — Degree of polynomial fit
positive integer scalar

`p` — Least-squares fit polynomial coefficients
vector

`S` — Error estimation structure
structure

`mu` — Centering and scaling values
two-element vector

Tall Arrays
Calculate with arrays that have more rows than fit in memory.

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™.