Fitting Data

The Fitting Process

You fit data using the Fitting GUI. To open the Fitting GUI, click the Fitting button from Curve Fitting Tool.

The Fitting GUI is shown below for the census data described in Getting Started, followed by the general steps you use when fitting any data set.

  1. Select a data set and fit name.

  2. Select an exclusion rule.

    If you want to exclude data from a fit, select an exclusion rule from the Exclusion rule list. The list contains only exclusion rules that are compatible with the current data set. An exclusion rule is compatible with the current data set if their lengths are identical, or if it is created by sectioning only.

  3. Select a fit type and fit options, fit the data, and evaluate the goodness of fit.

  4. Compare fits.

  5. Save the fit results.

    If the fit is good, save the results as a structure to the MATLAB workspace. Otherwise, modify the fit options or select another model.

Parametric Fitting

Introduction

Parametric fitting involves finding coefficients (parameters) for one or more models that you fit to data. The data is assumed to be statistical in nature and is divided into two components: a deterministic component and a random component.

data = deterministic component + random component

The deterministic component is given by a parametric model and the random component is often described as error associated with the data.

data = model + error

The model is a function of the independent (predictor) variable and one or more coefficients. The error represents random variations in the data that follow a specific probability distribution (usually Gaussian). The variations can come from many different sources, but are always present at some level when you are dealing with measured data. Systematic variations can also exist, but they can lead to a fitted model that does not represent the data well.

The model coefficients often have physical significance. For example, suppose you have collected data that corresponds to a single decay mode of a radioactive nuclide, and you want to estimate the half-life (T1/2) of the decay. The law of radioactive decay states that the activity of a radioactive substance decays exponentially in time. Therefore, the model to use in the fit is given by

where y0 is the number of nuclei at time t = 0, and λ is the decay constant. The data can be described by

Both y0 and λ are coefficients that are estimated by the fit. Because T1/2 = ln(2)/λ, the fitted value of the decay constant yields the fitted half-life. However, because the data contains some error, the deterministic component of the equation cannot be determined exactly from the data. Therefore, the coefficients and half-life calculation will have some uncertainty associated with them. If the uncertainty is acceptable, then you are done fitting the data. If the uncertainty is not acceptable, then you might have to take steps to reduce it either by collecting more data or by reducing measurement error and collecting new data and repeating the model fit.

In other situations where there is no theory to dictate a model, you might also modify the model by adding or removing terms, or substitute an entirely different model.

Library Models

Curve Fitting Toolbox parametric library models are described below.

Exponentials.   The toolbox provides a one-term and a two-term exponential model.

Exponentials are often used when the rate of change of a quantity is proportional to the initial amount of the quantity. If the coefficient associated with e is negative, y represents exponential decay. If the coefficient is positive, y represents exponential growth.

For example, a single radioactive decay mode of a nuclide is described by a one-term exponential. a is interpreted as the initial number of nuclei, b is the decay constant, x is time, and y is the number of remaining nuclei after a specific amount of time passes. If two decay modes exist, then you must use the two-term exponential model. For each additional decay mode, you add another exponential term to the model.

Examples of exponential growth include contagious diseases for which a cure is unavailable, and biological populations whose growth is uninhibited by predation, environmental factors, and so on.

Fourier Series.   The Fourier series is a sum of sine and cosine functions that is used to describe a periodic signal. It is represented in either the trigonometric form or the exponential form. The toolbox provides the trigonometric Fourier series form shown below,

where a0 models a constant (intercept) term in the data and is associated with the i = 0 cosine term, w is the fundamental frequency of the signal, n is the number of terms (harmonics) in the series, and .

For more information about the Fourier series, refer to Fourier Transforms in the MATLAB documentation.

Gaussian.   The Gaussian model is used for fitting peaks, and is given by the equation

where a is the amplitude, b is the centroid (location), c is related to the peak width, n is the number of peaks to fit, and .

Gaussian peaks are encountered in many areas of science and engineering. For example, line emission spectra and chemical concentration assays can be described by Gaussian peaks.

Polynomials.   Polynomial models are given by

where n + 1 is the order of the polynomial, n is the degree of the polynomial, and . The order gives the number of coefficients to be fit, and the degree gives the highest power of the predictor variable.

In this guide, polynomials are described in terms of their degree. For example, a third-degree (cubic) polynomial is given by

Polynomials are often used when a simple empirical model is required. The model can be used for interpolation or extrapolation, or it can be used to characterize data using a global fit. For example, the temperature-to-voltage conversion for a Type J thermocouple in the 0o to 760o temperature range is described by a seventh-degree polynomial.

The main advantages of polynomial fits include reasonable flexibility for data that is not too complicated, and they are linear, which means the fitting process is simple. The main disadvantage is that high-degree fits can become unstable. Additionally, polynomials of any degree can provide a good fit within the data range, but can diverge wildly outside that range. Therefore, you should exercise caution when extrapolating with polynomials. Refer to Determining the Best Fit for examples of good and poor polynomial fits to census data.

Note that when you fit with high-degree polynomials, the fitting procedure uses the predictor values as the basis for a matrix with very large values, which can result in scaling problems. To deal with this, you should normalize the data by centering it at zero mean and scaling it to unit standard deviation. You normalize data by selecting the Center and scale X data check box on the Fitting GUI.

Power Series.   The toolbox provides a one-term and a two-term power series model.

Power series models are used to describe a variety of data. For example, the rate at which reactants are consumed in a chemical reaction is generally proportional to the concentration of the reactant raised to some power.

Rationals.   Rational models are defined as ratios of polynomials and are given by

where n is the degree of the numerator polynomial and , while m is the degree of the denominator polynomial and . Note that the coefficient associated with is always 1. This makes the numerator and denominator unique when the polynomial degrees are the same.

In this guide, rationals are described in terms of the degree of the numerator/the degree of the denominator. For example, a quadratic/cubic rational equation is given by

Like polynomials, rationals are often used when a simple empirical model is required. The main advantage of rationals is their flexibility with data that has complicated structure. The main disadvantage is that they become unstable when the denominator is around zero. For an example that uses rational polynomials of various degrees, refer to Example: Rational Fit.

Sum of Sines.   The sum of sines model is used for fitting periodic functions, and is given by the equation

where a is the amplitude, b is the frequency, and c is the phase constant for each sine wave term. n is the number of terms in the series and . This equation is closely related to the Fourier series described previously. The main difference is that the sum of sines equation includes the phase constant, and does not include a constant (intercept) term.

Weibull Distribution.   The Weibull distribution is widely used in reliability and life (failure rate) data analysis. The toolbox provides the two-parameter Weibull distribution

where a is the scale parameter and b is the shape parameter. Note that there is also a three-parameter Weibull distribution with x replaced by x – c where c is the location parameter. Additionally, there is a one-parameter Weibull distribution where the shape parameter is fixed and only the scale parameter is fitted. To use these distributions, you must create a custom equation.

Curve Fitting Toolbox software does not fit Weibull probability distributions to a sample of data. Instead, it fits curves to response and predictor data such that the curve has the same shape as a Weibull distribution.

Custom Models

Custom Models vs. Library Models.   If the toolbox library does not contain a desired parametric equation, you can create your own custom equation. Library models, however, offer the best chance for rapid convergence. This is because:

Creating Custom Models.   Create custom equations with the New Custom Equation GUI. Open the GUI in one of two ways:

The GUI contains two panes: one for creating linear custom equations and one for creating general (nonlinear) custom equations.

Linear Equations.  

Linear models are linear combinations of (perhaps nonlinear) terms. They are defined by equations that are linear in the parameters. Use the Linear Equations pane on the New Custom Equation GUI to create custom linear equations. Interface controls are described below.

General Equations.  

General models are, in general, nonlinear combinations of (perhaps nonlinear) terms. They are defined by equations that may be nonlinear in the parameters. Use the General Equations pane on the New Custom Equation GUI to create custom general equations. Interface controls are described below.

Editing and Saving Custom Models.   When you click OK on the New Custom Equation GUI, the displayed Equation name is saved for the current session in the Custom Equations list on the Fitting GUI. The list is highlighted in the picture of the Fitting GUI below.

To edit a custom equation, select the equation in the Custom Equations list and click the Edit button. The Edit Custom Equation GUI appears. The Edit Custom Equation GUI is identical to the New Custom Equation GUI, but is pre-populated with the selected equation. After editing an equation in the Edit Custom Equation GUI, click OK to save it back to the Custom Equations list for further use in the current session. A button to Copy and Edit is also available, if you want to save both the original and edited equations for the current session.

To save custom equations for future sessions, select the File > Save Session menu item in Curve Fitting Tool.

Example: Legendre Polynomial.   This example fits data using several custom linear equations. The data is generated, and is based on the nuclear reaction 12C(e,e'α)8Be. The equations use sums of Legendre polynomial terms.

Consider an experiment in which 124 MeV electrons are scattered from 12C nuclei. In the subsequent reaction, alpha particles are emitted and produce the residual nuclei 8Be. By analyzing the number of alpha particles emitted as a function of angle, you can deduce certain information regarding the nuclear dynamics of 12C. The reaction kinematics are shown below.

The data is collected by placing solid state detectors at values of Θα ranging from 10o to 240o in 10o increments.

It is sometimes useful to describe a variable expressed as a function of angle in terms of Legendre polynomials

where Pn(x) is a Legendre polynomial of degree n, x is cos(Θα), and an are the coefficients of the fit. Refer to the legendre function for information about generating Legendre polynomials.

For the alpha-emission data, you can directly associate the coefficients with the nuclear dynamics by invoking a theoretical model. Additionally, the theoretical model introduces constraints for the infinite sum shown above. In particular, by considering the angular momentum of the reaction, a fourth-degree Legendre polynomial using only even terms should describe the data effectively.

You can generate Legendre polynomials with Rodrigues' formula:

The Legendre polynomials up to fourth degree are given below.

Legendre Polynomials up to Fourth Degree

n

Pn(x)

0

1

1

x

2

(1/2)(3x2– 1)

3

(1/2)(5x3 – 3x)

4

(1/8)(35x4 – 30x2 + 3)

The first step is to load the 12C alpha-emission data from the file carbon12alpha.mat, which is provided with the toolbox.

load carbon12alpha

The workspace now contains two new variables, angle and counts:

Import these two variables into Curve Fitting Tool and name the data set C12Alpha.

The Fit Editor for a custom equation fit type is shown below.

Fit the data using a fourth-degree Legendre polynomial with only even terms:

Because the Legendre polynomials depend only on the predictor variable and constants, you use the Linear Equations pane on the Create Custom Equation GUI. This pane is shown below for the model given by y1(x). Note that because angle is given in radians, the argument of the Legendre terms is given by cos(Θα).

The fit and residuals are shown below. The fit appears to follow the trend of the data well, while the residuals appear to be randomly distributed and do not exhibit any systematic behavior.

The numerical fit results are shown below. The 95% confidence bounds indicate that the coefficients associated with P0(x) and P4(x) are known fairly accurately, but that the P2(x) coefficient has a relatively large uncertainty.

To confirm the theoretical argument that the alpha-emission data is best described by a fourth-degree Legendre polynomial with only even terms, fit the data using both even and odd terms:

The Linear Equations pane of the Create Custom Equation GUI is shown below for the model given by y2(x).

The numerical results indicate that the odd Legendre terms do not contribute significantly to the fit, and the even Legendre terms are essentially unchanged from the previous fit. This confirms that the initial model choice is the best one.

Example: Fourier Series.   This example fits the ENSO data using several custom nonlinear equations. The ENSO data consists of monthly averaged atmospheric pressure differences between Easter Island and Darwin, Australia. This difference drives the trade winds in the southern hemisphere.

As shown in Example: Smoothing Data, the ENSO data is clearly periodic, which suggests it can be described by a Fourier series

where ai and bi are the amplitudes, and ci are the periods (cycles) of the data. The question to be answered in this example is how many cycles exist? As a first attempt, assume a single cycle and fit the data using one sine term and one cosine term.

If the fit does not describe the data well, add additional sine and cosine terms with unique period coefficients until a good fit is obtained.

Because there is an unknown coefficient c1 included as part of the trigonometric function arguments, the equation is nonlinear. Therefore, you must specify the equation using the General Equations pane of the Create Custom Equation GUI.

This pane is shown below for the equation given by y1(x).

Note that the toolbox includes the Fourier series as a nonlinear library equation. However, the library equation does not meet the needs of this example because its terms are defined as fixed multiples of the fundamental frequency w. Refer to Fourier Series for more information.

The numerical results shown below indicate that the fit does not describe the data well. In particular, the fitted value for c1 is unreasonably small. Because the starting points are randomly selected, your initial fit results might differ from the results shown here.

As you saw in Example: Smoothing Data, the data include a periodic component with a period of about 12 months. However, with c1 unconstrained and with a random starting point, this fit failed to find that cycle. To assist the fitting procedure, constrain c1 to a value between 10 and 14. To define constraints for unknown coefficients, use the Fit Options GUI, which you open by clicking the Fit options button in the Fitting GUI.

The fit, residuals, and numerical results are shown below.

The fit appears to be reasonable for some of the data points but clearly does not describe the entire data set very well. As predicted, the numerical results indicate a cycle of approximately 12 months. However, the residuals show a systematic periodic distribution indicating that there are additional cycles that you should include in the fit equation. Therefore, as a second attempt, add an additional sine and cosine term to y1(x)

and constrain the upper and lower bounds of c2 to be roughly twice the bounds used for c1.

The fit, residuals, and numerical results are shown below.

The fit appears to be reasonable for most of the data points. However, the residuals indicate that you should include another cycle to the fit equation. Therefore, as a third attempt, add an additional sine and cosine term to y2(x)

and constrain the lower bound of c3 to be roughly three times the value of c1.

The fit, residuals, and numerical results are shown below.

The fit is an improvement over the previous two fits, and appears to account for most of the cycles present in the ENSO data set. The residuals appear random for most of the data, although a pattern is still visible indicating that additional cycles may be present, or you can improve the fitted amplitudes.

In conclusion, Fourier analysis of the data reveals three significant cycles. The annual cycle is the strongest, but cycles with periods of approximately 44 and 22 months are also present. These cycles correspond to El Nino and the Southern Oscillation (ENSO).

Example: Gaussian with Exponential Background.   This example fits two poorly resolved Gaussian peaks on a decaying exponential background using a general (nonlinear) custom model. To get started, load the data from the file gauss3.mat, which is provided with the toolbox.

load gauss3

The workspace now contains two new variables, xpeak and ypeak:

Import these two variables into Curve Fitting Tool and accept the default data set name ypeak vs. xpeak.

You will fit the data with the following equation

where ai are the peak amplitudes, bi are the peak centroids, and ci are related to the peak widths. Because there are unknown coefficients included as part of the exponential function arguments, the equation is nonlinear. Therefore, you must specify the equation using the General Equations pane of the Create Custom Equation GUI. This pane is shown below for y(x).

The data, fit, and numerical fit results are shown below. Clearly, the fit is poor.

Because the starting points are randomly selected, your initial fit results might differ from the results shown here.

The results include this warning message.

Fit computation did not converge:
Maximum number of function evaluations exceeded. Increasing
MaxFunEvals (in fit options) may allow for a better fit, or 
the current equation may not be a good model for the data.

To improve the fit for this example, specify reasonable starting points for the coefficients. Deducing the starting points is particularly easy for the current model because the Gaussian coefficients have a straightforward interpretation and the exponential background is well defined. Additionally, as the peak amplitudes and widths cannot be negative, constrain a1, a2, c1, and c2 to be greater then zero.

To define starting values and constraints for unknown coefficients, use the Fit Options GUI, which you open by clicking the Fit options button. The starting values and constraints are shown below.

The data, fit, residuals, and numerical results are shown below.

Specifying Fit Options

Introduction.   You specify fit options with the Fit Options GUI. The fit options for the single-term exponential are shown below. The coefficient starting values and constraints are for the census data.

The available GUI options depend on whether you are fitting your data using a linear model, a nonlinear model, or a nonparametric fit type. All the options described below are available for nonlinear models. Method, Robust, and coefficient constraints (Lower and Upper) are available for linear models. Interpolants and smoothing splines include Method, but no configurable options.

Fitting Method and Algorithm.  

Finite Differencing Parameters.  

Note that DiffMinChange and DiffMaxChange apply to

However, DiffMinChange and DiffMaxChange do not apply to any linear equations.

Fit Convergence Criteria.  

Coefficient Parameters.  

For more information about these fit options, refer to the Optimization Toolbox documentation.

The default coefficient starting points and constraints for library and custom models are given below. If the starting points are optimized, then they are calculated heuristically based on the current data set. Random starting points are defined on the interval [0,1] and linear models do not require starting points.

If a model does not have constraints, the coefficients have neither a lower bound nor an upper bound. You can override the default starting points and constraints by providing your own values using the Fit Options GUI.

Default Starting Points and Constraints

Model

Starting Points

Constraints

Custom linear

N/A

None

Custom nonlinear

Random

None

Exponentials

Optimized

None

Fourier series

Optimized

None

Gaussians

Optimized

ci > 0

Polynomials

N/A

None

Power series

Optimized

None

Rationals

Random

None

Sum of sines

Optimized

bi > 0

Weibull

Random

a, b > 0

Note that the sum of sines and Fourier series models are particularly sensitive to starting points, and the optimized values might be accurate for only a few terms in the associated equations.

Example: Rational Fit

This example fits measured data using a rational model. The data describes the coefficient of thermal expansion for copper as a function of temperature in degrees kelvin.

To get started, load the thermal expansion data from the file hahn1.mat, which is provided with the toolbox.

load hahn1

The workspace now contains two new variables, temp and thermex:

Import these two variables into Curve Fitting Tool and name the data set CuThermEx.

For this data set, you will find the rational equation that produces the best fit. As described in Library Models, rational models are defined as a ratio of polynomials

where n is the degree of the numerator polynomial and m is the degree of the denominator polynomial. Note that the rational equations are not associated with physical parameters of the data. Instead, they provide a simple and flexible empirical model that you can use for interpolation and extrapolation.

As you can see by examining the shape of the data, a reasonable initial choice for the rational model is quadratic/quadratic. The Fitting GUI configured for this equation is shown below.

The data, fit, and residuals are shown below.

The fit clearly misses the data for the smallest and largest predictor values. Additionally, the residuals show a strong pattern throughout the entire data set indicating that a better fit is possible.

For the next fit, try a cubic/cubic equation. The data, fit, and residuals are shown below.

The numerical results shown below indicate that the fit did not converge.

Although the message in the Results window indicates that you might improve the fit if you increase the maximum number of iterations, a better choice at this stage of the fitting process is to use a different rational equation because the current fit contains several discontinuities. These discontinuities are due to the function blowing up at predictor values that correspond to the zeros of the denominator.

As the next try, fit the data using a cubic/quadratic equation. The data, fit, and residuals are shown below.

The fit is well behaved over the entire data range, and the residuals are randomly scattered about zero. Therefore, you can confidently use this fit for further analysis.

Example: Robust Fitting

This example fits data that is assumed to contain one outlier. The data consists of the 2000 United States presidential election results for the state of Florida. The fit model is a first degree polynomial and the fit method is robust linear least squares with bisquare weights.

In the 2000 presidential election, many residents of Palm Beach County, Florida, complained that the design of the election ballot was confusing, which they claim led them to vote for the Reform candidate Pat Buchanan instead of the Democratic candidate Al Gore. The so-called "butterfly ballot" was used only in Palm Beach County and only for the election-day ballots for the presidential race. As you will see, the number of Buchanan votes for Palm Beach is far removed from the bulk of data, which suggests that the data point should be treated as an outlier.

To get started, load the Florida election result data from the file flvote2k.mat, which is provided with the toolbox.

load flvote2k

The workspace now contains these three new variables:

Each variable contains 68 elements, which correspond to the 67 Florida counties plus the absentee ballots. The names of the counties are given in the variable counties. From these variables, create two data sets with the Buchanan votes as the response data: buchanan vs. bush and buchanan vs. gore.

For this example, assume that the relationship between the response and predictor data is linear with an offset of zero.

buchanan votes = (bush votes)(m1)

buchanan votes = (gore votes)(m2)

m1 is the number of Bush votes expected for each Buchanan vote, and m2 is the number of Gore votes expected for each Buchanan vote.

To create a first-degree polynomial equation with zero offset, you must create a custom linear equation. You create a custom equation using the Fitting GUI by selecting Custom Equations from the Type of fit list, and then clicking the New Equation button.

The Linear Equations pane of the Create Custom Equation GUI is shown below.

Before fitting, you should exclude the data point associated with the absentee ballots from each data set because these voters did not use the butterfly ballot. As described in Marking Outliers, you can exclude individual data points from a fit either graphically or numerically using the Exclude GUI. For this example, you should exclude the data numerically. The index of the absentee ballot data is given by

ind = find(strcmp(counties,'Absentee Ballots'))
ind =
    68

The Exclude GUI is shown below.

The exclusion rule is named AbsenteeVotes. You use the Fitting GUI to associate an exclusion rule with the data set to be fit.

For each data set, perform a robust fit with bisquare weights using the FlaElection equation defined above. For comparison purposes, also perform a regular linear least-squares fit.

You can identify the Palm Beach County data in the scatter plot by using the data tips feature, and knowing the index number of the data point.

ind = find(strcmp(counties,'Palm Beach'))
ind =
    50

The Fit Editor and the Fit Options GUI are shown below for a robust fit.

The data, robust and regular least-squares fits, and residuals for the buchanan vs. bush data set are shown below.

The graphical results show that the linear model is reasonable for the majority of data points, and the residuals appear to be randomly scattered around zero. However, two residuals stand out. The largest residual corresponds to Palm Beach County. The other residual is at the largest predictor value, and corresponds to Miami/Dade County.

The numerical results are shown below. The inverse slope of the robust fit indicates that Buchanan should receive one vote for every 197.4 Bush votes.

The data, robust and regular least-squares fits, and residuals for the buchanan vs. gore data set are shown below.

Again, the graphical results show that the linear model is reasonable for the majority of data points, and the residuals appear to be randomly scattered around zero. However, three residuals stand out. The largest residual corresponds to Palm Beach County. The other residuals are at the two largest predictor values, and correspond to Miami/Dade County and Broward County.

The numerical results are shown below. The inverse slope of the robust fit indicates that Buchanan should receive one vote for every 189.3 Gore votes.

Using the fitted slope value, you can determine the expected number of votes that Buchanan should have received for each fit. For the Buchanan versus Bush data, you evaluate the fit at a predictor value of 152,951. For the Buchanan versus Gore data, you evaluate the fit at a predictor value of 269,732. These results are shown below for both data sets and both fits.

Expected Buchanan Votes in Palm Beach County

Data Set

Fit

Expected Buchanan Votes

Buchanan vs. Bush

Ordinary least squares

814

 

Robust least squares

775

Buchanan vs. Gore

Ordinary least squares

1246

 

Robust least squares

1425

The robust results for the Buchanan versus Bush data suggest that Buchanan received 3411 – 775 = 2636 excess votes, while robust results for the Buchanan versus Gore data suggest that Buchanan received 3411 – 1425 = 1986 excess votes.

The margin of victory for George Bush is given by

margin = sum(bush)-sum(gore)
margin =

   537

Therefore, the voter intention comes into play because in both cases, the margin of victory is less than the excess Buchanan votes.

In conclusion, the analysis of the 2000 United States presidential election results for the state of Florida suggests that the Reform Party candidate received an excess number of votes in Palm Beach County, and that this excess number was a crucial factor in determining the election outcome. However, additional analysis is required before a final conclusion can be made.

Nonparametric Fitting

Introduction

In some cases, you are not concerned about extracting or interpreting fitted parameters. Instead, you might simply want to draw a smooth curve through your data. Fitting of this type is called nonparametric fitting. The Curve Fitting Toolbox software supports these nonparametric fitting methods:

For more information about interpolation, refer to Polynomials and the interp1 function in the MATLAB documentation.

Example: Nonparametric Fitting

This example fits the following data using a cubic spline interpolant and several smoothing splines.

x = (4*pi)*[0 1 rand(1,25)]; 
y = sin(x) + .2*(rand(size(x))-.5);

As shown below, you can fit the data with a cubic spline interpolant by selecting Interpolant from the Type of fit list.

The results shown below indicate that goodness-of-fit statistics are not defined for interpolants.

A cubic spline interpolation is defined as a piecewise polynomial that results in a structure of coefficients. The number of "pieces" in the structure is one less than the number of fitted data points, and the number of coefficients for each piece is four because the polynomial degree is three. The toolbox does not allow you to access the structure of coefficients.

As shown below, you can fit the data with a smoothing spline by selecting Smoothing Spline in the Type of fit list.

The level of smoothness is given by the Smoothing Parameter. The default smoothing parameter value depends on the data set, and is automatically calculated by the toolbox after you click the Apply button.

For this data set, the default smoothing parameter is close to 1, indicating that the smoothing spline is nearly cubic and comes very close to passing through each data point. Create a fit for the default smoothing parameter and name it Smooth1. If you do not like the level of smoothing produced by the default smoothing parameter, you can specify any value between 0 and 1. A value of 0 produces a linear polynomial fit, while a value of 1 produces a piecewise cubic polynomial fit that passes through all the data points. For comparison purposes, create another smoothing spline fit using a smoothing parameter of 0.5 and name the fit Smooth2.

The numerical results for the smoothing spline fit Smooth1 are shown below.

The data and fits are shown below. The default abscissa scale was increased to show the fit behavior beyond the data limits. You change the axes limits with Tools > Axes Limit Control menu item.

Note that the default smoothing parameter produces a curve that is smoother than the interpolant, but is a good fit to the data. In this case, decreasing the smoothing parameter from the default value produces a curve that is smoother still, but is not a good fit to the data. As the smoothing parameter increases beyond the default value, the associated curve approaches the cubic spline interpolant.

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS