# Cubic Smoothing Splines

This example shows how to use the `csaps`

and `spaps`

commands from Curve Fitting Toolbox™ to construct cubic smoothing splines.

### The CSAPS Command

The command `csaps`

provides the *smoothing* spline. This is a cubic spline that more or less follows the presumed underlying trend in noisy data. A smoothing parameter, to be chosen by you, determines just how closely the smoothing spline follows the given data. Here is the basic information, an abbreviated version of the documentation:

`CSAPS Cubic smoothing spline.`

`VALUES = CSAPS(X, Y, P, XX)`

`Returns the values at XX of the cubic smoothing spline for the`

`given data (X,Y) and depending on the smoothing parameter P, chosen`

`from the interval [0 .. 1]. This smoothing spline f minimizes`

` P * sum_i W(i)(Y(i) - f(X(i)))^2 + (1-P) * integral (D^2 f)^2`

### Example: Noisy Data From a Cubic Polynomial

Here are some trial runs. We start with data from a simple cubic, `q(x) := x^3`

, contaminate the values with some noise, and choose the value of the smoothing parameter to be .5. Then plot the resulting smoothed values, along with the underlying cubic, and the contaminated data.

xi = (0:.05:1); q = @(x) x.^3; yi = q(xi); randomStream = RandStream.create( 'mcg16807', 'Seed', 23 ); ybad = yi+.3*(rand(randomStream, size(xi))-.5); p = .5; xxi = (0:100)/100; ys = csaps(xi,ybad,p,xxi); plot(xi,yi,':',xi,ybad,'x',xxi,ys,'r-') title('Clean Data, Noisy Data, Smoothed Values') legend( 'Exact', 'Noisy', 'Smoothed', 'Location', 'NorthWest' )

The smoothing is way overdone here. By choosing the smoothing parameter `p`

closer to 1, we obtain a smoothing spline closer to the given data. We try `p = .6, .7, .8, .9, 1`

, and plot the resulting smoothing splines.

yy = zeros(5,length(xxi)); p = [.6 .7 .8 .9 1]; for j=1:5 yy(j,:) = csaps(xi,ybad,p(j),xxi); end hold on plot(xxi,yy); hold off title('Smoothing Splines for Various Values of the Smoothing Parameter') legend({'Exact','Noisy','p = 0.5','p = 0.6','p = 0.7','p = 0.8', ... 'p = 0.9', 'p = 1.0'}, 'Location', 'NorthWest' )

We see that the smoothing spline can be very sensitive to the choice of the smoothing parameter. Even for `p`

= 0.9, the smoothing spline is still far from the underlying trend, while for `p`

= 1, we get the interpolant to the (noisy) data.

In fact, the formulation used by `csapi`

(p.235ff of *A Practical Guide to Splines*) is very sensitive to scaling of the independent variable. A simple analysis of the equations used shows that the sensitive range for `p`

is around `1/(1+epsilon)`

, with `epsilon := h^3/16`

, and `h`

the average difference between neighboring sites. Specifically, you would expect a close following of the data when `p = 1/(1+epsilon/100)`

and some satisfactory smoothing when `p = 1/(1+epsilon*100)`

.

The plot below shows the smoothing spline for values of `p`

near this magic number `1/(1+epsilon)`

. For this case, it is more informative to look at `1-p`

since the magic number, `1/(1+epsilon)`

, is very close to 1.

epsilon = ((xi(end)-xi(1))/(numel(xi)-1))^3/16; 1 - 1/(1+epsilon)

ans = 7.8124e-06

plot(xi,yi,':',xi,ybad,'x') hold on labels = cell(1,5); for j=1:5 p = 1/(1+epsilon*10^(j-3)); yy(j,:) = csaps(xi,ybad,p,xxi); labels{j} = ['1-p= ',num2str(1-p)]; end plot(xxi,yy) title('Smoothing Splines for Smoothing Parameter Near Its ''Magic'' Value') legend( [{'Exact', 'Noisy'}, labels], 'Location', 'NorthWest' ) hold off

In this example, the smoothing spline is very sensitive to variation of the smoothing parameter near the magic number. The one farthest from 1 seems the best choice from these, but you may prefer the one beyond that.

p = 1/(1+epsilon*10^3); yy = csaps(xi,ybad,p,xxi); hold on plot( xxi, yy, 'y', 'LineWidth', 2 ) title( sprintf( 'The Smoothing Spline For 1-p = %s is Added, in Yellow', num2str(1-p) ) ) hold off

You can also supply `csaps`

with error weights, to pay more attention to some data points than others. Also, if you do not supply the evaluation sites `xx`

, then `csaps`

returns the ppform of the smoothing spline.

Finally, `csaps`

can also handle vector-valued data and even multivariate, gridded data.

### The SPAPS Command

The cubic smoothing spline provided by the command `spaps`

differs from the one constructed in `csaps`

only in the way it is selected. Here is an abbreviated version of the documentation for `spaps`

:

` SPAPS Smoothing spline.`

` [SP,VALUES] = SPAPS(X,Y,TOL) returns the B-form and, if asked,`

` the values at X, of a cubic smoothing spline f for the given`

` data (X(i),Y(:,i)), i=1,2, ..., n.`

` The smoothing spline f minimizes the roughness measure`

` F(D^2 f) := integral ( D^2 f(t) )^2 dt on X(1) < t < X(n)`

` over all functions f for which the error measure`

` E(f) := sum_j { W(j)*( Y(:,j) - f(X(j)) )^2 : j=1,...,n }`

` is no bigger than the given TOL. Here, D^M f denotes the M-th`

` derivative of f. The weights W are chosen so that E(f) is the`

` Composite Trapezoid Rule approximation for F(y-f).`

` f is constructed as the unique minimizer of`

` rho*E(f) + F(D^2 f),`

` with the smoothing parameter RHO so chosen so that E(f) equals`

` TOL. Hence, FN2FM(SP,'pp') should be (up to roundoff) the same`

` as the output from CPAPS(X,Y,RHO/(1+RHO)).`

### Tolerance vs. Smoothing Parameter

It may be easier to supply a suitable tolerance for `spaps`

than the smoothing parameter `p`

required by `csaps`

. In our earlier example, we added uniformly-distributed random noise from the interval `0.3*[-0.5 .. 0.5]`

. Hence, we can estimate a reasonable value for `tol`

as the value of the error measure `e`

at such noise.

tol = sum((.3*(rand(randomStream, size(yi))-.5)).^2);

This plot shows the resulting smoothing spline constructed by `spaps`

. Note that the error weights are specified to be uniform, which is their default value in `csaps`

.

[sp,ys,rho] = spaps(xi,ybad,tol,ones(size(xi))); plot(xi,yi,':',xi,ybad,'x',xi,ys,'r-') title( sprintf( 'Clean Data, Noisy Data, Smoothed Values (1-p = %s )', num2str(1/(1+rho)) ) ); legend( {'Exact','Noisy','Smoothed'}, 'location', 'NorthWest' )

The figure title shows the value of `p`

you would use in `csaps`

to obtain exactly this smoothing spline for these data.

Here, in addition, is the smoothing spline provided by `csaps`

when not given a smoothing parameter. In this case `csaps`

chooses the parameter by a certain ad hoc procedure that attempts to locate the region where the smoothing spline is most sensitive to the smoothing parameter (similar to the earlier discussion).

hold on plot(xxi,fnval(csaps(xi,ybad),xxi),'-') title('Clean Data, Noisy Data, Smoothed Values') legend({'Exact' 'Noisy' 'spaps, specified tolerance' ... 'csaps, default smoothing parameter'}, 'Location', 'NorthWest' ) hold off

### CSAPS vs. SPAPS

The `csaps`

and `spaps`

commands differ in the way in which you specify a particular smoothing spline, via a smoothing parameter vs. a tolerance. Another difference is that `spaps`

can provide a linear or a quintic smoothing spline, in addition to the cubic smoothing spline.

The quintic smoothing spline is better than the cubic smoothing spline in the situation when you would like the second derivative to move as little as possible.