Cubic smoothing spline

`pp = csaps(x,y) `

csaps(x,y,p)

[...,p] = csaps(...)

csaps(x,y,p,[],w)

values = csaps(x,y,p,xx)

csaps(x,y,p,xx,w)

[...] = csaps({x1,...,xm},y,...)

`pp = csaps(x,y) `

returns
the ppform of a cubic smoothing spline *f* to the
given data `x,y`

, with the value of *f* at
the data site `x(j)`

approximating the data value `y(:,j)`

,
for `j=1:length(x)`

. The values may be scalars, vectors,
matrices, even ND-arrays. Data points with the same site are replaced
by their (weighted) average, with its weight the sum of the corresponding
weights.

This smoothing spline *f* minimizes

$$p{\displaystyle \sum _{j=1}^{n}w(j)}{\left|y(:,j)-f(x(j))\right|}^{2}+(1-p){\displaystyle \int \lambda (t){\left|{D}^{2}f(t)\right|}^{2}\text{}dt}$$

Here, |*z*|^{2} stands
for the sum of the squares of all the entries of *z*, *n* is
the number of entries of `x`

, and the integral is
over the smallest interval containing all the entries of `x`

.
The default value for the weight vector `w`

in the *error measure* is `ones(size(x))`

.
The default value for the piecewise constant weight function *λ* in
the *roughness measure* is the constant
function 1. Further, *D*^{2}*f* denotes
the second derivative of the function *f*. The default
value for the *smoothing
parameter*, `p`

, is chosen in dependence
on the given data sites `x`

.

If the smoothing spline is to be evaluated outside its basic
interval, it must first be properly extrapolated, by the command ```
pp
= fnxtr(pp)
```

, to ensure that its second derivative is zero
outside the interval spanned by the data sites.

`csaps(x,y,p)`

lets you supply
the smoothing parameter. The smoothing parameter determines the relative
weight you would like to place on the contradictory demands of having *f* be
smooth *vs* having *f* be close
to the data. For `p`

= 0, *f* is
the least-squares straight line fit to the data, while, at
the other extreme, i.e., for `p`

= 1, *f* is
the variational,
or `natural' cubic spline interpolant. As `p`

moves
from 0 to 1, the smoothing spline changes from one extreme to the
other. The interesting range for `p`

is often near
1/(1 + *h*^{3}/6), with *h* the
average spacing of the data sites, and it is in this range that the
default value for `p`

is chosen. For uniformly spaced
data, one would expect a close following of the data for p = 1(1 + *h*^{3}/60)
and some satisfactory smoothing for p = 1/(1 + *h*^{3}/0.6). You can input a ```
p
> 1
```

, but this leads to a smoothing spline even rougher
than the variational cubic spline interpolant.

If the input `p`

is negative or empty, then
the default value for `p`

is used.

`[...,p] = csaps(...) `

also
returns the value of `p`

actually used whether or
not you specified `p`

. This is important for experimentation
which you might start with `[pp,p]=csaps(x,y)`

in
order to obtain a `reasonable' first guess for `p`

.

If you have difficulty choosing `p`

but have
some feeling for the size of the noise in `y`

, consider
using instead `spaps(x,y,tol)`

which, in effect,
chooses `p`

in such a way that the roughness measure

$${{\displaystyle \int \lambda (t)\left|{D}^{2}s(t)\right|}}^{2}\text{\hspace{0.05em}}dt$$

is as small as possible subject to the condition that the error measure

$$\sum w(j){\left|y(:,j)-s\left(x(j)\right)\right|}^{2}$$

does not exceed the specified `tol`

. This usually
means that the error measure equals the specified `tol`

.

The weight function *λ* in the roughness
measure can, optionally, be specified as a (nonnegative) piecewise
constant function, with breaks at the data sites `x `

,
by inputing for `p`

a *vector* whose `i`

th
entry provides the value of *λ* on the interval
(`x(i-1)`

.. `x(i)`

) for `i=2:length(x)`

.
The first entry of the input vector `p`

continues
to be used as the desired value of the smoothness parameter `p`

.
In this way, it is possible to insist that the resulting smoothing
spline be smoother (by making the weight function larger) or closer
to the data (by making the weight functions smaller) in some parts
of the interval than in others.

`csaps(x,y,p,[],w) `

lets
you specify the weights `w`

in the error measure,
as a vector of nonnegative entries of the same size as `x`

.

`values = csaps(x,y,p,xx) `

is the same as `fnval(csaps(x,y,p),xx)`

.

`csaps(x,y,p,xx,w) `

is the
same as `fnval(csaps(x,y,p,[],w),xx)`

.

`[...] = csaps({x1,...,xm},y,...) `

provides the ppform of an `m`

-variate tensor-product
smoothing spline to data on a rectangular grid. Here, the first argument
is a cell-array, containing the vectors `x1`

, ..., `xm`

,
of lengths `n1`

, ..., `nm`

, respectively.
Correspondingly, `y`

is an array of size `[n1,...,nm]`

(or
of size `[d,n1,...,nm]`

in case the data are `d`

-valued),
with y(:,*i*_{1}, ...,*i*_{m})
the given (perhaps noisy) value at the grid site *x*l(*i*_{1}),
...,*x*m(*i*_{m}).

In this case, `p`

if input must be a cell-array
with `m`

entries or else an `m`

-vector,
except that it may also be a scalar or empty, in which case it is
taken to be the cell-array whose `m`

entries all
equal the `p`

input. The optional second output argument
will always be a cell-array with `m`

entries.

Further, `w`

if input must be a cell-array
with `m`

entries, with `w{i}`

either
empty, to indicate the default choice, or else a nonnegative vector
of the same size as `xi`

.

**Example 1.**

x = linspace(0,2*pi,21); y = sin(x)+(rand(1,21)-.5)*.1; pp = csaps(x,y, .4, [], [ones(1,10), repmat(5,1,10), 0] );

returns a smooth fit to the (noisy) data that is much closer to the data in the right half, because of the much larger error weight there, except for the last data point, for which the weight is zero.

pp1 = csaps(x,y, [.4,ones(1,10),repmat(.2,1,10)], [], ... [ones(1,10), repmat(5,1,10), 0]);

uses the same data, smoothing parameter, and error weight but chooses the roughness weight to be only .2 in the right half of the interval and gives, correspondingly, a rougher but better fit there, except for the last data point, which is ignored.

A plot showing both examples for comparison can now be obtained by

fnplt(pp); hold on, fnplt(pp1,'r--'), plot(x,y,'ok'), hold off title(['cubic smoothing spline, with right half treated ',... 'differently:']) xlabel(['blue: larger error weights; ', ... 'red dashed: also smaller roughness weights'])

The resulting plot is shown below.

**Example 2. **This bivariate example
adds some uniform noise, from the interval [-1/2 .. 1/2], to values of the MATLAB^{®} `peaks`

function
on a 51-by-61 uniform grid, obtain smoothed values for these data
from `csaps`

, along with the smoothing parameters
chosen by `csaps`

, and then plot these smoothed values.

x = {linspace(-2,3,51),linspace(-3,3,61)}; [xx,yy] = ndgrid(x{1},x{2}); y = peaks(xx,yy); rng(0), noisy = y+(rand(size(y))-.5); [smooth,p] = csaps(x,noisy,[],x); surf(x{1},x{2},smooth.'), axis off

Note the need to transpose the array `smooth`

.
For a somewhat smoother approximation, use a slightly smaller value
of `p`

than the one, `.9998889`

,
used above by `csaps`

. The final plot is obtained
by the following:

smoother = csaps(x,noisy,.996,x); figure, surf(x{1},x{2},smoother.'), axis off

Was this topic helpful?