# Adventures in curve fitting

This example solves a complex curve fitting problem that involves parameter pooling and stratification using a nonlinear least-squares
approach. This example also takes advantage of some new language features with MATLAB 7.

- Anonymous functions
- Nested functions

## Contents

## Model: gaussian peaks on sloped background

Consider a gaussian peak. The ubiquitous bell curve permeates science and engineering from the normal distribution in statistics
to spectroscopic lines or peak shapes. A general form of gaussian peak can be characterized by three main parameters: height
`[a]`, center `[b]`, and width `[c]`. The peak equation can be defined with an anonymous function, `pk()`.

pk = @(x,a,b,c) a*exp(-0.5*[(x-b)/c].^2);

Also consider a linear background with two parameters: level `d` and slope `e`. The background equation can be defined with an anonymous function, `bg()`.

bg = @(x,d,e) d*x+e;

In this example the use of anonymous functions provided two main benefits:

- concise notation, and
- function declaration within a script m-file.

## Pooling & stratification

For added difficulty, let's complicate things. Suppose you have multiple data sets, each with its own peak; however, all peaks
must be fitted against a common background. To do this you **stratify** the data set for individual peak parameters `[a(i) b(i) c(i)]` and **pool** the whole data set for common background parmaters `[d e]`.

Define the stratified peak parameters.

a = [50 10 20];
b = [10 15 25];
c = [ 2 4 1];

Define the pooled background parameters.

d = 0.1;
e = 5;

Bundle and preserve baseline parameter values for comparison later.

origParams = [a, b, c, d, e]

origParams =
Columns 1 through 8
50.0000 10.0000 20.0000 10.0000 15.0000 25.0000 2.0000 4.0000
Columns 9 through 11
1.0000 0.1000 5.0000

## Example data

Using known model parameters, let's generate some data. To simulute measured data, we can include normally-distributed random
noise. By default, the `randn` function generates random numbers with a mean of zero and varience of one. We can take advantage of the anonymous functions,
`pk()` and `bg()`.

x = linspace(0,30,200)';
for i=1:3
y(:,i) = pk(x,a(i),b(i),c(i)) + bg(x,d,e) + randn(size(x));
end

Plot example data.

plot(x,y,'.')
xlabel X
ylabel Y
title('Example Data')
legend('1','2','3')

## Initial values

Good initial values produce good fit results. In general, if you can exploit characteristics of the data to estimate parameters
in a robust way, it's usually a good idea. This example relies on statistical properties of the data.

Estimate the pooled background parameters first.

d = 0;
e = min(y(:))

e =
2.4846

Estimate stratified peak parameters using background estimate.

for i=1:3
pdf = y(:,i) - e;
a(i) = max(pdf);
pdf = pdf/sum(pdf);
b(i) = sum(pdf.*x);
v = sum(pdf.*(x-b(i)).^2);
c(i) = sqrt(v);
end

Bundle initial parameter values by simple array concatenation.

estParams = [a, b, c, d, e]

estParams =
Columns 1 through 8
55.2101 14.9888 25.1841 12.2905 16.1393 19.0872 6.0431 6.9333
Columns 9 through 11
8.2397 0 2.4846

Plot initially estimated model. Again, reuse anonymous functions, `pk()` and `bg()`.

for i=1:3
f(:,i) = pk(x,a(i),b(i),c(i)) + bg(x,d,e);
end
hold on, plot(x,f,':'), hold off
title('Estimated Model')
legend('Data1','Data2','Data3','Est.1','Est.2','Est.3')

## Parameter fitting

Fitting generally involves minimizing an error function. This function is defined by a function M-file, `PkBgErr()`. We can use the `type` command to display or list the M-file contents.

type PkBgErr

function h = PkBgErr(x,y,pk,bg)
%PKBGERR returns function handle.
h = @errFcn;
function err = errFcn(param) %nested
a = param(1:3);
b = param(4:6);
c = param(7:9);
d = param(10);
e = param(11);
err = zeros(size(y));
for i=1:3
err(:,i) = pk(x,a(i),b(i),c(i)) + bg(x,d,e) - y(:,i);
end %for loop
end %nested function
end %main function

In this example `PkBgErr()` takes several inputs. The data values, `[x]` and `[y]` are passed in along with function handles, `pk` and `bg`. A function handle, `errFcn` is returned for the actual error function, `errFcn()`.

errFcn = PkBgErr(x,y,pk,bg);

As shown above, `errFcn()` takes as input the bundled initial parameter values, `[param]`. It returns a vector of error values, `[err]`. These are simply the difference between the model and data `[y]` values at each `[x]` value. During fitting these error values are what gets minimized.

Note that `errFcn()` is a nested function, which is defined inside the primary m-file function, `PkBgErr()`. Variables `[x]`, `[y]`, `pk` and `bg` are passed into `PkBgErr()` but they are also in scope from within the nested function, `errFcn()`. This simplifies syntax and efficiency.

Also note that `errFcn()` reuses the same anonymous functions, `pk()` and `bg()`. Since the function handles, `pk` and `bg` get passed into `PkBgErr()`, they too are in scope from within the nested function, `errFcn()`.

At this point we are ready to do the actual parameter fitting. There are several different approaches one can take as described
in Technical Note 1508, which is available on the MathWorks web site. http://www.mathworks.com/support/tech-notes/1500/1508.html

This example uses the `lqsnonlin` command from the Optimization Toolbox.

fitParams = lsqnonlin(errFcn,estParams)

Optimization terminated: relative function value
changing by less than OPTIONS.TolFun.
fitParams =
Columns 1 through 8
49.8370 9.7545 19.8606 9.9965 15.1315 25.0110 1.9854 3.9756
Columns 9 through 11
1.0019 0.1005 5.0461

The `lsqnonlin` command performs nonlinear least-squares minimization. `errFcn()` already knows the data and the model. `lsqnonlin` starts with `[estParams]` and evaluates `errFcn()`, then iterates with different variations of the parameter values. Once convergence is achieved, `[fitParams]` are returned.

Unbundle fitted parameters.

a = fitParams(1:3);
b = fitParams(4:6);
c = fitParams(7:9);
d = fitParams(10);
e = fitParams(11);

Now we can plot the fitted model (again reuing the anonymous functions, `pk()` and `bg()`).

for i=1:3
f(:,i) = pk(x,a(i),b(i),c(i)) + bg(x,d,e);
end
hold on, plot(x,f), hold off
title('Fitted Model')
legend('Data1','Data2','Data3','Est.1','Est.2','Est.3','Fit1','Fit2','Fit3')

## Final analysis

One way to assess how well a model fits the data is to compare parameters. In this case we started with known parameters as
a baseline.

origParams

origParams =
Columns 1 through 8
50.0000 10.0000 20.0000 10.0000 15.0000 25.0000 2.0000 4.0000
Columns 9 through 11
1.0000 0.1000 5.0000

Using these parameters we generated a data set (plus some random variation). Fitting the model to the data resulted in a set
of fitted model parameters.

fitParams

fitParams =
Columns 1 through 8
49.8370 9.7545 19.8606 9.9965 15.1315 25.0110 1.9854 3.9756
Columns 9 through 11
1.0019 0.1005 5.0461

We can display fitted parameters relative to their baseline values by simple division (using the `./` matrix element division operator).

fitParams./origParams

ans =
Columns 1 through 8
0.9967 0.9755 0.9930 0.9996 1.0088 1.0004 0.9927 0.9939
Columns 9 through 11
1.0019 1.0049 1.0092

For a perfect fit, all ratios would be exactly one. Ratios above one indicate overestimation. Ratios below one indicate underestimation.

For reference, this was the order of bundled parameters.

params = [a(1:3) b(1:3) c(1:3) d e]