# Least squares fit to multiple differently-sized data sets simultaneously

80 views (last 30 days)
Richard Cobley on 12 Aug 2016
Commented: Vipultomar on 9 Feb 2017
I can use lsqcurvefit to simultaneously fit a system of non-linear equations using matrix arguments. However, I think the matrix requirement to have every element defined means that all data sets must be the same size? I can't find a way around that.
To make it clearer, if I make up some simple data sets which are the same size:
x=[1 3 5 7; 5 7 9 11; 9 11 13 15];
y=[10 11 12 13; 10 11 12 13; 10 11 12 13];
and want to fit each with a straight line, where they all share the same gradient (and so the solver must simulatenously fit that across all three data sets, but fit the individual offset to each set separately):
%variables v(1:4) are m, c1, c2, c3
fn = @(v,xdata)[v(1).*xdata(1,:) + v(2); v(1).*xdata(2,:) + v(3); v(1).*xdata(3,:) + v(4)];
I can set up a guess and solve using:
x0 = [1; 10; 9; 8];
fitvars = lsqcurvefit(fn,x0,x,y);
The problem comes when the data sets don't have the same number of points - if the second row of xdata and ydata only had three points, how do I pass the data to lsqcurvefit?
I've tried padding the matrices with NaN but then the objective function returns undefined numbers and the solver complains. I tried converting the data and the function to cell arrays to allow different lengths, but lsqcurvefit won't take it.
I'm aware this particular example can be solved trivially, but the real functions and data sets are non-linear (which explains why I'm using lsqcurvefit) and the system has a minimum of five data sets to solve with one shared variable between each function, as well as several unique variables and constants to each.
Thanks for any help.

John D'Errico on 2 Sep 2016
Edited: John D'Errico on 2 Sep 2016
Yes. There is a way. Simply store the data in a cell array. Make the vectors COLUMN vectors. I was lazy here when I did that, using transposes.
x={[1 3 5 7]' [5 7 9 11]' [9 11 13 15]'};
y={[10 11 12 13]' [10 11 12 13]' [10 11 12 13]'};
yy = vertcat(y{;});
%variables v(1:4) are m, c1, c2, c3
fn = @(v,xdata) [v(1).*xdata{1} + v(2); v(1).*xdata{2} + v(3); v(1).*xdata(3,:) + v(4)];
x0 = [1; 10; 9; 8];
fitvars = lsqcurvefit(fn,x0,x,yy);

Richard Cobley on 18 Sep 2016
Thanks John - this approach worked.
I tidied up a couple of typos, dropped out one point from the data to check it works, and wrote a (terrible) loop approach to re-shape the output from the function to match the original cell array format. The working solution is below for anyone else who has this problem.
Thanks a lot,
Richard.
x={[1 3 5 7]' [5 7 9 11]' [9 11 15]'};
y={[10 11 12 13]' [10 11 12 13]' [10 11 13]'};
yy = vertcat(y{:});
%variables v(1:4) are m, c1, c2, c3
fn = @(v,xdata) [v(1).*xdata{1} + v(2); v(1).*xdata{2} + v(3); v(1).*xdata{3} + v(4)];
x0 = [1; 10; 9; 8];
fitvars = lsqcurvefit(fn,x0,x,yy);
yfit = testfn(fitvars,x);
%re-shape fitted y points back to original format
yfitcell=cell(size(y));
yyc = yy;
for counter1 = 1 : size(y,2)
yfitcell{counter1} = yyc(1:length(y{counter1}));
yyc(1:length(y{counter1}))=[];
end
figure; hold on;
cellfun(@plot,x,y,{'.' '.' '.'})
cellfun(@plot,x,yfitcell,{'--' '--' '--'})
Vipultomar on 9 Feb 2017
Hi. Thanks for sharing the solution. So overall all the ydata (of different dimensions) have been concatenated in a column and are being compared to theoretical ydata values from the function (which in turn should also be in the same order). Right?

Qu Cao on 24 Aug 2016
The data sets should have the same number of points. The 'xdata' and 'ydata' should be well defined vetors or matrices.

#### 1 Comment

Richard Cobley on 2 Sep 2016
The data sets I need to fit do not have the same number of points - that is the problem I am trying to solve.