Can you use Batched partitioned nonlinear least squares to fit 5 parameters?

Hi,
I've seen people using the batched partitioned nonlinear least squares to solve nonlinear problems of two unknown parameters.
I have a function with 5 unknown parameters and I want to fit it nonlinearly. I fitted my model using the lsqnonlin but is extremely slow.
I was looking for solutions on how to speed It up and I came across with the batched partitioned nonlinear least squares but I am not sure if I can use it in case of 5 unknown parameters.
Thank you in advance.

 Accepted Answer

As the person who introduced the concept of batched partitioned nonlinear least squares here, I'll assume you are asking about using my BATCHPLEAS utility, found on the file exchange.
Yes, in theory, you can use it to solve problems with any number of parameters. Of course, theory and practice need not always coincide. There is no limit (implicit or explicit) on the number of parameters you can estimate.
But first, are you talking about a problem with 5 NONLINEAR parameters, thus really more variables that that? Or as is common in this arena, if you have 5 total variables, then you might have 2 or 3 nonlinear parameters to estimate, but the remainder linear, so 5 total parameters? Remember that the partitioning reduces the search space for the estimation.
If you have no clue as to what partitioned nonlinear least squares is, or how it works, and the docs provided with BATCHPLEAS and/or FMINSPLEAS do not help you enough, then one of the chapters in the Seber & Wild book on nonlinear regression do cover the concept. And since Bates & Watts were the people who wrote the original papers on the topic as I recall, I'd bet that the book by Bates & Watts on nonlinear regression does so too. I think Seber and Wild called the idea something like separable nonlinear least squares.
Next, remember that regardless of how many nonlinear parameters there are in your problem, that nonlinear regression is often sensitive to starting values. BATCHPLEAS presumes the SAME set of starting values will be used for every member of the set. Problems with many nonlinear parameters might have local solutions that are not the global optimum. Or in some cases, some sub-problems might diverge, again due to sensitivity to the starting values. The more variables you have, the more risk there is in this happening. If partitioning does reduce the search space, so that in reality, you have only 2 or 3 unknowns, then the partitioning typically also increases the robustness of the problem, so it is less sensitive to the need for good starting values.

5 Comments

Hi John,
Thank you for your reply. Indeed I was wondering where I can use your BATCHPLEAS utility.
My function has the following form:
W( x, y) = a1[a2 * exp(- x * a3 - y/ a4)+(1-a2)* exp(- x * c1)( a5 * exp (- y/ c2)+(1-a5)exp(y/ c3))],
where c1,c2,c3 are constants and a1,a2,a3,a4,a5 are the 5 unknown parameters to be estimated.
Is it possible to use your code to solve a function of the above form?
Is it possible? Yes. You have ONE linear parameter there, a1. The others, a2,a3,a4,a5 all appear in there in ways that force them to be nonlinear. So you have one nonlinear term there, with 4 nonlinear parameters.
syms y x a1 a2 a3 a4 a5 c1 c2 c3
pretty(a1*[a2 * exp(- x * a3 - y/ a4)+(1-a2)* exp(- x * c1)*( a5 * exp (- y/ c2)+(1-a5)*exp(y/ c3))])
ans =
/ / y \ / / y \ \ \
a1 | a2 exp| - a3 x - -- | + exp(-c1 x) | exp(y/c3) (a5 - 1) - a5 exp| - -- | | (a2 - 1) |
\ \ a4 / \ \ c2 / / /
The parameters have a2 and a5 in there act as linear combinations of terms, with a3 and a4 inside the exponential. It is parameters that are inside nonlinear functions like exponentials that tend to cause problems.
As well, you need to be careful that these exponentials are not causing underflows or overflows. So starting values are always important where exponentials are involved, but this is simple enough that reasonable starting values should be easy to choose.
You also need sufficient data to be able to support estimating 5 total unknown parameters. That means more than just 5 data points per sub-problem. I'd normally suggest at the very least 10-15 data points per sub-problem, more is always better than less.
Will the batched solver work, and be more efficient than simply using a loop? Probably.
In practice, be careful not to overdo it on the batchsize. The batchsize is a scheme where I group sub-problems into one batch. So perhaps you might have 10000 sub-problems to solve, with 4 nonlinear parameters per problem. A single problem with 40000 nonlinear unknowns will probably be far slower than just a big loop, because the linear algebra will get in the way for a problem that large. Instead, it might be better to break it down internally into 100 blocks of subproblems, with 100 in each batch. BATCHPLEAS does all the work there for you, as long as you specify the batchsize.
Thank you for the detailed explanation, that was very useful. I struggle a bit to define the
funlist. I have a function of two variables x and y (x and y are vectors).
I have the following objective function:
W( x, y) = p(1)[p(2) * exp(- x * p(3) - y/ p(4))+(1-p(2))* exp(- x * c1)( p(5) * exp (- y/ c2)+(1-p(5))exp(y/ c3))],
As explained above the p(1) is a linear parameter and the other 4 nonlinear. That means that in my funlist I should only include the 4 nonlinear parameters, right?
Following the examples provided in the demo file I wrote the code:
INLPlb = 0;
INLPub = [1 1 0.1 1];
batchsize = 1000;
INLPstart = [p0(2), p0(3), p0(4), p0(5)]; %initial values of nonlinear parameters
funlist = {@(p,x,y) exp(-x*p(3)-y/p(4)), @(p,x,y) exp(-x*c1),
@(p,x,y) exp(-y/c2), @(p,x,y) exp(-y/c3)} ;
Did I define the funlist correctly?
No. First, making batchsize to be as large as 1000 will probably not be fast. It may even be slower than a loop that processes each data set separately.
As I explained, batchsize for 4 nonlinear variables will probably be best somewhere between 25 and 200. The code will do the work for you. I might try 50 or 100 to see how it does.
Next, there is only ONE function in funlist here, since there is only ONE linear variable. Also you will pass in ONE array of independent variables. I'll all it xy.
You also need to learn about the use of .* and ./ as operators. You can freely multiply or divide anything by a scalar variable. But if you multiply two vectors, then you need to use .* to do element-wise multiplication. Likewise, ./ is needed if you divide one vector into another element-wise, or if you divide a scalar by the elements of a vector.
xy = [x(:),y(:)];
funlist = {@(p,xy) (p(2) * exp(-xy(:,1)*p(3) - xy(:,2)/p(4))+(1-p(2)).*exp(-xy(:,1)*c1).*(p(5)*exp(-xy(:,2)/c2)+(1-p(5)).*exp(xy(:,2)/c3)))};
I think I got that right. At least, MATLAB accepts it as valid syntax.
The following code performs nonlinear least squares fitting using the Matlab function lsqnonlin.
img = data(:,:,:,:) %4D array consists of 26 images (x-pixels:150, y-pixels:190) and 41 time points
%c1 3D array with dimensions [150 190 26] calculated beforehand.
This array doesn’t change and is used as fixed.
%c2 and c3 are both scalars
%x and y are 1D vectors with 41 measurements
params=zeros(size(img,1),size(img,2),size(img,3),10);
for i=1:size(img,1)
for j=1:size(img,2)
for k=1:size(img,3)
[p,residual]=lsqnonlin(@(p)(s(isfinite(s))-p(1)*(p(2)*exp(-x(isfinite(s))*p(3)-y(isfinite(s))/p(4))+(1-p(2))*exp(-x(isfinite(s))*c1(i,j,k)).*(p(5)*exp(-y(isfinite(s))/c2) + (1-p(5))*exp(-y(isfinite(s))/c3)) )),[p0(1),p0(2),p0(3),p0(4),p0(5),p0(6),p0(7)],[0,0,c1(i,j,k),0,0,0,0],[100,1,1,1,10000,1000,1000],options);
params(i,j,k,1:9)=[p,c1(i,j,k),residual];
end
end
end
To speed it up I tried to use the batchpleas instead but I’m getting errors. One of those “funlist must be a cell array of functions, even if only one fun”
My code looks as follows:
INLPlb = 0;
INLPub = [1 1 0.1 1];
batchsize = 50;
INLPstart = [p0(2), p0(3), p0(4), p0(5)]; %initial values of nonlinear parameters
%Here I defined the Ydata to be my objective function using the initial parameter values
Ydata = p0(1)*(p0(2)*exp(-x*p0(3)-y/p0(4))+(1-p0(2))*exp(-x*c1).*(p0(5)*exp(-y/c2)+ (1-p0(5))*exp(-y/c3)));
xy = [x(:),y(:)];
%Here I defined the Xdata to be the inputs of the Data
Xdata = xy;
funlist = {@(p,xy) (p(2) * exp(-xy(:,1)*p(3) - xy(:,2)/p(5))+(1-p(2)).*exp(-xy(:,1)*c1).*(p(4)*exp(-xy(:,2)/c2)+(1-p(4)).*exp(xy(:,2)/c3)))};
[INLP,ILP] = batchpleas(Xdata,Ydata,funlist,INLPstart,batchsize,INLPlb,INLPub);
One more question I have is where the way I defined the Xdata and Ydata is the correct one to use in the batchpleas function.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!