Non-constant statistical weights in non-linear regression

Question

0 votes

Hello everyone, hope this message finds you and your families safe and healthy.

I am trying to perform weighted non-linear regression, where the weight factors are not contant. To clarify everything consider the following example:

exp_data = importdata('data.txt');
xdata = exp_data(:,1);
ydata = [exp_data(:,2),exp_data(:,3)];
x0 = [1 1 1 1];
fitfcn = @(c,xdata)test_function(c,xdata);
[lb] = [1 1 1 1]; ub = [10 10 10 10];
problem = createOptimProblem('lsqcurvefit','x0',x0,'objective',fitfcn,...
'lb',lb,'ub',ub,'xdata',xdata,'ydata',ydata);
ms = MultiStart('PlotFcns',@gsplotbestf);
[xmulti,errormulti] = run(ms,problem,500)

Normally, when a constant weight, w, is used one can simply multiply ydata and the expression in the function handle with w:

exp_data = importdata('data.txt');
xdata = exp_data(:,1);
ydata = [exp_data(:,2),exp_data(:,3)];
w = 1./sqrt(((ydata(:,1).^2)+(ydata(:,2).^2)));
ydata2 = ydata.*w;
x0 = [1 1 1 1];
fitfcn = @(c,xdata)test_function(c,xdata).*w;
[lb] = [1 1 1 1]; ub = [10 10 10 10];
problem = createOptimProblem('lsqcurvefit','x0',x0,'objective',fitfcn,...
'lb',lb,'ub',ub,'xdata',xdata,'ydata',ydata2);
ms = MultiStart('PlotFcns',@gsplotbestf);
[xmulti,errormulti] = run(ms,problem,500)

In my case I have two options. Either calculating w by using the experimental (imported) values or the calculated ones. The first case is straightforward (since the weight vector is defined and constant) and can be solved as above. However, due to the existence of random errors and an increased possibility of bias, I have to resort to the second case which is more complicated (at least for me), since the weight vector changes each time the c vector changes (from the x0 guess values) to produce an output based on the objective function. The objective function produces as an ouput a matrix of n x 2 dimensions (the real and imaginary part of a complex number). My goal is to use as weight the inverse of the modulus of the produced (calculated) values, i.e. the ydata in w above would be the calculated ydata for each c.

My question is how I can assign these calculated values to w.

Any help/suggestions would be highly appreciated.

Thank you in advance for your time.

Regards

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Bjorn Gustavsson on 6 May 2020

0 votes

When I have this type of problems (fitting of functions to photon-counting-data where the weights should be calculated from the function - Poisson-statistics using the expected values), I typically get good enough results with your first approach (the weights calculated from the data doesn't differ much from the weights calculated from the fitted function). For the next-level attempt I've thought about running this iteratively, first use the data-weights, then repeat with weights calculated from the first solution - which hopefully should be close to the optimal solution.

If that is not good enough, you might have to do this "properly" and calculate the weights as you should - you might have to be careful and make sure that the weights don't blow up/down. If your data doesn't have normal-distributed noise you might have to use the proper "norm" to minimize - this might force you away from any lsq-function - unfortunately.

HTH

3 Comments
Show 1 older comment Hide 1 older comment

Bjorn Gustavsson on 6 May 2020

What you want to do is to get a maximum-likelihood solution, right? This requires you to have some knowledge/grasp/estimate of the statistical characteristics of your data (if all data can be assumed to be samples from some normal-distributions we get the least-square-solution, weighted or not. If you have some other noise-characteristics, for example Cauchy-distribution (with much longer tails) you have to maximize another likelihood-function). If you think that you have data that is reasonably normal-distributed except for some fraction of outliers you could take a look at Huber-like norms, that's one way to reduce the impact of outliers.

thanos on 7 May 2020

Edited: thanos on 7 May 2020

Thank you very much for your illuminating comments. Electrical impedance measurements are usually measured by the application of a perturbation signal (sinuosidal excitation signal). The excitation and response signals (voltage and current) are recorded using a constant sample rate and then are transferred to the frequency domain (by FT) to calculate the impedance of the system. For an impedance spectrum, several frequencies are applied. The noise component of the signal at each frequency (isolated by FT) results in a Gaussian distribution around the true value of the impedance. I believe that this is why the least-square-solution is implemented (as you pointed out). However, I will totally have in mind the different distributions you mentioned and try to see if they can be applied to my case (especially the Huber function).

Sign in to comment.

Non-constant statistical weights in non-linear regression

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

3 Comments
Show 1 older comment Hide 1 older comment

Categories

Products

Release

Tags

Community Treasure Hunt

Non-constant statistical weights in non-linear regression

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

3 Comments Show 1 older comment Hide 1 older comment

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

3 Comments
Show 1 older comment Hide 1 older comment