How can I calculate the confidence interval for "lsqnonneg" regression?
2 views (last 30 days)
John D'Errico on 3 May 2017
Edited: John D'Errico on 3 May 2017
The standard solutions for confidence intervals do not apply to bound constrained regression (lsqnonneg).
That normally leaves a bootstrap/jackknife solution, which can in theory compute confidence intervals on any such problem.
However, you have far too few data points (only 5, with 3 unknowns.) That suggests that any confidence intervals will be extremely wide, and VERY poorly estimable. For me to say this is a silly idea (i.e., a waste of time) might be taken wrongly. But IMHO, it is just that.
Walter Roberson on 4 May 2017
c = sym('c', [5 3], 'real');
d = sym('d', [5 1], 'real');
x = sym('x', [3 1], 'real');
residue = sum((c*x - d).^2);
solx1 = simplify( solve( diff(residue, x(1)), x(1)) );
eq2 = simplify(subs(residue, x(1), solx1));
solx2 = simplify( solve( diff(eq2, x(2)), x(2)) );
eq3 = simplify(subs(eq2, x(2), solx2));
solx3 = simplify( solve( diff(eq3, x(3)), x(3)) );
X3 = solx3;
X2 = (subs(solx2, x(3), solx3));
X1 = ( subs( subs(solx1, x(2), X2), x(3), X3) );
Now X1, X2, and X3 are the general symbolic solution to the minimum residue for a 5 x 3 system, but with no range limitations. You can convert your actual values into rationals and substitute them in to get exact solutions. You can add a symbolic fudge factor to each of your actual values, and study the derivative of your X1, X2, X3 with respect to each of the fudge factors. For example,
C = sym(randi([-10 10], [5, 3])); %actual values
D = sym(randi([0 20], [5, 1])); %actual values
dC = sym('deltaC_', [5, 3]); %per-variable fudge factor
dD = sym('deltaD_', [5, 1]); %per-variable fudge factor
fudged_X3 = subs(subs(X3, c, C+dC), d, D+dD);
and now you can study fudged_X3 with respect to deltaC_4_2 for example.
In particular you would be interested in delta values that are +/- eps() of the actual value.
The expressions will be quite long. You might want to do statistical studies, substituting in a number of random values in the appropriate eps range and seeing how the X1, X2, X3 are affected.
Note: the above general solution could involve negative values. At the moment I am not sure of the best way to minimize the residue for that case.