version 1.6 (166 KB) by
John D'Errico

Numerical derivative of an analytically supplied function, also gradient, Jacobian & Hessian

The DERIVESTsuite provides a fully adaptive numerical differentiation tool for both scalar and vector valued functions. Tools for derivatives (up to 4th order) of a scalar function are provided, as well as the gradient vector, directional derivative, Jacobian matrix, and Hessian matrix. Error estimates are provided for all tools.

DERIVEST provides a robust adaptive numerical differentiation (up to the fourth derivative) of a user supplied function, much as quad does for integration. It is semi-intelligent, trying to use that step size which minimizes its estimate of the uncertainty in the derivative.

High order methods are used, although full control is provided to the user when you want it. You can direct the order of the method to be used, the general class of difference method employed (forward, backward, or central differences), the number of terms employed in its generalized Richardson acceleration scheme, step sizes, etc.

Although you can not provide a user supplied tolerance, DERIVEST does return an estimate of its uncertainty in the final result.

For example, the derivative of exp(x), at x=1 is exp(1)==2.71828182845905. DERIVEST does quite well.

[d,err]=derivest(@(x) exp(x),1)

d =

2.71828182845904

err =

1.02015503167879e-14

See the provided demos for many more examples.

John D'Errico (2021). Adaptive Robust Numerical Differentiation (https://www.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation), MATLAB Central File Exchange. Retrieved .

Created with
R14SP1

Compatible with any release

**Inspired by:**
Numerical derivative of analytic function

**Inspired:**
Adaptive numerical limit (and residue) estimation, Numerical Differentiation, Accelerated Failure Time (AFT) models, Fit distributions to censored data, Phase Portrait Plotter, modified_newton, hessianAnalysisDemo, Object tracking with an Iterative Extended Kalman Filter (IEKF)

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

Ian HunterA good routine, but beware:

Hi all,

While I found this to be a good method for many matrices, I found it failed by several orders of magnitude when the Jacobian has near 0 eigenvalues. I compared this solver to central differencing approximations of the Jacobian and analytic expressions for the eigenvalues in this limit, and found central differencing worked better than this function.

All that said, I found it broadly very useful and more accurate than central differencing in most cases.

Cheers,

Ian

Antonios KoskinasHello again John,

After spending some time in your code and comparing it to the correctly working derivest.m I think that the problem comes when transposing some matrices. More specifically, instead of using the ' operator (which is the hermitian operator) the .' operator should be used (that is the transpose operator).

I would propose the following 3 changes to solve the aforementioned issue:

In the jacobianest.m file:

1. line 189: change from: rombcoefs = rromb\(qromb'*rhs); to rombcoefs = rromb\(qromb.'*rhs);

2. line 190: change from: der_romb = rombcoefs(1,:)'; to der_romb = rombcoefs(1,:).';

3. line 196: change from: errest = s'*12.7062047361747*sqrt(cov1(1)); to errest = s.'*12.7062047361747*sqrt(cov1(1));

Please feel free to check for yourself the above changes. Thanks again for time.

Antonios KoskinasHello John, I would like to also thank you for this really cool and valuable tool.

As I was using it, I noticed an issue, that I guess is quite similar to what liuxbin xbin spotted some time ago. It occurs when trying to calculate the Jacobian of a vector function with complex input, where the result obtained is the conjucate of the expected result. A small code to reproduce is as follows:

myfun = @(x) [exp(x(1)),exp(x(2))]

value = [1+1i,1+1i]

res1 = myfun(value)

[res2, err] = jacobianest(@(x) myfun(x),value)

It would have been really cool if you could have a look at that. Thank you in advance

Matt JJohn, as a wish list item, I wonder if you'd consider extending the code to handle functions that return gpuArray output. For example, this will fail currently,

>> c=gpuArray(2);

>> fun=@(x) c*x;

>> jacobianest(fun,1)

You could, of course, insist that the user define the function so that the result always gets pulled back to the host:

>> fun=@(x) gather(c*x);

but this will introduce significant overhead in higher dimensional examples.

Dae Seok HanGreat job! Extremely valuable for my research! Thanks for your sharing!

John D'Errico@Tim - to try using fminsearch of any sort, with 15 unknowns carries unrealistic expectations. 15 unknowns is just too much for fminsearch to handle well. fminsearchcon is no better, since it is merely an overlay on top of fminsearch. So it is the same basic algorithm.

Worse, it looks like your objective function seems to have a singularity there. That suggests your model is over-parameterized. At least one of your unknowns seems to be unnecessary for the fit at least locally so. (Generally a singularity tells you that some linear combination of the parameters allows changing one set of parameters for another with no cost to the objective function.) That may mean your model is overly complex, or perhaps your data is inadequate to fit a model of that complexity.

That you got an excellent fit is not terribly relevant. Another set of parameters would have given you just as good results. In turn that means trying to compute any kind of confidence intervals is the wrong thing. Those confidence intervals tell you nothing, except what I've explained above.

TimIt is a great package! Thanks a lot!

I am using the jacobianest function in order to evaluate the Jacobian and the confidence interval for my fitted parameters based on John D'Errico`s comment here: https://groups.google.com/forum/#!topic/comp.soft-sys.matlab/JJOqLCNJjF0

My problem that one a column of the Jacobian matrix has very small values (1e-29) and when I calculate Sigma which is proportional to inv(J'*J) (see the link above), I have a message: Warning: Matrix is close to singular or badly

scaled. Results may be inaccurate. RCOND =

5.516957e-117.

However, I have an excellent fit, the fitted parameters have a lot of sense but their error is so huge, it is totally unrealistic. I know that the approach above assumes that I have a Gaussian error system, but honestly, I do not know how I can be sure about it. I have a nonlinear equation with 15 parameters and I am using fminsearchcon to find the best fit with the least square method.

Is there a way to get a confidence interval using existing Matlab functions when the Gaussian assumption is not valid?

Every suggestion is welcome.

TithaniumThanks a lot for this nice tool and documentation. Saves a lot of time, this was exacly what I was looking for... and even better !

BERGHOUT Tarekgood job sir

Duc Hong HoangI use the Hessian matrix to compute confidence interval estimates on parameters in a maximum likelihood estimation. However the diagonal of inverse matrix of hessian is negative. How should I deal with this problem?

Chang LiuThis code is great. But caution should be taken as it seems to give wrong results some times. Other reviewers have mentioned this. The case I encounter was when computing the gradient of the 2-D langermann function. The gradients in the range [0,10]*[0,10] seem to be incorrect compared to the analytic gradient.

Oleksandr FreiC/C++ port of DERIVESTsuite:

https://github.com/ofrei/cppDERIVEST

Oleksandr FreiThe software is truly excellent!

Of note, there is one case where DERIVESTsuite seems to give wrong answer: 2nd and 4th derivative calculated in 'backward' mode are reported with an opposite sign.

>> derivest(@exp, 1, 'DerivativeOrder', 2, 'Style', 'backward')

ans =

-2.7183

>> derivest(@exp, 1, 'DerivativeOrder', 4, 'Style', 'backward')

ans =

-2.7183

Carlo CabreraMatt J@Yaguang Yang

It's definitely better than fminunc, but I would caution you on assuming "the code has never failures". There are still wrinkles in there, like this one reported by Paul (Nov. 2015)

>> jacobianest(@(x) x+10, 1e-12)

ans =

0

Yaguang YangVery reliable tool. I build my modified Newton code based on John's code and compared modified Newton to fminunc in matlab optimization toolbox. For millions trials of randomly generated problems (an engineer problem will different size and parameters), the code has never failures; but fminunc has numerous failures. The problem for modified Newton including John's DERIVESTsuite is the speed is much slower than fminunc. My question to John is: can you build mex-file for your matlab code so that people can use mex-file instead of m-file for large problems?

Matt J@Abhinav,

Even if you are computing the derivative at a positive x, there is no reason a priori that the finite differencing used by jacobianest should avoid evaluating the function at negative x. Finite differencing always involves sampling some distance to the right and left of the desired point. However, if you are working with scalar x (it sounds like you are from your example), you could use derivest instead. derivest offers input options to limit the finite differencing stepsize, overriding the adaptive selection.

Abhinav@matt

Thanks a lot for your reply. I don't intend to compute derivative at x=0. The problem is that the 'jacobianest' still tries to evaluate my function at negative values. For example, when I try to compute the derivative at x=2.5, jacobianest is evaluating my function at 252.50 and -247.50.

Matt J@Abhinav

For a function to be differentiable at a point, it must be defined on an open region around that point. It is a fundamental part of the definition of a derivative. So f(x) defined only for x>=0 cannot be differentiable at x=0.

AbhinavHi John,

Thanks for a great code!

I want to find out the derivatives of a function which has positive support, i.e., the function is defined only for the positive parameter. I have not been able to figure out a way to put constraints on the variable. Can you please suggest a way to do it?

Best

AG.

Thomas LigonIs there any issue with running DERIVEST within a parfor loop? I have been using it successfully for some time and today I embedded it in a parfor loop, and I got an error that I haven't been able to reproduce singly.

Mahdi S. HosseiniA generalized framework called MaxPol has been recently published and made available here

https://www.mathworks.com/matlabcentral/fileexchange/63294-maxpol-smoothing-and-differentiation-package

MaxPol provides a framework to design variety of numerical differentiation kernels with properties like:

(1) Cutoff (lowpass) design with no side-lob artifacts (for noise-robust case)

(2) Arbitrary order of differentiation

(3) Arbitrary polynomial accuracy

(4) Derivative matrix design

(5) 2D Derivative Kernels with Steering moments

(6) Intuitive examples in Signal and Image processing

Mark BrandonUsing this code for an MLE calculation, and I found it to be fast, seamless, and well conceived. No doubt about it... John D'Errico produces codes that are simple, innovative, and bulletproof. I thank him for his contributions.

Mark

Miguel Martinez de Bujo Alonso de LinajedengpingjunIt's a good toolbox but some serious issues may be in it when I use the function jacobianest/hessianest. For example, when I define the very simple function fun1 in the script file fun1.m:

function res=fun1(x)

if x<=20

error('x must larger than 20');

end

res=sin(x);

end

then fun1 can be calculated at the point 21, the value is 0.8367. However, when using the function jacobianest as the following code

jacobianest(@sin, 21);

I find the following error notice:

error using fun1 (line 3)

20!

error JacobianEst (line 112)

fun(swapelement(x0,i,x0_i - delta(j)));

I don't know why. Hoping for your reply.

Best regards

Pingjun Deng

Filipe FigueiredoJanM. F.leo nidasclose all

clear all

clc

n=100;

m1=1;m2=2;

s1=1;s2=1;

x1=normrnd(m1,s1,n,1);

x2=normrnd(m2,s2,n,1);

h1=0.9*min(std(x1),iqr(x1)/1.34)*length(x1)^(-0.2);

h2=0.9*min(std(x2),iqr(x2)/1.34)*length(x2)^(-0.2);

F1inv =@(at) ksdensity(x1,at,'function','icdf','width',h1);%'support', 'positive');

F2 =@(at) ksdensity(x2,at,'function','cdf','width',h2);%'support', 'positive');

rochat=@(t) 1-F2(F1inv(1-t));

t=0:0.01:1;

plot(t,rochat(t));

title('I want the derivative of this curve at any given poit 0<t<1')

%=============================

%I want to find the derivative of rochat at say "at=0.8, 0.6, 0.4, 0.2, 0.1"

myderiv1=derivest(@(x) rochat(x), 0.8)

myderiv1=derivest(@(x) rochat(x), 0.6)

myderiv1=derivest(@(x) rochat(x), 0.4)

myderiv1=derivest(@(x) rochat(x), 0.2)

myderiv1=derivest(@(x) rochat(x), 0.1)

%WHY IS IT THE CASE THAT FOR THE FIRST THREE IT IS 0?

RodrigoFantastic tool. Great for spotting typos in gradients and jacobians that are being fed to solvers and optimizers.

Rodrigo Duran@Peng, derivest computes the derivative of a function of one variable, i.e. another function of one variable. The hessian is a square matrix of the second derivatives of a function of an n-dimensional variable.

Peng WangHi,

I met some problems when using the codes. I wrote:

d=derivest(@(x) exp([1 3]*x-0.1)+exp([1 -3]*x-0.1)+exp([-1 0]*x-0.1),[0;0])

in MATLAB, but it returns an error:

Error using *

Inner matrix dimensions must agree.

Error in GeneralFunction>@(x)exp([1,3]*x-0.1)+exp([1,-3]*x-0.1)+exp([-1,0]*x-0.1)

Error in derivest (line 362)

f_plusdel = fun(x0i+h*delta);

But we I use the command hessian in MATLAB:

h=hessian(@(x) exp([1 3]*x-0.1)+exp([1 -3]*x-0.1)+exp([-1 0]*x-0.1),[0;0])

it returns the right answer.

Can you help me to see what is wrong with derivest?

neihoHi John,

Thanks for sharing this great package!:) I have a small question:

Can I choose the method (forward, etc.) and precision in jacobianest?

Only (?) jacobianest allows for vectorized code, but no options are provided.

Huck FebboThanks Matt J.

I am having other issues with this now though, is hessdiag.m, it goes into this for loop and

for ind = 1:nx

[HD(ind),err(ind),finaldelta(ind)] = derivest( ...

@(xi) fun(swapelement(x0,ind,xi)), ...

x0(ind),'deriv',2,'vectorized','no');

end

and it passes x0(ind), so it passed the first element of the x0 vector to derivest. Then in derivest:

% will we need fun(x0)?

if (rem(par.DerivativeOrder,2) == 0) || ~strncmpi(par.Style,'central',7)

if strcmpi(par.Vectorized,'yes')

f_x0 = fun(x0);

else

% not vectorized, so loop

f_x0 = zeros(size(x0));

for j = 1:numel(x0)

f_x0(j) = fun(x0(j));

end

end

else

f_x0 = [];

end

it fails:

K>> f_x0(j) = fun(x0(j))

In an assignment A(I) = B, the number of elements in B and I must be the same.

Because we passed x0(ind), which is a single number and not the entire vector. Any ideas? I don't want to muddle around in this doe and try to fix it as I assume that I have done something wrong in the first place. Thanks

Matt J@Huck,

This doesn't require any modification of DERIVEST. You can turn any function with extra parameters into a single-argument function using the techniques described here,

http://www.mathworks.com/help/optim/ug/passing-extra-parameters.html

Huck FebboCan you please modify this toolbox so that the constraint function can also accept a set of parameters? For instance if I am trying to calculate the Jacobain of my constraints and my constraint function looks like this:

c = constraints(x, auxdata);

% where auxdata is a cell array of parameters that my constraint function needs to run

Thanks!

Matt JIt definitely seems a lot more robust than the Optimization Toolbox finite differencers. However, I also see failure cases like that reported by Paul (23 Nov 2015). I'll up my score to 5 stars once those wrinkles are ironed out.

rihabPaulHi John,

thanks for sharing this code. I found some numerical issues when estimating the Jacobian if the starting point is approaching zero.

For example:

jacobianest(@(x) x+10, 1e-12)

is returning zero even though it should be one. A starting point of 1e-11 works fine. Zero works as well because it is handled as a special case in the code. The problem is combining very small deltas with the relatively large +10 in the objective function, because delta is scaled by the starting point if the starting point is ~= 0 (line 100 in jacobianest.m). A workaround is just not scaling the deltas by the starting point.

Rodrigo DuranJohn,

It is because I know that difference formulae are very ill-conditioned that I am testing to see when does a fancy finite-difference scheme break down. I am looping over different values of coeff to get a feeling for this and every time i hit the mentioned value of coeff I see that derivest does very well at every point in vector v except in a vicinity of v==4. In other words your assumption about my belief is wrong and the question remains: what is so special about v==4 when noise is uniformly and randomly distributed in the interval [-1e-10,1e-10]?

If you run my code several times with the derivest version available here you should see what I mean.

John D'ErricoNo, no, no. There is no bug, except in your belief that derivest is a tool designed to solve the problem you pose.

A function that contains random noise is by definition not differentiable everywhere. Derivest is not designed to differentiate a noisy function, nor should it be expected to do so. While derivest tries to be insensitive to tiny amounts of noise (essentially in the least significant bits) you invalidate the use of a tool that tries to differentiate a function that actually HAS a derivative.

While the underlying function in your problem HAS a derivative, once you add noise to it, the result is NOT differentiable.

Note that differentiation is an ill-posed problem, in the sense that it tends to amplify any noise in the function. I'm not sure that it matters, but then you tried to use a higher order method to generate the derivative. This will induce more noise amplification yet.

You might wish to use a tool that can essentially smooth away the noise, differentiating the result. There are many such tools, depending on your application and what you need out the end. This might be anything from a Savitsky-Golay tool to a least squares spline or local polynomial model, or other tools.

Rodrigo DuranJohn,

I was testing finite differences with noise as follows:

coeff=1e-10;

v=0:0.1:2*pi;

[d,~]=derivest(@(x) sin(x)+coeff*(-1+2*rand(size(x))),v,'deriv',2,'MethodOrder',4);

plot(v,d,'-o')

Every time I run this code (i.e. with different noise generated with rand in the interval [-coeff coeff]), the error is much larger near v==4. This seems unusual and suggests a bug to me. Didn't look into derivest but thought I would report it.

Thanks for your submissions!!

Rodrigo.

MatthiasHello John,

great work. Very helpful.

I think the hessian calculation can be speeded up significantly, if the evaluation of the function under test is slow (as it is in my case):

In hessian.m you call hessiandiag.m. It needs to calculate the function values at certain sample points.

gradest.m, which is also called by hessian.m, calculates the function values at the same sample points as hessiandiag.m. So it's unnecessary to do it twice.

@Simon Tardivel: I can confirm your obseravtion as well as your workaround. Thank you.

Simon TardivelHello John,

Thank you very much for this code. It's well written, simple to use, fast enough for its purpose. It is strange that these functions are still missing in the official built-in functions.

However, I may have found a bug/unexpected behavior. I can't unfortunately share the code (plus it requires thousands of lines of code...) but here's the problem and a working fix:

Situation: a hessdiag I was computing was supposed to yield [a b c] with a,b,c all different from 0. I know they are all different from 0, the laws of physics require it. However b is rather small, usually a couple orders of magnitude smaller than a. a and c are comparable, of opposite sign.

Problem: hessdiag outputted [a,0,0] and the estimated error was [10^-7*a, 0, 0].

Fix: I went to derivest and changed the default RombergTerms from 2 to 3. b and c reappeared. b was estimated to about 10^-3*a and c was approximately -a -- and it is their correct values.

I have no idea what the problem fundamentally was. I've just noticed that hessian has default Romberg terms = 3 but doesn't pass this argument to the computation of the gradient that assumes Romberg terms = 2. It seems that, in fact, Romberg terms = 3 is used only for the upper triangle (3+1 really). Maybe that caused the problem? I do not know anything about numerical differentiation methods so that may be completely off point...

Best,

Simon.

JacobHello,

I was wondering if you can give me your opinion on the following issue.

jacobianest fails to work for my function and I think the reason may be that I am trying to evaluate the Jacobian at a point near the boundary of the function domain. Is this a reasonable guess? If so, how I can I fix this? My intuition is that the MaxStep is too large and the code tries to evaluate the function where it is not defined. But can I just pick a smaller value of MaxStep?

Thanks!

John D'ErricoYou do realize that the function you typed is not actually valid MATLAB syntax?

f=@(x)(1/2*x'(Ax) + x'b)

For example, x'b may be ok when you write it on paper, but MATLAB requires a * operator between them to understand that a multiply (dot product) is intended there.

Next, you write this:

f=@(x)(1/2*x'(Ax) + x'b)

derivest(f0,x0),

So, you define the function f, but then try to call derivest with f0. How should it know that you intended something other than what you wrote?

My guess is you are not being careful in what you do. USE VALID SYNTAX.

ak135AKI have a quadratic function:

1/2x'Ax + x'b

but it seems to be not working, does it work in general? or am I making mistake in calling it?

I have tried:

f=@(x)(1/2*x'(Ax) + x'b)

derivest(f0,x0), but always get following error:

Not enough input arguments.

Error in derivest (line 362)

f_plusdel = fun(x0i+h*delta);

Is there something to do in order to get it working?

Thanks !

John D'ErricoYou might THINK of the gradient of a function of multiple variables as the derivative, but it is not. A derivative is defined on a function of ONE variable, at least, unless they have changed calculus since I took it.

derivest computes the derivative, dy/dx, of a function y(x), a function of one variable.

The fact is, I do provide a gradient calculator in the package, so you might have tried that! Look at the help. Look at the demos.

Finally, I'm not going to look that carefully at your code to see if it makes sense. But a quick glance shows lines like this:

p = max(p,0.0000001);

This will make your function non-differentiable, and should have no purpose at all since you are computing a log-likelihood function in the first place! It does seem irrelevant of course, since you fail to use p after that in your code. Why compute something if you intend to dump it in the bit bucket?

These next lines do however get used, and they will indeed make the function non-differentiable.

y = reshape(y,max(alt),max(ncs));

x = reshape(exp(v(b)),max(alt),max(ncs));

So I think you need to consider what it is you are doing, and maybe if there is a reason why an optimizer fails, if it presumes the function it is trying to optimize is differentiable, when your function is not so.

SebastianHi there,

I am currently working on getting MATLAB to optimize LogLikelihood functions for some econometric models. The problem is that the internal optimiser seems to have issues with maximising these functions. Thus, I want to see if things get better if I provide a good numerical estimate of the gradient. Thus, I found this toolbox. However, I have some problems to make it run. Here is a very easy example of the function I am trying to optimize:

function logli = logl(b,v,ncs,alt,y)

y = reshape(y,max(alt),max(ncs));

x = reshape(exp(v(b)),max(alt),max(ncs));

p = x(y == 1)./sum(x)';

p = max(p,0.0000001);

logli = -sum(log(x(y == 1)./sum(x)'));

end

where

v is an anonymous function like

v = @(b) b(1)*x1 + b(2)*x2 + b(3)*x3

b is the parameter vector with starting values

b = [0,0,0]

ncs and alt are identifyer variables

y is the independent variable

Now if I want to obtain the first derivative I use:

[der,errest] = derivest(@(b)logl(b,v,ncs,alt,y),b)

but get the error message:

Error using derivest (line 409)

fun did not return the correct size result (fun must be vectorized)

If I specify:

[der,errest] = derivest(@(b)logl(b,v,ncs,alt,y),b,'Vectorized','no')

It seems to pass only one dimension of b as I get

Attempted to access b(2); index out of bounds because numel(b)=1.

The data structure of these problems is:

ncs alt y x1 x2 x3

1 1 1 x11 x21 x31

1 2 0 x12 x22 x32

1 3 0 x13 x23 x33

2 1 0 x11 x21 x31

2 2 1 x12 x22 x32

2 3 0 x13 x23 x33

...

I would be very happy about any recommendation on how to use the derivest function in that case.

Cheers

Sebastian

John D'ErricoThe simple answer is to use the hessian function. For example, I'll compute a second derivative

d^2f/dxdy

of the function:

y = sin(x + y^2)

at the point (x,y) = (2,3). I'll do it using these tools, then do it again using the symbolic toolbox to verify the answer.

fun = @(xy) sin(xy(1) + xy(2).^2);

format long g

hessian(fun,[2 3])

ans =

0.999990206550529 5.99994123927655

5.99994123927655 36.0084988318012

So the second partial you want is the (1,2) element, thus the off diagonal element of the hessian matrix.

Now verify that value using symbolic means.

syms x y

fs = sin(x + y.^2);

dfs_dxdy = diff(diff(fs,x),y)

dfs_dxdy =

-2*y*sin(y^2 + x)

subs(dfs_dxdy,{x,y},[2 3])

ans =

-6*sin(11)

vpa(ans)

ans =

5.9999412393042207423093893941531

So that worked well enough, although it did compute additional terms that we did not really need.

I could do it using derivest too, although it looks like I had to tell derivest that this function is not properly vectorized.

fun = @(x,y) sin(x + y.^2);

So as a function of y, at x == 2, we would have

dfdx = @(y) derivest(@(x)fun(x,y), 2, 'vectorized', 'no');

derivest(dfdx,3,'vectorized','no')

ans =

5.99994123930711

Muhammad AwaisGreat file, i have a question. I have an objective function and i taking its derivative wrt to input in matlab. I want to take the derivative two times, one time with respect to input and other time with respect to parameter. But if once i have use GRADEST to get the gradient wrt input now i have answer in an number, that is a gradient. How can second time i can take the gradient when i dont have the first derivative in some kind of a formulation.

GuilhermeGreat File, many thanks for it.

However, I am currently facing some issues with my function (bad scaled parameters), and therefore with the second derivatives matrix (Hessian).

Here is the deal: I am minimizing a cost function (xˆ2 = (y - ymodel)^2), where my model is 3 nonlinear parameter one, using a Nelder Mead algo.

Using the |hessian| function from your package using best-fit parameters, yielded me some strange results (since Parameter_1 ranges from [0.1:0.9], Parameter_2 from [10:30] and Parameter_3 from [150:250]), that is, I need to re-escale them. My question is, what is the best way to do that and does your code endogenously scale the results by any chance ?

Thank you very much in advance,

Niklaskartik srinivasanAndres TovarAdam AutonTimLoads better than fmincon's output

DannyCamilleGreat submission, saves me a lot of work, thanks!

MatthewJohn D'Erricoretrobytes - not all functions will be able to use complex arguments, so derivest avoids that approach. Hyper-dual numbers would appear to have the same flaw, at least without a great deal of effort invested.

retrobytesA very useful tool, John, thanks!

Did any of you guys think about implementing the hyper-dual numbers approach for this? I recently came across this:

http://www.stanford.edu/~jfike/SNUpresentation_rev4.pdf

and:

http://www.stanford.edu/~jfike/hyperdual.html

and I was wondering if any of you tried it instead of using complex numbers?

Anders Munk-NielsenAbsolutely awesome work! So easy to use, brilliantly done. This is probably the best thing to happen to mankind since sliced bread. And to put it up for free? John, you sir are a good man!

Roberto CalandraJust great

daisyMd. Ikramul HasanehsWoodeastThanks a lot for this useful tool!

I have a question about how far away will the program evaluate the fun from the initial point. I have a function which only exits in a certain range around the initial point(but big enough in my sense, e.g. starting at .1, requiring >0). However, the program will break down for evaluating out of range points after a fare amount of calculation. Could you please give me some idea about why this happens?

(BTW, my function has some randomness, does this matter?)

Thanks so much!

Andrew NewellExcellent! Some of its predecessors also do accurate derivatives, but this one is particularly easy to use for gradients and Hessians.

MarkH John,

This looks great. I'm trying to use the Hessian function to find the Fisher Information Matrix for Max-likelihood. However the Hessian program is choosing parameters which are not allowable in my chosen model. How can I restrict the Hessian algorythmn for permissible parameter values

John D'ErricoYour problem is in the way you called the code.

H=hessian(@(p) @reg2002, [0.03 3 -5])

Here is part of the error message:

"??? The following error occurred converting from function_handle to double:

Error using ==> double

Conversion to double from function_handle is not possible."

Is reg2002 a function m-file, into which you are trying to pass the variable p? If so, then you could have done this

H=hessian('reg2002', [0.03 3 -5])

or, you could have done this

H=hessian(@reg2002, [0.03 3 -5])

In either case, it assumes that reg2002 is an m-file, and passes in a variable with two elements in it asa vector. It would return a 2x2 Hessian matrix. The use of @reg2002 turns that into a function handle.

See the error message you had though. Matlab sees this:

@(p) @reg2002

and tries to return a function handle as the result of a function. Then my code looks at it, and tries to use the function handle as a number. Of course it fails.

Finally, you might also have done this

H=hessian(@(p) reg2002(p), [0.03 3 -5])

Here, you would be creating an anonymous function, a function of the variable p. It is yet another function handle that the code can use properly.

emma liuhi John, many thanks for the wonderful suite!

I met a problem when I use the function from the m-file. my code is:

H=hessian(@(p) @reg2002, [0.03 3 -5])

??? The following error occurred converting from function_handle to double:

Error using ==> double

Conversion to double from function_handle is not possible.

Error in ==> derivest at 330

f_x0(j) = fun(x0(j));

Error in ==> hessdiag at 59

[HD(ind),err(ind),finaldelta(ind)] = derivest( ...

Error in ==> hessian at 74

[hess,err] = hessdiag(fun,x0);

I don't what is wrong here. is there mistakes I made? many thanks in advance.

Emma

John D'ErricoFelipe - Note that the submissions #15235 and #26807 actually appeared AFTER my tool (#13490) did. Submissions are assigned sequential numbers on the FEX. I see no reason to acknowledge a later submission on the FEX. In general, since forward acknowledgement would seem to require precognition of some sort, it is rather difficult to accomplish. In fact, most of the references in that Wikipedia article referenced appeared several years after this tool was written and posted on the FEX.

I will modify the title in light of your comment though, since the phrase might conceivably confuse some people.

Felipe G. Nievinski"Automatic" in the title might confound viewers that it implements "automatic differentiation", see <http://en.wikipedia.org/wiki/Automatic_differentiation> and submission # 26807.

You might want to use the acknowledgement field so that related submission #15235 back-links to yours.

Haven't tried it yet, reason why I'm not rating it.

KulanSturla KvamsdalHi John,

Thank you very much for the great suite!

The information on the Hessian-function explicitly states that it is not for frequent use on an expensive to evaluate objective function. I tried it anyway, because that is what I need. Most of the time, I get something reasonable (hard to check, but I'll worry about that later), but sometimes I get NaN's. Any idea what the problem might be?

Perhaps some more details are warranted: The function I want to differentiate is a likelihood-function. It is a function of time, and I need the Hessian at the maximum for each timestep. (The maximum is one point associated to the last time period.) The function takes 5 parameters.

Any help or ideas appreciated!

Thank you very much,

Sturla

John D'ErricoIf you are looking for all of the second partials, then yes, a loop on the Hessian computation seems the solution.

Aydin BulucHi John,

Thanks much for the quick response.

jacobianest was indeed what I needed, should have looked into the whole package first.

I was looking at the hessian function, which doesn't allow me to use a vector function (perhaps understandably, as the output would be a tensor otherwise).

Proceeding with the matrix-vector example

@(x) A*x;

Should I loop over the rows of and call hessian m times to get all the Hessians?

for i=1:m

f = @(x) A(i,:)*x; // Returns a scalar

[h(i) herr(i)] = hessian(f, x0);

end

Thanks,

- Aydin

John D'ErricoHi Aydin,

The trick is, derivest itself takes only a function of a single variable. Suppose I had a scalar function of three variables. Thus it takes a set of three variables, returning a single number. If you wish to compute a partial derivative of that function, then you get a single number out. But that is a partial derivative, NOT a derivative. "The" derivative of a scalar function of three variables is given by the gradient vector, composed of three partial derivatives. So I have provided the tool gradest. Use it for scalar functions of multiple variables.

Now, suppose I have a function of n variables, that produces m output results? (This is what you have by the way.) Now you must compute a Jacobian MATRIX, an m by n matrix of partial derivatives. Again, derivest is not the tool, nor is gradest. This is a different problem that requires a different tool yet, so I wrote jacobianest.

>> A = magic(3)

A =

8 1 6

3 5 7

4 9 2

>> [J,Jerr] = jacobianest(fun,[1;2;3])

J =

8 1 6

3 5 7

4 9 2

Jerr =

7.7667e-14 9.7084e-15 5.3057e-14

2.6528e-14 4.3706e-14 7.09e-14

3.8833e-14 9.4059e-14 1.9417e-14

In the case that you really only wanted to know the derivative of a specific element of the result, with respect to a specific single variable out of the several inputs, you need to define a function where the OTHER variables are held fixed. Thus...

>> x1 = 1;

>> x2 = 2;

>> fun2 = @(x3) [0 1 0]*fun([x1;x2;x3]);

>> [d23,derr] = derivest(fun2,3,'vectorized','no')

d23 =

7

derr =

4.785e-14

Thus, fun2 fixes the first two variables, then extracts out the second output result. derivest can now operate, as long as I tell it that this function is not vectorized. It is not set up to take a vector of inputs and produce one result for each. fun2 takes only a scalar input, and produces only a scalar output.

John

Aydin BulucHi John,

Thanks for the lovely tool.

I couldn't figure out how to differentiate multivariable functions with it though. For example,

f = @(x) A*x

where A is a matrix and x is a vector.

Your code seems to do the following:

% Loop over the elements of x0, reducing it to

% a scalar problem. Sorry, vectorization is not

% complete here, but this IS only a single loop.

der = zeros(nx0);

errest = der;

finaldelta = der;

for i = 1:n

...

end

while A*x doesn't really reduce to a single scalar problem as far as I can see. Any suggestions?

AlessandroSorry John, sorted out the problem, was my fault... thanks again for these useful codes.

Alessandro

AlessandroHi John,

I tried to run the Rosenbrock example included in the help of your routine hessian, i.e. I ran

rosen = @(x) (1-x(1)).^2 + 105*(x(2)-x(1).^2).^2;

[h,err] = hessian(rosen,[1 1])

I get the following error:

"??? Error using ==> subsindex

Function 'subsindex' is not defined for values of class 'function_handle'."

do you know how to solve this?

thank you.

Alessandro

John D'ErricoI believe that fmincon has improved since my statement in the past, so that the hessian as returned is now a better approximation. Of course, if you supply the hessian, no updates are needed.

JanujHi Errico,

In another post, you stated that matlab does not compute the Hessian very well. Is this why the Hessian may not get updated in fmincon? Secondly, would providing a user-suppled Hessian (through the DERIVEST routine, for e.g.) be able to improve the ability of fmincon to update the Hessian in all cases? Thanks!

Kimonas FountoulakisHI Errico

I tried to expand your code in order to derivate Bilinear Matrix inequalities. The program is still in a rough form, in particular I didn't implement the error part of the estimation, but I was able to produce derivatives of a BMI function correctly. If you are interested in sending you the programm please let me know. I will try though one of these days to implement the error term as well.

Kimonas

John D'ErricoHi Andrew,

The point is, derivest uses a sequence of nested approximations, using a geometric sequence of points for the steps. For example, suppose we chose to compute the derivative of tan(x), at x = pi*2/3.

[D,errest] = derivest(@tan,pi*2/3)

D =

4

errest =

7.0302e-13

Tan(x) has a singularity at x = pi/2, so if the step size is chosen to be large enough, the approximations will actually span that singularity. This will cause the corresponding estimate to be useless, but since there are multiple estimates produced, some of them will be good, and I can figure out which estimates in the sequence ARE good and which ones are bad.

I pick that estimate with the lowest predicted error, along with some additional criteria.

A nice thing about the nested sequence of approximations is it allows me to deal with functions that have problem scaling. I can almost always find a viable solution.

John

Andrew NewellIn derivest_demo.m, I found the following statement:

"Although a central rule may put some samples in the wrong places, it may still succeed"

How would it put samples in the "wrong places"?

John D'ErricoHi Prasanna,

In theory, I could have written derivest to provide higher derivatives. In practice, high order numerical differentiation is difficult to do. Differentiation is a noise amplifier. It takes any tiny noise in your system, and boosts it. By the time you get up to the 4th derivative or so, the noise down in the least significant bits can start to cause numerical problems. For this reason, I limited it there. Any further, and I was no longer able to trust what was coming out in some test cases. And since I had to put a limit somewhere, I chose 4 as the maximum order of derivative it would provide.

You can use the error estimate provided as a measure of how well I think I was able to do. As an example, try a simple trig function. Differentiate sin(x), at x = pi/4. The result will always be plus or minus sqrt(2)/2, depending on the derivative order.

>> fun = @sin;

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',1)

D =

0.70711

DE =

7.0617e-15

For the first derivative, I can give virtually full precision. However, the second derivative is not quite so verifiably computable, and the error estimate reflects that fact.

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',2)

D =

-0.70711

DE =

1.666e-12

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',3)

D =

-0.70711

DE =

6.6146e-11

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',4)

D =

0.70711

DE =

1.9122e-05

See that by the time I've gotten to the 4th derivative, derivest tells me only to trust the first 4 or 5 digits in the result. Had I allowed higher orders here, the result would rapidly become a virtual random number generator.

Now, with care, one could modify the code to work for 5th derivatives. Just look for par.DerivativeOrder, and see what was done, extending the code for 5th order derivatives. A quick check through the code finds only one or two spots where serious understanding of the methods employed was necessary. (To be honest, leave such a modification of the code to me.)

Can you nest derivest calls? Yes, in theory you can do so to produce higher order derivatives, and the error estimate will tell you if it thinks it is generating garbage for a result. This test computes the 5th derivative of sin(x), at pi/4.

>> format long g

>> d2 = @(x) derivest(fun,pi/4 + x,'derivativeorder',2);

>> [d5,d5err] = derivest(d2,0,'derivativeorder',3)

d5 =

0.707106775613828

d5err =

4.17633428512851e-08

>> sqrt(2)/2

ans =

0.707106781186548

Note that derivest thinks it was able to do so reasonably well, and it was just about right on where it tells you to no longer trust the digits in the result. In fact, in this case, the nested result is probably more accurate than a direct computation of the 5th derivative. So the 3rd derivative of the second derivative is more accurate (but a bit slower!) than a direct computation of the 5th derivative.

HTH,

John

Prasanna VenkateshHello John,

This piece of code is great. I had one question, I have a function for which I have no analytical expression, hence cannot vectorize it. But I can do pretty fast numerical computation of it and I know that it has smooth first, second, third and fourth derivatives by applying your DerivTest function on it. I need one more derivative to complete my calculation and I was wondering if I should make an anonymous function out of the fourth derivative calculation using DerivTest and take its first derivative to get the fifth. I wanted to know if there is some specific reason for which this code goes only upto 4th derivative?

J WestThanks very much, that was exceptionally helpful. These are a nice set of functions and very much appreciated.

Unsurprisingly, calling a loop to gradest for several points inside fminsearch is not the fastest thing in the world, but such are the trade-offs, and it does seem to do the job!

John D'ErricoI have provided the related function gradest to compute a gradient, zipped with derivest. For example, I'll pick arbitrary values for the parameters k1,k2,k3,k4,k5, and then define a function handle.

k1 = 1/2;

k2 = 1/3;

k3 = 1/4;

k4 = 1/5;

k5 = 1/6;

fun = @(xyz) k1*xyz(1)*(1- ...

exp(-k2*(xyz(2)^k3)*(xyz(3)^k4)*(xyz(1)^k5)))

Now, call gradest...

gradest(fun,[1 2 3])

ans =

0.21997 0.018836 0.010046

For multiple points, wrap a loop around the call to gradest.

John

J WestI may be missing something obvious here, but is there a clever way to use derivest to calculate a set of partial derivatives?

I have a function something like

f = k1 * x * (1 - exp(-k2*(y^k3)*(z^k4)*(x^k5))

and would like to be able to calculate partial derivatives df/dx, df/dy, and df/dz for sets of specific values of x, y, and z.

I am doing this in order to calculate an error weighting for least squares fitting -- I have uncertainties in x, y, and z which I would like to map onto f in order to determine a weighting factor for the minimization (so I want the gradient at each point). I would prefer to determine the derivative numerically because that will allow me the most flexibility in being able to try different forms of the function, etc.

Many thanks.

John D'ErricoLarge relative reported errors might mean that the derivative computed is small compared to the uncertainty in it. It may also mean that there is some problem in the computation, leaving the derivative uncertain. It is difficult to know what has happened from only this description.

One option is to compute the derivative in question using derivest in stead of the hessian tool. While this will be less efficient, there are many more options for overcoming problems that I have provided in derivest itself. That way you could be sure of what is the problem, and hopefully overcome it.

pekkaHi,

I have some problem that sometimes when I compute the hessian I get that err contains terms that are larger than terms in the reported hessian. What are the reasons for this and are there any general ways to rewrite something to avoid this problem?

Vthank you very much for your quick reply

i'll get release 14 or 7

John D'ErricoHi V,

You must be using an old MATLAB release. Here is what happens when I try your example in R2010a:

>> myfun1 = inline('exp(x)');

[d,err]=derivest(myfun1,1,'Vectorized','no')

d =

2.7183

err =

2.1641e-14

So then I looked more carefully at your comment. I see the warning message gives the answer away.

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 463 Column: 1

Unmatched "end".

You are running version 6.5 (i.e., release 13.) Derivest was written in Release 14, or version 7. As such, it uses several things that will cause your MATLAB release to be quite unhappy.

Sorry,

John

Vhi john,

i tried this

myfun1 = inline('exp(x)');

[d,err]=derivest(myfun1,1,'Vectorized','no')

and i got the error below,

thanks for any help you can provide

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 463 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 523 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 538 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 573 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 659 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 759 Column: 1

Unmatched "end".

(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)

??? Error using ==> factorial

N must be a positive integer.

Error in ==> C:\MATLAB6p5\work\derivest.m (fdamat)

On line 564 ==> c = 1./factorial(1:2:(2*nterms));

Error in ==> C:\MATLAB6p5\work\derivest.m

On line 275 ==> fdarule = [1 0]/fdamat(par.StepRatio,1,2);

Richard CrozierAnother really excellent piece of code, I hope TMW buy you a nice christmas present every year for extending their software so extensively for them.

David DoriaAdamHello,

Hessian does not work, neither in any example nor when I try to use it. It gives the message:

Function 'subsindex' is not defined for values of class 'function_handle'.

All other functions seem to be great though..

Best,

Adam

MichelleHi John,

Thanks for the timely and detailed reply! I had assumed that derivest handles all cases. Now I see jacobianest is a standalone function. Thanks again! And great code!

Michelle

John D'ErricoHi Michelle,

The problem is, when you define myfun as you have, it is not a scalar valued function. It returns a vector argument. Thus we can do this:

myfun=inline('[exp(x);exp(2*x)]');

myfun(2)

ans =

7.3891

54.598

See that there are two numbers returned for a scalar input.

(By the way, I'd strongly recommend the use of function handles and anonymous functions instead of inline functions. An inline function is comparatively very slow to evaluate.)

The derivest code computes the derivative of a scalar valued function. Change your function to return a scalar value, as do myfun1 and myfun2 below, and derivest works as designed.

myfun1 = inline('exp(x)');

[d,err]=derivest(myfun1,1,'Vectorized','no')

d =

2.7183

err =

9.3127e-15

myfun2 = inline('exp(2*x)');

[d,err]=derivest(myfun2,1,'Vectorized','no')

d =

14.778

err =

1.8401e-13

Now, how might one define the derivative of a vector valued function? This is defined mathematically as the Jacobian. In fact, in the derivest suite, I have provided the jacobianest function to compute what you wish. It works on a vector valued function, returning a jacobian matrix. Here, since the parameter of these functions is a scalar, the Jacobian matrix is just a vector of length 2.

[d,err] = jacobianest(myfun,1)

d =

2.7183

14.778

err =

3.6841e-14

1.4736e-13

John

MichelleThanks so much for making such a useful suite available!

I was experimenting with the code and typed in the following two commands:

>> myfun=inline('[exp(x);exp(2*x)]')

>> [d,err]=derivest(myfun,1,'Vectorized','no')

but got back error messages that read

??? In an assignment A(I) = B, the number of elements in B and

I must be the same.

Error in ==> <a href="error:/a/fmalx1/lcl/fma/res1/mxw/research/min/code/matlab/DERIVESTsuite/derivest.m,362,1">derivest at 362</a>

f_plusdel(j) = fun(x0i+h*delta(j));

I'm posting it here because I thought you might be able to spot the problem much more quickly than I could. Could you let me know?

Thanks in advance for your help!

John D'ErricoBenn,

My guess is that you have written your own factorial function, probably as a homework assignment, since the error messages show a recursive implementation of factorial. This causes a failure in derivest, since your version is not properly vectorized. By vectorized, I mean that when called with a list of numbers, the Mathworks version of factorial computes the factorial of each number.

You can learn which factorial version is called by typing this at the command line:

which factorial -all

MATLAB will show which version of the factorial function will be used. You should see the version that you wrote listed at the very top.

At first, I thought that possibly you simply are using an old version of MATLAB, but this is unlikely, since the Mathworks would never have written the factorial function to have a recursive definition as I see. (Recursive factorial definitions are a nice way to teach a student recursive coding styles, but they are a terrible coding style in practice because of the extraneous, wasted function call overhead just to multiply a list of consecutive numbers together.)

The solution is to remove the bad (homework assignment) version of factorial from your search path. It will be inefficient anyway compared to the supplied version of factorial that MATLAB already comes with, and your version is not properly vectorized. At the very least, move it to another place far down on your search path so that matlab will use the correct version of factorial instead.

John

Benn Eiferthelp! just downloaded the toolbox and tried to run the most basic example in the derivest documentation:

[d,e]=derivest(@(x) exp(x),1)

and got a bunch of errors of this form:

Error in ==> factorial at 8

x = n*factorial(n-1);

Error in ==> factorial at 8

x = n*factorial(n-1);

Error in ==> derivest>fdamat at 564

c = 1./factorial(1:2:(2*nterms));

Error in ==> derivest at 283

fdarule = [0 1 0]/fdamat(par.StepRatio,1,3);

Serter YILMAZWorks really fine! very helpful! Thanks...

Mark A. MikofskiThis worked better than fminsearch when used as a damped newton method for finding a root with extremely high sensitivity. fminsearch was overly slow, and often overshot the root, even though the tolerance was high, and fzero didn't work at all. Derivest was fast.

This is the second time I've used great code from John D'Erico.

Thanks!

John D'ErricoI've just uploaded a new release that repairs the problem of complex domain differentiation. For example, the current release now gets this correct:

exp(1+i)

ans =

1.4687 + 2.2874i

derivest(@(z) exp(z),1+i)

ans =

1.4687 + 2.2874i

liuxbin xbinWhen I use this function to get the derivative at the complex numbers,problem appears.For example:

derivest(@(z) exp(z),1+i)

ans =

1.46869393991590 - 2.28735528717885i

However,the derivative of exp(z) at z(complex numbers) should be exp(z)

exp(1+i)

ans =

1.46869393991589 + 2.28735528717884i

which is the conjucate of the above.

Golam Kibria ChowdhuryRock 'n' Roll

Golam Kibria ChowdhuryRock 'n' Roll

Scott MillerWorks just great. I was in a situation where using hessiancs was problematic because the functions being evaluated had a transiently complex nature. hessiancs works well with functions that never use any complex arithmetic, but watch out for using complex steps if the internals use any complex arithmetic, even if the inputs and outputs are both real. Downloaded the package, changed function names, and it worked right out of the box - slower, but 'bullet-proof' sometimes beats 'fast'. Good documentation, etc., too. I appreciate that hessiancs used a similar form to the one you used - I'll go back to it when I need more speed and have the time to fix my loss functions.

Scott

John D'ErricoCan anyone let me know what conflicted with lightspeed? I'll try to fix it if possible. It might be a function name that I chose. Otherwise, this is just basic Matlab code.

Matlab UserVery useful! I was able to use this to check my symbolically derived hessian matrix in an optimization routine... optimset('DerivativeCheck','on') checks the gradient numerically, but apparently not the hessian.

In case anyone else has a problem, I found that this was incompadible with the matlab Lightspeed toolbox, but that is no doubt the Lightspeed toolbox's fault. Removing lightspeed fixed the issue.

Johannes WunschArun kumarTHIS WEBSITE IS VERY HELPFULL :)

serge koulayevI get this error sometimes:

??? Improper assignment with rectangular empty matrix.

Error in ==> derivest at 457

[errest(i),ind] = min(errors);

- why that would be?

Zahid Ullah KhanThe best!