Code covered by the BSD License

Highlights from Adaptive Robust Numerical Differentiation

4.80488
4.8 | 41 ratings Rate this file 120 Downloads (last 30 days) File Size: 166 KB File ID: #13490 Version: 1.6

John D'Errico (view profile)

27 Dec 2006 (Updated )

Numerical derivative of an analytically supplied function, also gradient, Jacobian & Hessian

File Information
Description

The DERIVESTsuite provides a fully adaptive numerical differentiation tool for both scalar and vector valued functions. Tools for derivatives (up to 4th order) of a scalar function are provided, as well as the gradient vector, directional derivative, Jacobian matrix, and Hessian matrix. Error estimates are provided for all tools.
DERIVEST provides a robust adaptive numerical differentiation (up to the fourth derivative) of a user supplied function, much as quad does for integration. It is semi-intelligent, trying to use that step size which minimizes its estimate of the uncertainty in the derivative.

High order methods are used, although full control is provided to the user when you want it. You can direct the order of the method to be used, the general class of difference method employed (forward, backward, or central differences), the number of terms employed in its generalized Richardson acceleration scheme, step sizes, etc.

Although you can not provide a user supplied tolerance, DERIVEST does return an estimate of its uncertainty in the final result.

For example, the derivative of exp(x), at x=1 is exp(1)==2.71828182845905. DERIVEST does quite well.

[d,err]=derivest(@(x) exp(x),1)
d =
2.71828182845904

err =
1.02015503167879e-14

See the provided demos for many more examples.

Acknowledgements

Numerical Derivative Of Analytic Function inspired this file.

Required Products MATLAB
MATLAB release MATLAB 7.0.1 (R14SP1)
MATLAB Search Path
```/
/DERIVESTsuite
/DERIVESTsuite/demo
/DERIVESTsuite/demo/html
/DERIVESTsuite/doc```
14 Jun 2016 Rodrigo

Rodrigo (view profile)

Fantastic tool. Great for spotting typos in gradients and jacobians that are being fed to solvers and optimizers.

12 Apr 2016 Rodrigo Duran

Rodrigo Duran (view profile)

@Peng, derivest computes the derivative of a function of one variable, i.e. another function of one variable. The hessian is a square matrix of the second derivatives of a function of an n-dimensional variable.

Comment only
12 Apr 2016 Peng Wang

Peng Wang (view profile)

Hi,
I met some problems when using the codes. I wrote:
d=derivest(@(x) exp([1 3]*x-0.1)+exp([1 -3]*x-0.1)+exp([-1 0]*x-0.1),[0;0])
in MATLAB, but it returns an error:
Error using *
Inner matrix dimensions must agree.

Error in GeneralFunction>@(x)exp([1,3]*x-0.1)+exp([1,-3]*x-0.1)+exp([-1,0]*x-0.1)

Error in derivest (line 362)
f_plusdel = fun(x0i+h*delta);
But we I use the command hessian in MATLAB:
h=hessian(@(x) exp([1 3]*x-0.1)+exp([1 -3]*x-0.1)+exp([-1 0]*x-0.1),[0;0])
Can you help me to see what is wrong with derivest?

Comment only
17 Mar 2016 neiho

neiho (view profile)

Hi John,

Thanks for sharing this great package!:) I have a small question:

Can I choose the method (forward, etc.) and precision in jacobianest?
Only (?) jacobianest allows for vectorized code, but no options are provided.

Comment only
25 Feb 2016 Huck Febbo

Huck Febbo (view profile)

Thanks Matt J.

I am having other issues with this now though, is hessdiag.m, it goes into this for loop and

for ind = 1:nx
[HD(ind),err(ind),finaldelta(ind)] = derivest( ...
@(xi) fun(swapelement(x0,ind,xi)), ...
x0(ind),'deriv',2,'vectorized','no');
end

and it passes x0(ind), so it passed the first element of the x0 vector to derivest. Then in derivest:

% will we need fun(x0)?
if (rem(par.DerivativeOrder,2) == 0) || ~strncmpi(par.Style,'central',7)
if strcmpi(par.Vectorized,'yes')
f_x0 = fun(x0);
else
% not vectorized, so loop
f_x0 = zeros(size(x0));
for j = 1:numel(x0)
f_x0(j) = fun(x0(j));
end
end
else
f_x0 = [];
end

it fails:
K>> f_x0(j) = fun(x0(j))
In an assignment A(I) = B, the number of elements in B and I must be the same.

Because we passed x0(ind), which is a single number and not the entire vector. Any ideas? I don't want to muddle around in this doe and try to fix it as I assume that I have done something wrong in the first place. Thanks

Comment only
21 Feb 2016 Matt J

Matt J (view profile)

@Huck,

This doesn't require any modification of DERIVEST. You can turn any function with extra parameters into a single-argument function using the techniques described here,

http://www.mathworks.com/help/optim/ug/passing-extra-parameters.html

Comment only
21 Feb 2016 Huck Febbo

Huck Febbo (view profile)

Can you please modify this toolbox so that the constraint function can also accept a set of parameters? For instance if I am trying to calculate the Jacobain of my constraints and my constraint function looks like this:

c = constraints(x, auxdata);

% where auxdata is a cell array of parameters that my constraint function needs to run

Thanks!

Comment only
02 Dec 2015 Matt J

Matt J (view profile)

It definitely seems a lot more robust than the Optimization Toolbox finite differencers. However, I also see failure cases like that reported by Paul (23 Nov 2015). I'll up my score to 5 stars once those wrinkles are ironed out.

25 Nov 2015 anuribs

23 Nov 2015 Paul

Paul (view profile)

Hi John,
thanks for sharing this code. I found some numerical issues when estimating the Jacobian if the starting point is approaching zero.

For example:
jacobianest(@(x) x+10, 1e-12)
is returning zero even though it should be one. A starting point of 1e-11 works fine. Zero works as well because it is handled as a special case in the code. The problem is combining very small deltas with the relatively large +10 in the objective function, because delta is scaled by the starting point if the starting point is ~= 0 (line 100 in jacobianest.m). A workaround is just not scaling the deltas by the starting point.

Comment only
19 Nov 2015 Rodrigo Duran

Rodrigo Duran (view profile)

John,

It is because I know that difference formulae are very ill-conditioned that I am testing to see when does a fancy finite-difference scheme break down. I am looping over different values of coeff to get a feeling for this and every time i hit the mentioned value of coeff I see that derivest does very well at every point in vector v except in a vicinity of v==4. In other words your assumption about my belief is wrong and the question remains: what is so special about v==4 when noise is uniformly and randomly distributed in the interval [-1e-10,1e-10]?
If you run my code several times with the derivest version available here you should see what I mean.

Comment only
19 Nov 2015 John D'Errico

John D'Errico (view profile)

No, no, no. There is no bug, except in your belief that derivest is a tool designed to solve the problem you pose.

A function that contains random noise is by definition not differentiable everywhere. Derivest is not designed to differentiate a noisy function, nor should it be expected to do so. While derivest tries to be insensitive to tiny amounts of noise (essentially in the least significant bits) you invalidate the use of a tool that tries to differentiate a function that actually HAS a derivative.

While the underlying function in your problem HAS a derivative, once you add noise to it, the result is NOT differentiable.

Note that differentiation is an ill-posed problem, in the sense that it tends to amplify any noise in the function. I'm not sure that it matters, but then you tried to use a higher order method to generate the derivative. This will induce more noise amplification yet.

You might wish to use a tool that can essentially smooth away the noise, differentiating the result. There are many such tools, depending on your application and what you need out the end. This might be anything from a Savitsky-Golay tool to a least squares spline or local polynomial model, or other tools.

Comment only
19 Nov 2015 Rodrigo Duran

Rodrigo Duran (view profile)

John,

I was testing finite differences with noise as follows:
coeff=1e-10;
v=0:0.1:2*pi;
[d,~]=derivest(@(x) sin(x)+coeff*(-1+2*rand(size(x))),v,'deriv',2,'MethodOrder',4);
plot(v,d,'-o')

Every time I run this code (i.e. with different noise generated with rand in the interval [-coeff coeff]), the error is much larger near v==4. This seems unusual and suggests a bug to me. Didn't look into derivest but thought I would report it.

Rodrigo.

Comment only
16 Sep 2015 Matthias

Matthias (view profile)

Hello John,

I think the hessian calculation can be speeded up significantly, if the evaluation of the function under test is slow (as it is in my case):
In hessian.m you call hessiandiag.m. It needs to calculate the function values at certain sample points.
gradest.m, which is also called by hessian.m, calculates the function values at the same sample points as hessiandiag.m. So it's unnecessary to do it twice.

@Simon Tardivel: I can confirm your obseravtion as well as your workaround. Thank you.

12 Aug 2015 Simon Tardivel

Simon Tardivel (view profile)

Hello John,

Thank you very much for this code. It's well written, simple to use, fast enough for its purpose. It is strange that these functions are still missing in the official built-in functions.

However, I may have found a bug/unexpected behavior. I can't unfortunately share the code (plus it requires thousands of lines of code...) but here's the problem and a working fix:

Situation: a hessdiag I was computing was supposed to yield [a b c] with a,b,c all different from 0. I know they are all different from 0, the laws of physics require it. However b is rather small, usually a couple orders of magnitude smaller than a. a and c are comparable, of opposite sign.

Problem: hessdiag outputted [a,0,0] and the estimated error was [10^-7*a, 0, 0].

Fix: I went to derivest and changed the default RombergTerms from 2 to 3. b and c reappeared. b was estimated to about 10^-3*a and c was approximately -a -- and it is their correct values.

I have no idea what the problem fundamentally was. I've just noticed that hessian has default Romberg terms = 3 but doesn't pass this argument to the computation of the gradient that assumes Romberg terms = 2. It seems that, in fact, Romberg terms = 3 is used only for the upper triangle (3+1 really). Maybe that caused the problem? I do not know anything about numerical differentiation methods so that may be completely off point...

Best,
Simon.

08 Aug 2015 Jacob

Jacob (view profile)

Hello,

I was wondering if you can give me your opinion on the following issue.
jacobianest fails to work for my function and I think the reason may be that I am trying to evaluate the Jacobian at a point near the boundary of the function domain. Is this a reasonable guess? If so, how I can I fix this? My intuition is that the MaxStep is too large and the code tries to evaluate the function where it is not defined. But can I just pick a smaller value of MaxStep?

Thanks!

Comment only
30 Jul 2015 John D'Errico

John D'Errico (view profile)

You do realize that the function you typed is not actually valid MATLAB syntax?

f=@(x)(1/2*x'(Ax) + x'b)

For example, x'b may be ok when you write it on paper, but MATLAB requires a * operator between them to understand that a multiply (dot product) is intended there.

Next, you write this:

f=@(x)(1/2*x'(Ax) + x'b)
derivest(f0,x0),

So, you define the function f, but then try to call derivest with f0. How should it know that you intended something other than what you wrote?

My guess is you are not being careful in what you do. USE VALID SYNTAX.

Comment only
30 Jul 2015 ak135AK

ak135AK (view profile)

1/2x'Ax + x'b
but it seems to be not working, does it work in general? or am I making mistake in calling it?

I have tried:
f=@(x)(1/2*x'(Ax) + x'b)
derivest(f0,x0), but always get following error:
Not enough input arguments.

Error in derivest (line 362)
f_plusdel = fun(x0i+h*delta);

Is there something to do in order to get it working?
Thanks !

17 Jun 2015 John D'Errico

John D'Errico (view profile)

You might THINK of the gradient of a function of multiple variables as the derivative, but it is not. A derivative is defined on a function of ONE variable, at least, unless they have changed calculus since I took it.

derivest computes the derivative, dy/dx, of a function y(x), a function of one variable.

The fact is, I do provide a gradient calculator in the package, so you might have tried that! Look at the help. Look at the demos.

Finally, I'm not going to look that carefully at your code to see if it makes sense. But a quick glance shows lines like this:

p = max(p,0.0000001);

This will make your function non-differentiable, and should have no purpose at all since you are computing a log-likelihood function in the first place! It does seem irrelevant of course, since you fail to use p after that in your code. Why compute something if you intend to dump it in the bit bucket?

These next lines do however get used, and they will indeed make the function non-differentiable.

y = reshape(y,max(alt),max(ncs));
x = reshape(exp(v(b)),max(alt),max(ncs));

So I think you need to consider what it is you are doing, and maybe if there is a reason why an optimizer fails, if it presumes the function it is trying to optimize is differentiable, when your function is not so.

Comment only
16 Jun 2015 Sebastian

Sebastian (view profile)

Hi there,
I am currently working on getting MATLAB to optimize LogLikelihood functions for some econometric models. The problem is that the internal optimiser seems to have issues with maximising these functions. Thus, I want to see if things get better if I provide a good numerical estimate of the gradient. Thus, I found this toolbox. However, I have some problems to make it run. Here is a very easy example of the function I am trying to optimize:

function logli = logl(b,v,ncs,alt,y)
y = reshape(y,max(alt),max(ncs));
x = reshape(exp(v(b)),max(alt),max(ncs));
p = x(y == 1)./sum(x)';
p = max(p,0.0000001);
logli = -sum(log(x(y == 1)./sum(x)'));
end

where
v is an anonymous function like
v = @(b) b(1)*x1 + b(2)*x2 + b(3)*x3
b is the parameter vector with starting values
b = [0,0,0]
ncs and alt are identifyer variables
y is the independent variable

Now if I want to obtain the first derivative I use:

[der,errest] = derivest(@(b)logl(b,v,ncs,alt,y),b)

but get the error message:
Error using derivest (line 409)
fun did not return the correct size result (fun must be vectorized)

If I specify:
[der,errest] = derivest(@(b)logl(b,v,ncs,alt,y),b,'Vectorized','no')
It seems to pass only one dimension of b as I get
Attempted to access b(2); index out of bounds because numel(b)=1.

The data structure of these problems is:

ncs alt y x1 x2 x3
1 1 1 x11 x21 x31
1 2 0 x12 x22 x32
1 3 0 x13 x23 x33
2 1 0 x11 x21 x31
2 2 1 x12 x22 x32
2 3 0 x13 x23 x33
...

I would be very happy about any recommendation on how to use the derivest function in that case.

Cheers
Sebastian

Comment only
19 Feb 2015 John D'Errico

John D'Errico (view profile)

The simple answer is to use the hessian function. For example, I'll compute a second derivative

d^2f/dxdy

of the function:

y = sin(x + y^2)

at the point (x,y) = (2,3). I'll do it using these tools, then do it again using the symbolic toolbox to verify the answer.

fun = @(xy) sin(xy(1) + xy(2).^2);
format long g
hessian(fun,[2 3])
ans =
0.999990206550529 5.99994123927655
5.99994123927655 36.0084988318012

So the second partial you want is the (1,2) element, thus the off diagonal element of the hessian matrix.

Now verify that value using symbolic means.

syms x y
fs = sin(x + y.^2);

dfs_dxdy = diff(diff(fs,x),y)
dfs_dxdy =
-2*y*sin(y^2 + x)

subs(dfs_dxdy,{x,y},[2 3])
ans =
-6*sin(11)

vpa(ans)
ans =
5.9999412393042207423093893941531

So that worked well enough, although it did compute additional terms that we did not really need.

I could do it using derivest too, although it looks like I had to tell derivest that this function is not properly vectorized.

fun = @(x,y) sin(x + y.^2);

So as a function of y, at x == 2, we would have

dfdx = @(y) derivest(@(x)fun(x,y), 2, 'vectorized', 'no');

derivest(dfdx,3,'vectorized','no')
ans =
5.99994123930711

Comment only

Great file, i have a question. I have an objective function and i taking its derivative wrt to input in matlab. I want to take the derivative two times, one time with respect to input and other time with respect to parameter. But if once i have use GRADEST to get the gradient wrt input now i have answer in an number, that is a gradient. How can second time i can take the gradient when i dont have the first derivative in some kind of a formulation.

Comment only
27 Sep 2014 Guilherme

Guilherme (view profile)

Great File, many thanks for it.

However, I am currently facing some issues with my function (bad scaled parameters), and therefore with the second derivatives matrix (Hessian).

Here is the deal: I am minimizing a cost function (xË†2 = (y - ymodel)^2), where my model is 3 nonlinear parameter one, using a Nelder Mead algo.

Using the |hessian| function from your package using best-fit parameters, yielded me some strange results (since Parameter_1 ranges from [0.1:0.9], Parameter_2 from [10:30] and Parameter_3 from [150:250]), that is, I need to re-escale them. My question is, what is the best way to do that and does your code endogenously scale the results by any chance ?

Thank you very much in advance,

17 Apr 2014 Niklas

Niklas (view profile)

28 Mar 2014 kartik srinivasan

kartik srinivasan (view profile)

06 Feb 2014 Andres Tovar

18 Nov 2013 Tim

Tim (view profile)

11 Nov 2013 Danny

Danny (view profile)

11 Sep 2013 Camille

Camille (view profile)

Great submission, saves me a lot of work, thanks!

13 Aug 2013 Matthew

Matthew (view profile)

13 Jul 2013 John D'Errico

John D'Errico (view profile)

retrobytes - not all functions will be able to use complex arguments, so derivest avoids that approach. Hyper-dual numbers would appear to have the same flaw, at least without a great deal of effort invested.

Comment only
12 Jul 2013 retrobytes

retrobytes (view profile)

A very useful tool, John, thanks!

Did any of you guys think about implementing the hyper-dual numbers approach for this? I recently came across this:
http://www.stanford.edu/~jfike/SNUpresentation_rev4.pdf
and:
http://www.stanford.edu/~jfike/hyperdual.html
and I was wondering if any of you tried it instead of using complex numbers?

18 May 2013 Anders Munk-Nielsen

Anders Munk-Nielsen (view profile)

Absolutely awesome work! So easy to use, brilliantly done. This is probably the best thing to happen to mankind since sliced bread. And to put it up for free? John, you sir are a good man!

30 Jan 2013 Roberto Calandra

Roberto Calandra (view profile)

Just great

01 Nov 2012 daisy

daisy (view profile)

12 May 2012 Md. Ikramul Hasan

20 Dec 2011 ehs

ehs (view profile)

11 Jul 2011 Woodeast

Woodeast (view profile)

Thanks a lot for this useful tool!

I have a question about how far away will the program evaluate the fun from the initial point. I have a function which only exits in a certain range around the initial point(but big enough in my sense, e.g. starting at .1, requiring >0). However, the program will break down for evaluating out of range points after a fare amount of calculation. Could you please give me some idea about why this happens?

(BTW, my function has some randomness, does this matter?)

Thanks so much!

Comment only
26 Jun 2011 Andrew Newell

Andrew Newell (view profile)

Excellent! Some of its predecessors also do accurate derivatives, but this one is particularly easy to use for gradients and Hessians.

18 Apr 2011 Mark

Mark (view profile)

H John,

This looks great. I'm trying to use the Hessian function to find the Fisher Information Matrix for Max-likelihood. However the Hessian program is choosing parameters which are not allowable in my chosen model. How can I restrict the Hessian algorythmn for permissible parameter values

04 Apr 2011 John D'Errico

John D'Errico (view profile)

Your problem is in the way you called the code.

H=hessian(@(p) @reg2002, [0.03 3 -5])

Here is part of the error message:

"??? The following error occurred converting from function_handle to double:
Error using ==> double
Conversion to double from function_handle is not possible."

Is reg2002 a function m-file, into which you are trying to pass the variable p? If so, then you could have done this

H=hessian('reg2002', [0.03 3 -5])

or, you could have done this

H=hessian(@reg2002, [0.03 3 -5])

In either case, it assumes that reg2002 is an m-file, and passes in a variable with two elements in it asa vector. It would return a 2x2 Hessian matrix. The use of @reg2002 turns that into a function handle.

See the error message you had though. Matlab sees this:

@(p) @reg2002

and tries to return a function handle as the result of a function. Then my code looks at it, and tries to use the function handle as a number. Of course it fails.

Finally, you might also have done this

H=hessian(@(p) reg2002(p), [0.03 3 -5])

Here, you would be creating an anonymous function, a function of the variable p. It is yet another function handle that the code can use properly.

Comment only
04 Apr 2011 emma liu

emma liu (view profile)

hi John, many thanks for the wonderful suite!

I met a problem when I use the function from the m-file. my code is:

H=hessian(@(p) @reg2002, [0.03 3 -5])

??? The following error occurred converting from function_handle to double:
Error using ==> double
Conversion to double from function_handle is not possible.

Error in ==> derivest at 330
f_x0(j) = fun(x0(j));

Error in ==> hessdiag at 59
[HD(ind),err(ind),finaldelta(ind)] = derivest( ...

Error in ==> hessian at 74
[hess,err] = hessdiag(fun,x0);

I don't what is wrong here. is there mistakes I made? many thanks in advance.

Emma

04 Apr 2011 John D'Errico

John D'Errico (view profile)

Felipe - Note that the submissions #15235 and #26807 actually appeared AFTER my tool (#13490) did. Submissions are assigned sequential numbers on the FEX. I see no reason to acknowledge a later submission on the FEX. In general, since forward acknowledgement would seem to require precognition of some sort, it is rather difficult to accomplish. In fact, most of the references in that Wikipedia article referenced appeared several years after this tool was written and posted on the FEX.

I will modify the title in light of your comment though, since the phrase might conceivably confuse some people.

Comment only
04 Apr 2011 Felipe G. Nievinski

Felipe G. Nievinski (view profile)

"Automatic" in the title might confound viewers that it implements "automatic differentiation", see <http://en.wikipedia.org/wiki/Automatic_differentiation> and submission # 26807.

You might want to use the acknowledgement field so that related submission #15235 back-links to yours.

Haven't tried it yet, reason why I'm not rating it.

Comment only
28 Mar 2011 Kulan

Kulan (view profile)

22 Feb 2011 Sturla Kvamsdal

Sturla Kvamsdal (view profile)

Hi John,

Thank you very much for the great suite!

The information on the Hessian-function explicitly states that it is not for frequent use on an expensive to evaluate objective function. I tried it anyway, because that is what I need. Most of the time, I get something reasonable (hard to check, but I'll worry about that later), but sometimes I get NaN's. Any idea what the problem might be?

Perhaps some more details are warranted: The function I want to differentiate is a likelihood-function. It is a function of time, and I need the Hessian at the maximum for each timestep. (The maximum is one point associated to the last time period.) The function takes 5 parameters.

Any help or ideas appreciated!

Thank you very much,
Sturla

Comment only
27 Jan 2011 John D'Errico

John D'Errico (view profile)

If you are looking for all of the second partials, then yes, a loop on the Hessian computation seems the solution.

Comment only
26 Jan 2011 Aydin Buluc

Aydin Buluc (view profile)

Hi John,
Thanks much for the quick response.
jacobianest was indeed what I needed, should have looked into the whole package first.

I was looking at the hessian function, which doesn't allow me to use a vector function (perhaps understandably, as the output would be a tensor otherwise).

Proceeding with the matrix-vector example
@(x) A*x;
Should I loop over the rows of and call hessian m times to get all the Hessians?
for i=1:m
f = @(x) A(i,:)*x; // Returns a scalar
[h(i) herr(i)] = hessian(f, x0);
end

Thanks,
- Aydin

Comment only
26 Jan 2011 John D'Errico

John D'Errico (view profile)

Hi Aydin,

The trick is, derivest itself takes only a function of a single variable. Suppose I had a scalar function of three variables. Thus it takes a set of three variables, returning a single number. If you wish to compute a partial derivative of that function, then you get a single number out. But that is a partial derivative, NOT a derivative. "The" derivative of a scalar function of three variables is given by the gradient vector, composed of three partial derivatives. So I have provided the tool gradest. Use it for scalar functions of multiple variables.

Now, suppose I have a function of n variables, that produces m output results? (This is what you have by the way.) Now you must compute a Jacobian MATRIX, an m by n matrix of partial derivatives. Again, derivest is not the tool, nor is gradest. This is a different problem that requires a different tool yet, so I wrote jacobianest.

>> A = magic(3)
A =
8 1 6
3 5 7
4 9 2

>> [J,Jerr] = jacobianest(fun,[1;2;3])
J =
8 1 6
3 5 7
4 9 2

Jerr =
7.7667e-14 9.7084e-15 5.3057e-14
2.6528e-14 4.3706e-14 7.09e-14
3.8833e-14 9.4059e-14 1.9417e-14

In the case that you really only wanted to know the derivative of a specific element of the result, with respect to a specific single variable out of the several inputs, you need to define a function where the OTHER variables are held fixed. Thus...

>> x1 = 1;
>> x2 = 2;
>> fun2 = @(x3) [0 1 0]*fun([x1;x2;x3]);
>> [d23,derr] = derivest(fun2,3,'vectorized','no')
d23 =
7
derr =
4.785e-14

Thus, fun2 fixes the first two variables, then extracts out the second output result. derivest can now operate, as long as I tell it that this function is not vectorized. It is not set up to take a vector of inputs and produce one result for each. fun2 takes only a scalar input, and produces only a scalar output.

John

Comment only
25 Jan 2011 Aydin Buluc

Aydin Buluc (view profile)

Hi John,
Thanks for the lovely tool.
I couldn't figure out how to differentiate multivariable functions with it though. For example,
f = @(x) A*x

where A is a matrix and x is a vector.

Your code seems to do the following:

% Loop over the elements of x0, reducing it to
% a scalar problem. Sorry, vectorization is not
% complete here, but this IS only a single loop.
der = zeros(nx0);
errest = der;
finaldelta = der;
for i = 1:n
...
end

while A*x doesn't really reduce to a single scalar problem as far as I can see. Any suggestions?

21 Jan 2011 Alessandro

Alessandro (view profile)

Sorry John, sorted out the problem, was my fault... thanks again for these useful codes.
Alessandro

Comment only
21 Jan 2011 Alessandro

Alessandro (view profile)

Hi John,

I tried to run the Rosenbrock example included in the help of your routine hessian, i.e. I ran

rosen = @(x) (1-x(1)).^2 + 105*(x(2)-x(1).^2).^2;
[h,err] = hessian(rosen,[1 1])

I get the following error:

"??? Error using ==> subsindex
Function 'subsindex' is not defined for values of class 'function_handle'."

do you know how to solve this?
thank you.
Alessandro

18 Dec 2010 John D'Errico

John D'Errico (view profile)

I believe that fmincon has improved since my statement in the past, so that the hessian as returned is now a better approximation. Of course, if you supply the hessian, no updates are needed.

Comment only
17 Dec 2010 Januj

Januj (view profile)

Hi Errico,

In another post, you stated that matlab does not compute the Hessian very well. Is this why the Hessian may not get updated in fmincon? Secondly, would providing a user-suppled Hessian (through the DERIVEST routine, for e.g.) be able to improve the ability of fmincon to update the Hessian in all cases? Thanks!

Comment only
08 Dec 2010 Kimonas Fountoulakis

Kimonas Fountoulakis (view profile)

HI Errico

I tried to expand your code in order to derivate Bilinear Matrix inequalities. The program is still in a rough form, in particular I didn't implement the error part of the estimation, but I was able to produce derivatives of a BMI function correctly. If you are interested in sending you the programm please let me know. I will try though one of these days to implement the error term as well.

Kimonas

Comment only
06 Dec 2010 John D'Errico

John D'Errico (view profile)

Hi Andrew,

The point is, derivest uses a sequence of nested approximations, using a geometric sequence of points for the steps. For example, suppose we chose to compute the derivative of tan(x), at x = pi*2/3.

[D,errest] = derivest(@tan,pi*2/3)

D =
4

errest =
7.0302e-13

Tan(x) has a singularity at x = pi/2, so if the step size is chosen to be large enough, the approximations will actually span that singularity. This will cause the corresponding estimate to be useless, but since there are multiple estimates produced, some of them will be good, and I can figure out which estimates in the sequence ARE good and which ones are bad.

I pick that estimate with the lowest predicted error, along with some additional criteria.

A nice thing about the nested sequence of approximations is it allows me to deal with functions that have problem scaling. I can almost always find a viable solution.

John

Comment only
06 Dec 2010 Andrew Newell

Andrew Newell (view profile)

In derivest_demo.m, I found the following statement:
"Although a central rule may put some samples in the wrong places, it may still succeed"

How would it put samples in the "wrong places"?

Comment only
29 Nov 2010 John D'Errico

John D'Errico (view profile)

Hi Prasanna,
In theory, I could have written derivest to provide higher derivatives. In practice, high order numerical differentiation is difficult to do. Differentiation is a noise amplifier. It takes any tiny noise in your system, and boosts it. By the time you get up to the 4th derivative or so, the noise down in the least significant bits can start to cause numerical problems. For this reason, I limited it there. Any further, and I was no longer able to trust what was coming out in some test cases. And since I had to put a limit somewhere, I chose 4 as the maximum order of derivative it would provide.

You can use the error estimate provided as a measure of how well I think I was able to do. As an example, try a simple trig function. Differentiate sin(x), at x = pi/4. The result will always be plus or minus sqrt(2)/2, depending on the derivative order.

>> fun = @sin;
>> [D,DE] = derivest(fun,pi/4,'derivativeorder',1)
D =
0.70711
DE =
7.0617e-15

For the first derivative, I can give virtually full precision. However, the second derivative is not quite so verifiably computable, and the error estimate reflects that fact.

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',2)
D =
-0.70711
DE =
1.666e-12

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',3)
D =
-0.70711
DE =
6.6146e-11

>> [D,DE] = derivest(fun,pi/4,'derivativeorder',4)
D =
0.70711
DE =
1.9122e-05

See that by the time I've gotten to the 4th derivative, derivest tells me only to trust the first 4 or 5 digits in the result. Had I allowed higher orders here, the result would rapidly become a virtual random number generator.

Now, with care, one could modify the code to work for 5th derivatives. Just look for par.DerivativeOrder, and see what was done, extending the code for 5th order derivatives. A quick check through the code finds only one or two spots where serious understanding of the methods employed was necessary. (To be honest, leave such a modification of the code to me.)

Can you nest derivest calls? Yes, in theory you can do so to produce higher order derivatives, and the error estimate will tell you if it thinks it is generating garbage for a result. This test computes the 5th derivative of sin(x), at pi/4.

>> format long g
>> d2 = @(x) derivest(fun,pi/4 + x,'derivativeorder',2);

>> [d5,d5err] = derivest(d2,0,'derivativeorder',3)
d5 =
0.707106775613828
d5err =
4.17633428512851e-08

>> sqrt(2)/2
ans =
0.707106781186548

Note that derivest thinks it was able to do so reasonably well, and it was just about right on where it tells you to no longer trust the digits in the result. In fact, in this case, the nested result is probably more accurate than a direct computation of the 5th derivative. So the 3rd derivative of the second derivative is more accurate (but a bit slower!) than a direct computation of the 5th derivative.

HTH,
John

Comment only
28 Nov 2010 Prasanna Venkatesh

Prasanna Venkatesh (view profile)

Hello John,
This piece of code is great. I had one question, I have a function for which I have no analytical expression, hence cannot vectorize it. But I can do pretty fast numerical computation of it and I know that it has smooth first, second, third and fourth derivatives by applying your DerivTest function on it. I need one more derivative to complete my calculation and I was wondering if I should make an anonymous function out of the fourth derivative calculation using DerivTest and take its first derivative to get the fifth. I wanted to know if there is some specific reason for which this code goes only upto 4th derivative?

19 Nov 2010 J West

J West (view profile)

Thanks very much, that was exceptionally helpful. These are a nice set of functions and very much appreciated.

Unsurprisingly, calling a loop to gradest for several points inside fminsearch is not the fastest thing in the world, but such are the trade-offs, and it does seem to do the job!

18 Nov 2010 John D'Errico

John D'Errico (view profile)

I have provided the related function gradest to compute a gradient, zipped with derivest. For example, I'll pick arbitrary values for the parameters k1,k2,k3,k4,k5, and then define a function handle.

k1 = 1/2;
k2 = 1/3;
k3 = 1/4;
k4 = 1/5;
k5 = 1/6;

fun = @(xyz) k1*xyz(1)*(1- ...
exp(-k2*(xyz(2)^k3)*(xyz(3)^k4)*(xyz(1)^k5)))

ans =
0.21997 0.018836 0.010046

For multiple points, wrap a loop around the call to gradest.

John

Comment only
17 Nov 2010 J West

J West (view profile)

I may be missing something obvious here, but is there a clever way to use derivest to calculate a set of partial derivatives?

I have a function something like
f = k1 * x * (1 - exp(-k2*(y^k3)*(z^k4)*(x^k5))
and would like to be able to calculate partial derivatives df/dx, df/dy, and df/dz for sets of specific values of x, y, and z.

I am doing this in order to calculate an error weighting for least squares fitting -- I have uncertainties in x, y, and z which I would like to map onto f in order to determine a weighting factor for the minimization (so I want the gradient at each point). I would prefer to determine the derivative numerically because that will allow me the most flexibility in being able to try different forms of the function, etc.

Many thanks.

Comment only
27 Mar 2010 John D'Errico

John D'Errico (view profile)

Large relative reported errors might mean that the derivative computed is small compared to the uncertainty in it. It may also mean that there is some problem in the computation, leaving the derivative uncertain. It is difficult to know what has happened from only this description.

One option is to compute the derivative in question using derivest in stead of the hessian tool. While this will be less efficient, there are many more options for overcoming problems that I have provided in derivest itself. That way you could be sure of what is the problem, and hopefully overcome it.

Comment only
17 Mar 2010 pekka

pekka (view profile)

Hi,

I have some problem that sometimes when I compute the hessian I get that err contains terms that are larger than terms in the reported hessian. What are the reasons for this and are there any general ways to rewrite something to avoid this problem?

Comment only
01 Mar 2010 V

V (view profile)

i'll get release 14 or 7

28 Feb 2010 John D'Errico

John D'Errico (view profile)

Hi V,

You must be using an old MATLAB release. Here is what happens when I try your example in R2010a:

>> myfun1 = inline('exp(x)');
[d,err]=derivest(myfun1,1,'Vectorized','no')

d =

2.7183

err =

2.1641e-14

So then I looked more carefully at your comment. I see the warning message gives the answer away.

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 463 Column: 1
Unmatched "end".

You are running version 6.5 (i.e., release 13.) Derivest was written in Release 14, or version 7. As such, it uses several things that will cause your MATLAB release to be quite unhappy.

Sorry,
John

Comment only
28 Feb 2010 V

V (view profile)

hi john,

i tried this

myfun1 = inline('exp(x)');
[d,err]=derivest(myfun1,1,'Vectorized','no')

and i got the error below,

Warning: File: C:\MATLAB6p5\work\derivest.m Line: 463 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
Warning: File: C:\MATLAB6p5\work\derivest.m Line: 523 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
Warning: File: C:\MATLAB6p5\work\derivest.m Line: 538 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
Warning: File: C:\MATLAB6p5\work\derivest.m Line: 573 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
Warning: File: C:\MATLAB6p5\work\derivest.m Line: 659 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
Warning: File: C:\MATLAB6p5\work\derivest.m Line: 759 Column: 1
Unmatched "end".
(Type "warning off MATLAB:m_warning_end_without_block" to suppress this warning.)
??? Error using ==> factorial
N must be a positive integer.

Error in ==> C:\MATLAB6p5\work\derivest.m (fdamat)
On line 564 ==> c = 1./factorial(1:2:(2*nterms));

Error in ==> C:\MATLAB6p5\work\derivest.m
On line 275 ==> fdarule = [1 0]/fdamat(par.StepRatio,1,2);

Comment only
10 Jan 2010 Richard Crozier

Richard Crozier (view profile)

Another really excellent piece of code, I hope TMW buy you a nice christmas present every year for extending their software so extensively for them.

23 Nov 2009 David Doria

David Doria (view profile)

Hello,

Hessian does not work, neither in any example nor when I try to use it. It gives the message:
Function 'subsindex' is not defined for values of class 'function_handle'.
All other functions seem to be great though..
Best,

19 Aug 2009 Michelle

Michelle (view profile)

Hi John,

Thanks for the timely and detailed reply! I had assumed that derivest handles all cases. Now I see jacobianest is a standalone function. Thanks again! And great code!

Michelle

Comment only
18 Aug 2009 John D'Errico

John D'Errico (view profile)

Hi Michelle,

The problem is, when you define myfun as you have, it is not a scalar valued function. It returns a vector argument. Thus we can do this:

myfun=inline('[exp(x);exp(2*x)]');

myfun(2)
ans =
7.3891
54.598

See that there are two numbers returned for a scalar input.

(By the way, I'd strongly recommend the use of function handles and anonymous functions instead of inline functions. An inline function is comparatively very slow to evaluate.)

The derivest code computes the derivative of a scalar valued function. Change your function to return a scalar value, as do myfun1 and myfun2 below, and derivest works as designed.

myfun1 = inline('exp(x)');
[d,err]=derivest(myfun1,1,'Vectorized','no')

d =
2.7183
err =
9.3127e-15

myfun2 = inline('exp(2*x)');
[d,err]=derivest(myfun2,1,'Vectorized','no')
d =
14.778
err =
1.8401e-13

Now, how might one define the derivative of a vector valued function? This is defined mathematically as the Jacobian. In fact, in the derivest suite, I have provided the jacobianest function to compute what you wish. It works on a vector valued function, returning a jacobian matrix. Here, since the parameter of these functions is a scalar, the Jacobian matrix is just a vector of length 2.

[d,err] = jacobianest(myfun,1)
d =
2.7183
14.778
err =
3.6841e-14
1.4736e-13

John

Comment only
18 Aug 2009 Michelle

Michelle (view profile)

Thanks so much for making such a useful suite available!

I was experimenting with the code and typed in the following two commands:

>> myfun=inline('[exp(x);exp(2*x)]')
>> [d,err]=derivest(myfun,1,'Vectorized','no')

but got back error messages that read

??? In an assignment A(I) = B, the number of elements in B and
I must be the same.

Error in ==> <a href="error:/a/fmalx1/lcl/fma/res1/mxw/research/min/code/matlab/DERIVESTsuite/derivest.m,362,1">derivest at 362</a>
f_plusdel(j) = fun(x0i+h*delta(j));

I'm posting it here because I thought you might be able to spot the problem much more quickly than I could. Could you let me know?

Comment only
27 Jun 2009 John D'Errico

John D'Errico (view profile)

Benn,

My guess is that you have written your own factorial function, probably as a homework assignment, since the error messages show a recursive implementation of factorial. This causes a failure in derivest, since your version is not properly vectorized. By vectorized, I mean that when called with a list of numbers, the Mathworks version of factorial computes the factorial of each number.

You can learn which factorial version is called by typing this at the command line:

which factorial -all

MATLAB will show which version of the factorial function will be used. You should see the version that you wrote listed at the very top.

At first, I thought that possibly you simply are using an old version of MATLAB, but this is unlikely, since the Mathworks would never have written the factorial function to have a recursive definition as I see. (Recursive factorial definitions are a nice way to teach a student recursive coding styles, but they are a terrible coding style in practice because of the extraneous, wasted function call overhead just to multiply a list of consecutive numbers together.)

The solution is to remove the bad (homework assignment) version of factorial from your search path. It will be inefficient anyway compared to the supplied version of factorial that MATLAB already comes with, and your version is not properly vectorized. At the very least, move it to another place far down on your search path so that matlab will use the correct version of factorial instead.

John

Comment only
27 Jun 2009 Benn Eifert

Benn Eifert (view profile)

help! just downloaded the toolbox and tried to run the most basic example in the derivest documentation:

[d,e]=derivest(@(x) exp(x),1)

and got a bunch of errors of this form:

Error in ==> factorial at 8
x = n*factorial(n-1);

Error in ==> factorial at 8
x = n*factorial(n-1);

Error in ==> derivest>fdamat at 564
c = 1./factorial(1:2:(2*nterms));

Error in ==> derivest at 283
fdarule = [0 1 0]/fdamat(par.StepRatio,1,3);

Comment only
16 Feb 2009 Serter YILMAZ

Serter YILMAZ (view profile)

Works really fine! very helpful! Thanks...

16 Feb 2009 Mark A. Mikofski

Mark A. Mikofski (view profile)

This worked better than fminsearch when used as a damped newton method for finding a root with extremely high sensitivity. fminsearch was overly slow, and often overshot the root, even though the tolerance was high, and fzero didn't work at all. Derivest was fast.
This is the second time I've used great code from John D'Erico.
Thanks!

13 Jan 2009 John D'Errico

John D'Errico (view profile)

I've just uploaded a new release that repairs the problem of complex domain differentiation. For example, the current release now gets this correct:

exp(1+i)
ans =
1.4687 + 2.2874i

derivest(@(z) exp(z),1+i)
ans =
1.4687 + 2.2874i

Comment only
27 Dec 2008 liuxbin xbin

liuxbin xbin (view profile)

When I use this function to get the derivative at the complex numbers,problem appears.For example:
derivest(@(z) exp(z),1+i)
ans =
1.46869393991590 - 2.28735528717885i
However,the derivative of exp(z) at z(complex numbers) should be exp(z)
exp(1+i)
ans =
1.46869393991589 + 2.28735528717884i
which is the conjucate of the above.

Comment only
08 Aug 2008 Golam Kibria Chowdhury

Rock 'n' Roll

08 Aug 2008 Golam Kibria Chowdhury

Rock 'n' Roll

Comment only
06 Mar 2008 Scott Miller

Works just great. I was in a situation where using hessiancs was problematic because the functions being evaluated had a transiently complex nature. hessiancs works well with functions that never use any complex arithmetic, but watch out for using complex steps if the internals use any complex arithmetic, even if the inputs and outputs are both real. Downloaded the package, changed function names, and it worked right out of the box - slower, but 'bullet-proof' sometimes beats 'fast'. Good documentation, etc., too. I appreciate that hessiancs used a similar form to the one you used - I'll go back to it when I need more speed and have the time to fix my loss functions.

Scott

08 Feb 2008 John D'Errico

Can anyone let me know what conflicted with lightspeed? I'll try to fix it if possible. It might be a function name that I chose. Otherwise, this is just basic Matlab code.

Comment only
08 Feb 2008 Matlab User

Very useful! I was able to use this to check my symbolically derived hessian matrix in an optimization routine... optimset('DerivativeCheck','on') checks the gradient numerically, but apparently not the hessian.

In case anyone else has a problem, I found that this was incompadible with the matlab Lightspeed toolbox, but that is no doubt the Lightspeed toolbox's fault. Removing lightspeed fixed the issue.

20 Dec 2007 Johannes Wunsch
07 Oct 2007 Arun kumar

THIS WEBSITE IS VERY HELPFULL :)

01 Sep 2007 serge koulayev

I get this error sometimes:
??? Improper assignment with rectangular empty matrix.

Error in ==> derivest at 457
[errest(i),ind] = min(errors);

- why that would be?

Comment only
02 Apr 2007 Zahid Ullah Khan

The best!

27 Dec 2006

Minor changes

08 Feb 2007

Added an option for differentiating a non-vectorized function.

12 Feb 2007

Fixed a bug in derivest which sometimes occurred for very small values of the variable.

08 Mar 2007

05 Sep 2007

Repaired a bug, operating at fixed step sizes for the 2nd order method. Removed mac origin .DS_store files.

01 Jun 2011 1.5

Repaired problem when the point to evaluate the derivative happens to be the same as the period of a periodic function.

03 Dec 2014 1.6

Flag as a toolbox