UseParallel for hessian?
Show older comments
Will Matlab at some point support parallel computation of the finite difference Hessian? More specifically, I've been using UseParallel in my fminunc settings (which have a lot of parameters), but computing the Hessian takes a fair amount of time.
4 Comments
ruobing han
on 2 Oct 2022
Same question here, have you figured it out?
Matt J
on 2 Oct 2022
What makes you suppose UseParallel applies to gradient, but not to Hessian computations?
Walter Roberson
on 2 Oct 2022
the option description is "When true, fminunc estimates gradients in parallel." but that is gradients not hessian
Jonne Guyt
on 4 Oct 2022
Answers (1)
I don't speak for MathWorks, but I think the issue is that finite difference Hessians are only relevant to the trust-region algorithm, since the quasi-newton algorithm does not use Hessian computations. But in the trust-region algorithm, the user is required to provide an analytical gradient computation via SpecifyObjectiveGradient=true. It seems a rather narrow use case that an analytical gradient calculation would be tractable, but not an analytical Hessian computation, assuming the memory footprint of such a matrix is not prohibitive. If the memory footprint of the Hessian is prohibitive, the user is meant to be use the HessianMultiplyFcn or HessPattern options.
10 Comments
Jonne Guyt
on 4 Oct 2022
Edited: Jonne Guyt
on 4 Oct 2022
Matt J
on 4 Oct 2022
but using a quasi-newton algorithm (fminunc) as well as an interior-point (fmincon) will use finite differences to calculate the hessian and these are not fringe/edge cases.
No, fminunc's quasi-newton algorithm does not do a full Hessian computation. Only gradients are used.
fmincon's interior-point does have an option to compute the Hessian by finite differences but, similar to fminunc's trust-region-algorithm, it requires the user to supply an analytical gradient computation, which means the Hessian is likely to be analytically tractable as well (or at least I've yet to see a counter-example). So, I wonder why this option would ever be used.
Jonne Guyt
on 4 Oct 2022
As I've been saying, not supplying gradients is common, but not in the specific algorithms where full Hessians are used.
Another reason why finite difference Hessians may be discouraged is that the Hessian needs to be inverted, which can be sensitive to finite differencing errors.
Bruno Luong
on 4 Oct 2022
Edited: Bruno Luong
on 4 Oct 2022
@Jonne Guyt " but using a quasi-newton algorithm (fminunc) as well as an interior-point (fmincon) will use finite differences to calculate the hessian and these are not fringe/edge cases."
This statement is wrong. If the Hessian function is not provided by user, the quasi newton Hessian used by both algorithms is resulting from bookeeping of the gradients evaluated at different points. There is no need of finite difference on top of the gradient.
The doc mention the "sparse finite difference algorithm on the gradients" only performed in trust-region algorithm as Matt's correctly stated.
Jonne Guyt
on 4 Oct 2022
if you use fminunc or fmincon and do not supply gradients/hessians, but do ask for the hessian to be calculated and if it is calculated via finite-differences.
That case does not exist in fmincon/fminunc. There is no fmincon/fminunc algorithm that performs a finite difference Hessian calculation when an analytical gradient is not provided.
So the only question is, do you know of a case where it would make sense to supply an analytical gradient, but not an analytical Hessian.
This use case is common within for example discrete choice modeling. The hessian is used to calculate the standard errors, so you do need it and there's no simple way to get it via alternative means
If you need the Hessian for the purposes of computing standard errors (and not iterative optimization), then I agree it may make sense to have a parallelized finite differencer for that. However, it is not clear why that belongs in fminunc/fmincon. Because the Hessian is not being recomputed iteratively, you would use a standalone Hessian computing routine for that.
Bruno Luong
on 4 Oct 2022
And furthermore the Hessian returned by minimization algorithms are usuallt NOT suitable to compute error standard deviations.
Jonne Guyt
on 4 Oct 2022
Categories
Find more on Solver Outputs and Iterative Display in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!