Automatic differentiation makes it easier to create custom training loops, custom layers, and other deep learning customizations.

Generally, the simplest way to customize deep learning training is to create a `dlnetwork`

. Include the layers you want in the network. Then perform training in a
custom loop by using some sort of gradient descent, where the gradient is the gradient of the
objective function. The objective function can be classification error, cross-entropy, or any
other relevant scalar function of the network weights. See List of Functions with dlarray Support.

This example is a high-level version of a custom training loop. Here, `f`

is the objective function, such as loss, and `g`

is the gradient of the
objective function with respect to the weights in the network `net`

. The
`update`

function represents some type of gradient descent.

% High-level training loop n = 1; while (n < nmax) [f,g] = dlfeval(@model,net,dlX,t); net = update(net,g); n = n + 1; end

You call `dlfeval`

to compute the numeric value of the objective and
gradient. To enable the automatic computation of the gradient, the data `dlX`

must be a `dlarray`

.

dlX = dlarray(X);

The objective function has a `dlgradient`

call to calculate the
gradient. The `dlgradient`

call must be inside of the function that
`dlfeval`

evaluates.

function [f,g] = model(net,dlX,T) % Calculate objective using supported functions for dlarray y = forward(net,dlX); f = fcnvalue(y,T); % crossentropy or similar g = dlgradient(f,net.Learnables); % Automatic gradient end

For an example using a `dlnetwork`

with a simple
`dlfeval`

-`dlgradient`

-`dlarray`

syntax, see Grad-CAM Reveals the Why Behind Deep Learning Decisions. For a more complex example
using a custom training loop, see Train Generative Adversarial Network (GAN). For further details on
custom training using automatic differentiation, see Define Custom Training Loops, Loss Functions, and Networks.

`dlgradient`

and `dlfeval`

Together for Automatic DifferentiationTo use automatic differentiation, you must call `dlgradient`

inside a
function and evaluate the function using `dlfeval`

. Represent the point
where you take a derivative as a `dlarray`

object, which manages the data
structures and enables tracing of evaluation. For example, the Rosenbrock function is a common
test function for optimization.

function [f,grad] = rosenbrock(x) f = 100*(x(2) - x(1).^2).^2 + (1 - x(1)).^2; grad = dlgradient(f,x); end

Calculate the value and gradient of the Rosenbrock function at the point `x0`

= [–1,2]. To enable automatic differentiation in the Rosenbrock function, pass
`x0`

as a `dlarray`

.

x0 = dlarray([-1,2]); [fval,gradval] = dlfeval(@rosenbrock,x0)

fval = 1x1 dlarray 104 gradval = 1x2 dlarray 396 200

For an example using automatic differentiation, see Grad-CAM Reveals the Why Behind Deep Learning Decisions.

To evaluate a gradient numerically, a `dlarray`

constructs a data structure for
reverse mode differentiation, as described in Automatic Differentiation Background. This data structure is the
*trace* of the derivative computation. Keep in mind these guidelines
when using automatic differentiation and the derivative trace:

Do not introduce a new

`dlarray`

inside of an objective function calculation and attempt to differentiate with respect to that object. For example:function [dy,dy1] = fun(x1) x2 = dlarray(0); y = x1 + x2; dy = dlgradient(y,x2); % Error: x2 is untraced dy1 = dlgradient(y,x1); % No error even though y has an untraced portion end

Do not use

`extractdata`

with a traced argument. Doing so breaks the tracing. For example:`fun = @(x)dlgradient(x + atan(extractdata(x)),x); % Gradient for any point is 1 due to the leading 'x' term in fun. dlfeval(fun,dlarray(2.5))`

ans = 1x1 dlarray 1

However, you can use

`extractdata`

to introduce a new independent variable from a dependent one.Use only supported functions. See List of Functions with dlarray Support. To use an unsupported function

*f*, try to implement*f*using supported functions.

You can evaluate gradients using automatic differentiation only for scalar-valued functions. Intermediate calculations can have any number of variables, but the final function value must be scalar. If you need to take derivatives of a vector-valued function, take derivatives of one component at a time. In this case, consider setting the

`dlgradient`

`'RetainData'`

name-value pair argument to`true`

.A call to

`dlgradient`

evaluates derivatives at a particular point. The software generally makes an arbitrary choice for the value of a derivative when there is no theoretical value. For example, the`relu`

function,`relu(x) = max(x,0)`

, is not differentiable at`x = 0`

. However,`dlgradient`

returns a value for the derivative.x = dlarray(0); y = dlfeval(@(t)dlgradient(relu(t),t),x)

y = 1x1 dlarray 0

The value at the nearby point

`eps`

is different.x = dlarray(eps); y = dlfeval(@(t)dlgradient(relu(t),t),x)

y = 1x1 dlarray 1

Currently,

`dlarray`

does not allow higher order derivatives. In other words, you cannot calculate a second derivative by calling`dlgradient`

twice.

`dlarray`

| `dlfeval`

| `dlgradient`

| `dlnetwork`