dlode45

Deep learning solution of nonstiff ordinary differential equation (ODE)

Description

The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE.

The dlode45 function applies the neural ODE operation to dlarray data. Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the "S", "T", "C", and "B" labels, respectively. For unspecified and other dimensions, use the "U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting the dlarray object directly, or by using the DataFormat option.

Note

The dlode45 function best suits neural ODE and custom training loop workflows. To solve ODEs for other workflows, use ode45.

example

dlY = dlode45(odefun,tspan,dlY0,theta) integrates the system of ODEs given by odefun on the time interval defined by the first and last elements of tspan, with the initial conditions dlY0 and parameters theta.

dlY = dlode45(odefun,tspan,dlY0,theta,DataFormat=FMT) specifies the data format for the unformatted initial conditions dlY0. The format must contain "S" (spatial), "C" (channel), and "B" (batch) dimension labels only.

dlY = dlode45(odefun,tspan,dlY0,theta,Name=Value) specifies additional options using one or more name-value arguments. For example, dlY = dlode45(odefun,tspan,dlY0,theta,GradientsMode="adjoint") integrates the system of ODEs given by odefun and computes gradients by solving the associated adjoint ODE system.

Examples

collapse all

For the initial conditions, create a formatted dlarray object containing a batch of 128 28-by-28 images with 64 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

miniBatchSize = 128;
inputSize = [28 28];
numChannels = 64;
Y0 = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize);
dlY0 = dlarray(Y0,"SSCB");

View the size and format of the initial conditions.

size(dlY0)
ans = 1×4

28    28    64   128

dims(dlY0)
ans =
'SSCB'

Specify the ODE function. Define the function odeModel, listed in the ODE Function section of the example, which applies a convolution operation followed by a hyperbolic tangent operation to the input data.

odefun = @odeModel;

Initialize the parameters for the convolution operation in the ODE function. The output size of the ODE function must match the size of the initial conditions, so specify the same number of filters as the number of input channels.

filterSize = [3 3];
numFilters = numChannels;

parameters = struct;
parameters.Weights = dlarray(rand(filterSize(1),filterSize(2),numChannels,numFilters));
parameters.Bias = dlarray(zeros(1,numFilters));

Specify an interval of integration of [0 0.1].

tspan = [0 0.1];

Apply the neural ODE operation.

dlY = dlode45(odefun,tspan,dlY0,parameters);

View the size and format of the output.

size(dlY)
ans = 1×4

28    28    64   128

dims(dlY)
ans =
'SSCB'

ODE Function

The ODE function odeModel takes as input the function inputs t (unused) and y, and the ODE function parameters p containing the convolution weights and biases, and returns the output of the convolution-tanh block operation. The convolution operation applies padding such that the output size matches the input size.

function z = odeModel(t,y,p)

weights = p.Weights;
bias = p.Bias;
z = tanh(z);

end

Input Arguments

collapse all

Function to solve, specified as a function handle that defines the function to integrate.

Specify odefun as a function handle with syntax z = fcn(t,y,p), where t is a scalar, y is a dlarray, and p is a set of parameters. The function returns a dlarray with the same size and format as y. The function must accept all three input arguments t, y, and p, even if not all the arguments are used in the function. The size of the ODE function output z must match the size of the initial conditions.

For example, specify the ODE function that applies a convolution operation followed by a tanh operation.

function z = dlconvtanh(t,y,p)

weights = p.Weights;
bias = p.Bias;
z = tanh(z);

end
Note here that the t argument is unused.

Data Types: function_handle

Interval of integration, specified as a numeric vector or an unformatted dlarray vector with two or more elements. The elements in tspan must be all increasing or all decreasing.

The solver imposes the initial conditions given by dlY0 at the initial time tspan(1), then integrates the ODE function from tspan(1) to tspan(end).

• If tspan has two elements, [t0 tf], then the solver returns the solution evaluated at point tf.

• If tspan has more than two elements, [t0 t1 t2 ... tf], then the solver returns the solution evaluated at the given points [t1 t2 ... tf]. The solver does not step precisely to each point specified in tspan. Instead, the solver uses its own internal steps to compute the solution, then evaluates the solution at the points specified in tspan. The solutions produced at the specified points are of the same order of accuracy as the solutions computed at each internal step.

Specifying several intermediate points has little effect on the efficiency of computation, but for large systems it can affect memory management.

Note

The behavior of the dlode45 function differs from the ode45 function.

If InitialStep or MaxStep is [], then the software uses the values of tspan to initialize the values.

• If InitialStep is [], then the software uses the elements of tspan as an indication of the scale of the task. When you specify tspan with different numbers of elements, the solution of the solver can change.

• If MaxStep is [], then the software calculates the maximum step size using the first and last elements of tspan. When you change the initial or final values of tspan, the solution of the solver can change because the solver uses a different step sequence.

Initial conditions, specified as a formatted or unformatted dlarray object.

If dlY0 is an unformatted dlarray, then you must specify the format using the DataFormat option.

For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

Parameters of ODE function, specified as one of the following:

• dlarray object

• Cell array of dlarray objects

• Structure of dlarray objects or nested structures of dlarray objects

• Table with the variables Layer, Parameter, and Value, where Layer and Parameter contain the layer and parameter names, and Value contains the parameter value. Specify the variables as dlarray objects.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Dimension order of unformatted input data, specified as a character vector or string scalar FMT that provides a label for each dimension of the data.

When you specify the format of a dlarray object, each character provides a label for each dimension of the data and must be one of the following:

• "S" — Spatial

• "C" — Channel

• "B" — Batch (for example, samples and observations)

• "T" — Time (for example, time steps of sequences)

• "U" — Unspecified

For neural ODE operations, the data format must contain "S", "C", and "B" dimension labels only. The initial conditions must not have a "T" or "U" dimension.

You must specify DataFormat when the dlY0 is not a formatted dlarray.

Example: DataFormat="SSCB"

Data Types: char | string

Method to compute gradients with respect to the initial conditions and parameters when using the dlgradient function, specified as one of the following:

• "direct" – Compute gradients by backpropagating through the operations undertaken by the numerical solver. This option best suits large mini-batch sizes or when tspan contains many values.

• "adjoint" – Compute gradients by solving the associated adjoint ODE system. This option best suits small mini-batch sizes or when tspan contains a small number of values.

Warning

The dlaccelerate function does not support accelerating the dlode45 function when the GradientMode option is "direct". The resulting accelerated function might return unexpected results. To accelerate the code that calls the dlode45 function, set the GradientMode option to "adjoint" or accelerate parts of your code that do not call the dlode45 with the GradientMode option set to "direct".

Initial step size, specified as a positive scalar or [].

If InitialStepSize is [], then the function automatically determines the initial step size based on the interval of integration and the output of the ODE function corresponding to the initial conditions.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Maximum step size, specified as a positive scalar or [].

If MaxStepSize is [], then the function uses a tenth of the interval of integration size.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Relative error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Absolute error tolerance, specified as a positive scalar. The relative tolerance applies to all components of the solution.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Output Arguments

collapse all

Solution of the neural ODE at the times given by tspan(2:end), returned as a dlarray object with the same underlying data type as dlY0.

If dlY0 is a formatted dlarray and tspan contains exactly two elements, then dlY has the same format as dlY0. If dlY0 is not a formatted dlarray and tspan contains exactly two elements, then dlY is an unformatted dlarray with the same dimension order as dlY0.

If dlY0 is a formatted dlarray and tspan contains more than two elements, then dlY has the same format as dlY0 with an additional appended "T" (time) dimension. If dlY0 is not a formatted dlarray and tspan contains more than two elements, then dlY is an unformatted dlarray with the same dimension order as dlY0 with an additional appended dimension corresponding to time.

Algorithms

The neural ordinary differential equation (ODE) operation returns the solution of a specified ODE. In particular, given an input, a neural ODE operation outputs the numerical solution of the ODE $y\prime =f\left(t,y,\theta \right)$ for the time horizon (t0,t1) and with the initial condition y(t0) = y0, where t and y denote the ODE function inputs and θ is a set of learnable parameters. Typically, the initial condition y0 is either the network input or the output of another deep learning operation.

The dlode45 function uses the ode45 function, which is based on an explicit Runge-Kutta (4,5) formula, the Dormand-Prince pair. It is a single-step solver–in computing y(tn), it needs only the solution at the immediately preceding time point, y(tn-1)  .

 Chen, Ricky T. Q., Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. “Neural Ordinary Differential Equations.” Preprint, submitted June 19, 2018. https://arxiv.org/abs/1806.07366.

 Dormand, J. R., and P. J. Prince. “A Family of Embedded Runge-Kutta Formulae.” Journal of Computational and Applied Mathematics 6, no. 1 (March 1980): 19–26. https://doi.org/10.1016/0771-050X(80)90013-3.

 Shampine, Lawrence F., and Mark W. Reichelt. “The MATLAB ODE Suite.” SIAM Journal on Scientific Computing 18, no. 1 (January 1997): 1–22. https://doi.org/10.1137/S1064827594276424.