Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Create and Train DAG Network for Deep Learning

This example shows how to create and train a directed acyclic graph (DAG) network for deep learning. A DAG network is a neural network that can have its layers arranged as a directed acyclic graph and have a more complex architecture where layers can have inputs from, or outputs to, multiple layers.

To create and train a DAG network:

  • Create a LayerGraph object using layerGraph. The layer graph specifies the network architecture. You can create an empty layer graph and then add layers to it. You can also create a layer graph directly from an array of network layers. The layers in the graph are automatically connected sequentially.

  • Add layers to the layer graph using addLayers and remove layers from the graph using removeLayers.

  • Connect layers of the layer graph using connectLayers and disconnect layers using disconnectLayers.

  • Plot the network architecture using plot.

  • Train the network using the layer graph as the layers input argument to trainNetwork. The trained network is a DAGNetwork object.

  • Perform classification and prediction on new data using classify and predict.

You can also load a pretrained DAG network by installing Neural Network Toolbox™ Model for GoogLeNet Network add-on. For more information, see googlenet.

Create Simple DAG Network

Create a simple directed acyclic graph (DAG) network for deep learning. Train the network to classify images of digits. The simple network in this example consists of:

  • A main branch with layers connected sequentially.

  • A shortcut connection containing a single 1-by-1 convolutional layer. Shortcut connections enable the parameter gradients to flow more easily from the output layer to the earlier layers of the network.

Create the main branch of the network as a layer array. The addition layer sums multiple inputs element-wise. Specify the number of inputs that the addition layer should sum. All layers must have names and all names must be unique.

layers = [
    imageInputLayer([28 28 1],'Name','input')
    
    convolution2dLayer(5,16,'Padding','same','Name','conv_1')
    batchNormalizationLayer('Name','BN_1')
    reluLayer('Name','relu_1')
    
    convolution2dLayer(3,32,'Padding','same','Stride',2,'Name','conv_2')
    batchNormalizationLayer('Name','BN_2')
    reluLayer('Name','relu_2')
    convolution2dLayer(3,32,'Padding','same','Name','conv_3')
    batchNormalizationLayer('Name','BN_3')
    reluLayer('Name','relu_3')
    
    additionLayer(2,'Name','add')
    
    averagePooling2dLayer(2,'Stride',2,'Name','avpool')
    fullyConnectedLayer(10,'Name','fc')
    softmaxLayer('Name','softmax')
    classificationLayer('Name','classOutput')];

Create a layer graph from the layer array. layerGraph connects all the layers in layers sequentially. Plot the layer graph.

lgraph = layerGraph(layers);
figure
plot(lgraph)

Create the 1-by-1 convolutional layer and add it to the layer graph. Specify the number of convolutional filters and the stride so that the activation size matches the activation size of the 'relu_3' layer. This enables the addition layer to add the outputs of the 'skipConv' and 'relu_3' layers. To check that the layer has been added, plot the layer graph.

skipConv = convolution2dLayer(1,32,'Stride',2,'Name','skipConv');
lgraph = addLayers(lgraph,skipConv);
figure
plot(lgraph)

Create the shortcut connection from the 'relu_1' to the 'add' layer. Because you specified the number of inputs to the addition layer to be two when you created the layer, the layer has two inputs with the names 'in1' and 'in2'. The 'relu_3' layer is already connected to the 'in1' input. Connect the 'relu_1' layer to the 'skipConv' layer and the 'skipConv' layer to the 'in2' input of the 'add' layer. The addition layer now sums the outputs of the 'relu_3' and 'skipConv' layers. To check that the layers are correctly connected, plot the layer graph.

lgraph = connectLayers(lgraph,'relu_1','skipConv');
lgraph = connectLayers(lgraph,'skipConv','add/in2');
figure
plot(lgraph);

Load training and validation data, consisting of 28-by-28 grayscale images of digits.

[XTrain,YTrain] = digitTrain4DArrayData;
[XValidation,YValidation] = digitTest4DArrayData;

Specify training options and train the network. trainNetwork validates the network using the validation data every ValidationFrequency iterations.

options = trainingOptions('sgdm',...
    'MaxEpochs',6,...
    'Shuffle','every-epoch',...
    'ValidationData',{XValidation,YValidation},...
    'ValidationFrequency',20,...
    'Verbose',false,...
    'Plots','training-progress');
net = trainNetwork(XTrain,YTrain,lgraph,options);

The trained network is a DAGNetwork object.

net
net = 
  DAGNetwork with properties:

         Layers: [16x1 nnet.cnn.layer.Layer]
    Connections: [16x2 table]

Classify the validation images and calculate the accuracy.

YPredicted = classify(net,XValidation);
accuracy = mean(YPredicted == YValidation)
accuracy = 0.9946

See Also

| | | | | | |

Related Examples

More About

Was this topic helpful?