Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

trainNetwork

Train a convolutional network

Use trainNetwork to train your convolutional neural network (ConvNet, CNN) for a classification or regression problem after defining the layers of your network and specifying the training options. You can train a ConvNet on either a CPU or a GPU or multiple GPUs and/or in parallel. Training on a GPU or in parallel requires the Parallel Computing Toolbox™. Using a GPU requires a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher. Specify the training parameters including the execution environment using the trainingOptions function.

Syntax

trainedNet = trainNetwork(imds,layers,options)
trainedNet = trainNetwork(X,Y,layers,options)
trainedNet = trainNetwork(tbl,layers,options)
trainedNet = trainNetwork(tbl,responseName,layers,options)
trainedNet = trainNetwork(tbl,responseNames,layers,options)
[trainedNet,traininfo] = trainNetwork(___)

Description

example

trainedNet = trainNetwork(imds,layers,options) returns a trained network for classification problems. imds stores the input image data, layers defines the convolutional neural network (ConvNet) architecture, and options defines the training options.

example

trainedNet = trainNetwork(X,Y,layers,options) returns a trained network for classification and regression problems. X contains the predictor variables and Y contains the categorical labels or numeric responses.

trainedNet = trainNetwork(tbl,layers,options) returns a trained network for classification and regression problems. tbl contains the predictors and the targets or response variables. The predictors must be in the first column of tbl. For information on the targets or response variables, see the tbl argument description.

trainedNet = trainNetwork(tbl,responseName,layers,options) returns a trained network for classification and regression problems. The predictors must be in the first column of tbl. The responseName argument specifies the response variable in the table tbl.

trainedNet = trainNetwork(tbl,responseNames,layers,options) returns a trained network for regression problems. The predictors must be in the first column of tbl. The responseNames argument specifies the response variables in the table tbl.

[trainedNet,traininfo] = trainNetwork(___) also returns information on the training for any of the input arguments.

Examples

collapse all

Load the data as an ImageDatastore object.

digitDatasetPath = fullfile(matlabroot,'toolbox','nnet','nndemos',...
    'nndatasets','DigitDataset');
digitData = imageDatastore(digitDatasetPath,...
        'IncludeSubfolders',true,'LabelSource','foldernames');

The data store contains 10000 synthetic images of digits 0-9. The images are generated by applying random transformations to digit images created using different fonts. Each digit image is 28-by-28 pixels.

Display some of the images in the datastore.

figure;
perm = randperm(10000,20);
for i = 1:20
    subplot(4,5,i);
    imshow(digitData.Files{perm(i)});
end

Check the number of images in each digit category.

digitData.countEachLabel
ans =

  10×2 table

    Label    Count
    _____    _____

    0        1000 
    1        1000 
    2        1000 
    3        1000 
    4        1000 
    5        1000 
    6        1000 
    7        1000 
    8        1000 
    9        1000 

The data contains an equal number of images per category.

Divide the data set so that each category in the training set has 750 images and the testing set has the remaining images from each label.

trainingNumFiles = 750;
rng(1) % For reproducibility
[trainDigitData,testDigitData] = splitEachLabel(digitData,...
				trainingNumFiles,'randomize');

splitEachLabel splits the image files in digitData into two new datastores, trainDigitData and testDigitData.

Define the convolutional neural network architecture.

layers = [imageInputLayer([28 28 1]);
          convolution2dLayer(5,20);
          reluLayer();
          maxPooling2dLayer(2,'Stride',2);
          fullyConnectedLayer(10);
          softmaxLayer();
          classificationLayer()];

Set the options to default settings for the stochastic gradient descent with momentum. Set the maximum number of epochs at 20, and start the training with an initial learning rate of 0.001.

options = trainingOptions('sgdm','MaxEpochs',20,...
	'InitialLearnRate',0.0001);

Train the network.

convnet = trainNetwork(trainDigitData,layers,options);
Training on single GPU.
Initializing image normalization.
|=========================================================================================|
|     Epoch    |   Iteration  | Time Elapsed |  Mini-batch  |  Mini-batch  | Base Learning|
|              |              |  (seconds)   |     Loss     |   Accuracy   |     Rate     |
|=========================================================================================|
|            1 |            1 |         0.06 |       3.0845 |       13.28% |       0.0001 |
|            1 |           50 |         0.72 |       1.0945 |       65.63% |       0.0001 |
|            2 |          100 |         1.43 |       0.7276 |       74.22% |       0.0001 |
|            3 |          150 |         2.14 |       0.4743 |       83.59% |       0.0001 |
|            4 |          200 |         2.85 |       0.3086 |       91.41% |       0.0001 |
|            5 |          250 |         3.56 |       0.2324 |       92.97% |       0.0001 |
|            6 |          300 |         4.25 |       0.1542 |       97.66% |       0.0001 |
|            7 |          350 |         4.95 |       0.1315 |       97.66% |       0.0001 |
|            7 |          400 |         5.63 |       0.0944 |       96.09% |       0.0001 |
|            8 |          450 |         6.33 |       0.0668 |       98.44% |       0.0001 |
|            9 |          500 |         7.02 |       0.0458 |       99.22% |       0.0001 |
|           10 |          550 |         7.73 |       0.0544 |      100.00% |       0.0001 |
|           11 |          600 |         8.43 |       0.0660 |       99.22% |       0.0001 |
|           12 |          650 |         9.12 |       0.0338 |      100.00% |       0.0001 |
|           13 |          700 |         9.82 |       0.0340 |      100.00% |       0.0001 |
|           13 |          750 |        10.51 |       0.0370 |       99.22% |       0.0001 |
|           14 |          800 |        11.21 |       0.0264 |      100.00% |       0.0001 |
|           15 |          850 |        11.91 |       0.0182 |      100.00% |       0.0001 |
|           16 |          900 |        12.61 |       0.0234 |      100.00% |       0.0001 |
|           17 |          950 |        13.32 |       0.0224 |      100.00% |       0.0001 |
|           18 |         1000 |        14.01 |       0.0160 |      100.00% |       0.0001 |
|           19 |         1050 |        14.70 |       0.0233 |      100.00% |       0.0001 |
|           19 |         1100 |        15.39 |       0.0245 |      100.00% |       0.0001 |
|           20 |         1150 |        16.09 |       0.0154 |      100.00% |       0.0001 |
|           20 |         1160 |        16.23 |       0.0146 |      100.00% |       0.0001 |
|=========================================================================================|

Run the trained network on the test set that was not used to train the network and predict the image labels (digits).

YTest = classify(convnet,testDigitData);
TTest = testDigitData.Labels;

Calculate the accuracy.

accuracy = sum(YTest == TTest)/numel(TTest)
accuracy =

    0.9852

Accuracy is the ratio of the number of true labels in the test data matching the classifications from classify, to the number of images in the test data. In this case about 98.5% of the digit estimations match the true digit values in the test set.

Load the training data.

load lettersTrainSet

XTrain contains 1500 28-by-28 grayscale images of the letters A, B, and C in a 4-D array. There are equal numbers of each letter in the data set. TTrain contains the categorical array of the letter labels.

Display some of the letter images.

figure;
perm = randperm(1500,20);
for i = 1:20
    subplot(4,5,i);
    imshow(XTrain(:,:,:,perm(i)));
end

Define the convolutional neural network architecture.

layers = [imageInputLayer([28 28 1]);
          convolution2dLayer(5,16);
          reluLayer();
          maxPooling2dLayer(2,'Stride',2);
          fullyConnectedLayer(3);
          softmaxLayer();
          classificationLayer()];

Set the options to default settings for the stochastic gradient descent with momentum.

options = trainingOptions('sgdm');

Train the network.

rng('default') % For reproducibility
net = trainNetwork(XTrain,TTrain,layers,options);
Training on single GPU.
Initializing image normalization.
|=========================================================================================|
|     Epoch    |   Iteration  | Time Elapsed |  Mini-batch  |  Mini-batch  | Base Learning|
|              |              |  (seconds)   |     Loss     |   Accuracy   |     Rate     |
|=========================================================================================|
|            1 |            1 |         3.55 |       1.0994 |       27.34% |       0.0100 |
|            5 |           50 |         4.66 |       0.2175 |       98.44% |       0.0100 |
|           10 |          100 |         5.42 |       0.0238 |      100.00% |       0.0100 |
|           14 |          150 |         6.18 |       0.0108 |      100.00% |       0.0100 |
|           19 |          200 |         6.93 |       0.0088 |      100.00% |       0.0100 |
|           23 |          250 |         7.68 |       0.0048 |      100.00% |       0.0100 |
|           28 |          300 |         8.44 |       0.0035 |      100.00% |       0.0100 |
|           30 |          330 |         8.88 |       0.0052 |      100.00% |       0.0100 |
|=========================================================================================|

Run the trained network on a test set that was not used to train the network and predict the image labels (letters).

load lettersTestSet;

XTest contains 1500 28-by-28 grayscale images of the letters A, B, and C in a 4-D array. There is again equal numbers of each letter in the data set. TTest contains the categorical array of the letter labels.

YTest = classify(net,XTest);

Calculate the accuracy.

accuracy = sum(YTest == TTest)/numel(TTest)
accuracy =

    0.9273

Input Arguments

collapse all

Images, specified as an ImageDatastore object with categorical labels. You can store data in ImageDatastore for only classification problems.

ImageDatastore allows batch-reading of JPG or PNG image files using pre-fetching. If you use a custom function for reading the images, pre-fetching does not happen. For more information about this data type, see ImageDatastore.

Images, specified as a 4-D numeric array. The first three dimensions must be the height, width, and channels, and the last dimension must index the individual images.

If there are NaNs in the array, they are propagated through the training, however, in most cases the training fails to converge.

Data Types: single | double

Responses for a classification or a regression problem, specified as one of the following:

  • For a classification problem, Y is a categorical vector containing the image labels.

  • For a regression problem, Y can be an

    • n-by-r numeric matrix, where n is the number of observations and r is the number of responses

    • h-by-w-by-c-by-n numeric array, where n is the number of observations and h-by-w-by-c is the size of a single response.

Responses must not contain NaNs.

Data Types: categorical | double

Input data, specified as a table. tbl must contain the predictors in the first column as either absolute or relative image paths or images. The type and location of the responses depend on the problem:

  • For a classification problem, the response must be a categorical variable containing labels for the images. If the name of the response variable is not specified in the call to trainNetwork, the responses must be in the second column. If the responses are in a different column of tbl, then you must specify the response variable name using the responseName positional argument.

  • For a regression problem, the responses must be numerical values in the column or columns after the first one. The responses can be either in multiple columns as scalars or in a single column as numeric vectors or cell arrays containing numeric 3-D arrays. When you do not specify the name of the response variable or variables, trainNetwork accepts the remaining columns of tbl as the response variables. You can specify the response variable names using the responseName positional argument.

Responses must not contain NaNs. If there are NaNs in the predictor data, they are propagated through the training, however, in most cases the training fails to converge.

Data Types: table

Name of the response variable for a regression and classification problem, specified as a character vector that shows the name of the variable containing the responses in tbl.

Data Types: char

Names of the response variables for a regression problem, specified as a cell array of character vectors that show the names of the variables containing the responses in tbl.

Data Types: cell

An array of network layers, specified as a Layer object. layers can be the layers of a checkpoint network trainNetwork previously saved. In that case, enter the network layers using the dot notation. For example, if the name of the checkpoint network is net, then enter net.Layers for the layers argument.

Training options, specified as a TrainingOptionsSGDM object returned by the trainingOptions function. SGDM stands for the stochastic gradient descent with momentum solver.

Output Arguments

collapse all

Trained network, returned as a SeriesNetwork object.

Information on the training, returned as a structure with the following fields.

  • TrainingLoss — Loss function value at each iteration

  • TrainingAccuracy — Training accuracy at each iteration if network is a classification network

  • TrainingRMSE — Training RMSE at each iteration if network is a regression network

  • BaseLearnRate — The learning rate at each iteration

More About

collapse all

Save Checkpoint Networks and Resume Training

trainNetwork enables you to save checkpoint networks as .mat files during training. You can then resume training from any of these checkpoint networks. If you want trainNetwork to save checkpoint networks, then you must specify the name of the path using the CheckpointPath name-value pair argument in the call to trainingOptions. If the path you specify is wrong, then trainingOptions returns an error.

trainNetwork automatically assigns unique names to these checkpoint network files. For example, convnet_checkpoint__351__2016_11_09__12_04_23.mat, where 351 is the iteration number, 2016_11_09 is the date and 12_04_21 is the time trainNetwork saves the network. You can load any of these by double clicking on them or typing, for example,

load convnet_checkpoint__351__2016_11_09__12_04_23.mat
in the command line. You can then resume training by using the layers of this network in the call to trainNetwork, for example,

trainNetwork(Xtrain,Ytrain,net.Layers,options)
You must manually specify the training options and the input data as the checkpoint network does not contain this information.

Introduced in R2016a

Was this topic helpful?