Compute convolutional neural network layer activations
You can extract features using a trained convolutional neural network
(ConvNet, CNN) on either a CPU or GPU. Using a GPU requires
Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU with compute capability 3.0 or higher. Specify the hardware requirements using the
ExecutionEnvironment name-value pair argument.
features = activations(net,X,layer)
features = activations(net,X,layer,Name,Value)
Load the sample data.
[XTrain,TTrain] = digitTrain4DArrayData;
digitTrain4DArrayData loads the digit training set as 4-D array data.
XTrain is a 28-by-28-by-1-by-4940 array, where 28 is the height and 28 is the width of the images. 1 is the number of channels and 4940 is the number of synthetic images of handwritten digits.
TTrain is a categorical vector containing the labels for each observation.
Construct the convolutional neural network architecture.
layers = [imageInputLayer([28 28 1]); convolution2dLayer(5,20); reluLayer(); maxPooling2dLayer(2,'Stride',2); fullyConnectedLayer(10); softmaxLayer(); classificationLayer()];
Set the options to default settings for the stochastic gradient descent with momentum. Specify the GPU as the hardware to train on. This option requires Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher.
options = trainingOptions('sgdm','ExecutionEnvironment','gpu');
Train the network.
rng('default') net = trainNetwork(XTrain,TTrain,layers,options);
Initializing image normalization. |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.01 | 2.3026 | 7.81% | 0.0100 | | 2 | 50 | 0.46 | 2.2735 | 33.59% | 0.0100 | | 3 | 100 | 0.92 | 1.6613 | 48.44% | 0.0100 | | 4 | 150 | 1.38 | 1.1803 | 64.06% | 0.0100 | | 6 | 200 | 1.89 | 1.0499 | 64.06% | 0.0100 | | 7 | 250 | 2.37 | 0.8392 | 76.56% | 0.0100 | | 8 | 300 | 2.86 | 0.6981 | 77.34% | 0.0100 | | 9 | 350 | 3.34 | 0.7084 | 77.34% | 0.0100 | | 11 | 400 | 3.87 | 0.4902 | 87.50% | 0.0100 | | 12 | 450 | 4.36 | 0.3839 | 91.41% | 0.0100 | | 13 | 500 | 4.83 | 0.2986 | 92.19% | 0.0100 | | 15 | 550 | 5.31 | 0.2583 | 93.75% | 0.0100 | | 16 | 600 | 5.79 | 0.2009 | 97.66% | 0.0100 | | 17 | 650 | 6.27 | 0.2642 | 92.97% | 0.0100 | | 18 | 700 | 6.77 | 0.1448 | 97.66% | 0.0100 | | 20 | 750 | 7.28 | 0.1314 | 96.88% | 0.0100 | | 21 | 800 | 7.77 | 0.1232 | 97.66% | 0.0100 | | 22 | 850 | 8.25 | 0.1009 | 98.44% | 0.0100 | | 24 | 900 | 8.72 | 0.1051 | 100.00% | 0.0100 | | 25 | 950 | 9.20 | 0.1483 | 97.66% | 0.0100 | | 26 | 1000 | 9.67 | 0.0743 | 99.22% | 0.0100 | | 27 | 1050 | 10.15 | 0.0603 | 100.00% | 0.0100 | | 29 | 1100 | 10.64 | 0.0769 | 99.22% | 0.0100 | | 30 | 1150 | 11.11 | 0.0524 | 100.00% | 0.0100 | | 30 | 1170 | 11.31 | 0.0566 | 100.00% | 0.0100 | |=========================================================================================|
trainNetwork, by default, uses a GPU to train the network, when available. If there is no available GPU, then it uses a CPU. Training a convolutional neural network on a GPU or in parallel requires Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher. There are also other hardware options such as training in parallel or using multiple GPUs. You can specify these options using the
'ExecutionEnvironment' name-value pair argument in the call to the
Make predictions, but rather than taking the output from the last layer, specify the second ReLU layer (the sixth layer) as the output layer.
trainFeatures = activations(net,XTrain,6);
These predictions from an inner layer are known as activations or features .
activations method, by default, also uses a CUDA-enabled GPU with compute capability 3.0, when available. You can also choose to run activations on a CPU using the
'ExecutionEnvironment','cpu' name-value pair argument.
You can use the returned features to train a support vector machine using the Statistics and Machine Learning Toolbox™ function
svm = fitcecoc(trainFeatures,TTrain);
Load the test data.
Extract the features from the same ReLU layer (the sixth layer) for test data and use the returned features to train a support vector machine.
testFeatures = activations(net,XTest,6); testPredictions = predict(svm,testFeatures);
Plot the confusion matrix. Convert the data into the format
ttest = dummyvar(double(TTest))'; % dummyvar requires Statistics and Machine Learning Toolbox tpredictions = dummyvar(double(testPredictions))'; plotconfusion(ttest,tpredictions);
The overall accuracy for the test data using the trained network
net is 97.8%.
Manually compute the overall accuracy.
accuracy = sum(TTest == testPredictions)/numel(TTest)
accuracy = 0.9778
net— Trained network
Trained network, specified as a
SeriesNetwork object. You can get a trained network by importing a pretrained network (for example, by using the
alexnet function) or by training your own network using the
activations does not support input networks that contain
recurrent layers (for example, LSTM networks).
X— Data for extracting features
ImageDatastoreobject | table
Data for extracting features, specified as an array of a single image, a
4-D array of images, images stored as an
or images or image paths in a
X is a single image, then the dimensions
correspond to the height, width, and channels of the image.
X is an array of images, then the first
three dimensions correspond to height, width, and channels of an
image, and the fourth dimension corresponds to the image
Images that are stored as an
object. For more information about this data type, see
A table, where the first column contains either image paths or images.
'OutputAs','channels' option, the input data can be
of different size than the data used for training. For other output options,
the input data has to be the same size as the data used for training.
layer— Layer to extract features from
Layer to extract features from, specified as a numeric index for the layer or a character vector that corresponds with one of the network layer names.
Specify optional comma-separated pairs of
Name is the argument name and
is the corresponding value.
Name must appear inside single quotes
' '). You can specify several name and value pair arguments
in any order as
'OutputAs'— Format of output activations
Format of output activations, specified as the comma-separated pair
'OutputAs' and one of the
an n-by-m matrix, where
n is the number of observations, and
m is the number of output elements from
the chosen layer.
is an m-by-n matrix, where
m is the number of output elements from
the chosen layer, and n is the number of
observations. Each column of the matrix is the output for a
array, where h, w, and
c are the height, width, and number of
channels for the output of the chosen layer.
n is the number of observations. Each
sub-array is the output for a single observation.
'OutputAs','channels' option, the input data in
X can be of different size than the data used
for training. For other output options, the data in
X has to be the same size as the data used for
'ExecutionEnvironment'— Hardware resource
Hardware resource, specified as the comma-separated pair consisting of
'ExecutionEnvironment' and one of the following:
'auto' — Use a GPU if one is available; otherwise, use the
'gpu' — Use the GPU. Using a GPU requires
Computing Toolbox and a CUDA enabled NVIDIA GPU with compute capability 3.0 or higher. If Parallel
Computing Toolbox or a suitable GPU is not available, then the software returns an
'cpu' — Use the CPU.
features— Activations from a network layer
Activations from a network layer, returned as one of the following
depending on the value of the
'OutputAs' name-value pair