Compute convolutional neural network layer activations
You can extract features using a trained convolutional
neural network (ConvNet, CNN) on either a CPU or GPU. Using a GPU
requires the Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU
with compute capability 3.0 or higher. Specify the hardware requirements
ExecutionEnvironment name-value pair
features = activations(net,X,layer)
features = activations(net,X,layer,Name,Value)
net— Trained network
X— Data for extracting features
ImageDatastoreobject | table
Data for extracting features, specified as an array of a single
image, a 4-D array of images, images stored as an
or images or image paths in a
X is a single image, then
the dimensions correspond to the height, width, and channels of the
X is an array of images, then
the first three dimensions correspond to height, width, and channels
of an image, and the fourth dimension corresponds to the image number.
Images that are stored as an
For more information about this data type, see
A table, where the first column contains either image paths or images.
'OutputAs','channels' option, the input
data can be of different size than the data used for training. For
other output options, the input data has to be the same size as the
data used for training.
layer— Layer to extract features from
Layer to extract features from, specified as a numeric index for the layer or a character vector that corresponds with one of the network layer names.
Specify optional comma-separated pairs of
Name is the argument
Value is the corresponding
Name must appear
inside single quotes (
You can specify several name and value pair
arguments in any order as
'OutputAs'— Format of output activations
Format of output activations, specified as the comma-separated
pair consisting of
'OutputAs' and one of the following:
an n-by-m matrix, where n is
the number of observations, and
m is the number
of output elements from the chosen layer.
an m-by-n matrix, where m is
the number of output elements from the chosen layer, and n is
the number of observations. Each column of the matrix is the output
for a single observation.
an h-by-w-by-c-by-n array,
where h, w, and c are
the height, width, and number of channels for the output of the chosen
layer. n is the number of observations. Each h-by-w-by-c sub-array
is the output for a single observation.
'OutputAs','channels' option, the input
X can be of different size than the data
used for training. For other output options, the data in
to be the same size as the data used for training.
'MiniBatchSize'— Size of mini-batches for prediction
Size of mini-batches for prediction, specified as an integer number. Larger mini-batch sizes require more memory, but lead to faster predictions.
'ExecutionEnvironment'— Hardware resource for
Hardware resource for
activations to run
the network, specified as the comma-separated pair consisting of
one of the following:
'auto' — Use a GPU if it
is available, otherwise uses the CPU.
'gpu' — Use the GPU. To
use a GPU, you must have Parallel Computing Toolbox and a CUDA-enabled NVIDIA GPU
with compute capability 3.0 or higher. If a suitable GPU is not available,
an error message.
'cpu' — Uses the CPU.
features— Activations from a network layer
Activations from a network layer, returned as one of the following
depending on the value of the
Load the sample data.
[XTrain,TTrain] = digitTrain4DArrayData;
digitTrain4DArrayData loads the digit training set as 4-D array data.
XTrain is a 28-by-28-by-1-by-4940 array, where 28 is the height and 28 is the width of the images. 1 is the number of channels and 4940 is the number of synthetic images of handwritten digits.
TTrain is a categorical vector containing the labels for each observation.
Construct the convolutional neural network architecture.
layers = [imageInputLayer([28 28 1]); convolution2dLayer(5,20); reluLayer(); maxPooling2dLayer(2,'Stride',2); fullyConnectedLayer(10); softmaxLayer(); classificationLayer()];
Set the options to default settings for the stochastic gradient descent with momentum. Specify the GPU as the hardware to train on. This option requires Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher.
options = trainingOptions('sgdm','ExecutionEnvironment','gpu');
Train the network.
rng('default') net = trainNetwork(XTrain,TTrain,layers,options);
Initializing image normalization. |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.01 | 2.3026 | 7.81% | 0.0100 | | 2 | 50 | 0.46 | 2.2735 | 33.59% | 0.0100 | | 3 | 100 | 0.92 | 1.6613 | 48.44% | 0.0100 | | 4 | 150 | 1.38 | 1.1803 | 64.06% | 0.0100 | | 6 | 200 | 1.89 | 1.0499 | 64.06% | 0.0100 | | 7 | 250 | 2.37 | 0.8392 | 76.56% | 0.0100 | | 8 | 300 | 2.86 | 0.6981 | 77.34% | 0.0100 | | 9 | 350 | 3.34 | 0.7084 | 77.34% | 0.0100 | | 11 | 400 | 3.87 | 0.4902 | 87.50% | 0.0100 | | 12 | 450 | 4.36 | 0.3839 | 91.41% | 0.0100 | | 13 | 500 | 4.83 | 0.2986 | 92.19% | 0.0100 | | 15 | 550 | 5.31 | 0.2583 | 93.75% | 0.0100 | | 16 | 600 | 5.79 | 0.2009 | 97.66% | 0.0100 | | 17 | 650 | 6.27 | 0.2642 | 92.97% | 0.0100 | | 18 | 700 | 6.77 | 0.1448 | 97.66% | 0.0100 | | 20 | 750 | 7.28 | 0.1314 | 96.88% | 0.0100 | | 21 | 800 | 7.77 | 0.1232 | 97.66% | 0.0100 | | 22 | 850 | 8.25 | 0.1009 | 98.44% | 0.0100 | | 24 | 900 | 8.72 | 0.1051 | 100.00% | 0.0100 | | 25 | 950 | 9.20 | 0.1483 | 97.66% | 0.0100 | | 26 | 1000 | 9.67 | 0.0743 | 99.22% | 0.0100 | | 27 | 1050 | 10.15 | 0.0603 | 100.00% | 0.0100 | | 29 | 1100 | 10.64 | 0.0769 | 99.22% | 0.0100 | | 30 | 1150 | 11.11 | 0.0524 | 100.00% | 0.0100 | | 30 | 1170 | 11.31 | 0.0566 | 100.00% | 0.0100 | |=========================================================================================|
trainNetwork, by default, uses a GPU to train the network, when available. If there is no available GPU, then it uses a CPU. Training a convolutional neural network on a GPU or in parallel requires Parallel Computing Toolbox™ and a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher. There are also other hardware options such as training in parallel or using multiple GPUs. You can specify these options using the
'ExecutionEnvironment' name-value pair argument in the call to the
Make predictions, but rather than taking the output from the last layer, specify the second ReLU layer (the sixth layer) as the output layer.
trainFeatures = activations(net,XTrain,6);
These predictions from an inner layer are known as activations or features .
activations method, by default, also uses a CUDA-enabled GPU with compute capability 3.0, when available. You can also choose to run activations on a CPU using the
'ExecutionEnvironment','cpu' name-value pair argument.
You can use the returned features to train a support vector machine using the Statistics and Machine Learning Toolbox™ function
fitcecoc (Statistics and Machine Learning Toolbox)
svm = fitcecoc(trainFeatures,TTrain);
Load the test data.
Extract the features from the same ReLU layer (the sixth layer) for test data and use the returned features to train a support vector machine.
testFeatures = activations(net,XTest,6); testPredictions = predict(svm,testFeatures);
Plot the confusion matrix. Convert the data into the format
ttest = dummyvar(double(TTest))'; % dummyvar requires Statistics and Machine Learning Toolbox tpredictions = dummyvar(double(testPredictions))'; plotconfusion(ttest,tpredictions);
The overall accuracy for the test data using the trained network
net is 97.8%.
Manually compute the overall accuracy.
accuracy = sum(TTest == testPredictions)/numel(TTest)
accuracy = 0.9778