This example demonstrates code generation for an image classification application that uses deep learning. It uses the codegen
command to generate a MEX function that runs prediction using popular image classification networks such as AlexNet, ResNet, and GoogLeNet.
CUDA® enabled NVIDIA® GPU with compute capability 3.2 or higher.
NVIDIA CUDA toolkit and driver.
NVIDIA cuDNN library (v7).
Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products. For setting up the environment variables, see Setting Up the Prerequisite Products.
Computer Vision System Toolbox™ for the video reader and viewer used in the example.
Deep Learning Toolbox™ for using SeriesNetwork or DAGNetwork objects.
Image Processing Toolbox™ for reading and displaying images.
GPU Coder™ for generating CUDA code.
GPU Coder Interface for Deep Learning Libraries support package. To install this support package, use the Add-On Explorer.
Use the coder.checkGpuInstall
function and verify that the compilers and libraries needed for running this example are set up correctly.
coder.checkGpuInstall('gpu','codegen','cudnn','quiet');
The alexnet_predict.m function takes an image input and runs prediction on the image using the deep learning network saved in alexnet.mat file. The function loads the network object from alexnet.mat into a persistent variable mynet. On subsequent calls to the function, the persistent object is reused for prediction.
type('alexnet_predict.m')
% Copyright 2017 The MathWorks, Inc. function out = alexnet_predict(in) %#codegen % A persistent object mynet is used to load the series network object. % At the first call to this function, the persistent object is constructed and % setup. When the function is called subsequent times, the same object is reused % to call predict on inputs, thus avoiding reconstructing and reloading the % network object. persistent mynet; if isempty(mynet) mynet = coder.loadDeepLearningNetwork('alexnet.mat','alexnet'); end % pass in input out = mynet.predict(in);
Download AlexNet network and save to alexnet.mat if it does not exist.
net = getAlexnet();
The network contains 25 layers including convolution, fully connected and the classification output layers.
net.Layers
ans = 25x1 Layer array with layers: 1 'data' Image Input 227x227x3 images with 'zerocenter' normalization 2 'conv1' Convolution 96 11x11x3 convolutions with stride [4 4] and padding [0 0 0 0] 3 'relu1' ReLU ReLU 4 'norm1' Cross Channel Normalization cross channel normalization with 5 channels per element 5 'pool1' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0 0 0] 6 'conv2' Convolution 256 5x5x48 convolutions with stride [1 1] and padding [2 2 2 2] 7 'relu2' ReLU ReLU 8 'norm2' Cross Channel Normalization cross channel normalization with 5 channels per element 9 'pool2' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0 0 0] 10 'conv3' Convolution 384 3x3x256 convolutions with stride [1 1] and padding [1 1 1 1] 11 'relu3' ReLU ReLU 12 'conv4' Convolution 384 3x3x192 convolutions with stride [1 1] and padding [1 1 1 1] 13 'relu4' ReLU ReLU 14 'conv5' Convolution 256 3x3x192 convolutions with stride [1 1] and padding [1 1 1 1] 15 'relu5' ReLU ReLU 16 'pool5' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0 0 0] 17 'fc6' Fully Connected 4096 fully connected layer 18 'relu6' ReLU ReLU 19 'drop6' Dropout 50% dropout 20 'fc7' Fully Connected 4096 fully connected layer 21 'relu7' ReLU ReLU 22 'drop7' Dropout 50% dropout 23 'fc8' Fully Connected 1000 fully connected layer 24 'prob' Softmax softmax 25 'output' Classification Output crossentropyex with 'tench' and 999 other classes
To generate CUDA code from design file alexnet_predict.m, create a GPU code configuration object for a MEX target and set the target language to C++. Use the coder.DeepLearningConfig function to create a CuDNN
deep learning configuration object and assign it to the DeepLearningConfig
property of the GPU code configuration object. Run the codegen
command specifying an input of size [227,227,3]. This value corresponds to the input layer size of AlexNet network.
cfg = coder.gpuConfig('mex'); cfg.TargetLang = 'C++'; cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); codegen -config cfg alexnet_predict -args {ones(227,227,3)} -report
Code generation successful: To view the report, open('codegen/mex/alexnet_predict/html/report.mldatx').
The Series Network is generated as a C++ class containing an array of 25 layer classes and functions to set up, predict, and clean up the network.
class b_alexnet { .... public: b_alexnet(); void setup(); void predict(); void cleanup(); ~b_alexnet(); };
The setup() method of the class sets up handles and allocates memory for each layer of the network object. The predict() method invokes prediction for each of the 25 layers in the network.
The entry-point function alexnet_predict() in the generated code file alexnet_predict.cu constructs a static object of b_alexnet class type and invokes setup and predict on this network object.
static b_alexnet mynet; static boolean_T mynet_not_empty; /* Function Definitions */ void alexnet_predict(alexnet_predictStackData *SD, const real_T in[154587], real32_T out[1000]) { if (!mynet_not_empty) { DeepLearningNetwork_setup(&mynet); mynet_not_empty = true; }
DeepLearningNetwork_predict(SD, &mynet, in, out); }
Binary files are exported for layers with parameters such as fully connected and convolution layers in the network. For instance, files cnn_alexnet_conv*_w and cnn_alexnet_conv*_b correspond to weights and bias parameters for the convolution layers in the network.
dir(fullfile(pwd, 'codegen', 'mex', 'alexnet_predict'))
. alexnet_predict_terminate.h .. alexnet_predict_terminate.o DeepLearningNetwork.cu alexnet_predict_types.h DeepLearningNetwork.h buildInfo.mat DeepLearningNetwork.o cnn_alexnet_avg MWCNNLayerImpl.cu cnn_alexnet_conv1_b MWCNNLayerImpl.hpp cnn_alexnet_conv1_w MWCNNLayerImpl.o cnn_alexnet_conv2_b MWCudaDimUtility.cu cnn_alexnet_conv2_w MWCudaDimUtility.h cnn_alexnet_conv3_b MWCudaDimUtility.o cnn_alexnet_conv3_w MWFusedConvReLULayer.cpp cnn_alexnet_conv4_b MWFusedConvReLULayer.hpp cnn_alexnet_conv4_w MWFusedConvReLULayer.o cnn_alexnet_conv5_b MWFusedConvReLULayerImpl.cu cnn_alexnet_conv5_w MWFusedConvReLULayerImpl.hpp cnn_alexnet_fc6_b MWFusedConvReLULayerImpl.o cnn_alexnet_fc6_w MWTargetNetworkImpl.cu cnn_alexnet_fc7_b MWTargetNetworkImpl.hpp cnn_alexnet_fc7_w MWTargetNetworkImpl.o cnn_alexnet_fc8_b _coder_alexnet_predict_api.o cnn_alexnet_fc8_w _coder_alexnet_predict_info.o cnn_alexnet_labels.txt _coder_alexnet_predict_mex.o cnn_api.cpp alexnet_predict.cu cnn_api.hpp alexnet_predict.h cnn_api.o alexnet_predict.o cpp_mexapi_version.cpp alexnet_predict_data.cu cpp_mexapi_version.o alexnet_predict_data.h gpu_codegen_info.mat alexnet_predict_data.o html alexnet_predict_initialize.cu interface alexnet_predict_initialize.h predict.cu alexnet_predict_initialize.o predict.h alexnet_predict_mex.mexa64 predict.o alexnet_predict_mex.mk rt_nonfinite.h alexnet_predict_mex.mki rtwtypes.h alexnet_predict_mex.sh setEnv.sh alexnet_predict_mex_mex.map alexnet_predict_terminate.cu
Load an input image.
im = imread('peppers.png');
imshow(im);
Call AlexNet predict on the input image.
im = imresize(im, [227,227]); predict_scores = alexnet_predict_mex(double(im));
Map the top five prediction scores to words in the synset dictionary.
fid = fopen('synsetWords.txt'); synsetOut = textscan(fid,'%s', 'delimiter', '\n'); synsetOut = synsetOut{1}; fclose(fid); [val,indx] = sort(predict_scores, 'descend'); scores = val(1:5)*100; labels = synsetOut(indx(1:5));
Display the top five classification labels.
imfull = zeros(227,400,3, 'uint8'); for k = 1:3 imfull(:,174:end,k) = im(:,:,k); end h = imshow(imfull, 'InitialMagnification',200); text(get(h, 'Parent'), 1, 20, 'Classification with AlexNet' , 'color', 'w','FontSize', 20); scol = 1; srow = 50; for k = 1:5 t = text(get(h, 'Parent'), scol, srow, labels{k}, 'color', 'w','FontSize', 15); pos = get(t, 'Extent'); text(get(h, 'Parent'), pos(1)+pos(3)+5, srow, sprintf('%2.2f%%', scores(k)), 'color', 'w', 'FontSize', 15); srow = srow + 20; end
The included example file alexnet_live.m grabs frames from a webcam, invokes prediction, and displays the classification results on each of the captured video frames. Note: This example uses webcam function which is supported through a MATLAB® Support Package for USB Webcams™. You can download and install the support package through the Support Package Installer.
camera = webcam; while true % Take a picture ipicture = camera.snapshot;
% Resize and cast the picture to single picture = imresize(ipicture,[227,227]);
% Call MEX function for AlexNet prediction tic; pout = alexnet_predict(single(picture)); newt = toc;
% fps fps = .9*fps + .1*(1/newt);
% top 5 scores [top5labels, scores] = getTopFive(pout, synsetOut);
% display dispResults(ax, imfull, picture, top5labels, scores, fps); end
Clear the static network object loaded in memory.
clear mex;
We can also use the popular DAG network ResNet-50 for image classification. A pretrained ResNet-50 model for MATLAB is available in the ResNet-50 support package of the Deep Learning Toolbox. To download and install the support package, use the Add-On Explorer. To learn more about finding and installing add-ons, see Get Add-Ons (MATLAB).
net = resnet50; disp(net)
DAGNetwork with properties: Layers: [177×1 nnet.cnn.layer.Layer] Connections: [192×2 table]
Generate CUDA code from design file resnet_predict.m. This design file calls the function resnet50 to load the network and run predict on the input image. To generate code from this file, create a GPU Configuration object for MEX target as before.
cfg = coder.gpuConfig('mex'); cfg.TargetLang = 'C++'; cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); codegen -config cfg resnet_predict -args {ones(224,224,3)} -report
Code generation successful: To view the report, open('codegen/mex/resnet_predict/html/report.mldatx').
Call predict on the input image.
im = imresize(im, [224,224]);
predict_scores = resnet_predict_mex(double(im));
[val,indx] = sort(predict_scores, 'descend');
scores = val(1:5)*100;
labels = synsetOut(indx(1:5));
Clear the static network object loaded in memory.
clear mex;
A pretrained GoogLeNet model for MATLAB is available in the GoogLeNet support package of the Deep Learning Toolbox. To download and install the support package, use the Add-On Explorer. To learn more about finding and installing add-ons, see Get Add-Ons (MATLAB).
net = googlenet; disp(net)
DAGNetwork with properties: Layers: [144×1 nnet.cnn.layer.Layer] Connections: [170×2 table]
Generate CUDA code from design file googlenet_predict.m . This design file calls the function googlenet to load the network and run predict on the input image. To generate code from this file, create a GPU Configuration object for MEX target as before.
cfg = coder.gpuConfig('mex'); cfg.TargetLang = 'C++'; cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); codegen -config cfg googlenet_predict -args {ones(224,224,3)} -report
Code generation successful: To view the report, open('codegen/mex/googlenet_predict/html/report.mldatx').
Call predict on the input image.
im = imresize(im, [224,224]);
predict_scores = googlenet_predict_mex(double(im));
[val,indx] = sort(predict_scores, 'descend');
scores_googlenet = val(1:5)*100;
labels_googlenet = synsetOut(indx(1:5));
Clear the static network object loaded in memory.
clear mex;