Code Generation for Deep Learning on ARM Targets

This example shows how to generate and deploy code for prediction on an ARM®-based device without using a hardware support package.

When you generate code for prediction using the ARM Compute Library and a hardware support package, codegen generates code on the host computer, copies the generated files to the target hardware, and builds the executable on the target hardware. Without a hardware support package, codegen generates code on the host computer. You must run commands to copy the files and build the executable program on the target hardware.


  • ARM processor that supports the NEON extension

  • ARM Compute Library (on the target ARM hardware)

  • Open Source Computer Vision Library

  • Environment variables for the compilers and libraries

  • MATLAB® Coder™

  • The support package MATLAB Coder Interface for Deep Learning

  • Deep Learning Toolbox™

  • The support package Deep Learning Toolbox Model for SqueezeNet Network

For supported versions of libraries and for information about setting up environment variables, see Prerequisites for Deep Learning with MATLAB Coder (MATLAB Coder).

This example is supported on Linux® platforms only and not supported for MATLAB Online.

squeezenet_predict Function

This example uses the DAG network SqueezeNet to show image classification with the ARM Compute Library. A pretrained SqueezeNet for MATLAB is available in the support package Deep Learning Toolbox Model for SqueezeNet Network. The squeezenet_predict function loads the SqueezeNet network into a persistent network object. On subsequent calls to the function, the persistent object is reused.

type squeezenet_predict
% Copyright 2018 The MathWorks, Inc.

function out = squeezenet_predict(in) 

% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% set up. When the function is called subsequent times, the same object is reused 
% to call predict on inputs, avoiding reconstructing and reloading the
% network object.

persistent mynet;
if isempty(mynet)
       mynet = coder.loadDeepLearningNetwork('squeezenet','squeezenet');

out = mynet.predict(in);

Set Up a Code Generation Configuration Object for a Static Library

When you generate code targeting an ARM-based device and do not use a hardware support package, create a configuration object for a library. Do not create a configuration object for an executable program.

Set up the configuration object for generation of C++ code and generation of code only.

cfg = coder.config('lib');
cfg.TargetLang = 'C++';
cfg.GenCodeOnly = true;

Set Up a Configuration Object for Deep Learning Code Generation

Create a coder.ARMNEONConfig object. Specify the library version and the architecture of the target ARM processor. For example, suppose that the target board is a HiKey board with ARMv8 architecture and ARM Compute Library version 19.02.

dlcfg = coder.DeepLearningConfig('arm-compute');
dlcfg.ArmComputeVersion = '19.02';
dlcfg.ArmArchitecture = 'armv8';

Attach the Deep Learning Configuration Object to the Code Generation Configuration Object

Set the DeepLearningConfig property of the code generation configuration object to the deep learning configuration object.

cfg.DeepLearningConfig = dlcfg;

Configure Code Generation Parameters Specific to the Hardware

To configure code generation parameters that are specific to the target hardware, set the ProdHWDeviceType property. For example, for the ARMv8 architecture, use 'ARM Compatible->ARM 64-bit (LP64)'. For the ARMv7 architecture, use ''ARM Compatible->ARM Cortex'.

cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM 64-bit (LP64)';

For more information about hardware-specific settings, see coder.HardwareImplementation.

Generate Source C++ Code by Using codegen

codegen -config cfg squeezenet_predict -args {ones(227, 227, 3, 'single')} -d arm_compute

The code is generated in the arm_compute folder in the current working folder on the host computer.

Copy the Generated Files to the Target Hardware

system('sshpass -p password scp -r arm_compute username@targetname:targetloc/');

Copy Example Files to the Target Hardware

Copy these files from the host computer to the target hardware:

  • C++ main file that runs prediction on an input image, main_squeezenet_arm_generic.cpp

  • Input image, coffeemug.png

  • Makefile for building the executable program,

  • Synset dictionary, synsetWords.txt

In, modify CODEGEN_LIB, ARM_COMPUTELIB,and targetDirName for your target.

In the following commands, replace:

  • password with your password

  • username with your user name

  • targetname with the name of your device

  • targetloc with the destination folder for the files

system('sshpass -p password scp main_squeezenet_arm_generic.cpp username@targetname:targetloc/arm_compute/');
system('sshpass -p password scp coffeemug.png username@targetname:targetloc/arm_compute/');
system('sshpass -p password scp username@targetname:targetloc/arm_compute/');
system('sshpass -p password scp synsetWords.txt username@targetname:targetloc/arm_compute/');

Build the Library on the Target Hardware

To build the library on the target hardware, execute the generated makefile on the ARM hardware.

Make sure that you set the environment variables ARM_COMPUTELIB and LD_LIBRARY_PATH on the target hardware. See Prerequisites for Deep Learning with MATLAB Coder (MATLAB Coder).

system('sshpass -p password ssh username@targetname "make -C targetloc/arm_compute/ -f"');

Create Executable from the Library on the Target Hardware

system('sshpass -p password ssh username@targetname "make -C targetloc/arm_compute/ -f"');

Run the Executable on the Target Hardware

Run the executable with an input image file.
system('sshpass -p password ssh username@targetname "cd targetloc/arm_compute/; ./squeezenet coffeemug.png"');
Top 5 Predictions:
88.299% coffee mug
7.309% cup
1.098% candle
0.634% paper towel
0.591% water jug