Deploying Neural Network with Simulink Embedded Coder: Reducing Code Size and Improving Execution Speed

7 views (last 30 days)
My goal is to deploy a fully connected neural network model to a STM32MP1 microcontroller using the Embedded Coder.
To this end, I've adapted the following workflow (any suggestions appreciated):
  1. Transform the neural network from .pth to .onnx file format (via torch.onnx.export),
  2. Bring the network to Matlab using importONNXNetwork(),
  3. Load this .mat file inside a Simulink Predict block (see image),
  4. Generate C-Code via the Embedded Coder App, and
  5. Build and run the model on the STM32MP1 microcontroller.
The constraints of the microcontroller (RAM) and the task (frequency) are the following:
  • Network Size: Given the training results the minimum size is a MLP with three hidden layers (256x128x64), I/O = 10/2
  • Code Size: The generated Code needs to be < 200kb in order to fit into the RAM
  • Network Speed: The generated code needs to run within < 1ms
Currently, with this workflow, I am able to generate and build C-Code for a smaller network (128x64x32) that is now running on the STM32MP1 with an execution time of around 0.9ms, while the actual network (256x128x64) fails to build on the microcontroller due to an overflow of 76kB.
Hence, the two related questions are (of this priority):
  1. How to reduce the size of the generated C-Code (from 276kb to <200kb)
  2. How to improve the execution speed for the generated C-Code

Answers (1)

Prateek
Prateek on 6 Jan 2023
  2 Comments
Julian
Julian on 6 Jan 2023
Hi Prateek,
Thanks for offering help on this topic and providing some links to the docs!
Having checked your pointers, I've tested the model advisor (#2) and selecting different objectives (#3) for improving the generated code without success so far. While the docs on code generation in general seems comprehensive, I feel it rarely focusing on the deployment of neural networks so far - do you have specific pointers also for the code optimization of NN?
Another Idea to reduce the size of the generated C-code for the NN could be to use the float16 datatype (half-precision), instead of the default float32 (single precision), which would essentially cut the size by almost 50%. However, using onnxmltools.utils.float16_converter to convert the ONNX model to float16 and importing it to Matlab with importONNXNetwork() results in the following error:
"Error using nnet.internal.cnn.onnx.getDataFromTensorProto
The datatype of initializer 'model._model.a2c_network.sigma' ('FLOAT16') is not supported."
Is Matlab (and the Coder) even able to handle neural networks with float16 datatypes at this point in time? An older forum entry said it was under "active development".
Best, Julian

Sign in to comment.

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!