File Exchange

image thumbnail

Deep Learning Toolbox Model Quantization Library

Quantize and Compress Deep Learning models


Updated 14 Oct 2020

Deep Learning Toolbox Model Quantization Library enables quantizing and compressing of your deep learning models. It provides instrumentation services that enable you to collect layer level data on the weights, activations and intermediate computations during the calibration step. Using instrumentation data, the library/add-on enables quantization of your model and it provides metrics to validate the accuracy of the quantized network.

The library/add-on enables an iterative workflow to optimize the quantization approach to meet the required accuracy. It provides heuristics to choose the right quantization strategy.
You can validate the quantized network and compare the accuracy against the single precision baseline.

The library/add-on provides a Quantization app that lets you analyze and visualize the instrumentation data to understand the tradeoff on the accuracy of quantizing the weights and biases of selected layers.
The library/add-on supports INT8 quantization for FPGAs and NVIDIA GPUs, for supported layers.

Please refer to the documentation here:

This hardware support package is functional for R2020a and beyond. Quantization of a neural network targeting GPUs requires the GPU Coder™ Interface for Deep Learning Libraries support package. R2020b adds support for quantization of a neural network targeting FPGAs and requires Deep Learning HDL Toolbox™.

If you have download or installation problems, please contact Technical Support -

Comments and Ratings (14)



To Address Invalid GPU Execution Environment Error:

Quantization of a neural network requires a GPU, the GPU Coder™ Interface for Deep Learning Libraries support package, and the Deep Learning Toolbox™ Model Quantization Library support package.
Using a GPU requires a CUDA® enabled NVIDIA® GPU with compute capability 6.1 or higher excluding 6.2.

If you still run into an "Invalid execution environment error" even with the above dependencies satisfied, it's likely that the requisite environments for cuDNN and TensorRT are not set correctly. To check, execute coder.checkGPUInstall in the command window.

Addressing invalid cuDNN and TensorRT environments can be found in the answer by Jaya here:

As always, if there are any additional issues, please contact Technical Support -

Ashwathi Nambiar

Ashwathi Nambiar

Denis Navarro

After I installed the support package, command calResults = calibrate(quantObj, aug_calData) return:

Error using dlquantization.instrument
The value of 'executionEnvironment' is invalid. No GPU available. dlquantizer requires a GPU machine to quantize a network object.
Error in dlquantizer/calibrate (line 25)
results = dlquantization.instrument(obj.NetworkObject,, obj.DLAccelData,'BatchSize',p.Results.batchSize,'MiniBatchSize',p.Results.miniBatchSize,'ExecutionEnvironment',obj.ExecutionEnvironment);

But I have a GPU.

CUDADevice with properties:

Name: 'GeForce GTX 1080 Ti'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 10.2000
ToolkitVersion: 10.1000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 1.1811e+10
AvailableMemory: 9.9493e+09
MultiprocessorCount: 28
ClockRateKHz: 1683000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1


Hi Yang,
Thanks for your feedback! Based on the error message you provided, it was a generic error reported from an incorrect function call. It is a bit hard to locate the issue without reproduction steps.
We recently updated the documentation. Please see if it provides any reference:
Can the same network run successfully in Deep Network Designer App?
Please contact our Technical Support and report the issue. We will follow up with you directly.

yang li

I am quantifying a model of my own.The network model comes from the onnx import layer.
I created new imds for training. And use the same imds for calibration.
When I was calibrating, it reported an error. It prompts "Cannot perform assignment with 0 elements on the right".

Are there any errors in the above steps?
Thank you.

Dor Rubin

Vaidehi Venkatesan

Frequently Asked Questions (April 2020)
Q: After I installed the support package, running command dlquantizer(net) errors out in "coder.internal.getSupportedLayerTypes".
A: Quantization function requires a GPU and the GPU Coder Interface for Deep Learning Libraries:
Successful installation of both packages will eliminate the errors.

Q: What is INT8 Quantization and what is it for?
A: A technical article on the concepts can be found here:

Q: How can I start with a simple example?
A: You can try quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images ( example. The memory required for the network can be reduced to approximately 75% while the accuracy of the network is almost the same.
Type >> help dlquantizer and find the documented examples to get started.

Q: My network object does seem to be supported. Why?
A: Leave a comment here and we will contact you.


MATLAB Release Compatibility
Created with R2020a
Compatible with R2020a to R2020b
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!