Deep Learning Toolbox Model Quantization Library enables quantizing and compressing of your deep learning models. It provides instrumentation services that enable you to collect layer level data on the weights, activations and intermediate computations during the calibration step. Using instrumentation data, the library/add-on enables quantization of your model and it provides metrics to validate the accuracy of the quantized network.
The library/add-on enables an iterative workflow to optimize the quantization approach to meet the required accuracy. It provides heuristics to choose the right quantization strategy.
You can validate the quantized network and compare the accuracy against the single precision baseline.
The library/add-on provides a Quantization app that lets you analyze and visualize the instrumentation data to understand the tradeoff on the accuracy of quantizing the weights and biases of selected layers.
The library/add-on supports INT8 quantization for FPGAs and NVIDIA GPUs, for supported layers.
Please refer to the documentation here: https://www.mathworks.com/help/deeplearning/ref/deepnetworkquantizer-app.html
This hardware support package is functional for R2020a and beyond. Quantization of a neural network targeting GPUs requires the GPU Coder™ Interface for Deep Learning Libraries support package. R2020b adds support for quantization of a neural network targeting FPGAs and requires Deep Learning HDL Toolbox™.
Quantization Workflow Prerequisites can be found on this page:
If you have download or installation problems, please contact Technical Support - www.mathworks.com/contact_ts
Hi Ali and Yang-
The dlquantizer object created in the 2020b release is intended to be used for code generation. Here are links to examples for generating code containing the quantized network for GPU and FPGA targets.
I face the thing that yang li has mentioned on 5 Dec 2020.
After quantizing my network and exporting it as a dlquantizer object, I see in that object the same network before the quantization process. Please, how to get the quantized version of the network?
Hi Yukui Luo -
A prerequisite for supporting FPGAs is MATLAB® Coder™ Interface for Deep Learning Libraries.
meet error while setup the environment as 'FPGA'
dlquantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA');
and I got:
Error using dlquantizer
Unable to resolve the name dltargets.mkldnn.SupportedLayerImpl.m_sourceFiles.
Anybod can help?
When I use the Deep Learning Toolbox Model Quantization Library to quantify the network, how do I get the network weight of the int8 data type?
From my observation, the weight of the exported network is still a single type. And the same as the network parameters before quantization.
Even after I use TensorRT to generate cuda code. Read back to the bin file, it is still single type.
So, did I miss some details to get the network weight of the int8 data type?
This library/add-on supports INT8 quantization for Jetson nano? jetson nano has compute capability 5.3，not support??? I refer to the documentation ,compute capability>=6.1
annother question, when does support Opset version 10+ calibrated quantized-onnx to import and export? such as quantized resnet50? 链接：https://pan.baidu.com/s/1oMcm2w4r5bUFU-RAqV8AUg 提取码：afba
Internal error in deep neural network quantification?
To Address Invalid GPU Execution Environment Error:
Quantization of a neural network requires a GPU, the GPU Coder™ Interface for Deep Learning Libraries support package, and the Deep Learning Toolbox™ Model Quantization Library support package.
Using a GPU requires a CUDA® enabled NVIDIA® GPU with compute capability 6.1 or higher excluding 6.2.
If you still run into an "Invalid execution environment error" even with the above dependencies satisfied, it's likely that the requisite environments for cuDNN and TensorRT are not set correctly. To check, execute coder.checkGPUInstall in the command window.
Addressing invalid cuDNN and TensorRT environments can be found in the answer by Jaya here: https://www.mathworks.com/matlabcentral/answers/508318-getting-error-for-nvidia-cudnn-with-matlab-2019b-in-windows-10#answer_420160
As always, if there are any additional issues, please contact Technical Support - www.mathworks.com/contact_ts.
Why doesn't my matlab have deepNetworkQuantizer app？
After I installed the support package, command calResults = calibrate(quantObj, aug_calData) return:
Error using dlquantization.instrument
The value of 'executionEnvironment' is invalid. No GPU available. dlquantizer requires a GPU machine to quantize a network object.
Error in dlquantizer/calibrate (line 25)
results = dlquantization.instrument(obj.NetworkObject, p.Results.data, obj.DLAccelData,'BatchSize',p.Results.batchSize,'MiniBatchSize',p.Results.miniBatchSize,'ExecutionEnvironment',obj.ExecutionEnvironment);
But I have a GPU.
CUDADevice with properties:
Name: 'GeForce GTX 1080 Ti'
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
Thanks for your feedback! Based on the error message you provided, it was a generic error reported from an incorrect function call. It is a bit hard to locate the issue without reproduction steps.
We recently updated the documentation. Please see if it provides any reference:
Can the same network run successfully in Deep Network Designer App?
Please contact our Technical Support and report the issue. We will follow up with you directly.
I am quantifying a model of my own.The network model comes from the onnx import layer.
I created new imds for training. And use the same imds for calibration.
When I was calibrating, it reported an error. It prompts "Cannot perform assignment with 0 elements on the right".
Are there any errors in the above steps?
Frequently Asked Questions (April 2020)
Q: After I installed the support package, running command dlquantizer(net) errors out in "coder.internal.getSupportedLayerTypes".
A: Quantization function requires a GPU and the GPU Coder Interface for Deep Learning Libraries: https://www.mathworks.com/matlabcentral/fileexchange/68642-gpu-coder-interface-for-deep-learning-libraries
Successful installation of both packages will eliminate the errors.
Q: What is INT8 Quantization and what is it for?
A: A technical article on the concepts can be found here: https://www.mathworks.com/company/newsletters/articles/what-is-int8-quantization-and-why-is-it-popular-for-deep-neural-networks.html
Q: How can I start with a simple example?
A: You can try quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images (https://www.mathworks.com/help/deeplearning/ug/train-deep-learning-network-to-classify-new-images.html) example. The memory required for the network can be reduced to approximately 75% while the accuracy of the network is almost the same.
Type >> help dlquantizer and find the documented examples to get started.
Q: My network object does seem to be supported. Why?
A: Leave a comment here and we will contact you.
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!