Deep Learning Toolbox Model Quantization Library enables quantizing and compressing of your deep learning models. It provides instrumentation services that enable you to collect layer level data on the weights, activations and intermediate computations during the calibration step. Using instrumentation data, the library/add-on enables quantization of your model and it provides metrics to validate the accuracy of the quantized network.
The library/add-on enables an iterative workflow to optimize the quantization approach to meet the required accuracy. It provides heuristics to choose the right quantization strategy.
You can validate the quantized network and compare the accuracy against the single precision baseline.
The library/add-on provides a Quantization app that lets you analyze and visualize the instrumentation data to understand the tradeoff on the accuracy of quantizing the weights and biases of selected layers.
The library/add-on supports INT8 quantization for FPGAs and NVIDIA GPUs, for supported layers.
Please refer to the documentation here: https://www.mathworks.com/help/deeplearning/ref/deepnetworkquantizer-app.html
This hardware support package is functional for R2020a and beyond. Quantization of a neural network targeting GPUs requires the GPU Coder™ Interface for Deep Learning Libraries support package. R2020b adds support for quantization of a neural network targeting FPGAs and requires Deep Learning HDL Toolbox™.
If you have download or installation problems, please contact Technical Support - www.mathworks.com/contact_ts
Internal error in deep neural network quantification?
To Address Invalid GPU Execution Environment Error:
Quantization of a neural network requires a GPU, the GPU Coder™ Interface for Deep Learning Libraries support package, and the Deep Learning Toolbox™ Model Quantization Library support package.
Using a GPU requires a CUDA® enabled NVIDIA® GPU with compute capability 6.1 or higher excluding 6.2.
If you still run into an "Invalid execution environment error" even with the above dependencies satisfied, it's likely that the requisite environments for cuDNN and TensorRT are not set correctly. To check, execute coder.checkGPUInstall in the command window.
Addressing invalid cuDNN and TensorRT environments can be found in the answer by Jaya here: https://www.mathworks.com/matlabcentral/answers/508318-getting-error-for-nvidia-cudnn-with-matlab-2019b-in-windows-10#answer_420160
As always, if there are any additional issues, please contact Technical Support - www.mathworks.com/contact_ts.
Why doesn't my matlab have deepNetworkQuantizer app？
After I installed the support package, command calResults = calibrate(quantObj, aug_calData) return:
Error using dlquantization.instrument
The value of 'executionEnvironment' is invalid. No GPU available. dlquantizer requires a GPU machine to quantize a network object.
Error in dlquantizer/calibrate (line 25)
results = dlquantization.instrument(obj.NetworkObject, p.Results.data, obj.DLAccelData,'BatchSize',p.Results.batchSize,'MiniBatchSize',p.Results.miniBatchSize,'ExecutionEnvironment',obj.ExecutionEnvironment);
But I have a GPU.
CUDADevice with properties:
Name: 'GeForce GTX 1080 Ti'
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
Thanks for your feedback! Based on the error message you provided, it was a generic error reported from an incorrect function call. It is a bit hard to locate the issue without reproduction steps.
We recently updated the documentation. Please see if it provides any reference:
Can the same network run successfully in Deep Network Designer App?
Please contact our Technical Support and report the issue. We will follow up with you directly.
I am quantifying a model of my own.The network model comes from the onnx import layer.
I created new imds for training. And use the same imds for calibration.
When I was calibrating, it reported an error. It prompts "Cannot perform assignment with 0 elements on the right".
Are there any errors in the above steps?
Frequently Asked Questions (April 2020)
Q: After I installed the support package, running command dlquantizer(net) errors out in "coder.internal.getSupportedLayerTypes".
A: Quantization function requires a GPU and the GPU Coder Interface for Deep Learning Libraries: https://www.mathworks.com/matlabcentral/fileexchange/68642-gpu-coder-interface-for-deep-learning-libraries
Successful installation of both packages will eliminate the errors.
Q: What is INT8 Quantization and what is it for?
A: A technical article on the concepts can be found here: https://www.mathworks.com/company/newsletters/articles/what-is-int8-quantization-and-why-is-it-popular-for-deep-neural-networks.html
Q: How can I start with a simple example?
A: You can try quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images (https://www.mathworks.com/help/deeplearning/ug/train-deep-learning-network-to-classify-new-images.html) example. The memory required for the network can be reduced to approximately 75% while the accuracy of the network is almost the same.
Type >> help dlquantizer and find the documented examples to get started.
Q: My network object does seem to be supported. Why?
A: Leave a comment here and we will contact you.
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!