ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)
34 views (last 30 days)
Show older comments
I am experiencing very slow ONNX model import and inference performance in MATLAB compared to Python when using the same ONNX file and hardware.
When running inference in MATLAB, both model loading time and prediction time (especially on GPU) are orders of magnitude slower than in Python.
MATLAB Reproduction Code
hTime = tic();
oNet = importNetworkFromONNX("rfdetr.onnx");
loadTime = toc(hTime);
fprintf('ONNX Model Load Time: %.4f seconds\n', loadTime);
sizeImg = 384;
% Load an image
tImg = imread("img.png");
tImgSmall = imresize(tImg,[sizeImg sizeImg]);
mImg = single(tImgSmall);
% Convert to dlarray
dlImg = dlarray(mImg,"SSC");
% Measure time for prediction on CPU
hTime = tic();
cpuPrediction = oNet.predict(dlImg);
cpuTime = toc(hTime);
fprintf('CPU Prediction Time: %.4f seconds\n', cpuTime);
%% predict Img with GPU
hTime = tic();
mGPUImg = gpuArray(dlImg);
[dlBoxes, dlLabels] = oNet.predict(mGPUImg);
gpuTime = toc(hTime);
fprintf('GPU Prediction Time: %.4f seconds\n', gpuTime);
MATLAB Performance Results
ONNX Model Load Time: 30.5475 seconds
CPU Prediction Time: 10.3682 seconds
GPU Prediction Time: 512.8456 seconds
Python Performance Results (Same ONNX, Same Hardware)
ONNX Load 0.56 seconds
CPU Inference 57.57 ms
GPU Inference 5.35 ms
Questions
- Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second?
- Why is inference significantly slower in MATLAB compared to Python: ~10 seconds vs ~57 ms on CPU and ~500 seconds vs ~5 ms on GPU
- What is the recommended way to run ONNX object detection models efficiently in MATLAB?
Thanks!
2 Comments
Jiangyin
on 19 Dec 2025 at 15:18
Hi IIan,
Re: "Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second":
In "importNetworkFromONNX", we translate operations from ONNX models into deep learning toolbox layers or auto-gen custom layers in MATLAB. It takes time for us to do the translation. Importation of the model is a one time thing, once the model is translated into MATLAB model, you could save the model and load it back when you need it.
% if the model is with variable name 'net'
filename = 'trained_network.mat';
save(filename, 'net');
load('trained_network.mat')
The saving and loading should give you a high performance.
Re: performance comparison between MATLAB and ONNX:
Which MATLAB version are you using now. We are actively enhancing deep learning toolbox as well as ONNX model importer. I could try the version you see and investigate the reason for the performance differences.
Answers (0)
See Also
Categories
Find more on Parallel and Cloud in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!