ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)

34 views (last 30 days)
I am experiencing very slow ONNX model import and inference performance in MATLAB compared to Python when using the same ONNX file and hardware.
The ONNX model was exported from Python using the RF-DETR repository (nano model).
When running inference in MATLAB, both model loading time and prediction time (especially on GPU) are orders of magnitude slower than in Python.
MATLAB Reproduction Code
hTime = tic();
oNet = importNetworkFromONNX("rfdetr.onnx");
loadTime = toc(hTime);
fprintf('ONNX Model Load Time: %.4f seconds\n', loadTime);
sizeImg = 384;
% Load an image
tImg = imread("img.png");
tImgSmall = imresize(tImg,[sizeImg sizeImg]);
mImg = single(tImgSmall);
% Convert to dlarray
dlImg = dlarray(mImg,"SSC");
% Measure time for prediction on CPU
hTime = tic();
cpuPrediction = oNet.predict(dlImg);
cpuTime = toc(hTime);
fprintf('CPU Prediction Time: %.4f seconds\n', cpuTime);
%% predict Img with GPU
hTime = tic();
mGPUImg = gpuArray(dlImg);
[dlBoxes, dlLabels] = oNet.predict(mGPUImg);
gpuTime = toc(hTime);
fprintf('GPU Prediction Time: %.4f seconds\n', gpuTime);
MATLAB Performance Results
ONNX Model Load Time: 30.5475 seconds
CPU Prediction Time: 10.3682 seconds
GPU Prediction Time: 512.8456 seconds
Python Performance Results (Same ONNX, Same Hardware)
ONNX Load 0.56 seconds
CPU Inference 57.57 ms
GPU Inference 5.35 ms
Questions
  • Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second?
  • Why is inference significantly slower in MATLAB compared to Python: ~10 seconds vs ~57 ms on CPU and ~500 seconds vs ~5 ms on GPU
  • What is the recommended way to run ONNX object detection models efficiently in MATLAB?
Thanks!
  2 Comments
Jiangyin
Jiangyin on 19 Dec 2025 at 15:18
Hi IIan,
Re: "Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second":
In "importNetworkFromONNX", we translate operations from ONNX models into deep learning toolbox layers or auto-gen custom layers in MATLAB. It takes time for us to do the translation. Importation of the model is a one time thing, once the model is translated into MATLAB model, you could save the model and load it back when you need it.
% if the model is with variable name 'net'
filename = 'trained_network.mat';
save(filename, 'net');
load('trained_network.mat')
The saving and loading should give you a high performance.
Re: performance comparison between MATLAB and ONNX:
Which MATLAB version are you using now. We are actively enhancing deep learning toolbox as well as ONNX model importer. I could try the version you see and investigate the reason for the performance differences.

Sign in to comment.

Answers (0)

Categories

Find more on Parallel and Cloud in Help Center and File Exchange

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!