ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)

Question

Ilan on 18 Dec 2025 at 13:12

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/2181894-onnx-model-import-and-inference-performance-in-matlab-is-significantly-slower-than-python-cpu-gpu

Commented: Jiangyin on 19 Dec 2025 at 15:18

I am experiencing very slow ONNX model import and inference performance in MATLAB compared to Python when using the same ONNX file and hardware.

The ONNX model was exported from Python using the RF-DETR repository (nano model).

When running inference in MATLAB, both model loading time and prediction time (especially on GPU) are orders of magnitude slower than in Python.

MATLAB Reproduction Code

hTime = tic();

oNet = importNetworkFromONNX("rfdetr.onnx");

loadTime = toc(hTime);

fprintf('ONNX Model Load Time: %.4f seconds\n', loadTime);

sizeImg = 384;

% Load an image

tImg = imread("img.png");

tImgSmall = imresize(tImg,[sizeImg sizeImg]);

mImg = single(tImgSmall);

% Convert to dlarray

dlImg = dlarray(mImg,"SSC");

% Measure time for prediction on CPU

hTime = tic();

cpuPrediction = oNet.predict(dlImg);

cpuTime = toc(hTime);

fprintf('CPU Prediction Time: %.4f seconds\n', cpuTime);

%% predict Img with GPU

hTime = tic();

mGPUImg = gpuArray(dlImg);

[dlBoxes, dlLabels] = oNet.predict(mGPUImg);

gpuTime = toc(hTime);

fprintf('GPU Prediction Time: %.4f seconds\n', gpuTime);

MATLAB Performance Results

ONNX Model Load Time: 30.5475 seconds

CPU Prediction Time: 10.3682 seconds

GPU Prediction Time: 512.8456 seconds

Python Performance Results (Same ONNX, Same Hardware)

ONNX Load 0.56 seconds

CPU Inference 57.57 ms

GPU Inference 5.35 ms

Questions

Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second?
Why is inference significantly slower in MATLAB compared to Python: ~10 seconds vs ~57 ms on CPU and ~500 seconds vs ~5 ms on GPU
What is the recommended way to run ONNX object detection models efficiently in MATLAB?

Thanks!

2 Comments
Show NoneHide None

Joss Knight on 18 Dec 2025 at 15:25

You might consider running the MATLAB profiler to see what it shows.

Jiangyin on 19 Dec 2025 at 15:18

Open in MATLAB Online

Hi IIan,

Re: "Why is importNetworkFromONNX taking ~30 seconds in MATLAB while loading the same ONNX file in Python takes less than 1 second":

In "importNetworkFromONNX", we translate operations from ONNX models into deep learning toolbox layers or auto-gen custom layers in MATLAB. It takes time for us to do the translation. Importation of the model is a one time thing, once the model is translated into MATLAB model, you could save the model and load it back when you need it.

% if the model is with variable name 'net'
filename = 'trained_network.mat';
save(filename, 'net');
load('trained_network.mat')

The saving and loading should give you a high performance.

Re: performance comparison between MATLAB and ONNX:

Which MATLAB version are you using now. We are actively enhancing deep learning toolbox as well as ONNX model importer. I could try the version you see and investigate the reason for the performance differences.

Sign in to comment.

Sign in to answer this question.

ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)

2 Comments
Show NoneHide None

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

ONNX Model Import and Inference Performance in MATLAB Is Significantly Slower Than Python (CPU & GPU)

2 Comments Show NoneHide None

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None