Execution Speed
Use code generation options and optimizations to improve the execution speed
of the generated code. You can modify or disable dynamic memory allocation,
which can affect execution speed.
Parallelized code can be generated by using parfor
loops.
When available, take advantage of preexisting optimized C code and specialized
libraries to speed up execution.
For more information about how to optimize your code for specific conditions, see Optimization Strategies.
Functions
coder.areUnboundedVariableSizedArraysSupported | Check if current configuration settings allow unbounded variable-size arrays (Since R2024a) |
coder.ceval | Call C/C++ function from generated code |
coder.const | Fold expressions into constants in generated code |
coder.inline | Control inlining of current function in generated code |
coder.inlineCall | Inline called function in generated code (Since R2024a) |
coder.loop.interchange | Interchange loop indices in generated code (Since R2023a) |
coder.loop.parallelize | Parallelize specific for loops in generated code; disable
automatic parallelization (Since R2021a) |
coder.loop.reverse | Reverse loop iteration order in generated code (Since R2023a) |
coder.loop.tile | Tile for -loops in the generated code (Since R2023a) |
coder.loop.unrollAndJam | Unroll and jam for -loops in the generated code (Since R2023a) |
coder.loop.vectorize | Vectorize for loops in generated code (Since R2023a) |
coder.nonInlineCall | Prevent inlining of called function in generated code (Since R2024a) |
coder.unroll | Unroll for -loop by making a copy of
the loop body for each loop iteration |
coder.varsize | Declare variable-size data |
parfor | Parallel for -loop |
Classes
coder.BLASCallback | Abstract class for specifying the BLAS library and CBLAS header and data type information for BLAS calls in generated code |
coder.LAPACKCallback | Abstract class for specifying the LAPACK library and LAPACKE header file for LAPACK calls in generated code |
coder.fftw.StandaloneFFTW3Interface | Abstract class for specifying an FFTW library for FFTW calls in generated code |
coder.loop.Control | Loop optimization control object (Since R2023a) |
Topics
Generated Code Optimizations
- Optimization Strategies
Optimize the execution speed or memory usage of generated code. - MATLAB Coder Optimizations in Generated Code
To improve the performance of generated code, the code generator uses optimizations. - Optimize Implicit Expansion in Generated Code
Implicit expansion in the generated code is enabled by default.
memcpy and memset Optimizations
- memcpy Optimization
The code generator optimizes generated code by usingmemcpy
. - memset Optimization
The code generator optimizes generated code by usingmemset
.
Variable-Size Arrays
- Dynamic Memory Allocation and Performance
Dynamic memory allocation can slow down execution speeds. - Minimize Dynamic Memory Allocation
Improve execution time by minimizing dynamic memory allocation. - Provide Maximum Size for Variable-Size Arrays
Use techniques to help the code generator determine the upper bound for a variable-size array. - Disable Dynamic Memory Allocation During Code Generation
Disable dynamic memory allocation in the app or at the command line. - Set Dynamic Memory Allocation Threshold
Disable dynamic memory allocation for arrays less than a certain size. - Optimize Dynamic Array Access
Improve execution time of dynamic arrays in generated C code.
Array Layout
- Generate Code That Uses Row-Major Array Layout
Generate C/C++ code with row elements stored contiguously in memory.
Loops
- Algorithm Acceleration Using Parallel for-Loops (parfor)
Generate MEX functions forparfor
-loops. - Classification of Variables in parfor-Loops
Variables insideparfor
-loops are classified as loop, sliced, broadcast, reduction, or temporary. - Generate Code with Parallel for-Loops (parfor)
Generate a loop that runs in parallel on shared-memory multicore platforms. - Specify Maximum Number of Threads in parfor-Loops
Generate a MEX function that executes loop iterations in parallel on specific number of available cores. - Specify Maximum Number of Threads to Run Parallel for-Loops in the Generated Code
Run parallelfor
-loops on specific number of available cores in the generated code. - Reduction Assignments in parfor-Loops
A reduction variable accumulates a value that depends on all the loop iterations together. - Control Compilation of parfor-Loops
Treatparfor
-loops asparfor
-loops that run on a single thread. - Install OpenMP Library on macOS Platform
Install OpenMP library to generate parallelfor
-loops on macOS platform. - Minimize Redundant Operations in Loops
Move operations outside of loop when possible. - Unroll for-Loops and parfor-Loops
Control loop unrolling. - Automatically Parallelize for Loops in Generated Code
Iterations of parallelfor
-loops can run simultaneously on multiple cores on the target hardware. - Reduction Operations Supported for Automatic Parallelization of for-loops
Supported operations for automatic parallelization offor
-loops. - Generate SIMD Code from MATLAB Functions for Intel Platforms
Improve the execution speed of the generated code using Intel® SSE and Intel AVX technology. - Optimize Loops in Generated Code
Generate code with loop transformations according to your performance requirements.
Function Calls
- Avoid Data Copies of Function Inputs in Generated Code
Generate code that passes input arguments by reference. - Control Inlining to Fine-Tune Performance and Readability of Generated Code
Inlining eliminates the overhead of function calls but can produce larger C/C++ code and reduce code readability. - Fold Function Calls into Constants
Reduce execution time by replacing expression with constant in the generated code.
Numerical Edge Cases
- Disable Support for Integer Overflow or Nonfinites
Improve performance by suppressing generation of supporting code to handle integer overflow or nonfinites.
External Code Integration
- LAPACK Calls in Generated Code
LAPACK function calls improve the execution speed of code generated for certain linear algebra functions. - BLAS Calls in Generated Code
BLAS function calls improve the execution speed of code generated for certain low-level vector and matrix operations. - Optimize Generated Code for Fast Fourier Transform Functions
Choose the correct fast Fourier transform implementation for your workflow and target hardware. - Integrate External/Custom Code
Improve performance by integrating your own optimized code. - Speed Up Linear Algebra in Generated Standalone Code by Using LAPACK Calls
Generate LAPACK calls for certain linear algebra functions. Specify LAPACK library to use. - Speed Up Matrix Operations in Generated Standalone Code by Using BLAS Calls
Generate BLAS calls for certain low-level matrix operations. Specify BLAS library to use. - Speed Up Fast Fourier Transforms in Generated Standalone Code by Using FFTW Library Calls
Generate FFTW library calls for fast Fourier transforms. Specify the FFTW library. - Synchronize Multithreaded Access to FFTW Planning in Generated Standalone Code
Implement FFT library callback class methods and provide supporting C code to prevent concurrent access to FFTW planning.
Troubleshooting
Diagnose errors for code generation of parfor
-loops.
Resolve Issue: coder.inline("never") and coder.nonInlineCall Do Not Prevent Function Inlining
Troubleshoot instances of coder.inline('never')
not
preventing inlining.
MEX Generated on macOS Platform Stays Loaded in Memory
Troubleshoot issues that occur when the source MATLAB® code contains global or persistent variables that are reachable
from the body of a parfor
-loop.