Code Generation for Interpolated FIR Filter on ARM Cortex-M Target Using CMSIS
This example shows how to generate and run optimized code using the ARM™ Cortex™-M CRL for an interpolated finite impulse response (IFIR) filter on the STM32F746G-Discovery hardware.
The IFIR filter provides a method for creating high-order FIR filters more efficiently in terms of computation. An IFIR filter uses multirate signal processing techniques to reduce the computational complexity of high-order FIR filters. The model contains a Gaussian noise source block, an FIR Decimation block, a Discrete FIR Filter block, and an FIR Interpolation block.
The FIR Decimation block downsamples the input signal and reduces the sample rate. The lower-order Discrete FIR Filter block filters the input signal at a reduced sample rate, thereby minimizing computational complexity. After the filtering stage, the FIR Interpolation block restores the signal to its original sample rate. This significantly lowers the number of multiplications during the convolution process, translating to a substantial reduction in computational burden.
Required Hardware
ARM Cortex-M STM32F746G-Discovery board
Simulate Interpolated FIR Filter
The filter coefficients of the FIR Decimation, Discrete FIR Filter, and FIR Interpolation blocks are derived using the ifir
(DSP System Toolbox) function. The ifir
(DSP System Toolbox) function designs a periodic filter, h(z)
, which provides the coefficients for the Discrete FIR Filter block. The ifir
function also designs an image-suppressor filter, g(z)
, which provides the coefficients for the FIR Decimation and FIR Interpolation blocks shown in this model.
The cascade of these filters represents the optimal minimax FIR approximation of the desired response.
Design h(z)
and g(z)
for a Low Pass Filter Response
Set the pass band peak ripple or deviation, to 0.005 dB, stop band peak ripple or deviation to 80 dB, interpolation factor to 7, pass band edge frequency to 0.1π rad/sample, and stop band edge frequency to 0.101π rad/sample.
Apass = 0.005; % dB Astop = 80; % dB Fstop = .101; M = 7; F = [.1 Fstop];
Convert the pass band and stop band ripple from dB to linear scale, and design the h(z)
and g(z)
filters, thereby deriving the filter coefficients.
A = [convertmagunits(Apass,'db','linear','pass') convertmagunits(Astop,'db','linear','stop')]; [h,g] = ifir(M,'low',F,A);
Ensure that the commands to compute h(z)
and g(z)
are set in the PreLoadFcn
of the model. To open PreLoadFcn
, follow these steps:
In the Simulink® Toolstrip, on the Modeling tab, in the Design gallery, select Property Inspector.
With no selection at the top level of the model, on the Properties tab, in the Callbacks section, select
PreLoadFcn
.
Open the model.
open_system('stm32f746g_ifir_filter');
You can observe the magnitude of output from IFIR Filter in the Spectrum Analyzer. To execute the host simulation, on the Simulation tab, click Run. The default simulation time is 1
second.
Configure the Model
You can configure the model either interactively, using the Configuration Parameters in Simulink, or programmatically, using the MATLAB® programming interface.
Interactive Approach
Configure the model for code generation targeting ARM Cortex-M hardware.
Press Ctrl+E (Model settings) to open the Configuration Parameters dialog box. Alternatively, open the Modeling tab and select Model Settings from the toolstrip.
Go to Hardware Implementation. Set Hardware board to
STM32F746G-Discovery
.
Select Code Generation and specify these settings.
Set System target file to
ert.tlc
.Set Build configuration to
Faster Runs
.
Select Interface under Code Generation. Set Code replacement libraries to ARM Cortex-M
.
Select Report under Code Generation. Enable Create code generation report, Open report automatically, and Summarize which blocks triggered code replacements.
Programmatic Approach
To programmatically configure the Simulink model stm32f746g_ifir_filter.slx
for deployment on the STM32F746G-Discovery
board, enter these commands. Specify ert.tlc
as the system target file, to optimize the code for embedded real-time systems, and Faster Runs
for the build configuration, to prioritize execution speed.
set_param('stm32f746g_ifir_filter','HardwareBoard','STM32F746G-Discovery'); set_param('stm32f746g_ifir_filter','SystemTargetFile','ert.tlc'); set_param('stm32f746g_ifir_filter','BuildConfiguration','Faster Runs');
Set the code replacement library as ARM Cortex-M
to generate the optimized code for ARM Cortex-M hardware.
set_param('stm32f746g_ifir_filter','CodeReplacementLibrary','ARM Cortex-M');
Finally, enable the generation of detailed code replacement reports. These reports provide valuable insights into the code structure and optimizations, facilitating a deeper understanding of the deployment process.
set_param('stm32f746g_ifir_filter','GenerateReport','on'); set_param('stm32f746g_ifir_filter','GenerateCodeReplacementReport','on');
Generate Code
To initiate the code generation and build process for the model, press Ctrl+B or click Build.
Once the code has been generated, click View Code to view the generated code.
You can also verify the code replacements by selecting the Open Report > Code Replacement Report.
Verify on Target Using SIL/PIL Manager
To verify the numerical accuracy of the generated code against the simulated output, you can either use the SIL/PIL Manager app or run the model programmatically in processor-in-the-loop (PIL) mode.
To use the SIL/PIL Manager app, follow these steps:
On the Embedded Coder app tab, in the Verify section, click Verify Code > SIL/PIL Manager.
Set Mode to
Automated Verification
.Set the SIL/PIL Mode to
Processor-in-the-Loop (PIL)
.Click Run Verification.
To instead run the model programmatically in PIL mode, enter these commands.
set_param('stm32f746g_ifir_filter','SimulationMode','processor-in-the-loop (pil)'); outputWithCRL = sim('stm32f746g_ifir_filter.slx');
You can verify the numerical accuracy of the generated code by using the Simulation Data Inspector. To open the Simulation Data Inspector, on the SIL/PIL tab of the toolstrip, in the Results section, click the bottom section of the Compare Runs split button and select Data Inspector.
To set the tolerance for output signal in the Simulink Data Inspector window, click [+] More and modify the Absolute or Relative.
From the plot of the two interpolated signals, observe that the simulation output completely overlaps with the PIL output. The plot of the difference between the signals confirms this, with a constant value of 0, and indicates that the absolute sample tolerance is less than 1e-6
.
Compare Performance
Compare the performance of a particular block with plain C code (without CRL) and CMSIS code (with CRL).
Enable Code Profiling
To enable code profiling, follow these steps:
Open the Simulink model Configuration Parameters dialog box and go to Code Generation > Verification. Select Measure task execution time.
Set Measure function execution times to
Detailed (all function call sites)
. Set toCoarse (referenced models and subsystems only)
if you are instead looking for overall model performance.On the SIL/PIL tab, run automated verification again.
Obtain Profiling Information
To obtain profiling information, you can use a code execution profiling report or the Code Profile Analyzer.
Code Execution Profiling Report
To obtain profiling information using a code execution profiling report, follow these steps:
On the SIL/PIL tab, in the Results section, select Generate Report.
The Profiled Sections of Code section of the report lists the execution time obtained for each of the model functions.
In this section, verify the block profiling. Click the MATLAB icon in line with the
stm32f746g_ifir_filter_step
.
ticksWithCRL = outputWithCRL.get('executionProfile').Sections(2).TotalExecutionTimeInTicks
ticksWithCRL = uint64
57287533
The returned value of TotalExecutionTimeInTicks
indicates the number of cycles consumed with ARM Cortex-M CRL enabled, as 57287533
cycles. Now, calculate the number of steps consumed without the CRL enabled, and compare the two values.
set_param('stm32f746g_ifir_filter','CodeReplacementLibrary','None'); outputWithoutCRL = sim('stm32f746g_ifir_filter.slx'); ticksWithoutCRL = outputWithoutCRL.get('executionProfile').Sections(2).TotalExecutionTimeInTicks
ticksWithoutCRL = uint64
94821678
speedUp = ticksWithoutCRL/ticksWithCRL
speedUp = uint64
2
bar(["ARM Cortex-M CRL","plain C"],[ticksWithCRL; ticksWithoutCRL],0.4) ylabel('Execution Time (ticks)'); title('Performance Comparison of ARM Cortex-M CRL vs. plain C');
The total cycles consumed by the step function without selecting the CRL is 94821678
cycles. Thus, by using CRL, the performance improves by a factor of two.
Code Profile Analyzer
To obtain profiling information using the Code Profile Analyzer, follow these steps:
To open Code Profile Analyzer, on the SIL/PIL tab, select the bottom section of the Compare Runs split button and click Code Profile Analyzer.
In the Analysis section of the Time Profiling tab, select Function Execution.
In the Function Execution pane, in the Function Execution Times section, verify the Maximum Execution Time and Average Execution Time for the FIR Decimation, Discrete FIR Filter, and FIR Interpolation blocks. You can also consider just the Average Execution Time for the whole simulation.
Optionally, you can also verify the relative execution times of the caller and called functions in the generated code.