Tip | Reduces ROM | Reduces Model Execution Time |
---|---|---|
Yes | Yes | |
Yes | Yes | |
Yes | Yes | |
Yes | No | |
Yes | Yes | |
Minimize the Variety of Similar Fixed-Point Utility Functions | Yes | No |
Dependent on model configuration, compiler, and target hardware | Dependent on model configuration, compiler, and target hardware | |
Optimize Generated Code Using Specified Minimum and Maximum Values | Yes | Yes |
If possible, restrict the fixed-point data type word lengths in your model so that they are equal to or less than the integer size of your target microcontroller. This results in fewer mathematical instructions in the microcontroller, and reduces ROM and execution time.
This recommendation strongly applies to global variables that consume global RAM. For example, Unit Delay blocks have discrete states that have the same word lengths as their input and output signals. These discrete states are global variables that consume global RAM, which is a scarce resource on many embedded systems.
For temporary variables that only occupy a CPU register or stack
location briefly, the space consumed by a long
is
less critical. However, depending on the operation, the use of long
variables
in math operations can be expensive. Addition and subtraction of long
integers generally requires the same effort as adding and subtracting
regular integers, so that operation is not a concern. In contrast,
multiplication and division with long integers can require significantly
larger and slower code.
Whenever possible, avoid using fixed-point numbers with bias. In certain cases, if you choose biases carefully, you can avoid significant increases in ROM and execution time. Refer to Recommendations for Arithmetic and Scaling for more information on how to choose appropriate biases in cases where it is necessary; for example if you are interfacing with a hardware device that has a built-in bias. In general, however, it is safer to avoid using fixed-point numbers with bias altogether.
Inputs to lookup tables are an important exception to this recommendation. If a lookup table input and the associated input data use the same bias, then there is no penalty associated with nonzero bias for that operation.
For most fixed-point and integer operations, the Simulink^{®} software provides you with options on how overflows are handled and how calculations are rounded. Traditional handwritten code, especially for control applications, almost always uses the "no effort" rounding mode. For example, to reduce the precision of a variable, that variable is shifted right. For unsigned integers and two's complement signed integers, shifting right is equivalent to rounding to floor. To get results comparable to or better than what you expect from traditional handwritten code, you should round to floor in most cases.
The primary exception to this rule is the rounding behavior of signed integer division. The C language leaves this rounding behavior unspecified, but for most targets the "no effort" mode is round to zero. For unsigned division, everything is nonnegative, so rounding to floor and rounding to zero are identical.
You can improve code efficiency by setting the value of the Model Configuration Parameters > Hardware Implementation
> Production hardware> Signed integer division rounds to parameter
to describe how your production target handles rounding for signed
division. For Product blocks that are doing only division, setting
the Integer rounding mode parameter to the rounding
mode of your production target gives the best results. You can also
use the Simplest
rounding mode on blocks
where it is available. For more information, refer to Rounding Mode: Simplest.
The options for overflow handling also have a big impact on the efficiency of your generated code. Using software to detect overflow situations and saturate the results requires the code to be much bigger and slower compared to simply ignoring the overflows. When overflows are ignored for unsigned integers and two's complement signed integers, the results usually wrap around modulo 2^{N}, where N is the number of bits. Unhandled overflows that wrap around are highly undesirable for many situations.
However, because of code size and speed needs, traditional handwritten code contains very little software saturation. Typically, the fixed-point scaling is very carefully set so that overflow does not occur in most calculations. The code for these calculations safely ignores overflow. To get results comparable to or better than what you would expect from traditional handwritten code, the Saturate on integer overflow parameter should not be selected for Simulink blocks doing those calculations.
In a design, there might be a few places where overflow can occur and saturation protection is needed. Traditional handwritten code includes software saturation for these few places where it is needed. To get comparable generated code, the Saturate on integer overflow parameter should only be selected for the few Simulink blocks that correspond to these at-risk calculations.
A secondary benefit of using the most efficient options for overflow handling and rounding is that calculations often reduce from multiple statements requiring several lines of C code to small expressions that can be folded into downstream calculations. Expression folding is a code optimization technique that produces benefits such as minimizing the need to store intermediate computations in temporary buffers or variables. This can reduce stack size and make it more likely that calculations can be efficiently handled using only CPU registers. An automatic code generator can carefully apply expression folding across parts of a model and often see optimizations that might not be obvious. Automatic optimizations of this type often allow generated code to exceed the efficiency of typical examples of handwritten code.
In addition to the tip mentioned in Wrap and Round to Floor or Simplest, to obtain the maximum benefits of
expression folding you also need to make sure that the Storage
class field in the Signal Properties dialog box is set
to Auto
for each signal. When you choose a setting
other than Auto
, you need to name the signal, and
a separate statement is created in the generated code. Therefore,
only use a setting other than Auto
when it is necessary
for global variables.
You can access the Signal Properties dialog box by selecting any connection between blocks in your model, and then selecting Signal Properties from the Simulink Edit menu.
If possible, use lookup tables with nontunable, evenly spaced axes. A table with an unevenly spaced axis requires a search routine and memory for each input axis, which increases ROM and execution time. However, keep in mind that an unevenly spaced lookup table might provide greater accuracy. You need to consider the needs of your algorithm to determine whether you can forgo some accuracy with an evenly spaced table in order to reduce ROM and execution time. Also note that this decision applies only to lookup tables with nontunable input axes, because tables with tunable input axes always have the potential to be unevenly spaced.
The Embedded Coder^{®} product generates fixed-point
utility functions that are designed to handle specific situations
efficiently. The Simulink Coder™ product can generate multiple
versions of these optimized utility functions depending on what a
specific model requires. For example, the division of long
integers
can, in theory, require eight varieties that are combinations of the
output and the two inputs being signed or unsigned. A model that uses
all these combinations can generate utility functions for all these
combinations.
In some cases, it is possible to make small adjustments to a model that reduce the variety of required utility functions. For example, suppose that across most of a model signed data types are used, but in a small part of a model, a local decision to use unsigned data types is made. If it is possible to switch that portion of the model to use signed data types, then the overall variety of generated utility functions can potentially be reduced.
The best way to identify these opportunities is to inspect the generated code. For each utility function that appears in the generated code, you can search for all the call sites. If relatively few calls to the function are made, then trace back from the call site to the Simulink model. By modifying those places in the Simulink model, it is possible for you to eliminate the few cases that need a rarely used utility function.
The Fixed-Point Designer™ software provides an optimization parameter, Use division for fixed-point net slope computation, that controls how the software handles net slope computation. To learn how to enable this optimization, see Use Division to Handle Net Slope Computation.
When a change of fixed-point slope is not a power of two, net slope computation is necessary. Normally, net slope computation is implemented using an integer multiplication followed by shifts. Under certain conditions, net slope computation can be implemented using integer division or a rational approximation of the net slope. One of the conditions is that the net slope can be accurately represented as a rational fraction or as the reciprocal of an integer. Under this condition, the division implementation gives more accurate numerical behavior. Depending on your compiler and embedded hardware, a division implementation might be more desirable than the multiplication and shifts implementation. The generated code for the rational approximation and/or integer division implementation might require less ROM or improve model execution time.
This optimization works if:
The net slope can be approximated with a fraction or is the reciprocal of an integer.
Division is more efficient than multiplication followed by shifts on the target hardware.
Note: The Fixed-Point Designer software is not aware of the target hardware. Before selecting this option, verify that division is more efficient than multiplication followed by shifts on your target hardware. |
This optimization does not work if:
The software cannot perform the division using the
production target long
data type and therefore
must use multiword operations.
Using multiword division does not produce code suitable for embedded targets. Therefore, do not use division to handle net slope computation in models that use multiword operations. If your model contains blocks that use multiword operations, change the word length of these blocks to avoid these operations.
Net slope is a power of 2 or a rational approximation of the net slope contains division by a power of 2.
Binary-point-only scaling, where the net slope is a power of 2, involves moving the binary point within the fixed-point word. This scaling mode already minimizes the number of processor arithmetic operations.
To enable this optimization:
In the Configuration Parameters dialog
box, set Optimization > Use division for fixed-point net slope computation to On
, or Use
division for reciprocals of integers only
For more information, see Use division for fixed-point net slope computation.
On the Hardware Implementation > Production hardware pane,
set the Signed integer division rounds to configuration
parameter to Floor
or Zero
,
as appropriate for your target hardware. The optimization does not
occur if the Signed integer division rounds to parameter
is Undefined
.
Note: You must set this parameter to a value that is appropriate for the target hardware. Failure to do so might result in division operations that comply with the definition on the Hardware Implementation pane, but are inappropriate for the target hardware. |
Set the Integer rounding mode of
the blocks that require net slope computation (for example, Product, Gain,
and Data Type Conversion) to Simplest
or
match the rounding mode of your target hardware.
Note: You can use the Model Advisor to alert you if you have not configured your model correctly for this optimization. Open the Model Advisor and run the Identify questionable fixed-point operations check. For more information, see Use the Model Advisor to Optimize Fixed-Point Operations in Generated Code. |
This example illustrates how setting the Optimization > Use division for fixed-point net slope computation parameter to On
improves
numerical accuracy. It uses the following model.
For the Product block in this model,
These values are represented by the general [Slope Bias] encoding scheme described in Scaling:$${V}_{i}={S}_{i}{Q}_{i}+{B}_{i}$$.
Because there is no bias for the inputs or outputs:
$${S}_{a}{Q}_{a}={S}_{b}{Q}_{b}.{S}_{c}{Q}_{c}$$, or
$${Q}_{a}=\frac{{S}_{b}{S}_{c}}{{S}_{a}}.{Q}_{b}{Q}_{c}$$
where the net slope is:
$$\frac{{S}_{b}{S}_{c}}{{S}_{a}}$$
The net slope for the Product block
is 7/11
. Because the net slope can be represented
as a fractional value consisting of small integers, you can use the On
setting
of the Use division for fixed-point net slope computation optimization
parameter if your model and hardware configuration are suitable. For
more information, see When to Use Division for Fixed-Point Net Slope Computation.
To set up the model and run the simulation:
For the Constant block Vb
,
set the Output data type to fixdt(1,
8, 0.7, 0)
. For the Constant block Vc
,
set the Output data type to fixdt(1,
8, 0)
.
For the Product block, set the Output
data type to fixdt(1, 16, 1.1, 0)
. Set
the Integer rounding mode to Simplest
.
Set the Hardware Implementation > Production hardware > Signed
integer division rounds to configuration
parameter to Zero
.
Set the Optimization > Use division for fixed-point net slope computation to Off
.
In your Simulink model window, select Simulation > Run.
Because
the simulation uses multiplication followed by shifts to handle the
net slope computation, net slope precision loss occurs. This precision
loss results in numerical inaccuracy: the calculated product is 306.9
,
not 308
, as you expect.
Note: You can set up the Fixed-Point Designer software to provide alerts when precision loss occurs in fixed-point constants. For more information, see Net Slope and Net Bias Precision. |
Set the Optimization > Use division for fixed-point net slope computation to On
.
Save your model, and simulate again.
The software
implements the net slope computation using a rational approximation
instead of multiplication followed by shifts. The calculated product
is 308
, as you expect.
The optimization works for this model because:
The net slope is representable as a fraction with small integers in the numerator and denominator.
The Hardware Implementation > Production hardware > Signed
integer division rounds to configuration
parameter is set to Zero
.
Note: This setting must match your target hardware rounding mode. |
The Integer rounding mode of
the Product block in the model is set to Simplest
.
The model does not use multiword operations.
This example shows how setting the optimization parameter Optimization > Use division
for fixed-point net slope computation to On
improves
the efficiency of generated code.
Note: The generated code is more efficient only if division is more efficient than multiplication followed by shifts on your target hardware. |
This example uses the following model.
For the Product block in this model,
$${V}_{m}={V}_{a}\times {V}_{b}$$
These values are represented by the general [Slope Bias] encoding scheme described in Scaling:$${V}_{i}={S}_{i}{Q}_{i}+{B}_{i}$$.
Because there is no bias for the inputs or outputs:
$${S}_{m}{Q}_{m}={S}_{a}{Q}_{a}.{S}_{b}{Q}_{b}$$
, or
$${Q}_{m}=\frac{{S}_{a}{S}_{b}}{{S}_{m}}.{Q}_{a}{Q}_{b}$$
where the net slope is:
$$\frac{{S}_{a}{S}_{b}}{{S}_{m}}$$
The net slope for the Product block
is 9/10
.
Similarly, for the Data Type Conversion block in this model,
$${S}_{a}{Q}_{a}+{B}_{a}={S}_{b}{Q}_{b}+{B}_{b}$$
There is no bias. Therefore,
the net slope is $$\frac{{S}_{b}}{{S}_{a}}$$.
The net slope for this block is also 9/10
.
Because the net slope can be represented as a fraction, you
can set the Optimization > Use division for fixed-point net slope computation optimization parameter to On
if
your model and hardware configuration are suitable. For more information,
see When to Use Division for Fixed-Point Net Slope Computation.
To set up the model and generate code:
For the Inport block Va
,
set the Data type to fixdt(1, 8, 9/10,
0)
; for the Inport block Vb
,
set the Data type to int8
.
For the Data Type Conversion block,
set the Integer rounding mode to Simplest
.
Set the Output data type to int16
.
For the Product block, set the Integer
rounding mode to Simplest
. Set
the Output data type to int16
.
Set the Hardware Implementation > Production hardware > Signed
integer division rounds to configuration
parameter to Zero
.
Set the Optimization > Use division for fixed-point net slope computation to Off
.
From the Simulink model menu, select Code > C/C++ Code > Build Model.
Conceptually, the net slope computation is 9/10
or 0.9
:
Vc = 0.9 * Va; Vm = 0.9 * Va * Vb;
The generated code uses multiplication with shifts:
% For the conversion Vc = (int16_T)(Va * 115 >> 7); % For the multiplication Vm = (int16_T)((Va * Vb >> 1) * 29491 >> 14);
The ideal value of the net slope computation is 0.9
.
In the generated code, the approximate value of the net slope computation
is 29491 >> 15 = 29491/2^15 = 0.899993896484375
.
This approximation introduces numerical inaccuracy. For example, using
the same model with constant inputs produces the following results.
In the original model with inputs Va
and Vb
,
set the Optimization > Use
division for fixed-point net slope computation parameter to On
, update the
diagram, and generate code again.
The generated code now uses integer division instead of multiplication followed by shifts:
% For the conversion Vc = (int16_T)(Va * 9/10); % For the multiplication Vm = (int16_T)(Va * Vb * 9/10);
In the generated code, the value of the net slope
computation is now the ideal value of 0.9
. Using
division, the results are numerically accurate.
In the model with constant inputs, set the Optimization > Use division
for fixed-point net slope computation parameter
to On
and simulate the model.
The optimization works for this model because the:
Net slope is representable as a fraction with small integers in the numerator and denominator.
Hardware Implementation > Production hardware > Signed
integer division rounds to configuration
parameter is set to Zero
.
Note: This setting must match your target hardware rounding mode. |
For the Product and Data Type
Conversion blocks in the model, the Integer rounding
mode is set to Simplest
.
Model does not use multiword operations.
Setting the Optimization > Use division for fixed-point net slope computation parameter to Use division for reciprocals
of integers only
triggers the optimization only in cases
where the net slope is the reciprocal of an integer. This setting
results in a single integer division to handle net slope computations.
The Fixed-Point Designer software uses representable minimum and maximum values and constant values to determine if it is possible to optimize the generated code, for example, by eliminating unnecessary utility functions and saturation code from the generated code.
This optimization results in:
Reduced ROM and RAM consumption
Improved execution speed
When you select the Optimize using specified minimum and maximum values configuration parameter, the software takes into account input range information, also known as design minimum and maximum, that you specify for signals and parameters in your model. It uses these minimum and maximum values to derive range information for downstream signals in the model and then uses this derived range information to simplify mathematical operations in the generated code whenever possible.
The Optimize using specified minimum and maximum values parameter appears for ERT-based targets only and requires an Embedded Coder license when generating code.
To make optimization more likely:
Provide as much design minimum and maximum information as possible. Specify minimum and maximum values for signals and parameters in the model for:
Inport and Outport blocks
Block outputs
Block inputs, for example, for the MATLAB Function and Stateflow Chart blocks
Simulink.Signal
objects
Before generating code, test the minimum and maximum values for signals and parameters. Otherwise, optimization might result in numerical mismatch with simulation. You can simulate your model with simulation range checking enabled. If errors or warnings occur, fix these issues before generating code.
Use fixed-point data types with binary-point-only (power-of-two) scaling.
Provide design minimum and maximum information upstream of blocks as close to the inputs of the blocks as possible. If you specify minimum and maximum values for a block output, these values are most likely to affect the outputs of the blocks immediately downstream. For more information, see Eliminate Unnecessary Utility Functions Using Specified Minimum and Maximum Values.
In the Configuration Parameters dialog box, set the Code Generation > System target
file to select an Embedded Real-Time (ERT
)
target (requires an Embedded Coder license).
Specify design minimum and maximum values for signals and parameters in your model using the tips in How to Configure Your Model.
Select the Optimization > Optimize using specified minimum and maximum values configuration parameter.
For more information, see Optimize using the specified minimum and maximum values.
This optimization does not occur for:
Multiword operations
Fixed-point data types with slope and bias scaling
Addition unless the fraction length is zero
This optimization does not take into account minimum and maximum values for:
Merge block inputs. To work around
this issue, use a Simulink.Signal
object on the Merge block
output and specify the range on this object.
Bus elements.
Conditionally-executed subsystem (such as a triggered subsystem) block outputs that are directly connected to an Outport block.
Outport blocks in conditionally-executed subsystems can have an initial value specified for use only when the system is not triggered. In this case, the optimization cannot use the range of the block output because the range might not cover the initial value of the block.
There are limitations on precision because you specify the minimum and maximum values as double-precision values. If the true value of a minimum or maximum value cannot be represented as a double, ensure that you round the minimum and maximum values correctly so that they cover the true design range.
If your model contains multiple instances of a reusable subsystem and each instance uses input signals with different specified minimum and maximum values, this optimization might result in different generated code for each subsystem so code reuse does not occur. Without this optimization, the Simulink Coder software generates code once for the subsystem and shares this code among the multiple instances of the subsystem.
This example shows how the Fixed-Point Designer software
uses the input range for a division operation to determine whether
it can eliminate unnecessary utility functions from the generated
code. It uses the fxpdemo_min_max_optimization
model.
First, you generate code without using the specified minimum and maximum
values to see that the generated code contains utility functions to
ensure that division by zero does not occur. You then turn on the
optimization, and generate code again. With the optimization, the
generated code does not contain the utility function because it is
not necessary for the input range.
First, generate code without taking into account the design
minimum and maximum values for the first input of the division operation
to show the code without the optimization. In this case, the software
uses the representable ranges for the two inputs, which are both uint16
.
With these input ranges, it is not possible to implement the division
with the specified precision using shifts, so the generated code includes
a division utility function.
Run the fxpdemo_min_max_optimizationfxpdemo_min_max_optimization example.
In the example window, double-click the View Optimization Configuration button.
The Optimization pane of the Configuration Parameters dialog box appears.
Note that the Optimize using specified minimum and maximum values parameter is not selected.
Double-click the Generate Code button.
The code generation report appears.
In the model, right-click the Division with
increased fraction length output type
block.
The context menu appears.
From the context menu, select C/C++ Code > Navigate To C/C++ Code.
The code generation report highlights the code generated for
this block. The generated code includes a call to the div_repeat_u32
utility
function.
rtY.Out3 = div_repeat_u32((uint32_T)rtU.In5 << 16, (uint32_T)rtU.In6, 1U);
Click the div_repeat_u32
link to
view the utility function, which contains code for handling division
by zero.
Next, generate code for the same division operation, this time taking into account the design minimum and maximum values for the first input of the Product block. These minimum and maximum values are specified on the Inport block directly upstream of the Product block. With these input ranges, the generated code implements the division by simply using a shift. It does not need to generate a division utility function, reducing both memory usage and execution time.
Double-click the Inport block labelled 5
to
open the block parameters dialog box.
On the block parameters dialog box, select the Signal Attributes pane and note that:
The Minimum value for this signal
is 1
.
The Maximum value for this signal
is 100
.
Click OK to close the dialog box.
Double-click the View Optimization Configuration button.
The Optimization pane of the Configuration Parameters dialog box appears.
On this pane, select the Optimize using specified minimum and maximum values parameter and click Apply.
Double-click the Generate Code button.
The code generation report appears.
In the model, right-click the Division with
increased fraction length output type
block.
The context menu appears.
From the context menu, select C/C++ Code > Navigate To C/C++ Code.
The code generation report highlights the code generated for this block. This time the generated code implements the division with a shift operation and there is no division utility function.
tmp = rtU.In6; rtY.Out3 = (uint32_T)tmp == (uint32_T)0 ? MAX_uint32_T : ((uint32_T)rtU.In5 << 17) / (uint32_T)tmp;
Finally, modify the minimum and maximum values for the first input to the division operation so that its input range is too large to guarantee that the value does not overflow when shifted. Here, you cannot shift a 16-bit number 17 bits to the right without overflowing the 32-bit container. Generate code for the division operation, again taking into account the minimum and maximum values. With these input ranges, the generated code includes a division utility function to ensure that no overflow occurs.
Double-click the Inport block labelled 5
to
open the block parameters dialog box.
On the block parameters dialog box, select the Signal
Attributes pane and set the Maximum value
to 40000
, then click OK to
close the dialog box.
Double-click the Generate Code button.
The code generation report appears.
In the model, right-click the Division with
increased fraction length output type
block.
The context menu appears.
From the context menu, select C/C++ Code > Navigate To C/C++ Code.
The code generation report highlights the code generated for
this block. The generated code includes a call to the div_repeat_32
utility
function.
rtY.Out3 = div_repeat_u32((uint32_T)rtU.In5 << 16, (uint32_T)rtU.In6, 1U);