# Documentation

## Operations with Fixed-Point Data

### Supported Operations with Fixed-Point Operands

#### Binary Operations

These binary operations work with fixed-point operands in the following order of precedence (0 = highest, 8 = lowest). For operations with equal precedence, they evaluate in order from left to right:

Example

Precedence

Description

`a %% b`

0

Remainder

`a * b`

1

Multiplication

`a / b`

1

Division

`a + b`

2

`a - b`

2

Subtraction

`a > b`

3

Comparison, greater than

`a < b`

3

Comparison, less than

`a >= b`

3

Comparison, greater than or equal to

`a <= b`

3

Comparison, less than or equal to

`a == b`

4

Comparison, equality

`a ~= b`

4

Comparison, inequality

`a != b`

4

Comparison, inequality

`a <> b`

4

Comparison, inequality

`a & b`

5

One of the following:

• Bitwise AND

Enabled when Enable C-bit operations is selected in the Chart properties dialog box. See Specify Chart Properties. Operands are cast to integers before the operation is performed.

• Logical AND

Enabled when Enable C-bit operations is cleared in the Chart properties dialog box.

`a | b`

6

One of the following:

• Bitwise OR

Enabled when Enable C-bit operations is selected in the Chart properties dialog box. See Specify Chart Properties. Operands are cast to integers before the operation is performed.

• Logical OR

Enabled when Enable C-bit operations is cleared in the Chart properties dialog box.

`a && b`

7

Logical AND

`a || b`

8

Logical OR

#### Unary Operations and Actions

These unary operations and actions work with fixed-point operands:

Example

Description

`~a`

Unary minus

`!a`

Logical NOT

`a++`

Increment

`a--`

Decrement

#### Assignment Operations

These assignment operations work with fixed-point operands:

Example

Description

`a = expression`

Simple assignment

`a := expression`

`a += expression`

Equivalent to `a = a + expression`

`a -= expression`

Equivalent to `a = a - expression`

`a *= expression`

Equivalent to `a = a * expression`

`a /= expression`

Equivalent to `a = a / expression`

`a |= expression`

Equivalent to `a = a | expression` (bit operation). See operation `a | b` in Binary Operations.

`a &= expression`

Equivalent to `a = a & expression` (bit operation). See operation `a & b` in Binary Operations.

### Promotion Rules for Fixed-Point Operations

Operations with at least one fixed-point operand require rules for selecting the type of the intermediate result for that operation. For example, in the action statement `c = a + b`, where `a` or `b` is a fixed-point number, an intermediate result type for `a + b` must first be chosen before the result is calculated and assigned to `c`.

The rules for selecting the numeric types used to hold the results of operations with a fixed-point number are called fixed-point promotion rules. The goal of these rules is to maintain computational efficiency and usability.

### Note

You can use the `:=` assignment operator to override the fixed-point promotion rules and obtain greater accuracy. However, in this case, greater accuracy can require more computational steps. See Assignment Operator :=.

The following topics describe the process of selecting an intermediate result type for binary operations with at least one fixed-point operand.

#### Default Selection of the Number of Bits of the Result Type

A fixed-point number with S = 1 and B = 0 is treated as an integer. In operations with integers, the C language promotes any integer input with fewer bits than the type `int` to the type `int` and then performs the operation.

The type `int` is the integer word size for C on a given platform. Result word size is increased to the integer word size because processors can perform operations at this size efficiently.

To maintain consistency with the C language, this default rule applies to assigning the number of bits for the result type of an operation with fixed-point numbers:

When both operands are fixed-point numbers, the number of bits in the result type is the maximum number of bits in the input types or the number of bits in the integer word size for the target machine, whichever is larger.

### Note

The preceding rule is a default rule for selecting the bit size of the result for operations with fixed-point numbers. This rule is overruled for specific operations as described in the sections that follow.

#### Set the Integer Word Size for a Target

The preceding default rule for selecting the bit size of the result for operations with fixed-point numbers relies on the definition of the integer word size for your target. You can set the integer word size for the targets that you build in Simulink® models with these steps:

1. In the Stateflow® Editor, select Simulation > Model Configuration Parameters.

2. Select Hardware Implementation in the left navigation panel.

The right panel displays configuration parameters for production hardware and test hardware.

3. To set integer word size for production hardware, follow these steps:

• In the drop-down menu for the Device type field, select `Custom`.

• In the int field, enter a word size in bits.

4. To set integer word size for test hardware, follow these steps:

• If no configuration fields appear, clear the None check box.

• In the drop-down menu for the Device type field, select `Custom`.

• In the int field, enter a word size in bits.

5. Click to accept the changes.

When you build any target after making this change, the generated code uses this integer size to select result types for your fixed-point operations.

### Note

Set all available integer sizes because they affect code generation. The integer sizes do not affect the implementation of the fixed-point promotion rules in generated code.

#### Unary Promotions

Only the unary minus (-) operation requires a promotion of its result type. The word size of the result is given by the default procedure for selecting the bit size of the result type for an operation involving fixed-point data. See Default Selection of the Number of Bits of the Result Type. The bias, B, of the result type is the negative of the bias of the operand.

#### Binary Operation Promotion for Integer Operand with Fixed-Point Operand

Integers as operands in binary operations with fixed-point numbers are treated as fixed-point numbers of the same word size with slope, S, equal to 1, and a bias, B, equal to 0. The operation now becomes a binary operation between two fixed-point operands. See Binary Operation Promotion for Two Fixed-Point Operands.

#### Binary Operation Promotion for Double Operand with Fixed-Point Operand

When one operand is of type `double` in a binary operation with a fixed-point type, the result type is `double`. In this case, the fixed-point operand is cast to type `double`, and the operation is performed.

#### Binary Operation Promotion for Single Operand with Fixed-Point Operand

When one operand is of type `single` in a binary operation with a fixed-point type, the result type is `single`. In this case, the fixed-point operand is cast to type `single`, and the operation is performed.

#### Binary Operation Promotion for Two Fixed-Point Operands

Operations with both operands of fixed-point type produce an intermediate result of fixed-point type. The resulting fixed-point type is chosen through the application of a set of operator-specific rules. The procedure for producing an intermediate result type from an operation with operands of different fixed-point types is summarized in these topics:

Addition (+) and Subtraction (-).  The output type for addition and subtraction is chosen so that the maximum positive range of either input can be represented in the output while preserving maximum precision. The base word type of the output follows the rule in Default Selection of the Number of Bits of the Result Type. To simplify calculations and yield efficient code, the biases of the two inputs are added for an addition operation and subtracted for a subtraction operation.

### Note

Mixing signed and unsigned operands can yield unexpected results and is not recommended.

Multiplication (*) and Division (/).  The output type for multiplication and division is chosen to yield the most efficient code implementation. You cannot use nonzero biases for multiplication and division in Stateflow charts (see note).

The slope for the result type of the product of the multiplication of two fixed-point numbers is the product of the slopes of the operands. Similarly, the slope of the result type of the quotient of the division of two fixed-point numbers is the quotient of the slopes. The base word type is chosen to conform to the rule in Default Selection of the Number of Bits of the Result Type.

### Note

Because nonzero biases are computationally very expensive, those biases are not supported for multiplication and division.

Relational Operations (>, <, >=, <=, ==, -=, !=, <>).  You can use the following relational (comparison) operations on all fixed-point types: >, <, >=, <=, ==, -=, !=, <>. See Supported Operations with Fixed-Point Operands for an example and description of these operations. Both operands in a comparison must have equal biases (see note).

Comparing fixed-point values of different types can yield unexpected results because each operand must convert to a common type for comparison. Because of rounding or overflow errors during the conversion, values that do not appear equal might be equal and values that appear to be equal might not be equal.

### Note

To preserve precision and minimize unexpected results, both operands in a comparison operation must have equal biases.

For example, compare these two unsigned 8-bit fixed-point numbers, `a` and `b`, in an 8-bit target environment:

Fixed-Point Number a

Fixed-Point Number b

Sa = 2–4

Sb = 2–2

Ba = 0

Bb = 0

Va = 43.8125

Vb = 43.75

Qa = 701

Qb = 175

By rule, the result type for comparison is 8-bit. Converting `b`, the least precise operand, to the type of a, the most precise operand, could result in overflow. Consequently, `a` is converted to the type of `b`. Because the bias values for both operands are 0, the conversion occurs as follows:

Sb (newQa) = SaQa

newQa = (SaSb) Qa = (2–4/2–2) 701 = 701/4 = 175

Although they represent different values, `a` and `b` are considered equal as fixed-point numbers.

Logical Operations (&, |, &&, ||).  If `a` is a fixed-point number used in a logical operation, it is interpreted with the equivalent substitution ```a != 0.0C``` where `0.0C` is an expression for zero in the fixed-point type of `a` (see Fixed-Point Context-Sensitive Constants). For example, if `a` is a fixed-point number in the logical operation ```a && b```, this operation is equivalent to the following:

```(a != 0.0C) && b ```

The preceding operation is not a check to see whether the quantized integer for a, Qa, is not 0. If the real-world value for a fixed-point number `a` is 0, this implies that Va = SaQa + Ba = 0.0. Therefore, the expression `a != 0`, for fixed-point number `a`, is equivalent to this expression:

Qa ! = –Ba / Sa

For example, if a fixed-point number, `a`, has a slope of 2–2, and a bias of 5, the test `a != 0` is equivalent to the test `if` Qa ! = –20.

### Assignment (=, :=) Operations

You can use the assignment operations `LHS = RHS` and ```LHS := RHS``` between a left-hand side (`LHS`) and a right-hand side (`RHS`). See these topics for examples that contrast the two assignment operations:

#### Assignment Operator =

An assignment statement of the type `LHS` = `RHS` is equivalent to casting the right-hand side to the type of the left-hand side. You can use any assignment between fixed-point types and therefore, implicitly, any cast.

A cast converts the stored integer Q from its original fixed-point type while preserving its value as accurately as possible using the online conversions (see Fixed-Point Conversion Operations). Assignments are most efficient when both types have the same bias, and slopes that are equal or both powers of 2.

#### Assignment Operator :=

Ordinarily, the fixed-point promotion rules determine the result type for an operation. Using the := assignment operator overrides this behavior by using the type of the `LHS` as the result type of the `RHS` operation.

These rules apply to the `:=` assignment operator:

• The `RHS` can contain at most one binary operator.

• If the `RHS` contains anything other than an addition (`+`), subtraction (`-`), multiplication (`*`), or division (`/`) operation, or a constant, then the `:=` assignment behaves like regular assignment (`=`).

• Constants on the `RHS` of an ```LHS := RHS``` assignment are converted to the type of the left-hand side using offline conversion (see Fixed-Point Conversion Operations). Ordinary assignment always casts the `RHS` using online conversions.

#### When to Use the := Operator Instead of the = Operator

Use the := assignment operator instead of the = assignment operator in these cases:

• Arithmetic operations where you want to avoid overflow

• Multiplication and division operations where you want to retain precision

### Caution

Using the := assignment operator to produce a more accurate result can generate code that is less efficient than the code you generate using the normal fixed-point promotion rules.

#### Avoid Overflow Using the := Operator for Addition and Subtraction

This model contains a Stateflow chart with two inputs and eight outputs.

The chart contains a graphical function that compares the use of the = and := assignment operators.

If you generate code for this model, you see code similar to this.

```/* Exported block signals */ int16_T x1; /* '<Root>/Input' */ int16_T x2; /* '<Root>/Input1' */ int32_T y1; /* '<Root>/Chart' */ int32_T y2; /* '<Root>/Chart' */ int32_T z1; /* '<Root>/Chart' */ int32_T z2; /* '<Root>/Chart' */ int16_T y3; /* '<Root>/Chart' */ int16_T y4; /* '<Root>/Chart' */ int16_T z3; /* '<Root>/Chart' */ int16_T z4; /* '<Root>/Chart' */ ... /* Model step function */ void doc_sf_colon_equal_step(void) { /* Case "=" - general */ y1 = x1 + x2; y2 = x1 - x2; y3 = x1 * x2 >> 3; y4 = div_s16_floor(x1, x2) << 3U; /* Case ":=" - better computation of the expression */ z1 = (int32_T)x1 + (int32_T)x2; z2 = (int32_T)x1 - (int32_T)x2; z3 = (int16_T)((int32_T)x1 * (int32_T)x2 >> 3); z4 = (int16_T)(((int32_T)x1 << 3) / (int32_T)x2); } ```

The inputs `x1` and `x2` are signed 16-bit integers with 3 fraction bits. For addition and subtraction, the outputs are signed 32-bit integers with 3 fraction bits.

Assume that the integer word size for production targets is 16 bits. To learn how to change the integer word size for a target, see Set the Integer Word Size for a Target.

Because the target `int` size is 16 bits, you can avoid overflow by using the := operator instead of the = operator. For example, assume that the inputs have these values:

• `x1` = 215 – 1

• `x2` = 1

=Adds the inputs in 16 bits before casting the sum to 32 bits`y1` = –215Yes
:=Casts the inputs to 32 bits before computing the sum`z1` = +215No

Similarly, you can avoid overflow for subtraction if you use the := operator instead of the = operator.

#### Avoid Overflow Using the := Operator for Multiplication

The following example contrasts the := and = assignment operators for multiplication. You can use the := operator to avoid overflow in the multiplication `c` = `a` * `b`, where `a` and `b` are two fixed-point operands. The operands and result for this operation are 16-bit unsigned integers with these assignments:

Fixed-Point Number a

Fixed-Point Number b

Fixed-Point Number c

Sa = 2–4

Sb = 2–4

Sc = 2–5

Ba = 0

Bb = 0

Bc = 0

Va = 20.1875

Vb = 15.3125

Vc = ?

Qa = 323

Qb = 245

Qc = ?

where S is the slope, B is the bias, V is the real-world value, and Q is the quantized integer.

c = a*b.  In this case, first calculate an intermediate result for `a*b` in the fixed-point type given by the rules in the section Fixed-Point Operations. Then cast that result to the type for `c`.

The calculation of intermediate value occurs as follows:

`${Q}_{iv}={Q}_{a}{Q}_{b}=323×245=79135$`

Because the maximum value of a 16-bit unsigned integer is 216 – 1 = 65535, the preceding result overflows its word size. An operation that overflows its type produces an undefined result.

You can capture overflow errors like the preceding example during simulation. See Detect Overflow for Fixed-Point Types.

c := a*b.  In this case, calculate `a*b` directly in the type of `c`. Use the solution for Qc given in Fixed-Point Operations with the requirement of zero bias, which occurs as follows:

`${Q}_{c}=\left(\left({S}_{a}{S}_{b}/{S}_{c}\right){Q}_{a}{Q}_{b}\right)=\left({2}^{-4}×{2}^{-4}/{2}^{-5}\right)\left(323×245\right)=79135/8=9892$`

No overflow occurs in this case, and the approximate real-world value is as follows:

`${\stackrel{\sim }{V}}_{c}={S}_{c}{Q}_{c}={2}^{-5}×9892=9892/32=309.125$`

This value is very close to the actual result of 309.121.

#### Improve Precision Using the := Operator for Division

The following example contrasts the := and = assignment operators for division. You can use the := operator to obtain a more precise result for the division of two fixed-point operands, `a` and `b`, in the statement `c := a/b`.

This example uses the following fixed-point numbers, where S is the slope, B is the bias, V is the real-world value, and Q is the quantized integer:

Fixed-Point Number a

Fixed-Point Number b

Fixed-Point Number c

Sa = 2–4

Sb = 2–3

Sc = 2–6

Ba = 0

Bb = 0

Bc = 0

Va = 2

Vb = 3

Vc = ?

Qa = 32

Qb = 24

Qc = ?

c = a/b.  In this case, first calculate an intermediate result for `a/b` in the fixed-point type given by the rules in the section Fixed-Point Operations. Then cast that result to the type for `c`.

The calculation of intermediate value occurs as follows:

`${Q}_{iv}={Q}_{a}/{Q}_{b}=32/24=1$`

The intermediate value is then cast to the result type for `c` as follows:

ScQc = SivQiv

Qc = (Siv / Sc) Qiv

The calculation for slope of the intermediate value for a division operation occurs as follows:

`${S}_{iv}={S}_{a}/{S}_{b}={2}^{-4}/{2}^{-3}={2}^{-1}$`

Substitution of this value into the preceding result yields the final result.

`${Q}_{c}={2}^{-1}/{2}^{-6}={2}^{5}=32$`

In this case, the approximate real-world value is ${\stackrel{\sim }{V}}_{c}=32/64=0.5$, which is not a very good approximation of the actual result of 2/3.

c := a/b.  In this case, calculate `a/b` directly in the type of `c`. Use the solution for Qc given in Fixed-Point Operations with the simplification of zero bias, which is as follows:

`${Q}_{c}=\left({S}_{a}{Q}_{a}\right)/\left({S}_{c}\left({S}_{b}{Q}_{b}\right)\right)=\left({S}_{a}/\left({S}_{b}{S}_{c}\right)\right)×\left({Q}_{a}/{Q}_{b}\right)=\left({2}^{-4}/\left({2}^{-3}×{2}^{-6}\right)\right)×\left(32/24\right)=42$`

In this case, the approximate real-world value is as follows:

`${\stackrel{\sim }{V}}_{c}=42/64=0.6563$`

This value is a much better approximation to the precise result of 2/3.

#### := Assignment and Context-Sensitive Constants

In a := assignment operation, the type of the left-hand side (`LHS`) determines part of the context used for inferring the type of a right-hand side (`RHS`) context-sensitive constant.

These rules apply to `RHS` context-sensitive constants in assignments with the := operator:

• If the `LHS` is a floating-point data (type `double` or `single`) , the `RHS` context-sensitive constant becomes a floating-point constant.

• For addition and subtraction, the type of the `LHS` determines the type of the context-sensitive constant on the `RHS`.

• For multiplication and division, the type of the context-sensitive constant is chosen independently of the `LHS`.

### Fixed-Point Conversion Operations

Real numbers are converted into fixed-point data during data initialization and as part of casting operations in the application. These conversions compute a quantized integer, Q, from a real number input. Offline conversions initialize data, and online conversions perform casting operations in the running application. The topics that follow describe each conversion type and give examples of the results.

#### Offline Conversions for Initialized Data

Offline conversions are performed during code generation and are designed to maximize accuracy. These conversions round the resulting quantized integer to its nearest integer value. If the conversion overflows, the result saturates the value for Q.

Offline conversions are performed for these operations:

• Initialization of data (both variables and constants) in the Stateflow hierarchy

• Initialization of constants or variables from the MATLAB® workspace

#### Online Conversions for Casting Operations

Online conversions are performed for casting operations that take place during execution of the application. Designed to maximize computational efficiency, they are faster and more efficient than offline conversions, but less precise. Instead of rounding Q to its nearest integer, online conversions round to the floor (with the exception of division, which can round to 0, depending on the C compiler you have). If the conversion overflows the type to which you convert, the result is undefined.

#### Offline and Online Conversion Examples

The following examples show the difference in the results of offline and online conversions of real numbers to a fixed-point type defined by a 16-bit word size, a slope (S) equal to 2–4, and a bias (B) equal to 0:

Offline Conversion

Online Conversion

V

V/S

Q

$\stackrel{\sim }{V}$

Q

$\stackrel{\sim }{V}$

3.45

55.2

55

3.4375

55

3.4375

1.0375

16.6

17

1.0625

16

1

2.06

32.96

33

2.0625

32

2

In the preceding example,

• V is the real-world value represented as a fixed-point value.

• V/S is the floating-point computation for the quantized integer Q.

• Q is the rounded value of V/S.

• is the approximate real-world value resulting from Q for each conversion.

### Automatic Scaling of Stateflow Fixed-Point Data

Automatic scaling tools can change the settings of Stateflow fixed-point data. You can prevent automatic scaling by selecting the Lock data type setting against changes by the fixed-point tools check box in the Data properties dialog box for fixed-point data (see Set Data Properties for details). Selecting this check box prevents replacement of the current fixed-point type with a type that the Fixed-Point Tool (Fixed-Point Designer) or Fixed-Point Advisor (Fixed-Point Designer) chooses. For methods on autoscaling fixed-point data, see Choosing a Range Collection Method (Fixed-Point Designer).