Documentation |
On this page… |
---|
Fixed-point numbers use integers and integer arithmetic to approximate real numbers. They are an efficient means for performing computations involving real numbers without requiring floating-point support in underlying system hardware.
Fixed-point numbers use integers and integer arithmetic to represent real numbers and arithmetic with the following encoding scheme:
$$V=\stackrel{\sim}{V}=SQ+B$$
where
V is a precise real-world value that you want to approximate with a fixed-point number.
$$\stackrel{\sim}{V}$$ is the approximate real-world value that results from fixed-point representation.
Q is an integer that encodes $$\stackrel{\sim}{V.}$$ This value is the quantized integer.
Q is the actual stored integer value used in representing the fixed-point number. If a fixed-point number changes, its quantized integer, Q, changes but S and B remain unchanged.
S is a coefficient of Q, or the slope.
B is an additive correction, or the bias.
Fixed-point numbers encode real quantities (for example, 15.375) using the stored integer Q. You set the value of Q by solving the equation
$$\stackrel{\sim}{V}=SQ+B$$
for Q and rounding the result to an integer value as follows:
Q = round((V – B)/S)
For example, suppose you want to represent the number 15.375 in a fixed-point type with the slope S = 0.5 and the bias B = 0.1. This means that
Q = round((15.375 – 0.1)/0.5) = 30
However, because Q is rounded to an integer, you lose some precision in representing the number 15.375. If you calculate the number that Q actually represents, you now get a slightly different answer.
$$V=\stackrel{\sim}{V}=SQ+B=0.5\times 30+0.1=15.1$$
Using fixed-point numbers to represent real numbers with integers involves the loss of some precision. However, if you choose S and B correctly, you can minimize this loss to acceptable levels.
Now that you can express fixed-point numbers as $$\stackrel{\sim}{V}=SQ+B,$$ you can define operations between two fixed-point numbers.
The general equation for an operation between fixed-point operands is as follows:
c = a <op> b
where a, b, and c are all fixed-point numbers, and <op> refers to a binary operation: addition, subtraction, multiplication, or division.
The general form for a fixed-point number x is S_{x}Q_{x} + B_{x} (see Fixed-Point Numbers). Substituting this form for the result and operands in the preceding equation yields this expression:
(S_{c}Q_{c} + B_{c}) = (S_{a}Q_{a} + B_{a}) <op> (S_{b}Q_{b} + B_{b})
The values for S_{c} and B_{c} are chosen by Stateflow^{®} software for each operation (see Promotion Rules for Fixed-Point Operations) and are based on the values for S_{a}, S_{b}, B_{a} and B_{b} that you enter for each fixed-point data (see Specify Fixed-Point Data).
Note: You can be more precise in choosing the values for S_{c} and B_{c} when you use the := assignment operator (that is, c := a <op> b). See Assignment (=, :=) Operations. |
Using the values for S_{a}, S_{b}, S_{c}, B_{a}, B_{b}, and B_{c}, you can solve the preceding equation for Q_{c} for each binary operation as follows:
The operation c=a+b implies that
Q_{c} = ((S_{a}/S_{c})Q_{a} + (S_{b}/S_{c})Q_{b} + (B_{a} + B_{b} – B_{c})/S_{c})
The operation c=a-b implies that
Q_{c} = ((S_{a}/S_{c})Q_{a} – (S_{b}/S_{c})Q_{b} – (B_{a} – B_{b} – B_{c})/S_{c})
The operation c=a*b implies that
Q_{c} = ((S_{a}S_{b}/S_{c})Q_{a}Q_{b} + (B_{a}S_{b}/S_{c})Q_{a} + (B_{b}S_{a}/S_{c})Q_{b} + (B_{a}B_{b} – B_{c})/S_{c})
The operation c=a/b implies that
Q_{c} = ((S_{a}Q_{a} + B_{a})/(S_{c}(S_{b}Q_{b} + B_{b})) – (B_{c}/S_{c}))
The fixed-point approximations of the real number result of the operation c = a <op> b are given by the preceding solutions for the value Q_{c}. In this way, all fixed-point operations are performed using only the stored integer Q for each fixed-point number and integer operation.