Fixed-Point Toolbox 3.0
Fixed-Point Arctangent Calculation
Developing an efficient fixed-point arctangent algorithm to estimate an angle is critical in many applications, including control of robotics, frequency tracking in wireless communications, and many more. This demo shows how to use the CORDIC algorithm and polynomial approximation to do a fixed-point calculation of the four quadrant inverse tangent. This implementation is equivalent to MATLAB®'s built-in function atan2, which only supports floating-point data types.
ATAN2(Y,X) is the four quadrant arctangent of the real parts of the elements of X and Y, where
.
Contents
- Calculate atan2(y,x) with the CORDIC Algorithm
- Visualize the Vectoring Mode CORDIC Iterations
- Convert the Floating-Point CORDIC Algorithm to Fixed Point
- Overall Error Analysis of the CORDIC Algorithm
- Accelerate the Fixed-Point CORDIC Algorithm Using emlmex
- Calculate atan2(y,x) Using Chebyshev Polynomial Approximation
- Algorithmic Error Comparison of the CORDIC and Polynomial Approximation Algorithms
- Convert the Floating-Point Chebyshev Polynomial Approximation Algorithm to Fixed Point
- Overall Error Analysis of the Polynomial Approximation Algorithm
- Cost Comparison of the Fixed-Point CORDIC and Polynomial Approximation Algorithms
- References
Calculate atan2(y,x) with the CORDIC Algorithm
CORDIC is an acronym for COordinate Rotation DIgital Computer. The Givens rotation-based CORDIC algorithm (see [1,2]) is among one of the most hardware efficient algorithms because it only requires shift-add iterative operations. The CORDIC algorithm is suitable for calculating a variety of functions, such as sine, cosine, arcsine, arccosine, arctangent, vector magnitude, divide, square root, hyperbolic and logarithmic functions.
Vectoring mode CORDIC equations are widely used to calculate atan(y/x). In vectoring mode, the CORDIC rotator rotates the input vector towards the positive X-axis in order to minimize the y component of the residual vector. For each iteration, if the y coordinate of the residual vector is positive, the CORDIC rotator will rotate using a negative angle (clockwise); otherwise, it will rotate with a positive angle (counter-clockwise). If the angle accumulator is initialized to 0, by the end of the iterations, the accumulated rotation angle is the angle of the original input vector.
In vectoring mode, the CORDIC equations are:


is the angle accumulator
where
if
, and
otherwise;
i = 0, 1, ..., N-1, and N is the total number of iterations.
As N approaches
:




As explained above, the arctangent can be directly computed using the vectoring mode CORDIC rotator with the angle accumulator being initialized to zero, i.e.,
and
.
Floating-Point CORDIC Code
The floating-point CORDIC arctangent algorithm is implemented in the cordic_atan_fltpt.m file. This function calculates arctangent in the range [-pi/2, pi/2] using the vectoring mode CORDIC algorithm. Both x and y must be real scalar inputs, and x must be greater than or equal to 0. The angle look-up table input is angleLUT = atan(2.^-(0:N-1)).
function [z, x, y] = cordic_atan_fltpt(y,x,N,angleLUT)
z = 0;
for i = 0:N-1,
x0 = x;
if y < 0 % negative y leads to counter clock-wise rotation
x = x0 - y*2^(-i);
y = y + x0*2^(-i);
z = z - angleLUT(i+1); % z_{i+1} = z_{i} + atan(2^{-
i})
else % positive y leads to clock-wise rotation
x = x0 + y*2^(-i);
y = y - x0*2^(-i);
z = z + angleLUT(i+1); % z_{i+1} = z_{i} - atan(2^{-
i})
end
endVisualize the Vectoring Mode CORDIC Iterations
The CORDIC algorithm is guaranteed to converge, but not always monotonically in a finite number of iterations. You can typically achieve greater accuracy by increasing the number of iterations. However, as you can see in the following example, intermediate iterations occasionally rotate the vector closer to the positive X-axis than the following iteration does. Even so, the CORDIC algorithm is usually run through a specified number of iterations. Ending the iterations early would break pipelined code, and the gain
would not be constant because
would vary.
In the following example, iteration 5 provides a better estimate of the angle than iteration 6, and the CORDIC algorithm converges in later iterations.
Initialize the input vector with angle
degrees, magnitude = 1
origFormat = get(0, 'format'); %store original format setting; % restore this at the end of the demo. format short % theta = 43*pi/180; % Input angle in radians Niter = 10; % Ten iterations inX = cos(theta); % x coordinate of the input vector inY = sin(theta); % y coordinate of the input vector % pre-allocate memories zf = zeros(1, Niter); xf = [inX, zeros(1, Niter)]; yf = [inY, zeros(1, Niter)]; angleLUT = atan(2.^-(0:Niter-1)); %pre-calculate the angle look-up table % Call floating-point CORDIC algorithm for k = 1:Niter [zf(k), xf(k+1), yf(k+1)] = cordic_atan_fltpt(inY, inX, k, angleLUT); end
The following output shows the CORDIC angle accumulation (in degrees) through 10 iterations. Note that the 5th iteration produced less error than the 6th iteration, and that the calculated angle quickly converged to the actual input angle after that.
angleAccumulator = zf*180/pi; angleError = angleAccumulator - theta*180/pi; fprintf('Iteration: %2d, Calculated angle: %7.3f, Error in degrees: %10g, Er ror in bits: %g\n',... [(1:Niter); angleAccumulator(:)'; angleError(:)';log2(abs(zf(:)'-the ta))]);
Iteration: 1, Calculated angle: 45.000, Error in degrees: 2, Erro r in bits: -4.84036 Iteration: 2, Calculated angle: 18.435, Error in degrees: -24.5651, Erro r in bits: -1.22182 Iteration: 3, Calculated angle: 32.471, Error in degrees: -10.5288, Erro r in bits: -2.44409 Iteration: 4, Calculated angle: 39.596, Error in degrees: -3.40379, Erro r in bits: -4.07321 Iteration: 5, Calculated angle: 43.173, Error in degrees: 0.172543, Erro r in bits: -8.37533 Iteration: 6, Calculated angle: 41.383, Error in degrees: -1.61737, Erro r in bits: -5.14671 Iteration: 7, Calculated angle: 42.278, Error in degrees: -0.722194, Erro r in bits: -6.3099 Iteration: 8, Calculated angle: 42.725, Error in degrees: -0.27458, Erro r in bits: -7.70506 Iteration: 9, Calculated angle: 42.949, Error in degrees: -0.0507692, Erro r in bits: -10.1403 Iteration: 10, Calculated angle: 43.061, Error in degrees: 0.0611365, Erro r in bits: -9.87218
As N approaches
, the CORDIC rotator gain
approaches 1.6476. In this example, the input
was on the unit circle, so the initial rotator magnitude is 1. The following output shows the rotator magnitude through 10 iterations:
rotatorMagnitude = sqrt(xf.^2+yf.^2); % CORDIC rotator gain through iteratio ns fprintf('Iteration: %2d, Rotator magnitude: %g\n',... [(0:Niter); rotatorMagnitude(:)']);
Iteration: 0, Rotator magnitude: 1 Iteration: 1, Rotator magnitude: 1.41421 Iteration: 2, Rotator magnitude: 1.58114 Iteration: 3, Rotator magnitude: 1.6298 Iteration: 4, Rotator magnitude: 1.64248 Iteration: 5, Rotator magnitude: 1.64569 Iteration: 6, Rotator magnitude: 1.64649 Iteration: 7, Rotator magnitude: 1.64669 Iteration: 8, Rotator magnitude: 1.64674 Iteration: 9, Rotator magnitude: 1.64676 Iteration: 10, Rotator magnitude: 1.64676
Note that
approaches 0, and
approaches
because
.
y_n = yf(end)
y_n = -0.0018
x_n = xf(end)
x_n =
1.6468
figno = 1;
fixpt_atan2_demo_plot(figno, xf, yf) %Vectoring Mode CORDIC Iterations
figno = figno + 1; %Cumulative Angle and Rotator Magnitude Through Iteration
s
fixpt_atan2_demo_plot(figno,Niter, theta, angleAccumulator, rotatorMagnitude
)
Convert the Floating-Point CORDIC Algorithm to Fixed Point
Compared with fixed-point calculations, floating-point calculations have no overflow issues and suffer much less precision loss from rounding operations.
To convert a floating-point algorithm to fixed point, we need to consider the hardware constraints, and the trade-offs between dynamic ranges and finite precision. Assume the input and output word lengths are limited to 16 bits, and the dynamic range of the input is [-1, +1]. Due to the CORDIC rotator gain, the dynamic range of the x and y register is within (-2,+2). To avoid overflow, we pick a signed fixed point input data type with a word length of 16 bits and a fraction length of 14 bits. This allows us to reuse the x and y registers in each CORDIC iteration.
Because the four quadrant CORDIC atan2 algorithm outputs estimated angles within
, we pick an output fraction length of 13 bits to avoid overflow and provide a dynamic range of [-4, +3.9998779296875].
The fixed-point algorithm uses the default full precision mode of the fimath object. When the numerator is a power-of-2 number, all division operations are replaced by bitshift operations.
originalDefaultFimath = fimath; % Save the current default fimath object % so that it can be restored at the end of t he demo. % Specify and set the default fimath object to be used in this demo. % To produce efficient code, Floor rounding and wrap overflow are used. F = fimath('RoundMode', 'floor', ... 'OverflowMode', 'wrap', ... 'ProductMode', 'FullPrecision', ... 'SumMode', 'FullPrecision'); setdefaultfimath(F);
CORDIC Rotator Gain
Although the CORDIC rotator gain
does not affect the final calculated angle, it does affect the intermediate quantities. Thus, to avoid overflow, the CORDIC rotator gain needs to be considered when selecting fraction lengths for the input and output data types during fixed-point algorithm development. The gain
is a constant for a given N, and quickly approaches a value of 1.64676. Thus, because the gain is always greater than 1 and less than 2, only one extra bit needs to be added to account for the growth in fixed-point algorithms. The following code shows the CORDIC rotator gain
for N=0 through N=16, where N=0 corresponds to no rotations.
for N=0:16 A = prod(sqrt(1+2.^(-2*(0:N-1)))); fprintf('A_%2d = %.14f\n',N,A) end
A_ 0 = 1.00000000000000 A_ 1 = 1.41421356237310 A_ 2 = 1.58113883008419 A_ 3 = 1.62980060130066 A_ 4 = 1.64248406575224 A_ 5 = 1.64568891575726 A_ 6 = 1.64649227871248 A_ 7 = 1.64669325427364 A_ 8 = 1.64674350659690 A_ 9 = 1.64675607020488 A_10 = 1.64675921113982 A_11 = 1.64675999637562 A_12 = 1.64676019268469 A_13 = 1.64676024176197 A_14 = 1.64676025403129 A_15 = 1.64676025709862 A_16 = 1.64676025786545
Fixed-Point Algorithm
The fixed-point CORDIC arctangent algorithm is implemented in the cordic_atan_fixpt.m file. In fixed-point arithmetic, multiplication by negative powers of two can be done more efficiently using bit shift right arithmetic operations. To do this, the fixed-point implementation of the CORDIC arctangent algorithm uses the bitsra function. The bitsra function uses sign extension to shift bits in on the left side, and shifts the extra bits off the right side without rounding.
function [z,x,y] = cordic_atan_fixpt(y,x,N,angleLUT)
z = angleLUT(1); z(:) = 0; % z is the same data type as angleLUT
for i = 0:N-1,
x0 = x;
if y < 0 % negative y leads to counter clock-wise rotation
x(:) = x0 - bitsra(y,i);
y(:) = y + bitsra(x0,i);
z(:) = z - angleLUT(i+1); % z_{i+1} = z_{i} + atan(2^
3;-i})
else % positive y leads to clock-wise rotation
x(:) = x0 + bitsra(y,i);
y(:) = y - bitsra(x0,i);
z(:) = z + angleLUT(i+1); % z_{i+1} = z_{i} - atan(2^
3;-i})
end
endFour-Quadrant CORDIC
The four quadrant CORDIC atan2 algorithm is implemented in the cordic_atan2.m file. It uses the 2-quadrant arctangent algorithm by passing in abs(x), and then using angle correction to calculate the second and third quadrant results.
function z = cordic_atan2(y,x,N)
if isfi(y)
% Fixed-point
Ty = numerictype(y);
Tz = numerictype(1, Ty.WordLength, Ty.WordLength - 3);
% Build the constant angle look-up-table with best possible rounding, the
n
% do arithmetic with floor rounding for efficiency since no bits will be
rounded
% off in addition.
angleLUT = fi(fi(atan(2.^-(0:N-1)), Tz,'RoundMode','convergent'),'Roundmo
de','floor');
z = fi(zeros(size(y)),Tz);
for k = 1:length(y)
z(k) = cordic_atan_fixpt(y(k),abs(x(k)),N,angleLUT);
end else
% Floating-point
angleLUT = atan(2.^-(0:N-1));
z = zeros(size(y));
for k = 1:length(y)
z(k) = cordic_atan_fltpt(y(k),abs(x(k)),N,angleLUT);
end
end for k = 1:length(y)
% Correct for second and third quadrant
if x(k) < 0
if y(k) >= 0
% Second quadrant
z(k) = pi - z(k);
else
% Third quadrant
z(k) = -pi - z(k);
end
end
endOverall Error Analysis of the CORDIC Algorithm
The overall error consists of two parts:
- The algorithmic error that results from the CORDIC rotation angle being represented by a finite number of basic angles.
- The quantization or rounding error that results from the finite precision representation of the angle look-up table, and the finite precision arithmetic used in fixed-point operations.
Calculate the CORDIC Algorithmic Error
theta = (-178:2:180)*pi/180; % angle in radians inXflt = cos(theta); % generates input vector inYflt = sin(theta); Niter = 12; % total number of iterations zflt = cordic_atan2(inYflt, inXflt, Niter); % floating-point algorithm
Calculate the maximum magnitude of the CORDIC algorithmic error by comparing the CORDIC computation to the builtin atan2 function.
format long
cordic_algErr_real_world_value = max(abs((atan2(inYflt, inXflt) - zflt)))
cordic_algErr_real_world_value =
4.753112306290497e-004
The log base 2 error is related to the number of iterations. In this example, we use 12 iterations and are accurate to 11 binary digits, so the magnitude of the error is less than 
cordic_algErr_bits = log2(cordic_algErr_real_world_value)
cordic_algErr_bits = -11.038839889583048
Calculate the CORDIC Overall Error
The Effect of Rounding Modes in CORDIC
Typically, Convergent, Round and Nearest rounding modes give better results than other rounding modes like Floor, Ceil and Fix. The sums and differences in the CORDIC algorithm are all done in full precision because all binary points are identical and we scaled the input such that it will never overflow. Similar to the >> operator in C, the bitsra operation used in the CORDIC algorithm shifts the bits of the operand to the right. Excess bits are shifted off the right side and discarded without regard to rounding mode. Hence, the rounding mode has no effect on fixed-point math in the CORDIC algorithm.
The only place that a more expensive rounding mode can increase precision in the CORDIC algorithm is in the building of the angle look-up table. In the cordic_atan2 function, we chose to use Convergent rounding to build the constant angle look-up table at initialization time, and then used Floor rounding to improve efficiency at run time.
Relationship Between Number of Iterations and Precision
Once the quantization error dominates the overall error, i.e., the quantization error is greater than the algorithmic error, increasing the total number of iterations won't significantly decrease the overall error of the fixed-point CORDIC algorithm.
It is recommended that you pick your fraction lengths and total number of iterations to ensure that the quantization error is smaller than the algorithmic error. In the CORDIC algorithm, the precision increases by one bit every iteration. Thus, there is no reason to pick a number of iterations greater than the precision of the input data. Another way to look at the relationship between the number of iterations and the precision, is in the right-shift step of the algorithm. For example, on the counter-clockwise rotation
x(:) = x0 - bitsra(y,i); y(:) = y + bitsra(x0,i);
if i is equal to the word-length of y and x0, then bitsra(y,i) and bitsra(x0,i) shift all the way to zero and do not contribute anything to the next step.
To ensure that we only measure the error from the fixed-point algorithm, and not the differences in input values, the floating-point reference is computed with the same inputs as the fixed-point CORDIC algorithm.
inXfix = fi(inXflt, 1, 16, 14); inYfix = fi(inYflt, 1, 16, 14); zref = atan2(double(inYfix), double(inXfix)); zfix8 = cordic_atan2(inYfix, inXfix, 8); zfix10 = cordic_atan2(inYfix, inXfix, 10); zfix12 = cordic_atan2(inYfix, inXfix, 12); zfix14 = cordic_atan2(inYfix, inXfix, 14); zfix15 = cordic_atan2(inYfix, inXfix, 15); cordic_err = bsxfun(@minus,zref,double([zfix8;zfix10;zfix12;zfix14;zfix15])) ;
The error depends on the number of iterations and the precision of the input data. In this example, the input data is in the range [-1, +1], and the number of fractional bits is 14. From the following tables showing the maximum error at each iteration, and the figure showing the overall error of the CORDIC algorithm, you can see that the error decreases by about 1 bit per iteration until the precision of the data is reached.
iterations = [8, 10, 12, 14, 15]; max_cordicErr_real_world_value = max(abs(cordic_err')); fprintf('Iterations: %2d, Max error in real-world-value: %g\n',... [iterations; max_cordicErr_real_world_value]);
Iterations: 8, Max error in real-world-value: 0.00784503 Iterations: 10, Max error in real-world-value: 0.00198566 Iterations: 12, Max error in real-world-value: 0.000609882 Iterations: 14, Max error in real-world-value: 0.000357782 Iterations: 15, Max error in real-world-value: 0.000357782
max_cordicErr_bits = log2(max_cordicErr_real_world_value);
fprintf('Iterations: %2d, Max error in bits: %g\n',[iterations; max_cordicEr
r_bits]);
Iterations: 8, Max error in bits: -6.994 Iterations: 10, Max error in bits: -8.97617 Iterations: 12, Max error in bits: -10.6792 Iterations: 14, Max error in bits: -11.4486 Iterations: 15, Max error in bits: -11.4486
figno = figno + 1; fixpt_atan2_demo_plot(figno, theta, cordic_err)
Accelerate the Fixed-Point CORDIC Algorithm Using emlmex
A C-MEX function can be generated from M-code using the Embedded MATLAB™ C-MEX generation emlmex command. Typically, running the generated C-MEX function can improve the simulation speed (see [3]). The actual speed improvement depends on the simulation platform being used. The following example shows how to accelerate the fixed-point CORDIC atan2 algorithm using emlmex.
The emlmex function compiles the M-File into a C-MEX function. This step requires the creation of a temporary directory and write permissions in this directory.
emlmexdir = [tempdir 'emlmexdir']; if ~exist(emlmexdir,'dir') mkdir(emlmexdir); end emlcurdir = pwd; cd(emlmexdir)
Compile the cordic_atan2 M-File into a C-MEX file. Note that when you declare the number of iterations to be a constant 12 using emlcoder.egc(12), the angle look-up table will also be constant, and thus won't be computed at each iteration. Also, when you call cordic_atan2_mex, you no longer need to give it the input argument for the number of iterations. If you do try to pass in the number of iterations, the mex-function will error.
The data type of the input parameters determines whether the cordic_atan2 function performs fixed-point or floating-point calculations. When the Embedded MATLAB subset generates code for this file, code is only generated for the specific data type. In other words, if the inputs are fixed point, only fixed-point code is generated.
inp = {inYfix, inXfix, emlcoder.egc(12)}; %Example inputs for the funct
ion
emlmex('cordic_atan2', '-o', 'cordic_atan2_mex', '-eg', inp)
First, calculate a vector of 4 quadrant atan2 by calling the M-file cordic_atan2.
tstart = tic; cordic_atan2(inYfix,inXfix,Niter); telapsed_Mcordic_atan2 = toc(tstart);
Next, calculate a vector of 4 quadrant atan2 by calling the C-MEX code cordic_atan2_mex
cordic_atan2_mex(inYfix,inXfix); % load the C-MEX file
tstart = tic;
cordic_atan2_mex(inYfix,inXfix);
telapsed_MEXcordic_atan2 = toc(tstart);
Now, compare the speed. Type the following in the MATLAB command window to see the speed improvement on your specific platform:
emlmex_speedup = telapsed_Mcordic_atan2/telapsed_MEXcordic_atan2;
To clean up the temporary directory, run the following commands:
cd(emlcurdir); clear cordic_atan2_mex; status = rmdir(emlmexdir,'s');
Calculate atan2(y,x) Using Chebyshev Polynomial Approximation
Polynomial approximation is a Multiply ACcumulation (MAC) centric algorithm. It can be a good choice for DSP implementations of non-linear functions like atan(x).
For a given degree of polynomial, and a given function f(x) = atan(x) evaluated over the interval of [-1, +1], the polynomial approximation theory tries to find the polynomial that minimizes the maximum value of
, where P(x) is the approximating polynomial. In general, one can obtain polynomials very close to the optimal one by approximating the given function in terms of Chebyshev polynomials and cutting off the polynomial at the desired degree.
The approximation of arctangent over the interval of [-1, +1] using the Chebyshev polynomial of the first kind is summarized in the following formula:

where

![$$ x \in [-1, +1] $$](/products/demos/shipping/fixedpoint/fixpt_atan2_demo_eq34141.png)



Therefore, the 3rd order Chebyshev polynomial approximation is

The 5th order Chebyshev polynomial approximation is

The 7th order Chebyshev polynomial approximation is

You can obtain four quadrant output through angle correction based on the properties of the arctangent function.
Algorithmic Error Comparison of the CORDIC and Polynomial Approximation Algorithms
In general, higher degrees of polynomial approximation produce more accurate final results. However, higher degrees of polynomial approximation also increase the complexity of the algorithm and require more MAC operations and more memory. To be consistent with the CORDIC algorithm and the MATLAB atan2 function, the input arguments consist of both x and y coordinates instead of the ratio y/x.
To eliminate quantization error, floating-point implementations of the CORDIC and Chebyshev polynomial approximation algorithms are used in the comparison. An algorithmic error comparison reveals that increasing the number of CORDIC iterations results in less error. It also reveals that the CORDIC algorithm with 12 iterations provides a slightly better angle estimation than the 5th order Chebyshev polynomial approximation. The approximation error of the 3rd order Chebyshev Polynomial is about 8 times bigger than that of the 5th order Chebyshev polynomial. The order or degree of the polynomial can be chosen based on the required accuracy of the angle estimation and the hardware constraints.
The coefficients of the Chebyshev polynomial approximation for atan(x), are shown in ascending order of x.
constA3 = [0.970562748477141, -0.189514164974601]; % 3rd order constA5 = [0.994949366116654,-0.287060635532652,0.078037176446441]; %5th ord er constA7 = [0.999133448222780 -0.320533292381664 0.144982490144465... -0.038254464970299]; %7th order theta = (-90:1:90)*pi/180; % angle in radians inXflt = cos(theta); inYflt = sin(theta); zfltRef = atan2(inYflt, inXflt); %Ideal output from ATAN2 function zfltp3 = poly_atan2(inYflt,inXflt,3,constA3); % 3rd order zfltp5 = poly_atan2(inYflt,inXflt,5,constA5); % 5th order zfltp7 = poly_atan2(inYflt,inXflt,7,constA7); % 7th order poly_algErr = [zfltRef;zfltRef;zfltRef] - [zfltp3;zfltp5;zfltp7]; zflt8 = cordic_atan2(inYflt, inXflt, 8); % Cordic Alg with 8 iterations zflt12 = cordic_atan2(inYflt, inXflt, 12); % Cordic Alg with 12 iterations cordic_algErr = [zfltRef;zfltRef] - [zflt8;zflt12];
The maximum algorithmic error magnitude (or infinity norm of the algorithmic error) for the CORDIC algorithm with 8 and 12 iterations is shown below:
max_cordicAlgErr = max(abs(cordic_algErr')); fprintf('Iterations: %2d, CORDIC algorithmic error in real-world-value: %g\n ',... [[8,12]; max_cordicAlgErr(:)']);
Iterations: 8, CORDIC algorithmic error in real-world-value: 0.00772146 Iterations: 12, CORDIC algorithmic error in real-world-value: 0.000483258
The log base 2 error shows the number of binary digits of accuracy. The 12th iteration of the CORDIC algorithm has an estimated angle accuracy of
:
max_cordicAlgErr_bits = log2(max_cordicAlgErr); fprintf('Iterations: %2d, CORDIC algorithmic error in bits: %g\n',... [[8,12]; max_cordicAlgErr_bits(:)']);
Iterations: 8, CORDIC algorithmic error in bits: -7.01691 Iterations: 12, CORDIC algorithmic error in bits: -11.0149
The following code shows the magnitude of the maximum algorithmic error of the polynomial approximation for orders 3, 5, and 7:
max_polyAlgErr = max(abs(poly_algErr')); fprintf('Order: %d, Polynomial approximation algorithmic error in real-world -value: %g\n',... [3:2:7; max_polyAlgErr(:)']);
Order: 3, Polynomial approximation algorithmic error in real-world-value: 0. 00541647 Order: 5, Polynomial approximation algorithmic error in real-world-value: 0. 000679384 Order: 7, Polynomial approximation algorithmic error in real-world-value: 9. 16204e-005
The log base 2 error shows the number of binary digits of accuracy.
max_polyAlgErr_bits = log2(max_polyAlgErr); fprintf('Order: %d, Polynomial approximation algorithmic error in bits: %g\n ',... [3:2:7; max_polyAlgErr_bits(:)']);
Order: 3, Polynomial approximation algorithmic error in bits: -7.52843 Order: 5, Polynomial approximation algorithmic error in bits: -10.5235 Order: 7, Polynomial approximation algorithmic error in bits: -13.414
figno = figno + 1; fixpt_atan2_demo_plot(figno, theta, cordic_algErr, poly_algErr)
Convert the Floating-Point Chebyshev Polynomial Approximation Algorithm to Fixed Point
Assume the input and output word lengths are constrained to 16 bits by the hardware, and the 5th order Chebyshev polynomial is used in the approximation. Because the dynamic range of inputs x, y and y/x are all within [-1, +1], we can avoid overflow by picking a signed fixed-point input data type with a word length of 16 bits and a fraction length of 14 bits. The coefficients of the polynomial are purely fractional and within (-1, +1), so we can pick their data types as signed fixed point with a word length of 16 bits and a fraction length of 15 bits (best precision). The algorithm is robust because
is within [-1, +1], and the multiplication of the coefficients and
is within (-1, +1). Thus, the dynamic range won't grow, and because of the pre-determined fixed-point data types, overflow is not expected.
Similar to the CORDIC algorithm, the four quadrant polynomial approximation-based atan2 algorithm outputs estimated angles within
. Therefore, we can pick an output fraction length of 13 bits to avoid overflow and provide a dynamic range of [-4, +3.9998779296875].
The basic floating-point Chebyshev polynomial approximation of arctangent over the interval [-1, +1] is implemented in the chebyPoly_atan_fltpt.m file.
function z = chebyPoly_atan_fltpt(y,x,N,constA,Tz)
tmp = y/x;
switch N
case 3
z = constA(1)*tmp + constA(2)*tmp^3;
case 5
z = constA(1)*tmp + constA(2)*tmp^3 + constA(3)*tmp^5;
case 7
z = constA(1)*tmp + constA(2)*tmp^3 + constA(3)*tmp^5 + constA(4)
*tmp^7;
otherwise
disp('Supported order of Chebyshev polynomials are 3, 5 and 7');
endThe basic fixed-point Chebyshev polynomial approximation of arctangent over the interval [-1, +1] is implemented in the chebyPoly_atan_fixpt.m file.
function z = chebyPoly_atan_fixpt(y,x,N,constA,Tz)
z = fi(0,'NumericType', Tz); tmp = x(1); %tmp is the same data type as input x tmp(:) = Tx.divide(y, x); % y/x;
tmp2 = x(1); tmp3 = x(1); tmp2(:) = tmp*tmp; % (y/x)^2 tmp3(:) = tmp2*tmp; % (y/x)^3 z(:) = constA(1)*tmp + constA(2)*tmp3; % for order N = 3
if (N == 5) || (N == 7)
tmp5 = x(1);
tmp5(:) = tmp3 * tmp2; % (y/x)^5
z(:) = z + constA(3)*tmp5; % for order N = 5
if N == 7
tmp7 = x(1);
tmp7(:) = tmp5 * tmp2; % (y/x)^7
z(:) = z + constA(4)*tmp7; %for order N = 7
end
endThe universal four quadrant atan2 calculation using Chebyshev polynomial approximation is implemented in the poly_atan2.m file.
function z = poly_atan2(y,x,N,constA,Tz)
if nargin<5,
% floating-point algorithm
fhandle = @chebyPoly_atan_fltpt;
Tz = [];
z = zeros(size(y));
else
% fixed-point algorithm
fhandle = @chebyPoly_atan_fixpt;
%pre-allocate output
z = fi(zeros(size(y)), 'NumericType', Tz);
end for idx = 1:length(y)
% fist quadrant
if abs(x(idx)) >= abs(y(idx))
% (0, pi/4]
z(idx) = feval(fhandle, abs(y(idx)), abs(x(idx)), N, constA, Tz);
else
% (pi/4, pi/2)
z(idx) = pi/2 - feval(fhandle, abs(x(idx)), abs(y(idx)), N, constA
, Tz);
end if x(idx) < 0
% second and third quadrant
if y(idx) < 0
z(idx) = -pi + z(idx);
else
z(idx) = pi - z(idx);
end
else % fourth quadrant
if y(idx) < 0
z(idx) = -z(idx);
end
end
endOverall Error Analysis of the Polynomial Approximation Algorithm
Similar to the CORDIC algorithm, the overall error of the polynomial approximation algorithm consists of two parts, i.e., the algorithmic error and the quantization error. The algorithmic error of the polynomial approximation algorithm was analyzed and compared to the algorithmic error of the CORDIC algorithm in a previous section.
Calculate the Quantization Error
The quantization error is computed by comparing the fixed-point polynomial approximation to the floating-point polynomial approximation.
F = fimath('RoundMode','Floor','OverflowMode','Saturate'); setdefaultfimath(F) % Quantize the inputs and coefficients with convergent rounding inXfix = fi(fi(inXflt, 1, 16, 14,'RoundMode','Convergent'),'RoundMode','Floo r'); inYfix = fi(fi(inYflt, 1, 16, 14,'RoundMode','Convergent'),'RoundMode','Floo r'); constAfix3 = fi(fi(constA3, 1, 16,'RoundMode','Convergent'),'RoundMode','Flo or'); constAfix5 = fi(fi(constA5, 1, 16,'RoundMode','Convergent'),'RoundMode','Flo or'); constAfix7 = fi(fi(constA7, 1, 16,'RoundMode','Convergent'),'RoundMode','Flo or'); Tz = numerictype(1, 16, 13); % output data type zfix3p = poly_atan2(inYfix,inXfix,3,constAfix3,Tz); % 3rd order zfix5p = poly_atan2(inYfix,inXfix,5,constAfix5,Tz); % 5th order zfix7p = poly_atan2(inYfix,inXfix,7,constAfix7,Tz); % 7th order poly_quantErr = bsxfun(@minus, [zfltp3;zfltp5;zfltp7], double([zfix3p;zfix5p ;zfix7p])); % polyOrder = 3:2:7; max_polyQuantErr_real_world_value = max(abs(poly_quantErr')); max_polyQuantErr_bits = log2(max_polyQuantErr_real_world_value); fprintf('PolyOrder: %2d, Quant error in bits: %g\n',... [polyOrder; max_polyQuantErr_bits]);
PolyOrder: 3, Quant error in bits: -12.3514 PolyOrder: 5, Quant error in bits: -11.784 PolyOrder: 7, Quant error in bits: -11.7412
Calculate the Overall Error
The overall error is computed by comparing the fixed-point polynomial approximation to the builtin atan2 function. The ideal reference output is zfltRef. The overall error of the 7th order polynomial approximation is dominated by the quantization error, which is due to the finite precision of the input data, coefficients and the rounding effects from the fixed-point arithmetic operations.
poly_err = bsxfun(@minus, zfltRef, double([zfix3p;zfix5p;zfix7p])); max_polyErr_real_world_value = max(abs(poly_err')); max_polyErr_bits = log2(max_polyErr_real_world_value); fprintf('PolyOrder: %2d, Overall error in bits: %g\n',... [polyOrder; max_polyErr_bits]);
PolyOrder: 3, Overall error in bits: -7.51907 PolyOrder: 5, Overall error in bits: -10.2401 PolyOrder: 7, Overall error in bits: -11.5883
figno = figno + 1; fixpt_atan2_demo_plot(figno, theta, poly_err)
The Effect of Rounding Modes in Polynomial Approximation
Compared to the CORDIC algorithm with 12 iterations and a 13 bit fraction length in the angle accumulator, the fifth order Chebyshev polynomial approximation gives a similar order of quantization error. In the following example, Nearest, Round and Convergent rounding modes give smaller quantization error than the Floor rounding mode.
Maximum magnitude of the quantization error using Floor rounding
poly5_quantErrFloor = max(abs(poly_quantErr(2,:))); poly5_quantErrFloor_bits = log2(poly5_quantErrFloor)
poly5_quantErrFloor_bits = -11.783967700537794
For comparison, calculate the maximum magnitude of the quantization error using Nearest rounding.
F = fimath('RoundMode','Nearest','OverflowMode','Saturate'); setdefaultfimath(F) constAfix5.fimath = F; inXfix.fimath = F; inYfix.fimath = F; zfixp5n = poly_atan2(inYfix,inXfix,5,constAfix5, Tz); poly5_quantErrNearest = max(abs(zfltp5 - double(zfixp5n))); poly5_quantErrNearest_bits = log2(poly5_quantErrNearest)
poly5_quantErrNearest_bits = -13.175966487895451
Cost Comparison of the Fixed-Point CORDIC and Polynomial Approximation Algorithms
The fixed-point CORDIC algorithm requires the following operations per iteration:
- 1 table lookup
- 2 shifts
- 3 additions
As a comparison, the N-th order fixed-point Chebyshev polynomial approximation algorithm requires the following operations:
- 1 division (only required if the ratio is not available as an input)
- (N+1) multiplications
- (N-1)/2 additions
In real world applications, selecting an algorithm for the fixed-point arctangent calculation typically depends on the required accuracy, cost and hardware constraints.
% Reset default fimath to the original default fimath setdefaultfimath(originalDefaultFimath); set(0, 'format', origFormat); %reset MATLAB output format close all;
References
- Jack E. Volder, The CORDIC Trigonometric Computing Technique, IRE Transactions on Electronic Computers, Volume EC-8, September 1959, pp330-334.
- Ray Andraka, A survey of CORDIC algorithm for FPGA based computers, Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, Feb. 22-24, 1998, pp191-200
- Speeding Up Fixed-Point Execution with the emlmex Function, in section "Working with the Fixed-Point Embedded MATLAB Subset" of Fixed-Point Toolbox™ User's Guide
Store