Main Content

FPGA Based Cell-Averaging Constant False Alarm Rate (CA-CFAR) Detector - Algorithm Design and HDL Code Generation

This example shows how to design a FPGA implementation ready CA-CFAR Detector. To verify the implementation model is functionally correct, we compare the simulation output of the implementation model with the output of a CFAR based behavioral model using Phased Array System Toolbox™. The term deployment here implies designing a model that is suitable for implementation on a FPGA. The model is implementation ready and this will be verified in the example. The HDL workflow is designed in fixed-point.

The Phased Array System Toolbox™ provides the floating point behavioral model for the CFAR Detector through the phased.CFARDetector System object. This behavioral model is used to verify the results of the implementation model and the automatically generated HDL code as well.

Fixed-Point Designer™ provides data types and tools for developing fixed-point and single precision algorithms to optimize performance on an embedded hardware. Bit true simulations can be performed to observe the impact of limited range and precision without implementing the design in hardware.

This example uses HDL Coder™ to generate HDL Code from the developed Simulink® model and verifies the HDL Code using the HDL Verifier™. HDL Verifier™ is used to generate a co-simulation test bench model to verify the behavior of the automatically generated HDL Code. The test bench uses ModelSim® for Co-simulation to verify the generated HDL code.

Algorithm Design

In a radar system, target detection is achieved by comparing the received signal power to a global threshold. If the received power is greater than the threshold, it marks the presence of a target else a target is said to be absent. This makes the choice of threshold a critical characteristic. The appropriate threshold value depends on maximizing the detection and minimizing the false alarm.

The threshold is chosen based on apriori knowledge (estimate) of the interferer power. The interferer power is affected by many external factors, hence, the variance will be a large value when measured globally. When the threshold is constant, the increase in interferer power can lead to an increase in false detections and at the same time, if the interferer power drops significantly, the target might not be detected.

The CFAR detector, as the name suggests maintains the specified false alarm rate by means of an Adaptive Thresholding, wherein the threshold is calculated based on the locality of the Cell Under Test (CUT) and this defines the cell for which detection is required. The interference power of the neighboring cells are used to calculate the threshold for a CUT. The detection threshold is calculated as

$$T = -P_n^2.ln(P_{FA}) = -P_n^2\alpha$$

where, $P_{FA}$ is the Probability of False Alarm, $\alpha$ is the Threshold Factor, $P_n$ is the Interference power level.

Cell-Averaging(CA) CFAR Detector In a CA CFAR, the lead and lag cells are used to calculate the interferer estimate. The number of lead cells are the same as that of the number of lag cells. CA-CFAR assumes that, the neighboring cells to the CUT contains the same interference statistic - Homogenous Interference and the target is present in only one CUT. To reinforce the second assumption, guard cells are placed immediately after the CUT.

For a CA-CFAR with an independent and identically distributed (i.i.d) Gaussian interference (standard normal), the average noise power is just the mean of output of square law detector of all the training cells, which is

$$\widehat{P_n^2} = \frac{1}{N}\sum\limits_{i=1}^N x_i$$

Here $x_i$ is the signal from the i-th training cell. For a given Probability of False Alarm, the threshold factor can be calculated as,

$$\alpha = N.(P_{FA}^{(\frac{-1}{N})} - 1)$$

The training cell and guard cell along with the CUT is called as the CFAR Window The following figure shows a representation of a CFAR Window

Implementation Model

The implementation model is designed using the HDL Coder™ compatible blocks from the Simulink® HDL Coder™ library. For this example, we have chosen the following values for the parameters,

  • No. of Training Cells = 50

  • No. of Guard Cells = 2

  • Probability of False Alarm = 0.005

  • Total No. of Cells = 1000

The following command is used to open the Simulink model.

modelname = 'SimulinkCFARHDLWorkflowExample';
% Ensure model is visible and not obstructed by scopes

The Simulink model consists of two branches from the input block. The top branch is the behavioral model with floating point operations of the phased.CFARdetector System object. The bottom branch is the functional equivalent implementation model with fixed-point version.

The input to the behavioral model is a (NCell x 1) 1000x1 matrix. The input is passed through the Square-Law sub-system which performs the square-law operation which is then forwarded into the CFAR Detector behavioral model.

The input to the implementation model is provided via buffer which converts a multi-dimensional signal into single dimensional data stream for a deployable model. The input data type is then converted to fixed-point using the Quantize block. The fixed-point has a word-length of 24 bits and a fraction length of 12 bits. The tradeoff between different fixed-point setting with resource utilization and accuracy is discussed later in this example. The input is then passed on to the CFAR implementation model sub-system which performs the CFAR Detection.

The output of the CFAR Detector behavioral model is passed through a delay of 125 cycles to compensate the delay for the output of the implementation model.

The scope block plots the threshold and detection outputs of the behavioral model and implementation model. In addition, the error between the threshold of implementation model and behavioral model, and the error between the detection of implementation model and behavioral model is also calculated and plotted.

The implementation model contains the following sub-systems:

  1. Square-Law HDL

  2. Alpha HDL

  3. CFAR Core HDL

  4. Validate

Square-Law HDL

The following command is used to open the Square-Law HDL subsystem model

open_system([modelname '/CFAR Implementation Model/Square Law HDL']);

The model computes the square-law envelope of the complex input signal.

For the implementation model the square-law is designed as a deployable model, with additional pipelining registers. This model is implemented using adders and multipliers which account for a latency of 6 cycles.

Alpha HDL

The following command is used to open the Alpha HDL subsystem model

open_system([modelname '/CFAR Implementation Model/Alpha HDL']);

The Alpha HDL block utilizes the No. of Training Cells and Probability of False Alarm value to calculate the threshold factor ($$\alpha $$).

This subsystem uses Math HDL blocks and works in single precision for the native floating-point division operation which is then converted to fixed-point at the output. This block accounts for a pipeline latency of 7 Cycles.


%The following command is used to open the CFAR Core subsystem model
open_system([modelname '/CFAR Implementation Model/CFAR Core']);

This subsystem extracts the training cells from the input stream and calculates the noise power. The noise power is then multiplied by alpha to generate the threshold which is later used in the detection process. There are two outputs from this block, namely, threshold and detection.

The threshold is the direct output of the product block whereas the detection output is provided through a comparator block which compares the threshold and signal value of CUT and returns true if the CUT signal is greater than the threshold.

This subsystem accounts for an input streaming delay of 102 (2*NumberTrainingCells + NoGuardCells) clock cycles with an additional pipelining latency of another 23 cycles. The total HDL latency is 125 cycles.

This block contains the following subsystems:

  1. Training HDL

  2. CUT HDL

Training HDL

The following command is used to open the Training HDL subsystem model.

open_system([modelname '/CFAR Implementation Model/CFAR Core/Training HDL']);

The lead training HDL subsystem extracts the lead cells of the CUT and performs a running sum. At the same time the lag training HDL subsystem pulls out the lag cells of the CUT and performs a running sum with a latency of 8 cycles each. The implementation of the lead training HDL and lag training HDL is very much analogous to a moving average filter the difference is that instead of the average we use the sum of window elements.

The CA noise power HDL subsystem sums up the lead power value and the lag power value and estimates the average noise power by dividing the sum by 100 (2*NoTrainingCells). This blocks accounts for a delay of 3 clock cycles.

The output of the training HDL subsystem is the noise power which is used to calculate the threshold.


The following command is used to open the CUT HDL subsystem model

open_system([modelname '/CFAR Implementation Model/CFAR Core/CUT HDL']);

This subsystem uses a single delay block with a delay of 102 (2*NumberTrainingCells + NoGuardCells) cycles to time-align the CUT with the generated threshold value from the training HDL block. The above delay is the minimum delay required before which the CFAR Detector can detect the target at the first cell.


The following command is used to open the validate subsystem model

open_system([modelname '/CFAR Implementation Model/Validate']);

The valid input along with the latency is used to check the validity of the output. When the output is not valid this subsystem sends zero to the output.

Comparing the Results of Implementation Model to the Behavioral Model

The model can be simulated by clicking the Play button or using the sim command as shown below,


The Scope blocks are used to compare the output of implementation and behavioral model. The scope displays the detection and threshold from both the behavioral and implementation model and an additional scope displays the calculated the error.

The implementation model has a data streaming latency of 102 cycles and pipelining latency of 23 cycles. This in total accounts for an overall latency of 125 cycles. To time align the behavioral model with the implementation model we use an additional delay of 125 to the output of behavioral model.

With the 24 bit fixed-point of fraction length 12 bits, we have an error bounded by approximately 0.006 between the behavioral model and the implementation model threshold. Since the detection is boolean we have no significant error in the detection output.

Code Generation and Verification

This section covers the procedure to perform HDL codegeneration for the implementation model. It also covers the verification that the generated code is functionally correct. The behavioral model provides the reference values to validate the output from HDL model.

If you start with a new model, you can run hdlsetup (HDL Coder™) to configure the Simulink model for HDL code generation. To configure the Simulink model for test bench creation, open Simulink's Model Settings , select Test Bench under HDL Code Generation in the left panel, and check HDL test bench and Co-simulation model in the Test Bench Generation Output properties group.

Model Settings

After the fixed-point implementation is verified and the implementation model produces the same results as your floating-point, behavioral model, you can generate HDL code and test bench. For code generation and test bench, set the HDL Code Generation parameters in the Configuration Parameters dialog. The following parameters in Model Settings are set under HDL Code Generation:

  • Target: Xilinx Vivado synthesis tool; Virtex7 family; Device xc7vx485t; package ffg1761, speed -1; and target frequency of 300 MHz.

  • Optimization: Uncheck all optimizations

  • Global Settings: Set the Reset type to Asynchronous

  • Test Bench: Select HDL test bench, Co-simulation model and System Verilog DPI test bench

HDL Code Verification via Co-Simulation

After the model is set up, HDL Workflow advisor can be invoked to generate the HDL code using the HDL Coder™ also use the HDL Verifier™ to generate a System Verilog DPI Test Bench to test the model. To invoke HDL Workflow advisor right-click on the CFAR Implementation model subsystem and navigate to HDL Code and left-click HDL Workflow advisor. Instead of using HDL Workflow advisor the following lines of code can also be used to generate HDL code and System Verilog Test Bench.

% Uncomment the following two lines to generate HDL code and test bench.
% makehdl([modelname '/CFAR Implementation Model']);   % Generate HDL code
% makehdltb([modelname '/CFAR Implementation Model ']); % Generate Cosimulation test bench

Since all the optimizations are unchecked we do not have to add extra delays to the behavioral output other than the HDL latency previously added. (This is because all the critical paths are manually pipelined in the implementation model).

After generating HDL code and test bench a new Simulink model named gm_<modelname>_mq containing a ModelSim® Simulator block is created in your working directory, which looks like this:

% To open the test bench model, uncomment the following lines of code
% modelname = ['gm_',modelname,'_mq'];
% open_system(modelname);

Launch ModelSim® and run the co-simulation model to display the simulation results. You can click on the Play button on the top of Simulink canvas to run the test bench or you can do it via command window from the code below

% Uncomment the following line, to run the test bench.
% sim(modelname);

The Simulink® test bench model will populate the QuestaSim® with the HDL model's signal and Time Scopes in Simulink®.

The Simulink® scope shows detection output and threshold output for both the co-simulation and Design Under Test (DUT) as well as the error between them. The scopes comparing the results of the co-simulation can be found in test bench model inside the Compare subsystem, which is at the output of the CFAR_HDL_mq subsystem.

Fixed-Point Word Length and Fraction Length Tradeoffs

For this example a Fixed-Point word length of 24 bit and a fraction length of 12 bit is used for simulation, and implementation. The following figures show the trade-off with choosing a longer fraction length which would increase the precision (reduces the Error) but also increases the resource utilization.

The following plot shows fraction length associated with chosen word length

The following plot shows the Precision with respect to chosen word length

The following plot shows the Error with respect to chosen word length (Precision)

The following plots show the LUT/Registers/DSP Utilization with respect to chosen Word Length


This example demonstrated how to design a Simulink model for a Cell Averaging Constant False Alarm Rate(CA CFAR) Detector, verify the results with an equivalent behavioral setup from the Phased Array System Toolbox™. This example demonstrates how to automatically generate HDL code for a fixed-point equivalent algorithm and verify the generated code in Simulink®. The generated HDL code as well as a co-simulation test bench for the Simulink subsystem was created with blocks that support HDL code generation. This example showed how to setup and launch ModelSim to cosimulate the HDL code and compare its output to the output generated by the HDL implementation model. The cosimulation is performed via ModelSim® for the HDL code and compare results to the output generated by the HDL model.