Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

IFFT HDL Optimized

Inverse fast Fourier transform—optimized for HDL code generation

Library

DSP System Toolbox/Transforms

dspxfrm3

Description

The IFFT HDL Optimized block provides two architectures to optimize either throughput or area. Use the streaming Radix 2^2 architecture for high-throughput applications. This architecture supports scalar or vector input data. You can achieve giga-sample-per-second (GSPS) throughput using vector input. Use the burst Radix 2 architecture for a minimum resource implementation, especially with large FFT sizes. Your system must be able to tolerate bursty data and higher latency. This architecture supports only scalar input data. The block accepts real or complex data, provides hardware-friendly control signals, and has optional output frame control signals.

Signal Attributes

This IFFT HDL Optimized block icon shows all optional ports available.

PortDirectionDescriptionData Type
dataInInputScalar or column vector of real or complex input data. Vector input is supported with Streaming Radix 2^2 architecture only. The vector size must be a power of 2 from 1 through 64 that is not greater than the FFT length.

  • fixdt()

  • int64/32/16/8

  • uint64/32/16/8

double/single are allowed for simulation but not for HDL code generation.

validInInputIndicates that the input data is valid. When validIn is true, the block captures the value on dataIn. boolean
resetInput Optional. Reset internal state. When reset is true, the block stops the current calculation and clears all internal state. The block begins fresh calculations when reset is false and validIn starts a new frame.boolean
dataOutOutputFrequency channel output data. The output order is bit reversed by default. Same as dataIn. If scaling is disabled, the output word length grows to avoid overflow. See the Divide butterfly outputs by two parameter.
validOutOutputIndicates that the output data is valid. The block sets validOut to true with each valid sample on dataOut. boolean
readyOutputThis port appears when you select the burst architecture. Indicates when the block has memory available for new input data.boolean
startOutOutputOptional. When this port is enabled, the block sets startOut to true during the first valid cycle of a frame of output data.boolean
endOutOutputOptional. When this port is enabled, the block sets endOut to true during the last valid cycle of a frame of output data.boolean

Parameters

Main

FFT length

Specify the number of data points used for one IFFT calculation. The default value is 1024. For HDL code generation, the FFT length must be a power of 2 between 23 and 216.

Architecture
  • Streaming Radix 2^2 (default) — Low-latency architecture. Supports giga-sample-per-second (GSPS) throughput when you use vector input.

  • Burst Radix 2 — Minimum resource architecture. Vector input is not supported when you select this architecture.

For details of both architectures, see Algorithm.

Complex Multiplication

Select the HDL implementation of complex multipliers. Each multiplication is implemented with either 3 multipliers and 5 adders, or 4 multipliers and 2 adders. The faster or smaller option depends on your synthesis tool and target device. This option applies only when you set Architecture to Streaming Radix 2^2.

Output in bit-reversed order

When you select this check box, the output elements are bit reversed relative to the input order. Clear the check box to output elements in linear order. By default, the check box is selected. The IFFT algorithm calculates output in the reverse order to the input. If you specify the output to be in the same order as the input, the algorithm performs an extra reversal operation. For vector data, input and output data must be in opposite orders, so select only one of Output in bit-reversed order or Input in bit-reversed order. For more information, see Linear and Bit-Reversed Output Order.

Input in bit-reversed order

When you select this check box, the block expects input data in bit-reversed order. By default, the check box is cleared and input is expected in linear order. The IFFT algorithm calculates output in the reverse order to the input. If you specify the output to be in the same order as the input, the algorithm performs an extra reversal operation. For vector data, input and output data must be in opposite orders, so select only one of Output in bit-reversed order or Input in bit-reversed order. For more information, see Linear and Bit-Reversed Output Order.

Divide butterfly outputs by two

When you select this check box, the block implements an overall 1/N scale factor by scaling the output of each butterfly multiplication by 2. This adjustment keeps the output of the IFFT in the same amplitude range as its input. If scaling is disabled, the block avoids overflow by increasing the word length by one bit after each butterfly multiplication. The bit growth is the same for both architectures. By default, the check box is selected.

Data Types

Rounding Method

The default rounding method for internal fixed point calculations is Floor. When the input is any integer or fixed-point data type, the IFFT algorithm uses fixed-point arithmetic for internal calculations. This option does not apply when the input is single or double type. Rounding applies to twiddle factor multiplication and scaling operations.

Control Ports

Enable reset input port

Select this check box to enable the reset port. When reset is true, the block stops the current calculation and clears all internal state. The block begins fresh calculations when reset is false and validIn starts a new frame. By default, the check box is not selected.

Enable start output port

Select this check box to enable the startOut port. This ral> port is present on the block icon, and this output signal is asserted (true) for the first cycle of an output frame. By default, the check box is not selected.

Enable end output port

Select this check box to enable the endOut port. Thisl> port is present on the block icon, and this output signal is asserted (true) for the last cycle of an output frame. By default, the check box is not selected.

Algorithm

Streaming Radix 2^2

The streaming Radix 2^2 architecture implements a low-latency architecture. It saves resources compared to a streaming Radix 2 implementation by factoring and grouping the FFT equation. The architecture has log4(N) stages. Each stage contains two single-path delay feedback (SDF) butterflies with memory controllers. When you use vector input, each stage operates on fewer input samples, so some stages reduce to a simple butterfly, without SDF.

The first SDF stage is a regular butterfly. The second stage multiplies by –j by swapping the real and imaginary parts of the input, and swapping the imaginary parts of the output. Each stage rounds the result of the twiddle factor multiplication to the input word length. The twiddle factors have the same bit width as the input data. They use two integer bits, and the remainder are fractional bits.

If you enable scaling, the algorithm divides the result of each butterfly stage by 2. Scaling at each stage avoids overflow, keeps the word length the same as the input, and results in an overall scale factor of 1/N. If scaling is disabled, the algorithm avoids overflow by increasing the word length by 1 bit at each stage. The diagram shows the butterflies and internal word lengths of each stage, not including the memory.

Burst Radix 2

The burst Radix 2 architecture implements the FFT by using a single complex butterfly multiplier. The algorithm cannot start until it has stored the entire input frame, and it cannot accept the next frame until computations are complete. The ready output signal indicates when the algorithm is ready for new data. The diagram shows the burst architecture, with pipeline registers.

Control Signals

The algorithm processes input data only when validIn is high. Output data is valid only when validOut is high.

When the optional reset input signal is high, the algorithm stops the current calculation and clears all internal state. The algorithm begins fresh calculations when reset is low and validIn starts a new frame.

Timing Diagram

This diagram shows validIn and validOut signals for contiguous scalar input data, streaming Radix 2^2 architecture, an FFT length of 1024, and a vector size of 16.

The diagram also shows the optional startOut and endOut signals that indicate frame boundaries. If you enable startOut, it pulses for one cycle with the first validOut of the frame. If you enable endOut, it pulses for one cycle with the last validOut of the frame.

If you apply continuous input frames, the output will also be continuous, after the initial latency.

The validIn signal can be noncontiguous. Data accompanied by a validIn signal is processed as it arrives, and the output is stored until a frame is filled. Then the algorithm returns contiguous output samples in a frame of N (FFT length) cycles. This diagram shows noncontiguous input and contiguous output for an FFT length of 512 and a vector size of 16.

When you use the burst architecture, you cannot provide the next frame of input data until memory space is available. The ready signal indicates when the algorithm can accept new input data.

Latency

The latency varies with the FFT length and input vector size. After you update the model, the block icon displays the latency. The displayed latency is the number of cycles between the first valid input and the first valid output, assuming the input is contiguous.

When using the burst architecture with contiguous input, if your design waits for ready=0 before deasserting validIn, then one extra cycle of data arrives at the input. This data sample is the first sample of the next frame. The algorithm can save one sample while processing the current frame. Due to this one sample advance, the observed latency of the later frames (validIn to validOut) is one cycle shorter than the reported latency. The number of cycles between ready low and validOut high is always latencyFFTLength.

HDL Code Generation

This block supports HDL code generation using HDL Coder™. HDL Coder provides additional configuration options that affect HDL implementation and synthesized logic. For more information on implementations, properties, and restrictions for HDL code generation, see IFFT HDL Optimized in the HDL Coder documentation.

Performance

These resource and performance data are the synthesis results from the generated HDL targeted to a Xilinx® Virtex®-6 (XC6VLX75T-1FF484) FPGA. The examples in the tables have this configuration:

  • 1024 FFT length (default)

  • Complex multiplication using 4 multipliers, 2 adders

  • Output scaling enabled

  • 16-bit complex input data

  • Clock enables minimized (HDL Coder parameter)

Performance of the synthesized HDL code varies with your target and synthesis options. For instance, natural-order output uses more RAM than bit-reversed output, and real input uses less RAM than complex input.

For a scalar input Radix 2^2 configuration, the design achieves 326 MHz clock frequency. The latency is 1116 cycles. The design uses these resources.

ResourceNumber Used
LUT4597
FFS5353

Xilinx LogiCORE® DSP48

12
Block RAM (16K)6

When you vectorize the same Radix 2^2 implementation to process two 16-bit input samples in parallel, the design achieves 316 MHz clock frequency. The latency is 600 cycles. The design uses these resources.

ResourceNumber Used
LUT7653
FFS9322

Xilinx LogiCORE DSP48

24
Block RAM (16K)8

The burst Radix 2 implementation is supported with scalar input data only. The burst design achieves 309 MHz clock frequency. The latency is 5811 cycles. The design uses these resources.

ResourceNumber Used
LUT971
FFS1254

Xilinx LogiCORE DSP48

3
Block RAM (16K)6

See Also

Blocks

System Objects

Introduced in R2014a

Was this topic helpful?