Documentation

Speed vs. Area Tradeoffs

Overview of Speed vs. Area Optimizations

The coder provides options that extend your control over speed vs. area tradeoffs in the realization of filter designs. To achieve the desired tradeoff, you can either specify a fully parallel architecture for generated HDL filter code, or choose one of several serial architectures. Supported architectures are described in Parallel and Serial Architectures.

The coder supports the full range of parallel and serial architecture options via properties passed in to the generatehdl command, as described in Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

Alternatively, you can use the Architecture pop-up menu on the Generate HDL dialog box to choose parallel and serial architecture options, as described in Selecting Parallel and Serial Architectures in the Generate HDL Dialog Box.

The following table summarizes the filter types that are available for parallel and serial architecture choices.

ArchitectureAvailable for Filter Types...
Fully parallel (default)Filter types that are supported for HDL code generation
Fully serial
  • dfilt.dffir

  • dfilt.dfsymfir

  • dfilt.dfasymfir

  • dfilt.df1sos

  • dfilt.df2sos

  • mfilt.firdecim

  • mfilt.firinterp

Partly serial
  • dfilt.dffir

  • dfilt.dfsymfir

  • dfilt.dfasymfir

  • dfilt.df1sos

  • dfil2.df1sos

  • mfilt.firdecim

  • mfilt.firinterp

Cascade serial
  • dfilt.dffir

  • dfilt.dfsymfir

  • dfilt.dfasymfir

    Note:   The coder also supports distributed arithmetic (DA), another highly efficient architecture for realizing filters. See Distributed Arithmetic for FIR Filters for information about how to use this architecture.)

Parallel and Serial Architectures

Fully Parallel Architecture

This option is the default selection. A fully parallel architecture uses a dedicated multiplier and adder for each filter tap; the taps execute in parallel. This type of architecture is optimal for speed. However, it requires more multipliers and adders than a serial architecture, and therefore consumes more chip area.

Serial Architectures

Serial architectures reuse hardware resources in time, saving chip area. The coder provides a range of serial architecture options. These architectures have a latency of one clock period (see Latency in Serial Architectures).

You can select from these serial architecture options:

  • Fully serial: A fully serial architecture conserves area by reusing multiplier and adder resources sequentially. For example, a four-tap filter design would use a single multiplier and adder, executing a multiply/accumulate operation once for each tap. The multiply/accumulate section of the design runs at four times the filter's input/output sample rate. This type of architecture saves area at the cost of some speed loss and higher power consumption.

    In a fully serial architecture, the system clock runs at a much higher rate than the sample rate of the filter. Thus, for a given filter design, the maximum speed achievable by a fully serial architecture will be less than that of a parallel architecture.

  • Partly serial: Partly serial architectures cover the full range of speed vs. area tradeoffs that lie between fully parallel and fully serial architectures.

    In a partly serial architecture, the filter taps are grouped into a number of serial partitions. The taps within each partition execute serially, but the partitions execute together in parallel. The outputs of the partitions are summed at the final output.

    When you select a partly serial architecture for a filter, you can define the serial partitioning in the following ways:

    • Define the serial partitions directly, as a vector of integers. Each element of the vector specifies the length of the corresponding partition.

    • Specify the desired hardware folding factor ff, an integer greater than 1. Given the folding factor, the coder computes the serial partition and the number of multipliers.

    • Specify the desired number of multipliers nmults, an integer greater than 1. Given the number of multipliers, the coder computes the serial partition and the folding factor.

    The Generate HDL dialog box lets you specify a partly serial architecture in terms of these three parameters. You can then view how a change in one parameter interacts with the other two. The coder also provides hdlfilterserialinfo , an informational function that helps you define an optimal serial partition for a filter.

  • Cascade-serial: A cascade-serial architecture closely resembles a partly serial architecture. As in a partly serial architecture, the filter taps are grouped into a number of serial partitions that execute together in parallel. However, the accumulated output of each partition cascades to the accumulator of the previous partition. The output of the partitions is therefore computed at the accumulator of the first partition. This technique is termed accumulator reuse. You do not require a final adder, which saves area.

    The cascade-serial architecture requires an extra cycle of the system clock to complete the final summation to the output. Therefore, the frequency of the system clock must be increased slightly with respect to the clock used in a noncascade partly serial architecture.

    To generate a cascade-serial architecture, you specify a partly serial architecture with accumulator reuse enabled. (See Specifying Speed vs. Area Tradeoffs via generatehdl Properties.) If you do not specify the serial partitions, the coder automatically selects an optimal partitioning.

Latency in Serial Architectures

Serialization of a filter increases the total latency of the design by one clock cycle. The serial architectures use an accumulator (an adder with a register) to sequentially add the products. An additional final register is used to store the summed result of each of the serial partitions. The operation requires an extra clock cycle.

Holding Input Data in a Valid State

Serial filters allow data to be delivered to the outputs N cycles (N >= 2) later than the inputs. Using the Hold input data between samples test bench option (or the HoldInputDataBetweenSamples CLI property), you can determine how long (in terms of clock cycles) input data values are held in a valid state, as follows:

  • When you select Hold input data between samples (the default), input data values are held in a valid state across N clock cycles.

  • When you clear Hold input data between samples, data values are held in a valid state for only one clock cycle. For the next N-1 cycles, data is in an unknown state (expressed as 'X') until the next input sample is clocked in.

The following figure shows the Test Bench pane of the Generate HDL dialog box with Hold input data between samples set to its default setting.

See also HoldInputDataBetweenSamples

Specifying Speed vs. Area Tradeoffs via generatehdl Properties

By default, generatehdl generates filter code using a fully parallel architecture. If you want to generate filter code with a fully parallel architecture, you do not have to specify this architecture explicitly.

Two properties let you specify serial architecture options when generating code via generatehdl:

  • 'SerialPartition': This property specifies the serial partitioning of the filter.

  • 'ReuseAccum': This property enables or disables accumulator reuse.

The table below summarizes how to set these properties to generate the desired architecture. The table is followed by several examples.

To Generate This
Architecture...
Set SerialPartition to...Set ReuseAccum to...
Fully parallelOmit this propertyOmit this property
Fully serialN, where N is the length of the filterNot specified, or 'off'
Partly serial[p1 p2 p3...pN] : a vector of integers having N elements, where N is the number of serial partitions. Each element of the vector specifies the length of the corresponding partition. The sum of the vector elements must be equal to the length of the filter. When you define the partitioning for a partly serial architecture, consider the following:
  • The filter length should be divided as uniformly as you can into a vector of length equal to the number of multipliers intended. For example, if your design requires a filter of length 9 with 2 multipliers, the recommended partition is [5 4]. If your design requires 3 multipliers, the recommended partition is[3 3 3] rather than some less uniform division such as [1 4 4] or [3 4 2].

  • If your design is constrained by having to compute each output value (corresponding to each input value) in an exact number N of clock cycles, use N as the largest partition size and partition the other elements as uniformly as you can. For example, if the filter length is 9 and your design requires exactly 4 cycles to compute the output, define the partition as [4 3 2]. This partition executes in 4 clock cycles, at the cost of 3 multipliers.

You can also specify a serial architecture in terms of a desired hardware folding factor, or in terms of the optimal number of multipliers. See hdlfilterserialinfo for detailed information.

'off'
Cascade-serial with explicitly specified partitioning[p1 p2 p3...pN]: a vector of integers having N elements, where N is the number of serial partitions. Each element of the vector specifies the length of the corresponding partition. The sum of the vector elements must equal the length of the filter. The values of the vector elements must appear in descending order, except that the last two elements must be equal. For example, for a filter of length 9, partitions such as[5 4] or [4 3 2] would be legal, but the partitions [3 3 3] or [3 2 4] would raise an error at code generation time. 'on'
Cascade-serial with automatically optimized partitioningOmit this property'on'

Specifying Parallel and Serial FIR Filter Architectures in generatehdl

The following examples show the use of the 'SerialPartition' and 'ResuseAccum' properties in generating code with the generatehdl function. The following examples assume that a direct-form FIR filter has been created in the workspace as follows:

Hd = design(fdesign.lowpass('N,Fc',8,.4))
Hd.arithmetic = 'fixed'

This example generates a fully parallel architecture (by default).

generatehdl(Hd, 'Name','FullyParallel')
### Starting VHDL code generation process for filter: FullyParallel
### Generating: D:\Work\test\hdlsrc\FullyParallel.vhd
### Starting generation of FullyParallel VHDL entity
### Starting generation of FullyParallel VHDL architecture
### HDL latency is 2 samples
### Successful completion of VHDL code generation process for filter: FullyParallel

This example generates a fully serial architecture. Notice that the system clock rate is nine times the filter's sample rate. Also, the HDL latency reported is one sample greater than in the previous (parallel) example.

generatehdl(Hd,'SerialPartition',9, 'Name','FullySerial')
### Starting VHDL code generation process for filter: FullySerial
### Generating: D:\Work\test\hdlsrc\FullySerial.vhd
### Starting generation of FullySerial VHDL entity
### Starting generation of FullySerial VHDL architecture
### Clock rate is 9 times the input sample rate for this architecture.
### HDL latency is 3 samples
### Successful completion of VHDL code generation process for filter: FullySerial

This example generates a partly serial architecture with three equal partitions.

generatehdl(Hd,'SerialPartition',[3 3 3], 'Name', 'PartlySerial')
### Starting VHDL code generation process for filter: PartlySerial
### Generating: D:\Work\test\hdlsrc\PartlySerial.vhd
### Starting generation of PartlySerial VHDL entity
### Starting generation of PartlySerial VHDL architecture
### Clock rate is 3 times the input sample rate for this architecture.
### HDL latency is 2 samples
### Successful completion of VHDL code generation process for filter: PartlySerial

This example generates a cascade-serial architecture with three partitions. The partitions appear in descending order of size. Notice that the clock rate is higher than that in the previous (partly serial without accumulator reuse) example.

generatehdl(Hd,'SerialPartition',[4 3 2], 'ReuseAccum', 'on','Name','CascadeSerial')
### Starting VHDL code generation process for filter: CascadeSerial
### Generating: D:\Work\test\hdlsrc\CascadeSerial.vhd
### Starting generation of CascadeSerial VHDL entity
### Starting generation of CascadeSerial VHDL architecture
### Clock rate is 5 times the input sample rate for this architecture.
### HDL latency is 3 samples
### Successful completion of VHDL code generation process for filter: CascadeSerial

This example generates a cascade-serial architecture, with the partitioning automatically determined by the coder .

generatehdl(Hd,'ReuseAccum','on', 'Name','CascadeSerial')
### Starting VHDL code generation process for filter: CascadeSerial
### Generating: D:\Work\test\hdlsrc\CascadeSerial.vhd
### Starting generation of CascadeSerial VHDL entity
### Starting generation of CascadeSerial VHDL architecture
### Clock rate is 5 times the input sample rate for this architecture.
### Serial partition # 1 has 4 inputs.
### Serial partition # 2 has 3 inputs.
### Serial partition # 3 has 2 inputs.
### HDL latency is 3 samples
### Successful completion of VHDL code generation process for filter: CascadeSerial

Serial Partitions for Cascaded Filters

    Note:   Filter Design HDL Coder™ software supports this feature for the command-line interface (generatehdl) only.

To specify serial partitioning for one or more cascade stages, use the SerialPartition property. The following example defines different serial partitions for each filter in a two-stage cascade. The partition vectors are contained within a cell array.

Hd = design(fdesign.lowpass('N,Fc',8,.4))
Hd.arithmetic = 'fixed'
Hp = design(fdesign.highpass('N,Fc',8,.4))
Hp.arithmetic = 'fixed'
Hc = cascade(Hd,Hp)
generatehdl(Hc,'SerialPartition',{[5 4],[8 1]})

    Tip   Use the hdlfilterserialinfo function to display the effective filter length and partitioning options for each filter stage of a cascade.

Serial Architectures for IIR SOS Filters

To specify a partly or fully serial architecture for an IIR SOS filter structure, specify either one of the following parameters:

  • 'FoldingFactor', ff: Specify the desired hardware folding factor ff, an integer greater than 1. Given the folding factor, the coder computes the number of multipliers.

  • ‘NumMultipliers', nmults: Specify the desired number of multipliers nmults, an integer greater than 1. Given the number of multipliers, the coder computes the folding factor.

To obtain information about the folding factor options and the corresponding number of multipliers for a filter, call the hdlfilterserialinfo function. The following example creates a Direct Form I SOS (df1sos) filter and the calls hdlfilterserialinfo.

Fs = 48e3             % Sampling frequency 
Fc = 10.8e3           % Cut-off frequency 
N = 5                 % Filter Order 
f_lp = fdesign.lowpass('n,f3db',N,Fc,Fs) 
Hd = design(f_lp,'butter','FilterStructure','df1sos') 
Hd.arithmetic = 'fixed'
hdlfilterserialinfo(Hd)
  Table of folding factors with corresponding number of multipliers for the given filter.

   | Folding Factor | Multipliers |
   --------------------------------
   |        6       |      3      |
   |        9       |      2      |
   |       18       |      1      |

The following example generates HDL code for the df1sos filter, specifying a folding factor of 18.

generatehdl(Hd, 'FoldingFactor',18)
### Starting VHDL code generation process for filter: Hd
### Starting VHDL code generation process for filter: Hd
### Generating: c:\work\hdlsrc\Hd.vhd
### Starting generation of Hd VHDL entity
### Starting generation of Hd VHDL architecture
### HDL latency is 2 samples
### Successful completion of VHDL code generation process for filter: Hd

Selecting Parallel and Serial Architectures in the Generate HDL Dialog Box

The Architecture pop-up menu, located on the Generate HDL dialog box, lets you select parallel and serial architecture options corresponding to those described in Parallel and Serial Architectures. The following topics describe the GUI options you must set for each Architecture choice.

Specifying a Fully Parallel Architecture

The default Architecture setting is Fully parallel, as shown in the following figure.

Specifying a Fully Serial Architecture

When you select the Fully serial, Architecture options, the Generate HDL dialog box displays additional information about the filters's folding factor, number of multipliers, and serial partitioning. Because these parameters are determined by the length of the filter, they display in a read-only format, as shown in the following figure.

The Generate HDL dialog box also displays a View details link. When you click on this link, the coder displays an HTML report in a separate window. The report displays an exhaustive table of folding factor, multiplier, and serial partition settings for the current filter. You can use the table to help you choose optimal settings for your design.

Specifying Partitions for a Partly Serial Architecture

When you select the Partly serial Architecture option, the Generate HDL dialog box displays additional information and data entry fields related to serial partitioning. (See the following figure.)

The Generate HDL dialog box also displays a View details link. When you click this link, the coder displays an HTML report in a separate window. The report displays an exhaustive table of folding factor, multiplier, and serial partition settings for the current filter. You can use the table to help you choose optimal settings for your design.

The Specified by drop-down menu lets you decide how you define the partly serial architecture. Select one of the following options:

  • Folding factor: The drop-down menu to the right of Folding factor contains an exhaustive list of folding factors for the filter. When you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.

  • Multipliers: The drop-down menu to the right of Multipliers contains an exhaustive list of value options for the number of multipliers for the filter. When you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.

  • Serial partition: The drop-down menu to the right of Serial partition contains an exhaustive list of serial partition options for the filter. When you select a value, the display of the current folding factor, multiplier, and serial partition settings updates.

Specifying a Cascade Serial Architecture

When you select the Cascade serial Architecture option, the Generate HDL dialog box displays the Serial partition field, as shown in the following figure.

The Specified by menu lets you define the number and size of the serial partitions according to different criteria, as described in Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

Specifying Serial Architectures for IIR SOS Filters

To specify a partly or fully serial architecture for an IIR SOS filter structure in the GUI, you set the following options:

  • Architecture: Select Fully parallel (the default), Fully serial, or Partly serial.

    If you select Partly serial, the GUI displays the Specified by drop-down menu.

  • Specified by: Select one of the following:

    • Folding factor: Specify the desired hardware folding factor ff, an integer greater than 1. Given the folding factor, the coder computes the number of multipliers.

    • Multipliers: Specify the desired number of multipliers nmults, an integer greater than 1. Given the number of multipliers, the coder computes the folding factor.

Example: Direct Form I SOS (df1sos) Filter.  The following example creates a Direct Form I SOS (df1sos) filter design and opens the GUI. The figure following the code example shows the coder options configured for a partly serial architecture specified with a Folding factor of 18.

Fs = 48e3             % Sampling frequency 
Fc = 10.8e3           % Cut-off frequency 
N = 5                 % Filter Order 
f_lp = fdesign.lowpass('n,f3db',N,Fc,Fs) 
Hd = design(f_lp,'butter','FilterStructure','df1sos') 
Hd.arithmetic = 'fixed' 
fdhdltool(Hd)

Example: Direct Form II SOS (df2sos) Filter.  The following example creates a Direct Form II SOS (df2sos) filter design using filterbuilder.

The filter is a lowpass df2sos filter with a filter order of 6. The filter arithmetic is set to Fixed-point.

On the Code Generation tab, the Generate HDL button activates the Filter Design HDL Coder GUI. The following figure shows the HDL coder options configured for this filter, using party serial architecture with a Folding factor of 9.

Specifying a Distributed Arithmetic Architecture

The Architecture pop-up menu also includes the Distributed arithmetic (DA) option. See Distributed Arithmetic for FIR Filters) for information about this architecture.

Interactions Between Architecture Options and Other HDL Options

Selecting some Architecture menu options may change or disable other options.

  • When the Fully serial option is selected, the following options are set to their default values and disabled:

    • Coefficient multipliers

    • Add pipeline registers

    • FIR adder style

  • When the Partly serial option is selected:

    • The Coefficient multipliers option is set to its default value and disabled.

    • If the filter is multirate, the Clock inputs options is set to Single and disabled.

  • When the Cascade serial option is selected, the following options are set to their default values and disabled:

    • Coefficient multipliers

    • Add pipeline registers

    • FIR adder style

Was this topic helpful?