Optimize speed or area of generated HDL code

Speed Optimization

`AddPipelineRegisters` — Optimize clock rate with pipeline registers
`'off'` (default) | `'on'`

Optimize clock rate with pipeline registers, specified as 'off' or 'on'. You cannot use this property with fully serial or cascade serial filters. When you set this property to 'on', the coder adds pipeline registers between filter computation stages. Although the registers add to the overall filter latency, they provide significant improvements to the clock rate.

Filter Type	Location of Added Pipeline Register
FIR transposed	Between coefficient multipliers and adders
Direct form FIR, antisymmetric FIR, and symmetric FIR	Between levels of a tree-based final adder For an alternative tree-based summation technique, see also the property `FIRAdderStyle`.
IIR	Between sections
CIC	Between comb sections

For more details, see Optimizing the Clock Rate with Pipeline Registers.

`FIRAdderStyle` — Optimize clock rate with summation technique
`'linear'` (default) | `'tree'` | `'pipelined'`

Optimize clock rate with summation technique, specified as 'linear', 'tree', or 'pipelined'. This property applies only to direct form FIR, antisymmetric FIR, and symmetric FIR filters. You cannot use this property with fully serial or cascade serial filters. When you set this property to 'tree', the coder creates a final adder that performs pairwise addition on successive products that execute in parallel, rather than sequentially. When you set this property to 'pipelined', the coder creates a tree-based final adder with pipeline registers between the levels of the tree.

For more details, see Optimizing Final Summation for FIR Filters.

Dependencies

This property applies only when the AddPipelineRegisters property is set to 'off'.

`AddInputRegister` — Extra input register
`'on'` (default) | `'off'`

Extra input register, specified as 'on' or 'off'. When this property is set to 'on', the coder generates a signal named input_register and includes a process statement that controls the register. If the incurred latency is a concern, or if the filter is incorporated into a code that has an existing input register, set this property to 'off'. For more details, see Specifying or Suppressing Registered Input and Output.

`AddOutputRegister` — Extra output register
`'on'` (default) | `'off'`

Extra output register, specified as 'on' or 'off'. When this property is set to 'on', the coder generates a signal named output_register and includes a process statement that controls the register. If the incurred latency is a concern, or if the filter is incorporated into a code that has an existing output register, set this property to 'off'. For more details, see Specifying or Suppressing Registered Input and Output.

`MultiplierInputPipeline` — Number of pipeline stages on multiplier inputs
`0` (default) | nonnegative integer

Number of pipeline stages on multiplier inputs, specified as a nonnegative integer. This property applies only to FIR filters. Multiplier pipelining can significantly increase clock rates. For more details, see Multiplier Input and Output Pipelining for FIR Filters.

Dependencies

To enable this property, set CoeffMultipliers to 'multipliers'.

`MultiplierOutputPipeline` — Number of pipeline stages on multiplier outputs
`0` (default) | nonnegative integer

Number of pipeline stages on multiplier outputs, specified as a nonnegative integer. This property applies only to FIR filters. Multiplier pipelining can significantly increase clock rates. For more details, see Multiplier Input and Output Pipelining for FIR Filters.

Dependencies

To enable this property, set CoeffMultipliers to 'multipliers'.

Area Optimization

expand all

`OptimizeForHDL` — HDL code optimization
`'off'` (default) | `'on'`

HDL code optimization, specified as 'off' or 'on'. By default, the coder generates the literal implementation of the filter with numeric behaviour that matches the filter object exactly. This implementation is not necessarily an optimal HDL implementation. When this property is set to 'on', the coder reduces the area of the hardware implementation and optimizes data types and quantization effects. For more details about the underlying tradeoffs, see Optimize for HDL.

`CoeffMultipliers` — Implementation of coefficient multiplications
`'multiplier'` (default) | `'csd'` | `'factored-csd'`

Implementation of coefficient multiplications, specified as 'multiplier', 'csd', or 'factored-csd'. You cannot use this property with multirate or serial filters.

'multiplier' — The coder retains multiplier logic in the generated HDL code.
'csd' or 'factored-csd'— The coder implements multiplication using canonical signed digit (CSD) logic. The CSD technique replaces multipliers with shift and add logic. This technique also minimizes the number of adders used for constant multiplication by representing binary numbers with a minimum count of nonzero digits. This optimization decreases the area used by the filter while maintaining or increasing clock speed.
'factored-csd' — The coder implements multiplication using factored CSD logic. Factored CSD replaces multiplier operations with shift and add operations on prime factors of the coefficients. This option achieves a greater area reduction than CSD, at the cost of decreasing clock speed.

For more details, see CSD Optimizations for Coefficient Multipliers.

`SerialPartition` — Partitions for serial filter architectures
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | cell array of serial partitions

Partitions for serial filter architectures, specified as one of the following:

-1 — The coder generates a fully parallel architecture. This architecture is equivalent to a serial partition defined as a vector of ones of the size of the effective filter length.
Effective filter length — The coder generates a fully serial architecture.
[p1 p2 ... pN] — The coder generates a partly serial architecture with N partitions. The integers in the vector specify the length of each partition. The sum of the vector elements must be equal to the effective filter length. To reduce the area further, you can generate a cascade-serial architecture by enabling the ReuseAccum property. For some examples, see Generate Serial Partitions for FIR Filter.
Cell array of serial partitions — The coder generates partitions for each filter stage in a cascaded filter. Specify the partitions for each filter stage as -1, the effective filter length, or a vector of integers. The elements of each vector must sum to the effective filter length of the associated filter in the cascade. For an example, see Generate Serial Partitions of Cascaded Filter.
When the serial partition of a filter stage is set to -1, you can specify a LUT partition for that stage by using the DALUTPartition and DARadix properties. For more details, see Architecture Options for Cascaded Filters.

You cannot use this property with IIR SOS filters. To generate serial architectures for IIR SOS filters, use the FoldingFactor or NumMultipliers properties instead.

Use this table as a guide for calculating the effective filter length. Alternatively, you can use the hdlfilterserialinfo function to display the effective filter length and possible partitions for a filter.

Filter Type	Effective Filter Length Calculation
Direct form	`FL = length(find(filt.Numerator~= 0))`
Direct form symmetric	`FL = ceil(length(find(filt.Numerator~= 0))/2)`
Direct form antisymmetric	`FL = ceil(length(find(filt.Numerator~= 0))/2)`

For more details, see Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

For an overview of parallel and serial architectures and a list of filter types supported for each architecture, see Speed vs. Area Tradeoffs.

`ReuseAccum` — Accumulator reuse for cascade-serial architecture
`'off'` (default) | `'on'`

Accumulator reuse for cascade-serial architecture, specified as 'off' or 'on'. When this property is set to 'on', the coder groups filter taps into several serial partitions. The accumulated output of each partition is cascaded to the accumulator of the previous partition. The output of the partitions is therefore computed at the accumulator of the first partition. This technique, called accumulator reuse, saves chip area. If the property SerialPartition is not defined, the coder generates an optimal partition. For more details, see Specifying Speed vs. Area Tradeoffs via generatehdl Properties.

For an overview of parallel and serial architectures and a list of filter types supported for each architecture, see Speed vs. Area Tradeoffs.

`DALUTPartition` — Lookup table partitions for distributed arithmetic
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | `{p1 p2 ... pN; q1 q2 ... qN; ... }` | cell array of DALUT partitions

Lookup table (LUT) partitions for distributed arithmetic (DA), specified as one of the following:

-1 — The coder generates a fully parallel architecture.
Effective filter length — The coder generates a DA implementation without LUT partitioning.
[p1 p2 ... pN] — The coder generates a DA implementation with N LUT partitions. The integers in the vector specify the size of each partition. The maximum size for an individual partition is 12. The sum of the vector elements must be equal to the effective filter length. For multirate filters, each polyphase subfilter uses the same LUT partitions. For an example, see Distributed Arithmetic for Single Rate Filters.
{p1 p2 ... pN; q1 q2 ... qN; ... } — The coder generates a DA implementation with N unique LUT partitions for each polyphase subfilter of a multirate filter. Each row of the matrix specifies the partitions for one subfilter. The elements in each row must sum to the associated subfilter length, FLi. For an example, see Distributed Arithmetic for Multirate Filters.
Cell array of DALUT partitions — The coder generates DA implementation with different LUT partitions for each filter stage of the cascade. Specify the LUT partitions for each filter stage as -1, the effective filter length, or a vector of integers. The elements of each vector must sum to the effective filter length of the associated filter in the cascade. For an example, see Distributed Arithmetic for Cascaded Filters.
When the LUT partition of a filter stage is set to -1, you can specify a serial partition for that stage by using the SerialPartition property. For more details, see Architecture Options for Cascaded Filters.

Use this table as a guide for calculating the effective filter length. Alternatively, you can use the hdlfilterdainfo function to display the effective filter length, LUT partitioning options, and possible DARadix values for the filter.

Filter Type	Effective Filter Length Calculation
Direct form	`FL = length(find(filt.Numerator~= 0))`
Direct form symmetric	`FL = ceil(length(find(filt.Numerator~= 0))/2)`
Direct form antisymmetric	`FL = ceil(length(find(filt.Numerator~= 0))/2)`
Multirate with uniform LUT partitions for each polyphase subfilter	`FL = size(polyphase(filt),2)`
Multirate with unique LUT partitions for each polyphase subfilter	`p = polyphase(filt) FLi = length(find(p(i,:)))`, where `i` is the index to the `i`th row of the polyphase matrix of the filter. The `i`th row of the matrix `p` represents the `i`th subfilter.

For more details, see Distributed Arithmetic for FIR Filters.

`DARadix` — Number of bits processed simultaneously in distributed arithmetic
`2` (default) | `2^N` | `{2^N,2^M,...}`

Number of bits processed simultaneously in distributed arithmetic (DA), specified as 2, 2^N, or {2^N,2^M,...} where:

N > 0
mod(W,N) = 0, where W is the input word size of the filter
2^N <= 2^W

This property specifies a degree of parallelism in the DA architecture which can improve clock speed at the expense of area.

2¹ — The coder implements a fully serial DA architecture that processes 1 bit at a time.
2^N — The coder generates a partly serial DA architecture when 1 < N < W.
2^W — The coder generates a fully parallel DA architecture.
{2^N,2^M,...} — The coder generates a DA implementation with different DARadix values for each filter stage in a cascaded filter. For an example, see Distributed Arithmetic for Cascaded Filters.
When the DARadix value of a filter stage is set to 2, you can specify a serial architecture for that stage by using the SerialPartition property. For more details, see Architecture Options for Cascaded Filters.

For more details, see Distributed Arithmetic for FIR Filters.

`FoldingFactor` — Folding factor for IIR filter
`1` (default) | positive integer

Folding factor for IIR filter, specified as 1 or a positive integer. Use this property to define a serial architecture for direct form I or direct form II SOS filters. To reduce area in a serial architecture implementation, you can share multipliers at the cost of latency. The folding factor specifies the factor by which the clock rate increases in response to area optimization.

You can specify either the FoldingFactor property or the NumMultipliers property, but not both. If you do not specify either property, the coder generates a fully parallel architecture.

For an example, see Generate Serial Architectures for IIR Filter. To obtain information about the FoldingFactor options and the corresponding NumMultipliers, call the hdlfilterserialinfo function.

`NumMultipliers` — Number of shared multipliers for IIR filter
positive integer

Number of shared multipliers for IIR filter, specified as a positive integer. Use this property to define a serial architecture for direct form I or direct form II SOS filters. Shared multipliers reduce area at the cost of an increased clock rate.

You can specify either the NumMultipliers property or the FoldingFactor property, but not both. If you do not specify either property, the coder generates a fully parallel architecture.

For an example, see Generate Serial Architectures for IIR Filter. To obtain information about the NumMultipliers options and the corresponding FoldingFactor, call the hdlfilterserialinfo function.

Tips

If you use the fdhdltool function to generate HDL code, you can set the corresponding properties in the Generate HDL dialog box.

Property	Location in Dialog Box
Add input register	Global Settings tab > Ports tab
Add output register
Additional optimization properties	Filter Architecture tab See also: Select Architectures in the Generate HDL Tool Distributed Arithmetic Options in the Generate HDL Tool

Property

Location in Dialog Box

Add input register

Global Settings tab > Ports tab

Add output register

Additional optimization properties

Filter Architecture tab

HDL Optimization Properties

Speed Optimization

`AddPipelineRegisters` — Optimize clock rate with pipeline registers
`'off'` (default) | `'on'`

`FIRAdderStyle` — Optimize clock rate with summation technique
`'linear'` (default) | `'tree'` | `'pipelined'`

Dependencies

`AddInputRegister` — Extra input register
`'on'` (default) | `'off'`

`AddOutputRegister` — Extra output register
`'on'` (default) | `'off'`

`MultiplierInputPipeline` — Number of pipeline stages on multiplier inputs
`0` (default) | nonnegative integer

Dependencies

`MultiplierOutputPipeline` — Number of pipeline stages on multiplier outputs
`0` (default) | nonnegative integer

Dependencies

Area Optimization

`OptimizeForHDL` — HDL code optimization
`'off'` (default) | `'on'`

`CoeffMultipliers` — Implementation of coefficient multiplications
`'multiplier'` (default) | `'csd'` | `'factored-csd'`

`SerialPartition` — Partitions for serial filter architectures
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | cell array of serial partitions

`ReuseAccum` — Accumulator reuse for cascade-serial architecture
`'off'` (default) | `'on'`

`DALUTPartition` — Lookup table partitions for distributed arithmetic
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | `{p1 p2 ... pN; q1 q2 ... qN; ... }` | cell array of DALUT partitions

`DARadix` — Number of bits processed simultaneously in distributed arithmetic
`2` (default) | `2^N` | `{2^N,2^M,...}`

`FoldingFactor` — Folding factor for IIR filter
`1` (default) | positive integer

`NumMultipliers` — Number of shared multipliers for IIR filter
positive integer

Tips

See Also

Topics

HDL Optimization Properties

Speed Optimization

AddPipelineRegisters — Optimize clock rate with pipeline registers 'off' (default) | 'on'

FIRAdderStyle — Optimize clock rate with summation technique 'linear' (default) | 'tree' | 'pipelined'

Dependencies

AddInputRegister — Extra input register 'on' (default) | 'off'

AddOutputRegister — Extra output register 'on' (default) | 'off'

MultiplierInputPipeline — Number of pipeline stages on multiplier inputs 0 (default) | nonnegative integer

Dependencies

MultiplierOutputPipeline — Number of pipeline stages on multiplier outputs 0 (default) | nonnegative integer

Dependencies

Area Optimization

OptimizeForHDL — HDL code optimization 'off' (default) | 'on'

CoeffMultipliers — Implementation of coefficient multiplications 'multiplier' (default) | 'csd' | 'factored-csd'

SerialPartition — Partitions for serial filter architectures -1 (default) | effective filter length | [p1 p2 ... pN] | cell array of serial partitions

ReuseAccum — Accumulator reuse for cascade-serial architecture 'off' (default) | 'on'

DALUTPartition — Lookup table partitions for distributed arithmetic -1 (default) | effective filter length | [p1 p2 ... pN] | {p1 p2 ... pN; q1 q2 ... qN; ... } | cell array of DALUT partitions

DARadix — Number of bits processed simultaneously in distributed arithmetic 2 (default) | 2N | {2N,2M,...}

FoldingFactor — Folding factor for IIR filter 1 (default) | positive integer

NumMultipliers — Number of shared multipliers for IIR filter positive integer

Tips

See Also

Topics

`AddPipelineRegisters` — Optimize clock rate with pipeline registers
`'off'` (default) | `'on'`

`FIRAdderStyle` — Optimize clock rate with summation technique
`'linear'` (default) | `'tree'` | `'pipelined'`

`AddInputRegister` — Extra input register
`'on'` (default) | `'off'`

`AddOutputRegister` — Extra output register
`'on'` (default) | `'off'`

`MultiplierInputPipeline` — Number of pipeline stages on multiplier inputs
`0` (default) | nonnegative integer

`MultiplierOutputPipeline` — Number of pipeline stages on multiplier outputs
`0` (default) | nonnegative integer

`OptimizeForHDL` — HDL code optimization
`'off'` (default) | `'on'`

`CoeffMultipliers` — Implementation of coefficient multiplications
`'multiplier'` (default) | `'csd'` | `'factored-csd'`

`SerialPartition` — Partitions for serial filter architectures
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | cell array of serial partitions

`ReuseAccum` — Accumulator reuse for cascade-serial architecture
`'off'` (default) | `'on'`

`DALUTPartition` — Lookup table partitions for distributed arithmetic
`-1` (default) | effective filter length | `[p1 p2 ... pN]` | `{p1 p2 ... pN; q1 q2 ... qN; ... }` | cell array of DALUT partitions

`DARadix` — Number of bits processed simultaneously in distributed arithmetic
`2` (default) | `2^N` | `{2^N,2^M,...}`

`FoldingFactor` — Folding factor for IIR filter
`1` (default) | positive integer

`NumMultipliers` — Number of shared multipliers for IIR filter
positive integer