Documentation

Resource Sharing For Area Optimization

This example shows how to use the subsystem level sharing optimization in HDL Coder.

Introduction

Sharing is a subsystem-level optimization supported by HDL Coder for implementing area-efficient hardware.

By default, the coder implements hardware that is a 1-to-1 mapping of Simulink blocks to hardware module implementations. The resource sharing optimization enables users to share hardware resources by enabling an N-to-1 mapping of 'N' functionally-equivalent Simulink blocks to a single hardware module. The user specifies 'N' using the 'SharingFactor' implementation parameter.

Since a time-multiplexed architecture incurs a longer latency to complete the operation, HDL Coder™ automatically manages the timing discrepancy depending on what resources are being shared. Suppose that the shared resources are operating at the base sample rate, then resource sharing is implemented as a local multi-rate architecture, which is described in this example. If the shared resources are operating at a slower sample rate than the base sample rate, then HDL Coder™ invokes clock rate pipelining to synthesizes an implementation that utilizes the latency budget defined in the rate differential. In this case, the resource shared architecture is a single rate implementation and takes multiple time steps to complete all the shared operations. The single rate scheduler example describes the details of this implementation.

The rest of this example illustrates the local multi-rate architecture of resource sharing. Consider the following symmetric FIR filter model. It contains 4 product blocks that are functionally equivalent and which map to 4 multipliers in hardware. The Resource Utilization Report lists the hardware resources used.

bdclose all;
load_system('sfir_fixed');
open_system('sfir_fixed/symmetric_fir');
hdlset_param('sfir_fixed', 'ResourceReport', 'on');
makehdl('sfir_fixed/symmetric_fir');
### Generating HDL for 'sfir_fixed/symmetric_fir'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('sfir_fixed', { 'HDL Code Generation' } )">sfir_fixed</a> for HDL code generation parameters.
### Starting HDL check.
### Begin VHDL Code Generation for 'sfir_fixed'.
### Working on sfir_fixed/symmetric_fir as hdlsrc/sfir_fixed/symmetric_fir.vhd.
### Generating HTML files for code generation report at <a href="matlab:web('/tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/sfir_fixed/html/sfir_fixed_codegen_rpt.html');">sfir_fixed_codegen_rpt.html</a>
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/sfir_fixed/symmetric_fir_report.html
### HDL check for 'sfir_fixed' complete with 0 errors, 0 warnings, and 0 messages.
### HDL code generation complete.

Sharing to Realize an N-to-1 Mapping

To reduce area resources, you can invoke the sharing optimization by setting the 'SharingFactor' parameter on the subsystem to a positive integer value. This parameter specifies 'N' in the N-to-1 hardware mapping. In this example, there are 4 product blocks, so generating HDL with 'SharingFactor' set to 4 generates HDL with 1 multiplier.

The code generation model reflects the sharing architecture. The inputs to the shared blocks are time-multiplexed over the shared resource at a faster rate (in this case 4x faster, shown in red). The outputs are then routed to the respective consumers at a slower rate (shown in green).

hdlset_param('sfir_fixed/symmetric_fir', 'SharingFactor', 4);
hdlset_param('sfir_fixed', 'GenerateValidationModel', 'on');
makehdl('sfir_fixed/symmetric_fir');
open_system('gm_sfir_fixed/symmetric_fir');
set_param('gm_sfir_fixed', 'SimulationCommand', 'update');
### Generating HDL for 'sfir_fixed/symmetric_fir'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('sfir_fixed', { 'HDL Code Generation' } )">sfir_fixed</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 1 cycles.
### Output port 1: 1 cycles.
### Generating new validation model: <a href="matlab:open_system('gm_sfir_fixed_vnl')">gm_sfir_fixed_vnl</a>.
### Validation model generation complete.
### Begin VHDL Code Generation for 'sfir_fixed'.
### MESSAGE: The design requires 4 times faster clock with respect to the base rate = 1.
### Working on symmetric_fir_tc as hdlsrc/sfir_fixed/symmetric_fir_tc.vhd.
### Working on sfir_fixed/symmetric_fir as hdlsrc/sfir_fixed/symmetric_fir.vhd.
### Generating package file hdlsrc/sfir_fixed/symmetric_fir_pkg.vhd.
### Generating HTML files for code generation report at <a href="matlab:web('/tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/sfir_fixed/html/sfir_fixed_codegen_rpt.html');">sfir_fixed_codegen_rpt.html</a>
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/sfir_fixed/symmetric_fir_report.html
### HDL check for 'sfir_fixed' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

The sharing optimization is implemented using time-division multiplexing. Simulink requires the outputs of the shared resource to be sampled at the predefined sample rate, so HDL Coder overclocks the shared resource at a faster rate than the data rate. In the above example, the shared architecture, which includes the shared resource, multiplexer-serializer at the inputs and demultiplexer-deserializer at the outputs, operates at 4 times the rate of the input data, because 'Sharingfactor' = 4.

Delay Balancing and Functional Equivalence

The rate transitions that implement time-multiplexing in the resource sharing architecture introduce additional latency. To maintain functional equivalence, delay balancing automatically inserts matching delays in parallel merging paths. See the Delay Balancing and Validation Model Workflow In HDL Coder™ example for more details. The generated validation model allows the user to verify functional equivalence by comparing the operation of the shared hardware architecture with the original model.

sim('gm_sfir_fixed_vnl');
open_system('gm_sfir_fixed_vnl/Compare/Assert_y_out/compare: y_out')
open_system('gm_sfir_fixed_vnl/Compare/Assert_delayed_xout/compare: delayed_xout')

Control Multiplicative Oversampling through SharingFactor

The net oversampling for the whole design is equivalent to the LCM of all 'SharingFactor' values set on the model. Consider the example, hdlcoder_uniform_oversampling.slx. It has two subsystems: subsystem 'Share3' has 3 gain blocks that can be shared and 'Share4' has 4 gain blocks that can be shared.

saved_warning_state = warning('off', 'hdlcoder:makehdl:DeprecateMaxOverSampling');
warning('off', 'hdlcoder:makehdl:DeprecateMaxComputationLatency');
bdclose('all');
load_system('hdlcoder_uniform_oversampling');
open_system('hdlcoder_uniform_oversampling/Subsystem');
set_param('hdlcoder_uniform_oversampling', 'SimulationCommand', 'update');
hdlsaveparams('hdlcoder_uniform_oversampling/Subsystem');
%% Set Model 'hdlcoder_uniform_oversampling' HDL parameters
hdlset_param('hdlcoder_uniform_oversampling', 'GenerateValidationModel', 'on');
hdlset_param('hdlcoder_uniform_oversampling', 'HDLSubsystem', 'hdlcoder_uniform_oversampling');

% Set SubSystem HDL parameters
hdlset_param('hdlcoder_uniform_oversampling/Subsystem/Share3', 'SharingFactor', 3);

% Set SubSystem HDL parameters
hdlset_param('hdlcoder_uniform_oversampling/Subsystem/Share4', 'SharingFactor', 4);

Notice that 'Share3' sets its 'SharingFactor' to 3 and 'Share4' sets its 'SharingFactor' to 4. HDL Coder™ applies local resource sharing to each subsystem and as a result, the HDL implementation requires LCM(3, 4) = 12x oversampling. This is reported in the message during HDL code generation.

makehdl('hdlcoder_uniform_oversampling/Subsystem');
### Generating HDL for 'hdlcoder_uniform_oversampling/Subsystem'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('hdlcoder_uniform_oversampling', { 'HDL Code Generation' } )">hdlcoder_uniform_oversampling</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 1 cycles.
### Output port 1: 1 cycles.
### Generating new validation model: <a href="matlab:open_system('gm_hdlcoder_uniform_oversampling_vnl')">gm_hdlcoder_uniform_oversampling_vnl</a>.
### Validation model generation complete.
### Begin VHDL Code Generation for 'hdlcoder_uniform_oversampling'.
### MESSAGE: The design requires 12 times faster clock with respect to the base rate = 0.1.
### Working on hdlcoder_uniform_oversampling/Subsystem/Share3 as hdlsrc/hdlcoder_uniform_oversampling/Share3.vhd.
### Working on hdlcoder_uniform_oversampling/Subsystem/Share4 as hdlsrc/hdlcoder_uniform_oversampling/Share4.vhd.
### Working on Subsystem_tc as hdlsrc/hdlcoder_uniform_oversampling/Subsystem_tc.vhd.
### Working on hdlcoder_uniform_oversampling/Subsystem as hdlsrc/hdlcoder_uniform_oversampling/Subsystem.vhd.
### Generating package file hdlsrc/hdlcoder_uniform_oversampling/Subsystem_pkg.vhd.
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdlcoder_uniform_oversampling/Subsystem_report.html
### HDL check for 'hdlcoder_uniform_oversampling' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

One way to circumvent this multiplicative effect of oversampling is to set the 'SharingFactor' of all subsystems to the available oversampling budget. In the above example, if the oversampling budget is only 4x, then set 'SharingFactor' = 4 for both 'Share3' and 'Share4'. In this case, HDL Coder can share fewer resources than the SharingFactor and stay idle for the remaining cycles.

hdlset_param('hdlcoder_uniform_oversampling/Subsystem/Share3', 'SharingFactor', 4);
makehdl('hdlcoder_uniform_oversampling/Subsystem');
### Generating HDL for 'hdlcoder_uniform_oversampling/Subsystem'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('hdlcoder_uniform_oversampling', { 'HDL Code Generation' } )">hdlcoder_uniform_oversampling</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 1 cycles.
### Output port 1: 1 cycles.
### Generating new validation model: <a href="matlab:open_system('gm_hdlcoder_uniform_oversampling_vnl')">gm_hdlcoder_uniform_oversampling_vnl</a>.
### Validation model generation complete.
### Begin VHDL Code Generation for 'hdlcoder_uniform_oversampling'.
### MESSAGE: The design requires 4 times faster clock with respect to the base rate = 0.1.
### Working on hdlcoder_uniform_oversampling/Subsystem/Share3 as hdlsrc/hdlcoder_uniform_oversampling/Share3.vhd.
### Working on hdlcoder_uniform_oversampling/Subsystem/Share4 as hdlsrc/hdlcoder_uniform_oversampling/Share4.vhd.
### Working on Subsystem_tc as hdlsrc/hdlcoder_uniform_oversampling/Subsystem_tc.vhd.
### Working on hdlcoder_uniform_oversampling/Subsystem as hdlsrc/hdlcoder_uniform_oversampling/Subsystem.vhd.
### Generating package file hdlsrc/hdlcoder_uniform_oversampling/Subsystem_pkg.vhd.
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdlcoder_uniform_oversampling/Subsystem_report.html
### HDL check for 'hdlcoder_uniform_oversampling' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

Notice that the oversampling factor is now LCM(4,4) = 4, and that this is the value reported during code generation. In general, it is a good idea to set the 'SharingFactor' values to the available oversampling budget. If your design contains fewer shareable resources than the 'SharingFactor' value you specify, HDL Coder shares the shareable resources available, and overclocks them by the 'SharingFactor' value. However, if you want to apply both resource sharing and other optimizations that uses overclocking, such as streaming, or apply resource sharing in multiple nested subsystems, this general guideline may result in a higher oversampling factor.

Block Support, Atomic Subsystems and Extensions

HDL Coder supports resource sharing of 4 block types: Product, Gain, Atomic Subsystem, and MATLAB Function.

Sharing functionally equivalent Product and Gain blocks means that the multipliers in the HDL implementation will be shared. Two Product blocks are functionally equivalent if: a) the data types of their inputs and outputs are identical, b) their block parameter settings are identical, and c) their HDL block properties are identical. For a Gain block, functional equivalence additionally requires that the constant value and data types are also identical. However, if the gain constant data types are identical for two Gain blocks with different gain constant values, HDL Coder can share them. Similarly, if a Gain block can be implemented as a Product block with constant input, and it has the same data types as another Product block in the design, the coder can share them.

The third block type, Atomic Subsystem, is useful for sharing functionally equivalent islands of logic encapsulated inside atomic subsystems. Two atomic subsystems are functionally equivalent and can be shared if: a) their Simulink checksums are identical (see Simulink.SubSystem.getChecksum for more details), and b) their HDL block properties are identical.

The fourth block type, MATLAB Function block, can be shared if it is stateless. A MATLAB Function block is stateless if it does not contain persistent variables, or if its persistent variables are not updated in one call to the function and read on a subsequent call. If you only want to share multipliers within a single MATLAB Function block, you can set its SharingFactor block property.

Sharing Atomic Subsystems

% The following example demonstrates an audio filtering model that applies
% the same filter on the left and right channels. By default, HDL Coder
% generates two filter modules in hardware.
bdclose all;
load_system('hdlcoder_audiofiltering');
open_system('hdlcoder_audiofiltering/Audio filter');

% The filters on the two audio channels can be shared by specifying a
% 'SharingFactor' value of 2 on the encompassing subsystem. This generates
% an architecture that uses only one filter, as shown below.
hdlset_param('hdlcoder_audiofiltering/Audio filter', 'SharingFactor', 2);
makehdl('hdlcoder_audiofiltering/Audio filter');
open_system('gm_hdlcoder_audiofiltering/Audio filter');
set_param('gm_hdlcoder_audiofiltering', 'SimulationCommand', 'update');
### Generating HDL for 'hdlcoder_audiofiltering/Audio filter'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('hdlcoder_audiofiltering', { 'HDL Code Generation' } )">hdlcoder_audiofiltering</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 1 cycles.
### Output port 1: 1 cycles.
### Begin VHDL Code Generation for 'hdlcoder_audiofiltering'.
### MESSAGE: The design requires 2 times faster clock with respect to the base rate = 0.00012207.
### Working on hdlcoder_audiofiltering/Audio filter/Filter_left as hdlsrc/hdlcoder_audiofiltering/Filter_left.vhd.
### Working on Audio filter_tc as hdlsrc/hdlcoder_audiofiltering/Audio_filter_tc.vhd.
### Working on hdlcoder_audiofiltering/Audio filter as hdlsrc/hdlcoder_audiofiltering/Audio_filter.vhd.
### Generating package file hdlsrc/hdlcoder_audiofiltering/Audio_filter_pkg.vhd.
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdlcoder_audiofiltering/Audio_filter_report.html
### HDL check for 'hdlcoder_audiofiltering' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

Opportunities Across Hierarchies

Since 'SharingFactor' is a subsystem-level parameter, different subsystems at different levels of the hierarchy can specify different sharing values. In the audio filter example, the filters on each channel use 3 gain blocks respectively.

open_system('hdlcoder_audiofiltering/Audio filter/Filter_left');

Sharing at Higher Level of Hierarchy

We can specify a 'SharingFactor' value of 2 at the top-level of the DUT to share the filters on the two channels. Additionally, we can specify a 'SharingFactor' value of 3 on each of the filter subsystems to enable sharing of the 3 gain blocks in each channel. When HDL code is now generated, notice first that the left and right filters have been shared and we have only one filter at the top-level of the hierarchy.

hdlset_param('hdlcoder_audiofiltering/Audio filter', 'SharingFactor', 2);
hdlset_param('hdlcoder_audiofiltering/Audio filter/Filter_left', 'SharingFactor', 3);
hdlset_param('hdlcoder_audiofiltering/Audio filter/Filter_right', 'SharingFactor', 3);
makehdl('hdlcoder_audiofiltering/Audio filter');
open_system('gm_hdlcoder_audiofiltering/Audio filter');
set_param('gm_hdlcoder_audiofiltering', 'SimulationCommand', 'update');
### Generating HDL for 'hdlcoder_audiofiltering/Audio filter'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('hdlcoder_audiofiltering', { 'HDL Code Generation' } )">hdlcoder_audiofiltering</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 2 cycles.
### Output port 1: 2 cycles.
### Begin VHDL Code Generation for 'hdlcoder_audiofiltering'.
### MESSAGE: The design requires 6 times faster clock with respect to the base rate = 0.00012207.
### Working on hdlcoder_audiofiltering/Audio filter/Filter_left as hdlsrc/hdlcoder_audiofiltering/Filter_left.vhd.
### Working on Audio filter_tc as hdlsrc/hdlcoder_audiofiltering/Audio_filter_tc.vhd.
### Working on hdlcoder_audiofiltering/Audio filter as hdlsrc/hdlcoder_audiofiltering/Audio_filter.vhd.
### Generating package file hdlsrc/hdlcoder_audiofiltering/Audio_filter_pkg.vhd.
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdlcoder_audiofiltering/Audio_filter_report.html
### HDL check for 'hdlcoder_audiofiltering' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

Sharing at Lower Level of Hierarchy

Additionally, the 3 gain blocks in the lower of the hierarchy (within the filter subsystem) have also been shared. The net result is that we have reduced the resource usage from 6 multipliers to just one.

open_system('gm_hdlcoder_audiofiltering/Audio filter/Filter_left');
set_param('gm_hdlcoder_audiofiltering', 'SimulationCommand', 'update');

Combining Optimizations and Sharing

Resource sharing can also be combined with other optimizations like streaming.

Consider the model below: it contains a 24-element vector datapath and 3 vector gain blocks, which will map to 72 multipliers, by default. Streaming can scalarize the vector datapath while sharing can share the 3 gain blocks.

bdclose all;
load_system('hdl_areaopt1');
open_system('hdl_areaopt1/Controller');
set_param('hdl_areaopt1', 'SimulationCommand', 'update');

Streaming and Sharing Reduce the Design to Use One Multiplier

To invoke both optimizations, we set 'StreamingFactor' to 24 and 'SharingFactor' to 3. The former will reduce the number of multipliers from 72 to 3, and the latter reduces the 3 scalar multipliers to 1.

hdlset_param('hdl_areaopt1/Controller', 'StreamingFactor', 24);
hdlset_param('hdl_areaopt1/Controller', 'SharingFactor', 3);
makehdl('hdl_areaopt1/Controller');
open_system('gm_hdl_areaopt1/Controller');
set_param('gm_hdl_areaopt1', 'SimulationCommand', 'update');
### Generating HDL for 'hdl_areaopt1/Controller'.
### Using the config set for model <a href="matlab:configset.showParameterGroup('hdl_areaopt1', { 'HDL Code Generation' } )">hdl_areaopt1</a> for HDL code generation parameters.
### Starting HDL check.
### The DUT requires an initial pipeline setup latency. Each output port experiences these additional delays.
### Output port 0: 2 cycles.
### Begin VHDL Code Generation for 'hdl_areaopt1'.
### MESSAGE: The design requires 72 times faster clock with respect to the base rate = 2.
### Working on Controller_tc as hdlsrc/hdl_areaopt1/Controller_tc.vhd.
### Working on hdl_areaopt1/Controller as hdlsrc/hdl_areaopt1/Controller.vhd.
### Generating package file hdlsrc/hdl_areaopt1/Controller_pkg.vhd.
### Generating HTML files for code generation report at <a href="matlab:web('/tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdl_areaopt1/html/hdl_areaopt1_codegen_rpt.html');">hdl_areaopt1_codegen_rpt.html</a>
### Creating HDL Code Generation Check Report file:///tmp/BR2017bd_694400_188374/publish_examples1/tpaab8996d/hdlsrc/hdl_areaopt1/Controller_report.html
### HDL check for 'hdl_areaopt1' complete with 0 errors, 0 warnings, and 1 messages.
### HDL code generation complete.

Was this topic helpful?