Increase Throughput by Omitting Padding

This example uses:

This example shows how to reduce latency and save hardware resources by not adding padding pixels at the edge of each frame.

Most image filtering operations pad the image to fill in the neighborhoods for pixels at the edge of the image. Padding can help avoid border artifacts in the output image. In a hardware implementation, the padding operation uses extra resources and introduces extra latency.

Vision HDL Toolbox™ blocks that perform neighborhood processing with padding require horizontal blanking that is twice the kernel width. This behavior means that larger filter sizes result in a longer blanking requirement. Excluding the padding by setting the Padding method parameter to None enables you to use a smaller period of horizontal blanking. Without padding, the horizontal blanking requirement is independent of the image resolution and kernel size. A small number of blanking cycles are still required.

This example includes two models. The first model shows how to use this option with library blocks, and the second model demonstrates using it when constructing algorithms that use the Line Buffer block. This example also explains some design considerations when you do not use padding.

Omitting Padding with Library Blocks

This example model shows how to omit padding with a predefined algorithm from Vision HDL Toolbox libraries. This model includes an Image Filter block configured for an n-by-n blur filter and with its Padding method parameter set to None. You can change the size of the filter kernel by changing the value of n in the workspace. The model opens with n set to 15.

When using edge padding, most blocks have floor(KernelHeight/2) lines of latency and require 2*KernelWidth cycles of horizontal blanking. When you omit padding, most blocks require only 12 cycles of horizontal blanking. Because the internal line buffer latency no longer depends on the kernel size, this blanking interval accommodates any kernel size.

To show the reduced blanking requirements of using Padding method set to None, the Frame To Pixels block is configured for a custom 240p format that uses only 12 cycles of combined front and back porch.

When you run the model, it shows these three figures.

Input Video -- Original 240p input video.
Padding None Full Frame -- Output video from the filter without padding, showing border artifacts.
Padding None ROI -- Output video from the filter without padding, with border pixels trimmed from the edges of the frame. The frame size is smaller than the size of the input video.

Border Artifacts

In the Padding None Full Frame viewer, shown, a dark border is visible around the edge of each frame. This effect is because, without padding pixels, the filter neighborhoods are not fully defined at the edges of the frame. Output from a filter that has padding pixels does not show any border artifacts because the padding logic ensures that the edge neighborhoods are fully defined.

Removing or masking off these border pixels from nonpadded output before further analysis is common. Border artifacts can decrease the accuracy of subsequent processing. For example, these artifacts can affect the statistical distribution of the overall image. Vision HDL Toolbox blocks return the border pixels for nonpadded images to maintain the input and output timing. The values of these pixels are undefined and cannot be assumed to have any particular relation to the surrounding pixels.

The ROI Selector block removes floor(KernelHeight/2) and floor(KernelWidth/2) pixels from the edges of each frame. The Padding None ROI viewer, shown, shows the video with the border artifacts removed. The resulting frame for a 15-by-15 kernel is 225-by-305 pixels in size, reduced from 240-by-320 pixels.

Omitting Padding with the Line Buffer Block

This model shows how to design algorithms by using a Line Buffer with the Padding method parameter set to None. This model contains a Padding None subsystem, and a Padding Symmetric subsystem.

The Frame To Pixels block connected to the Padding Symmetric subsystem uses the standard 240p format. The standard horizontal blanking (combined front and back porch) is 82 cycles. Increasing the resolution increases the blanking interval. For example, the 1080p format has 280 idle cycles between lines.

The Frame To Pixels block connected to the Padding None subsystem implements a custom 240p format that uses only 12 cycles of combined front and back porch, the same as in the Image Filter model shown earlier.

This model implements a 15-by-15 Gaussian filter, with a large standard deviation, by using the Line Buffer block.

When you run the model, it shows three figures:

Input Video -- Original 240p input video.
Padding None ROI -- Output video from the filter without padding, with border pixels trimmed from the edges of the frame. The frame size is smaller than the size of the input video.
Padding Symmetric -- Output video from the filter with symmetric padding. This video is full size but has no edge effects because the padding bits define the neighborhoods around the edge pixels.

`pixelcontrol` Delay Balancing

When you construct algorithms that use the Line Buffer block, you must delay-balance the pixelcontrol bus to account for the kernel latency. When you use padding, the Line Buffer returns shiftEnable set to 1 for floor(KernelWidth/2) cycles before hStart and after hEnd. The delay-balancing logic uses this extended shiftEnable signal to control the delay registers for the pixelcontrol signals. You can see this logic in the Padding Symmetric/pixelctrldelay subsystem.

When you set Padding method to None, the Line Buffer returns shiftEnable to 1 between hStart and hEnd. The delay-balancing logic must use the clock, instead of shiftEnable, to control the delay registers for hEnd, vEnd, and valid. The valid signal must also respond to shiftEnable being set to 0 during a line, which can occur when interfacing with external memory. The valid signal must also be set to 1 on the last pixel of the line, to match with hEnd and vEnd. To meet both requirements, the delay-balancing logic delays the valid signal by using a register enabled by shiftEnable, and uses a Unit Delay Enabled block to set the valid signal to 1 with hEnd at the end of the line. The Padding None/pixelctrldelay subsystem shows this logic.

Conclusion

Excluding padding logic enables you to achieve higher throughput by using a video format with reduced horizontal blanking. This option also reduces hardware resource usage. However, your design must account for the border artifacts later in the processing chain. When you use the Line Buffer block, you must delay the pixelcontrol bus to match the kernel latency by using control logic that accounts for the modified behavior of the shiftEnable output signal. Using this example as a starting point, you can design algorithms and systems that achieve higher throughput by excluding padding logic.