This example shows how to develop vision algorithms to work with the Vision HDL Toolbox™ Support Package for Xilinx® Zynq-Based Hardware. It demonstrates how to take the Vision HDL Toolbox Edge Detection and Image Overlay example, and make it work with Zynq devices. The workflow can also be applied to other Vision HDL Toolbox examples.
Vision HDL Toolbox
Computer Vision Toolbox™
HDL Coder Support Package for Xilinx Zynq Platform
Optionally, to generate, compile, and target a Zynq ARM® software generation model:
Embedded Coder Support Package for Xilinx Zynq Platform
You can use this support package to develop algorithms that target a Zynq-based board. This example shows a complete workflow, from a frame-based simulation model to a deployed pixel streaming algorithm running on a ZedBoard®. To demonstrate the workflow an edge detection and image overlay algorithm is used throughout this example. This algorithm corresponds to the Vision HDL Toolbox example, Edge Detection and Image Overlay. With the Support Package for Zynq-Based Hardware, you get a hardware reference design that allows for easy integration of your targeted algorithm in the context of a vision system.
You can also apply this workflow to the Corner Detection and Image Overlay with Zynq-Based Hardware, Gamma Correction with Zynq-Based Hardware , and Image Sharpening with Zynq-Based Hardware examples.
If you have not yet done so, run through the guided setup wizard portion of the Zynq support package installation. You might have already completed this step when you installed this support package.
On the MATLAB Home tab, in the Environment section of the Toolstrip, click Add-Ons > Manage Add-Ons. Locate Vision HDL Toolbox Support Package for Xilinx Zynq-Based Hardware, and click Setup.
The guided setup wizard performs a number of initial setup steps, and confirms that the target can boot and that the host and target can communicate.
For more information, see Step 1. Setup Checklist
If you plan to generate embedded ARM code to control AXI-Lite registers attached to the FPGA control logic, you must perform additional setup steps to configure the Xilinx cross-compiling tools. These steps are detailed in Step 9. Setup for ARM Targeting
Start with a frame-based model of the Edge Detection and Image Overlay algorithm.
You can run this simulation without hardware as the video source for this example comes from the From Multimedia File block, that reads video data from a multimedia file. This step allows you to verify the frame-based algorithm against known fixed video data.
Algorithms are often sensitive to the specific video input. In this step, you can verify the algorithm against real-world data coming from the camera attached to the HDMI input. To do this, right-click on the variant selection icon in the lower-left corner of the Image Source block, choose Label mode active choice, and select HW.
When using real-world data, choose a frame size that matches your camera settings. If your camera allows different sizes, you can choose smaller sizes for faster throughput. The minimum size the HDMI input supports is 480p.
All of the settings on Video Capture block are sent to the target during simulation to properly configure it for capturing the camera video stream.
Now, refine the algorithm for implementation in hardware. Instead of working on full frame images, this HDL-ready algorithm works on a pixel-streaming interface. The model includes comparison logic between the frame-based version of the algorithm and the pixel-based. It uses a YCbCr Resize block to resize the source video frame for better simulation performance.
You can run this simulation without hardware to determine whether the pixel-stream and frame-based versions of the algorithm are the same. If they are, you should see 'Inf' in the PSNR output.
There are two things to note about the simulation outputs:
During the first frame of simulation output, the Video Display scopes displays a green image. This condition indicates that no image data is available. This behavior is because the output of the pixel-streaming algorithm must be buffered to form a full-frame before being displayed. The output of the frame-based algorithm is being delayed by one frame to compensate for this.
During the second frame of simulation, the Video Display scope for the pixel-streaming output is incorrect with respect to the
cAlpha parameter values that you configured in the Simulink model. The
PSNR for Y scope will also show a non-Inf value for the second frame. This discrepancy is because the algorithm uses initial
cAlpha value of
0.0. When you use Rate Transition blocks configured in Unit Delay mode the input value is only registered at the output after a full sample period at the input rate. In this case, the input rate corresponds to one video frame. During this initial transition period, the block will output the initial conditions (a value of
0.0). These blocks are required to ensure a single-rate for all blocks within the subsystem, which is required for HDL code generation. For more information, see Rate Transition (Simulink).
You can run the pixel stream model with live camera data by configuring the Image Source block accordingly. As before, right-click on the variant selection icon in the lower-left corner of the Image Source block, choose Label mode active choice, and select either HW.
Although the capture can support large frames, the pixel stream algorithm runs very slowly. It is best to run with the smallest possible frame size. You can choose to resize the captured frame to 240p using the YCbCr Resize block. If you change the frame size, change it in the reference design block, the Frame To Pixels for YCbCr 4:2:2 block, and the Pixels To Frame for YCbCr 4:2:2 block.
After you are satisfied that the pixel streaming and frame-based algorithm match, you can target the pixel algorithm to the FPGA on the Zynq board. The previous model is transformed to remove the frame-based algorithm and verification logic in order to create a model that serves as input to generating the hardware user logic.
In preparation for targeting, set up the Xilinx tool chain by invoking
hdlsetuptoolpath. For example:
>> hdlsetuptoolpath('ToolName','Xilinx Vivado','ToolPath','C:\Xilinx\Vivado\2020.1\bin\vivado.bat');
hdlsetuptoolpath (HDL Coder) for more information.
Start the targeting workflow by right clicking the
Edge Overlay Algorithm subsystem and selecting
HDL Code > HDL Workflow Advisor.
In Step 1.1, select
IP Core Generation workflow and the appropriate platform from the choices:
In Step 1.2, select the reference design corresponding to the pixel format used in your design. This step calculates the necessary clock frequency to support the resolution you select, and adds a synthesis constraint. After step 4.3, check the synthesis logs for your design to verify that the synthesis achieved the requested clock frequency.
In Step 1.3, map the target platform interfaces to the input and output ports of your design. The options in the Target Platform Interfaces column vary depending on the pixel format. The figure shows an example of the interface table configured for
YCbCr 4:2:2, along with a color-coded selection of possible interface values.
In the example model, the ports have been labeled so that they closely resemble the correct entry in the Target Platform Interface drop-down list. These names show in the Port Names column. The Target Platform Interfaces include four types of signal:
Pixel Streaming Data Signals: These signal interfaces depend on the pixel format of the design. This example uses YCbCr pixel format, so only the options
CbCr appear in the list. If you select the RGB reference design, then three options,
B, are available. If you select
Y Only, the only data interface option is
Y. These signal interfaces apply to both the input and output interfaces.
Pixel Streaming Control Buses These bus interfaces appear for all pixel formats. See Pixel Control Bus, for more information.
External Board Interfaces These signal interfaces correspond to physical GPIO interfaces on your target hardware platform. There are two categories: input and output. In this example, the input port
pbEdgeOnly is mapped to a Push Button interface; the input port
dsGrayscale is mapped to DIP Switch interface; and the output port
LED is mapped to an LED interface.
ARM User Logic Interfaces The only interface option in this category is
AXI-Lite. Choosing this interface directs HDL Coder to generate a memory-mapped register in the FPGA fabric. You can access this register from software running on the ARM processor. This interface enables you to tune parameters in real-time when running a Targeted Hardware Simulation or a Software Interface model.
For this example, select the
YCbCr 4:2:2 reference design to match the pixel format of the Pixel-Stream Edge Overlay Algorithm subsystem. With reference to the Interface Table diagram, map the
YIn port to the Y Input [0:7] interface. Similarly, map the
CbCrIn port to the Cb/Cr Input [0:7] interface. Map the pixel streaming control input and output ports to the corresponding control signal interfaces. To utilize the external interfaces on the target hardware platform, map the
pbEdgeOnly port to push button 0;
dsGrayScale port to dip switch 0; and
LED port to LED 0. Likewise, map
cThreshold to AXI4-Lite software control.
Step 2 prepares the design for generation by doing some design checks.
Step 3 generates HDL code for the IP core.
Step 4 integrates the newly generated IP core into the larger Vision Zynq reference design.
Execute each step in sequence to experience the full workflow, or, if you are already familiar with preparation and HDL code generation phases, right-click Step 4.1 in the table of contents on the left hand side and select
Run to selected task.
In Step 4.2, the workflow generates a targeted hardware interface model and, if the Embedded Coder Zynq support package has been installed, a Zynq software interface model. Two library models, containing the required blocks and interfaces for the configuration and control of the targeted algorithm, are also created. These models are explained later. Click
Run this task button with the default settings.
Steps 4.3 and 4.4
The rest of the workflow generates a bitstream for the FPGA, downloads it to the target, and reboots the board.
Because this process can take 20-40 minutes, you can choose to bypass this step by using a pre-generated bitstream for this example that ships with product and was placed on the SDCard during setup.
Note: This bitstream was generated with the HDMI pixel clock constrained to 148.5 MHz for a maximum resolution of 1080p HDTV at 60 frames-per-second. To run this example on Zynq hardware with a higher resolution, select the Source Video Resolution value from the drop-down list in Step 1.2.
To use this pre-generated bitstream execute the following:
>> vz = visionzynq(); >> changeFPGAImage(vz, 'visionzynq-zedboard-hdmicam-edge_overlay.bit');
Replace 'zedboard' with 'zc706', 'zc702', 'picozed' or 'zcu102' if appropriate.
Alternatively, you can continue with Steps 4.3 and 4.4.
In Step 4.3, the workflow advisor generates a bitstream for the FPGA. You can choose to execute this step in an external shell by keeping the selection
Run build process externally. This selection allows you to continue using MATLAB while the FPGA is being built. The step will complete in a couple of minutes after some basic project checks have been completed, and the step will be marked with a green checkmark. However, you must wait until the external shell shows a successful bitstream build before moving on to the next step.
In Step 4.4, you download the completed bitstream to the target and reboot the board.
Step 4.2 generated two, or four, models depending on whether Embedded Coder is installed: A 'targeted hardware interface' model and associated library model, and a 'software interface' model and associated library model. The 'targeted hardware interface' model can be used to control the reference design from the Simulink model without Embedded Coder. The 'software interface' model supports full software targeting to the Zynq when Embedded Coder and the Zynq (Embedded Coder) support package are installed, enabling External mode simulation, Processor-in-the-loop, and full deployment.
The library models are created so that any changes to the hardware generation model are propagated to any custom targeted hardware simulation or software interface models that exist.
In this model, you can adjust the configuration of the reference design and read or drive control ports of the hardware user logic. These configuration changes affect the design while it is running on the target. You can also display captured video from the target device.
The generated model contains the blocks that enable the targeted algorithm to be configured and controlled from Simulink. Areas of the model are labelled to highlight where further video processing algorithms, and algorithms to control the targeted hardware user logic, should be placed.
An example of how you can use the targeted hardware simulation model is provided in a saved model.
Update the frame size in the Video Capture block to match the settings on your camera.
This model includes a capture display block to allow further analysis of the user logic output in Simulink. This is an optional step. The HDMI output on the target board also shows the output. If you only require visual confirmation of the algorithm working, you can remove the capture display block.
Open the 'cThreshold' constant block and change values. Notice that the edge detection becomes more or less aggressive.
Because the algorithm is running on hardware, push PB0 on the Zynq board to show only the edge detection image to the HDMI output, and toggle DS0 to switch between color and greyscale.
Change other settings to see their effects.
In this model, you can run in External mode to control the configuration of the reference design, and read or drive any control ports of the hardware user logic that you connected to AXI-Lite registers. These configuration changes affect the design while it is running on the target. You can use this model to fully deploy a software design. (This model is generated only if Embedded Coder and the Zynq 7000 (Embedded Coder) support package are installed.)
The generated model contains the blocks that enable the targeted algorithm to be configured and controlled from software. An area of the model is labelled to highlight where the software algorithm to control the targeted hardware user logic should be placed.
An example of how to use the software interface model to generate software is provided in a saved model.
Before running this model, you must perform additional setup steps to configure the Xilinx cross-compiling tools. For more information, see Step 9. Setup for ARM Targeting
This model can be run in
External mode. This mode allows you to control the configuration from the Simulink model. Adjust the constant values for
cAlpha. You can also bypass the hardware User Logic in the data path on the Zynq device by checking the Bypass hardware user logic option in the Software Interface block properties, and clicking Apply.
You can also fully deploy the design. In the Simulink toolbar, click Deploy to Hardware.
Algorithm Configuration Options
In addition to processing the image for edge detection, this algorithm offers some configurable control parameters.
pbEdgeOnly is mapped to a Zynq board push button that, when pushed, shows only the edge detection results and none of the original image. The specific push button will be specified in the Target the Algorithm section.
dsGrayScale is mapped to a Zynq board dip switch that, when toggled, removes the CbCr pixel data and only outputs the Y data. The specific dip switch will be specified in the Target the Algorithm section.
cThreshold is a configuration option to adjust the edge detection threshold. Even after targeting it is controllable from the Simulink model through
External Mode or
Target Hardware execution.
cAlpha is a configuration option to adjust the blending of the original image and the edge detection image. A value of 1.0 means full original image and a 0.0 means full edge detection image. This configuration is also available for remote configuration through the
External Mode or
Target Hardware execution.
Note that the
dsGrayScale control ports are pure hardware connections in the targeted design. These ports can run at any desired rate including at the pixel clock rate.
cAlpha values are controlled by the embedded processor (or the host in
External Mode or
Target Hardware mode). Because neither the host nor the embedded CPU can update these controls at the pixel clock rate, a rate on the order of the frame rate is desired.
This example demonstrated the workflow for developing algorithms to work with Xilinx Zynq devices. You can apply this workflow to:
In addition, you can also adapt other Vision HDL Toolbox examples to work with Zynq devices.
The example models are based on the YCbCr 4:2:2 Frame-Based Algorithms with Zynq Simulink template and YCbCr 4:2:2 HDL Pixel-Streaming Algorithms with Zynq Simulink template, which are available from the Simulink start page.