Use MATLAB to Prototype Deep Learning on a Xilinx FPGA
FPGA-based hardware is a good fit for deep learning inferencing on embedded devices because they deliver low latency and power consumption. Early prototyping is essential to developing a deep learning network that can be efficiently deployed to an FPGA.
See how Deep Learning HDL Toolbox™ automates FPGA prototyping of deep learning networks directly from MATLAB®. With a few lines of MATLAB code, you can deploy to and run inferencing on a Xilinx® ZCU102 FPGA board. This direct connection allows you to run deep learning inferencing on the FPGA as part of your application in MATLAB, so you can converge more quickly on a network that meets your system requirements.
FPGAs are a good fit for deep learning inferencing in edge devices because they have lower latency, and use less power than CPUs or GPUs, and we’re starting to see them designed into a variety of applications.
But edge deployment brings constraints such as speed, size and power consumption, that force tradeoffs in implementing deep learning networks on FPGA-based hardware. So it becomes vital for engineers to be able to quickly iterate between network design and FPGA deployment.
With Deep Learning HDL Toolbox, you can get started running inferencing on an FPGA from MATLAB with as little as 5 lines of code added to your existing deep learning code, so you can experiment and iterate right in MATLAB.
To get started quickly, download the Xilinx support package for Deep Learning HDL Toolbox from the add-on explorer or the MathWorks hardware support page. This package includes pre-built bitstreams that program a deep learning processor and data movement functionality onto popular boards like the Xilinx ZCU102.
This deep learning processor has modules that run convolution and fully connected layers, and you can compile a variety of deep learning networks to run on them without reprogramming the FPGA. The rest of the functionality controls the layers, along with movement and storage of the parameters and activations, plus the interfaces that allow MATLAB to talk to it directly over Ethernet or JTAG.
This is a lane detection example that uses a series network that already been trained. It will overlay lane markings on the video.
The first line of code you will need is to define your target object. In this case, the target is a Xilinx board, using the Ethernet interface.
The next line defines the workflow object, which specifies to use that target object, which bitstream – and in this case it’s the one we downloaded that uses single-precision floating-point calculations, so you don’t even need to quantize to fixed-point, and of course what network we want to program onto the target.
The third line compiles the instructions that control the network and generates the parameters. As you iterate on your network design, you can just re-compile and deploy to the processor,
Which is the fourth line of code here – the deploy function. This one programs the FPGA with the bitstream if it hasn’t already been programmed. And it loads the compiled instructions that define the network, along with its parameters.
Then finally the fifth line is the one that calls on the network to run prediction on the FPGA. You’ll usually use it in your MATLAB algorithm like here.
And that’s it, you can try your network running on the FPGA in the context of your algorithm.
Here because we load one image at a time from MATLAB to the FPGA, it appears to be running slowly, but assessing the performance profile it’s not too bad. We can make adjustments to the network right from here, recompile, and re-assess performance with just a few lines of MATLAB code.
So you can get immediate feedback on how it performs on an FPGA without having to burden the hardware team, and ultimately you can generate HDL for a deep learning processor that you know can be implemented in hardware.
These 5 lines of MATLAB code are a common theme throughout our suite of examples, so you can try it with one of the examples most similar to your application.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.