Programmable Deep Learning Processor
The toolbox includes a deep learning processor that features generic convolution and fully-connected layers controlled by scheduling logic. This deep learning processor performs FPGA-based inferencing of networks developed using Deep Learning Toolbox™. High-bandwidth memory interfaces speed memory transfers of layer and weight data.
Compilation and Deployment
Compile your deep learning network into a set of instructions to be run by the deep learning processor. Deploy to the FPGA and run prediction while capturing actual on-device performance metrics.
Get Started with Prebuilt Bitstreams
Prototype your network without FPGA programming using available bitstreams for popular FPGA development kits.
Creating a Network for Deployment
Begin by using Deep Learning Toolbox to design, train, and analyze your deep learning network for tasks such as object detection or classification. You can also start by importing a trained network or layers from other frameworks.
Deploying Your Network to the FPGA
Once you have a trained network, use the
deploy command to program the FPGA with the deep learning processor along with the Ethernet or JTAG interface. Then use the
compile command to generate a set of instructions for your trained network without reprogramming the FPGA.
Running FPGA-Based Inferencing as Part of Your MATLAB Application
Run your entire application in MATLAB®, including your test bench, preprocessing and post-processing algorithms, and the FPGA-based deep learning inferencing. A single MATLAB command,
predict, performs the inferencing on the FPGA and returns results to the MATLAB workspace.
Profile FPGA Inferencing
Measure layer-level latency as you run predictions on the FPGA to find performance bottlenecks.
Tune the Network Design
Using the profile metrics, tune your network configuration with Deep Learning Toolbox. For example, use Deep Network Designer to add layers, remove layers, or create new connections.
Deep Learning Quantization
Reduce resource utilization by quantizing your deep learning network to a fixed-point representation. Analyze tradeoffs between accuracy and resource utilization using the Model Quantization Library support package.
Custom Deep Learning Processor Configuration
Specify hardware architecture options for implementing the deep learning processor, such as the number of parallel threads or maximum layer size.
Generate Synthesizable RTL
Use HDL Coder to generate synthesizable RTL from the deep learning processor for use in a variety of implementation workflows and devices. Reuse the same deep learning processor for prototype and production deployment.
Generate IP Cores for Integration
When HDL Coder generates RTL from the deep learning processor, it also generates an IP core with standard AXI interfaces for integration into your SoC reference design.