Real-Time Object Detection with YOLO v2 Using GPU Coder

Hi, I am Ram Cherukuri, product manager here at Mathworks, and in this video I am going to walk you through an example of real-time object detection using Yolo V2 in MATLAB.

What is Yolo?

YOLO stands for “You only look once” and it is a popular approach for object detection.

A common approach to object detection was to repurpose classifiers to perform detection.

So, for instance, RCNN uses region proposal methods to first generate potential bounding boxes in an image and then run a classifier on these proposed boxes and then refine the predictions. As you can see, this requires multiple evaluations.

Yolo, on the other hand, frames detection as a regression problem, and unifies the separate components of object detection into a single neural network.

It divides the input image into a grid and each grid cell predicts a certain number of bounding boxes along with confidence scores for the boxes. The scores reflect how confident the model is that the box contains an object and also how accurate it thinks the box is that it predicts. Each grid cell also predicts conditional class probabilities.

That’s a lot to process and I would suggest referring to a few papers and articles to understand the nuances of this unique approach as it’s not possible to cover all that in this short video.

Yolo has become very popular and important as it is considered the state-of-the-art technique since it uses a single network and is very fast for real-time object detection.

Even if you are not familiar, you can get started with Yolo v2 with this published example in MATLAB® that explains how you can train a Yolov2 object detector on your data.

Then, using GPU Coder™, you can generate optimized CUDA code to target NVIDIA® boards like the Jetson Xavier directly from MATLAB.

The hardware support package enables you to deploy the generated code to the Jetson and the drive platforms, as we will see in the demo that follows.

Here in MATLAB, I have taken the trained object detector from the example as my starting point and I am going to run inference on a test image here.

In fact, we ran a simple test comparing the Faster RCNN model with YolO v2 and you can see that Yolo v2 is about 25 times faster on my local machine here.

Now, using GPU Coder, we are going to generate CUDA code from this function and compile it using nvcc into a MEX file so we can verify the generated code on my desktop machine.

You can see that the generated MEX runs at about 80 frames per sec on this video file on my desktop that has a Titan Xp GPU.

Please note that these are not official benchmarking numbers, as I have some other programs also running in the background, but this should give you an idea of the performance of the Yolo v2 network.

Now, using the hardware support package for NVIDIA GPUs, I can get live data from the camera connected to my Jetson Xavier board and we can run inference using the same generated MEX file.

 Here I am using the live data from outside my window overlooking the traffic on Route 9.

Finally, I can generate code from our algorithm here that takes the input from the webcam, uses Yolo v2 for object detection, and displays the output.

The NVIDIA hardware support package supports code generation for these interfaces and once code is generated and built, we can run the executable as a standalone application on the Jetson Xavier board.

So, we have real-time object detection using Yolo v2 running standalone on the Jetson Xavier here, taking live input from the webcam connected to it.

A few takeaways from this example are summarized here. You can target NVIDIA boards like the Jetson Xavier and Drive PX with simple APIs directly from MATLAB without needing to write any CUDA code.

Please refer to the links below the video to learn more about the hardware support package and to find more object detection examples in MATLAB.