Detect objects using ACF object detector configured for monocular camera
[___] = detect(___,
specifies options using one or more
Name,Value pair arguments. For
detect(detector,I,'WindowStride',2) sets the stride of the
sliding window used to detect objects to 2.
Configure an ACF object detector for use with a monocular camera mounted on an ego vehicle. Use this detector to detect vehicles within video frames captured by the camera.
acfObjectDetector object pretrained to detect vehicles.
detector = vehicleDetectorACF;
Model a monocular camera sensor by creating a
monoCamera object. This object contains the camera intrinsics and the location of the camera on the ego vehicle.
focalLength = [309.4362 344.2161]; % [fx fy] principalPoint = [318.9034 257.5352]; % [cx cy] imageSize = [480 640]; % [mrows ncols] height = 2.1798; % height of camera above ground, in meters pitch = 14; % pitch of camera, in degrees intrinsics = cameraIntrinsics(focalLength,principalPoint,imageSize); monCam = monoCamera(intrinsics,height,'Pitch',pitch);
Configure the detector for use with the camera. Limit the width of detected objects to a typical range for vehicle widths: 1.5–2.5 meters. The configured detector is an
vehicleWidth = [1.5 2.5]; detectorMonoCam = configureDetectorMonoCamera(detector,monCam,vehicleWidth);
Load a video captured from the camera, and create a video reader and player.
videoFile = fullfile(toolboxdir('driving'),'drivingdata','caltech_washington1.avi'); reader = vision.VideoFileReader(videoFile,'VideoOutputDataType','uint8'); videoPlayer = vision.VideoPlayer('Position',[29 597 643 386]);
Run the detector in a loop over the video. Annotate the video with the bounding boxes for the detections and the detection confidence scores.
cont = ~isDone(reader); while cont I = reader(); % Run the detector. [bboxes,scores] = detect(detectorMonoCam,I); if ~isempty(bboxes) I = insertObjectAnnotation(I, ... 'rectangle',bboxes, ... scores, ... 'Color','g'); end videoPlayer(I) % Exit the loop if the video player figure is closed. cont = ~isDone(reader) && isOpen(videoPlayer); end
detector— ACF object detector configured for monocular camera
I— Input image
Input image, specified as a real, nonsparse, grayscale or RGB image.
roi— Search region of interest
Search region of interest, specified as an [x y width height] vector. The vector specifies the upper left corner and size of a region in pixels.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
'NumScaleLevels'— Number of scale levels per octave
8(default) | positive integer
Number of scale levels per octave, specified as the comma-separated pair consisting of
'NumScaleLevels' and a positive integer. Each octave is a
power-of-two downscaling of the image. To detect people at finer scale
increments, increase this number. Recommended values are in the range [4,
'WindowStride'— Stride for sliding window
4(default) | positive integer
Stride for the sliding window, specified as the comma-separated pair
'WindowStride' and a positive integer. This
value indicates the distance for the function to move the window in both the
x and y directions. The sliding window
scans the images for object detection.
'SelectStrongest'— Select strongest bounding box for each object
Select the strongest bounding box for each detected object, specified as the comma-separated
pair consisting of
'SelectStrongest' and either
true — Return the strongest bounding box per
object. To select these boxes,
detect calls the
function, which uses nonmaximal suppression to eliminate overlapping
bounding boxes based on their confidence scores.
false — Return all detected bounding boxes. You can
then create your own custom operation to eliminate overlapping bounding
'MinSize'— Minimum region size
Minimum region size that contains a detected object, specified as the comma-separated pair consisting of
'MinSize' and a [height width] vector. Units are in pixels.
MinSize is the smallest object that the trained
detector can detect.
'MaxSize'— Maximum region size
I) (default) | [height width] vector
Maximum region size that contains a detected object, specified as the comma-separated pair consisting of
'MaxSize' and a [height width] vector. Units are in pixels.
To reduce computation time, set this value to the known maximum region size for the objects
being detected in the image. By default,
'MaxSize' is set to
the height and width of the input image,
'Threshold'— Classification accuracy threshold
–1(default) | numeric scalar
Classification accuracy threshold, specified as the comma-separated pair consisting of
'Threshold' and a numeric scalar. Recommended values are
in the range [–1, 1]. During multiscale object detection, the threshold value
controls the accuracy and speed for classifying image subregions as either
objects or nonobjects. To speed up the performance at the risk of missing true
detections, increase this threshold.
bboxes— Location of objects detected within image
Location of objects detected within the input image, returned as an M-by-4
matrix, where M is the number of bounding boxes. Each row of
bboxes contains a four-element vector of the form
height]. This vector specifies the upper left corner and size
of that corresponding bounding box in pixels.
scores— Detection confidence scores
Detection confidence scores, returned as an M-by-1 vector, where M is the number of bounding boxes. A higher score indicates higher confidence in the detection.