Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

vision.PeopleDetector System object

Detect upright people using HOG features

Description

The people detector object detects people in an input image using the Histogram of Oriented Gradient (HOG) features and a trained Support Vector Machine (SVM) classifier. The object detects unoccluded people in an upright position.

Note

Starting in R2016b, instead of using the step method to perform the operation defined by the System object™, you can call the object with arguments, as if it were a function. For example, y = step(obj,x) and y = obj(x) perform equivalent operations.

Construction

peopleDetector = vision.PeopleDetector returns a System object, peopleDetector, that tracks a set of points in a video.

peopleDetector = vision.PeopleDetector(MODEL) creates a peopleDetector System object and sets the ClassificationModel property to MODEL. The input MODEL can be either 'UprightPeople_128x64' or 'UprightPeople_96x48'.

peopleDetector = vision.PeopleDetector(Name,Value) configures the tracker object properties, specified as one or more name-value pair arguments. Unspecified properties have default values.

To detect people:

  1. Define and set up your people detector object using the constructor.

  2. Call the step method with the input image, I, and the people detector object, peopleDetector. See the syntax below for using the step method.

BBOXES = step(peopleDetector,I) performs multiscale object detection on the input image, I. The method returns an M-by-4 matrix defining M bounding boxes, where M represents the number of detected people. Each row of the output matrix, BBOXES, contains a four-element vector, [x y width height]. This vector specifies, in pixels, the upper-left corner and size, of a bounding box. When no people are detected, the step method returns an empty vector. The input image, I, must be a grayscale or truecolor (RGB) image.

[BBOXES,SCORES] = step(peopleDetector,I) returns a confidence value for the detections. The M-by-1 vector, SCORES, contain positive values for each bounding box in BBOXES. Larger score values indicate a higher confidence in the detection. The SCORES value depends on how you set the MergeDetections property. When you set the property to true, the people detector algorithm evaluates classification results to produce the SCORES value. When you set the property to false, the detector returns the unaltered classification SCORES.

[___] = step(peopleDetector,I,roi) detects people within the rectangular search region specified by roi. You must specify roi as a 4-element vector, [x y width height], that defines a rectangular region of interest within image I. Set the 'UseROI' property to true to use this syntax.

Properties

expand all

Name of classification model, specified as a comma-separated pair consisting of 'ClassificationModel' and the character vector 'UprightPeople_128x64' or 'UprightPeople_96x48'. The pixel dimensions indicate the image size used for training.

The images used to train the models include background pixels around the person. Therefore, the actual size of a detected person is smaller than the training image size.

People classification threshold, specified as a comma-separated pair consisting of 'ClassificationThreshold' and a nonnegative scalar value. Use this threshold to control the classification of individual image subregions during multiscale detection. The threshold controls whether a subregion gets classified as a person. You can increase this value when there are many false detections. The higher the threshold value, the more stringent the requirements are for the classification. Vary the threshold over a range of values to find the optimum value for your data set. Typical values range from 0 to 4. This property is tunable.

Smallest region containing a person, specified as a comma-separated pair consisting of 'MinSize' and a two-element [height width] vector. Set this property in pixels for the minimum size region containing a person. When you know the minimum person size to detect, you can reduce computation time. To do so, set this property to a value larger than the image size used to train the classification model. When you do not specify this property, the detector sets it to the image size used to train the classification model. This property is tunable.

Largest region that contains a person, specified as a comma-separated pair consisting of 'MaxSize' and a two-element [height width] vector. Set this property in pixels for the largest region containing a person. When you know the maximum person size to detect, you can reduce computation time. To do so, set this property to a value smaller than the size of the input image. When you do not specify this property, the detector sets it to the input image size. This property is tunable.

Multiscale object detection scaling, specified as a comma-separated pair consisting of 'ScaleFactor' and a value greater than 1.0001. The scale factor incrementally scales the detection resolution between MinSize and MaxSize. You can set the scale factor to an ideal value using:

size(I)/(size(I)-0.5)

The object calculates the detection resolution at each increment.

round(TrainingSize*(ScaleFactorN))

In this case, the TrainingSize is [128 64] for the 'UprightPeople_128x64' model and [96 48] for the 'UprightPeople_96x48' model. N is the increment. Decreasing the scale factor can increase the detection accuracy. However, doing so increases the computation time. This property is tunable.

Detection window stride in pixels, specified as a comma-separated pair consisting of 'WindowStride' and a scalar or a two-element [x y] vector. The object uses the window stride to slide the detection window across the image. When you specify this value as a vector, the first and second elements are the stride size in the x and y directions. When you specify this value as a scalar, the stride is the same for both x and y. Decreasing the window stride can increase the detection accuracy. However, doing so increases computation time. Increasing the window stride beyond [8 8] can lead to a greater number of missed detections. This property is tunable.

Merge detection control, specified as a comma-separated pair consisting of 'MergeDetections' and a logical scalar. This property controls whether similar detections are merged. Set this property to true to merge bounding boxes using a mean-shift based algorithm. Set this property to false to output the unmerged bounding boxes.

For more flexibility and control of merging parameters, you can use the selectStrongestBbox function in place of the MergeDetections algorithm. To do this, set the MergeDetections property to false. See the Tracking Pedestrians from a Moving Car example, which shows the use of the people detector and the selectStrongestBbox function.

Use region of interest, specified as a comma-separated pair consisting of 'UseROI' and a logical scalar. Set this property to true to detect objects within a rectangular region of interest within the input image.

Methods

stepDetect upright people using HOG features
Common to All System Objects
clone

Create System object with same property values

getNumInputs

Expected number of inputs to a System object

getNumOutputs

Expected number of outputs of a System object

isLocked

Check locked states of a System object (logical)

release

Allow System object property value changes

Examples

expand all

Create a people detector and load the input image.

peopleDetector = vision.PeopleDetector;
I = imread('visionteam1.jpg');

Detect people using the people detector object.

[bboxes,scores] = step(peopleDetector,I);

Annotate detected people.

I = insertObjectAnnotation(I,'rectangle',bboxes,scores);
figure, imshow(I)
title('Detected people and detection scores');

References

Dalal, N. and B. Triggs. “Histograms of Oriented Gradients for Human Detection,”Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, June 2005, pp. 886-893.

Extended Capabilities

Introduced in R2012b