Train cascade object detector model
writes a trained cascade detector XML file with the name,
The name specified by the
must have an XML extension. For a more detailed explanation on how
this function works, refer to Train a Cascade Object Detector.
an interrupted training session. The
must match the output file name from the interrupted session. All
arguments saved from the earlier session are reused automatically.
This example shows you the steps involved in training a cascade object detector. It trains a 5-stage detector from a very small training set. In reality, an accurate detector requires many more stages and thousands of positive samples and negative images.
Load the positive samples data from a .mat file. The file names and bounding boxes are contained in an array of structures named 'data'.
Add the images location to the MATLAB path.
imDir = fullfile(matlabroot,'toolbox','vision','visiondata','stopSignImages'); addpath(imDir);
Specify the folder for negative images.
negativeFolder = fullfile(matlabroot,'toolbox','vision','visiondata','nonStopSigns');
Train a cascade object detector called 'stopSignDetector.xml' using HOG features. The following command may take several minutes to run:
Automatically setting ObjectTrainingSize to [ 35, 32 ] Using at most 42 of 42 positive samples per stage Using at most 84 negative samples per stage Training stage 1 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 1: 1 seconds Training stage 2 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 2: 0 seconds Training stage 3 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 3: 1 seconds Training stage 4 of 5 [........................................................................] Used 42 positive and 84 negative samples Time to train stage 4: 4 seconds Training stage 5 of 5 [.................................................... Very low false alarm rate 0.000308187 reached in stage. Training will halt and return cascade detector with 4 stages Time to train stage 5: 12 seconds Training complete
Use the newly trained classifier to detect a stop sign in an image.
detector = vision.CascadeObjectDetector('stopSignDetector.xml');
Read the test image.
img = imread('stopSignTest.jpg');
Detect a stop sign.
bbox = step(detector,img);
Insert bounding boxes and return marked image.
detectedImg = insertObjectAnnotation(img,'rectangle',bbox,'stop sign');
Display the detected stop sign.
Remove the image directory from the path.
positiveInstances— Positive samplesarray of structs
Positive samples, specified as an array of structs containing
string image file names, and an M-by-4 matrix of
bounding boxes specifying object locations in the images. You can
use the Training Image Labeler app
to label objects of interest with bounding boxes. The app outputs
an array of structs to use for
struct fields are
defined as follows:
|A string that specifies the image name. The image can be true
color, grayscale, or indexed, in any of the formats supported by |
An M-by-4 matrix of M bounding boxes. Each bounding box is in the format, [x y width height] and specifies an object location in the corresponding image.
The function automatically determines the number of positive samples to use at each of the cascade stages. This value is based on the number of stages and the true positive rate. The true positive rate specifies how many positive samples can be misclassified
negativeImages— Negative imagescell array | string
Negative images, specified as either a path to a folder containing images or as a cell array of image file names. Because the images are used to generate negative samples, they must not contain any objects of interest. Instead, they should contain backgrounds associated with the object.
Trained cascade detector file name, specified as a string with an XML extension.
Specify optional comma-separated pairs of
Name is the argument
Value is the corresponding
Name must appear
inside single quotes (
You can specify several name and value pair
arguments in any order as
'Haar'specifies Haar for the type of features to use.
'ObjectTrainingSize'— Object size for training
'Auto'(default) | 2-element vector
Training object size, specified as the comma-separated pair.
This pair contains '
ObjectTrainingSize' and either
a 2-element vector [height, width]
or as the string
'Auto'. Before training, the function
resizes the positive and negative samples to
pixels. If you select
'Auto', the function determines
the size automatically based on the median width-to-height ratio of
the positive instances. For optimal detection accuracy, specify an
object training size close to the expected size of the object in the
image. However, for faster training and detection, set the object
training size to be smaller than the expected size of the object in
'NegativeSamplesFactor'— Negative sample factor
2(default) | real-valued scalar
Negative sample factor, specified as the comma-separated pair
consisting of '
NegativeSamplesFactor' and a real-valued
scalar. The number of negative samples to use at each stage is equal
NegativeSamplesFactor x [the
number of positive samples used at each stage].
'NumCascadeStages'— Number of cascade stages
20(default) | positive integer
Number of cascade stages to train, specified as the comma-separated
pair consisting of '
NumCascadeStages' and a positive
integer. Increasing the number of stages may result in a more accurate
detector but also increases training time. More stages may require
more training images., because at each stage, some number of positive
and negative samples may be eliminated. This value depends on the
TruePositiveRate. More stages may also allow
you to increase the
FalseAlarmRate. See the Train a Cascade Object Detector tutorial
for more details.
'FalseAlarmRate'— Acceptable false alarm rate
0.5(default) | value in the range (0 1]
Acceptable false alarm rate at each stage, specified as the
comma-separated pair consisting of '
and a value in the range (0 1]. The false alarm rate is the fraction
of negative training samples incorrectly classified as positive samples.
The overall false alarm rate is calculated using the
stage and the number of cascade stages,
FalseAlarmRate increase complexity
of each stage. Increased complexity can achieve fewer false detections
but may result in longer training and detection times. Higher values
FalseAlarmRate may require a greater number
of cascade stages to achieve reasonable detection accuracy.
'TruePositiveRate'— Minimum true positive rate
Minimum true positive rate required at each stage, specified
as the comma-separated pair consisting of '
and a value in the range (0 1]. The true positive rate is the fraction
of correctly classified positive training samples.
The overall resulting target positive rate is calculated using
TruePositiveRate per stage and the number
of cascade stages,
TruePositiveRate increase complexity
of each stage. Increased complexity can achieve a greater number of
correct detections but may result in longer training and detection
'FeatureType'— Feature type
Feature type, specified as the comma-separated pair consisting
FeatureType' and one of three strings. The
possible features types are:
The function allocates a large amount of memory, this is especially the case for the Haar features. To avoid running out of memory, use this function on a 64-bit operating system with a sufficient amount of RAM.
Training a good detector requires thousands of training samples. Processing time for a large amount of data varies. It is likely to take on the order of hours or even days. During training, the function displays the time it took to train each stage in the MATLAB® command window.
 Viola, P., and M. J. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features". Proceedings of the 2001 IEEE Computer Society Conference. Volume 1, 15 April 2001, pp. I-511–I-518.
 Ojala, T., M. Pietikainen, and T. Maenpaa, "Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns". IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.
 Dalal, N., and B. Triggs, "Histograms of Oriented Gradients for Human Detection". IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Volume 1, (2005), pp. 886–893.