Contents

trainCascadeObjectDetector

Train cascade object detector model

Syntax

  • trainCascadeObjectDetector(outputXMLFilename,positiveInstances,negativeImages) example
  • trainCascadeObjectDetector(outputXMLFilename,'resume')
  • trainCascadeObjectDetector(___, Name,Value) example

Description

example

trainCascadeObjectDetector(outputXMLFilename,positiveInstances,negativeImages) writes a trained cascade detector XML file with the name, outputXMLFilename. The name specified by the outputXMLFilename input must have an XML extension. For a more detailed explanation on how this function works, refer to Train a Cascade Object Detector.

trainCascadeObjectDetector(outputXMLFilename,'resume') resumes an interrupted training session. The outputXMLFilename input must match the output file name from the interrupted session. All arguments saved from the earlier session are reused automatically.

example

trainCascadeObjectDetector(___, Name,Value) uses additional options specified by one or more Name,Value pair arguments.

Examples

expand all

Train a Stop Sign Detector

This example shows you the steps involved in training a cascade object detector. It trains a 5-stage detector from a very small training set. In reality, an accurate detector requires many more stages and thousands of positive samples and negative images.

Load the positive samples data from a .mat file. The file names and bounding boxes are contained in an array of structures named 'data'.

     load('stopSigns.mat');

Add the image directory to the MATLAB path.

     imDir = fullfile(matlabroot, 'toolbox', 'vision', 'visiondemos','stopSignImages');
     addpath(imDir);

Specify the folder for negative images.

     negativeFolder = fullfile(matlabroot, 'toolbox', 'vision','visiondemos', 'non_stop_signs');

Train a cascade object detector called 'stopSignDetector.xml' using HOG features. The following command may take several minutes to run:

     trainCascadeObjectDetector('stopSignDetector.xml', data, negativeFolder, 'FalseAlarmRate', 0.2, 'NumCascadeStages', 5);
Automatically setting ObjectTrainingSize to [ 33, 32 ]
Using at most 86 of 86 positive samples per stage
Using at most 172 negative samples per stage

Training stage 1 of 5
[........................................................................]
Used 86 positive and 172 negative samples
Time to train stage 1: 1 seconds

Training stage 2 of 5
[........................................................................]
Used 86 positive and 172 negative samples
Time to train stage 2: 0 seconds

Training stage 3 of 5
[........................................................................]
Used 86 positive and 172 negative samples
Time to train stage 3: 1 seconds

Training stage 4 of 5
[........................................................................]
Used 86 positive and 172 negative samples
Time to train stage 4: 3 seconds

Training stage 5 of 5
[........................................................................]
Used 86 positive and 172 negative samples
Time to train stage 5: 11 seconds

Training complete

Use the newly trained classifier to detect a stop sign in an image.

    detector = vision.CascadeObjectDetector('stopSignDetector.xml');

Read the test image.

    img = imread('stopSignTest.jpg');

Detect a stop sign.

    bbox = step(detector, img);

Insert bounding boxes and return marked image.

    detectedImg = insertObjectAnnotation(img, 'rectangle', bbox, 'stop sign');

Display the detected stop sign.

    figure; imshow(detectedImg);

Remove the image directory from the path.

    rmpath(imDir);

Input Arguments

expand all

positiveInstances — Positive samplesarray of structs

Positive samples, specified as an array of structs containing string image file names, and an M-by-4 matrix of bounding boxes specifying object locations in the images. You can use the trainingImageLabeler app to label objects of interest with bounding boxes. The app outputs an array of structs to use for positiveInstances.

The struct fields are defined as follows:

imageFilenameA string that specifies the image name. The image can be true color, grayscale, or indexed, in any of the formats supported by imread.
objectBoundingBoxes

An M-by-4 matrix of M bounding boxes. Each bounding box is in the format, [x y width height] and specifies an object location in the corresponding image.

The function automatically determines the number of positive samples to use at each of the cascade stages. This value is based on the number of stages and the true positive rate. The true positive rate specifies how many positive samples can be misclassified

Data Types: struct

negativeImages — Negative imagescell array | string

Negative images, specified as either a path to a folder containing images or as a cell array of image file names. Because the images are used to generate negative samples, they must not contain any objects of interest. Instead, they should contain backgrounds associated with the object.

Data Types: char | cell

outputXMLFilename — Trained cascade detector file namestring

Trained cascade detector file name, specified as a string with an XML extension.

Data Types: char

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'FeatureType','Haar' specifies Haar for the type of features to use.

'ObjectTrainingSize' — Object size for training'Auto' (default) | 2-element vector

Training object size, specified as the comma-separated pair. This pair contains 'ObjectTrainingSize' and either a 2-element vector [height, width] or as the string 'Auto'. Before training, the function resizes the positive and negative samples to ObjectTrainingSize in pixels. If you select 'Auto', the function determines the size automatically based on the median width-to-height ratio of the positive instances. For optimal detection accuracy, specify an object training size close to the expected size of the object in the image. However, for faster training and detection, set the object training size to be smaller than the expected size of the object in the image.

Data Types: char | single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

'NegativeSamplesFactor' — Negative sample factor2 (default) | real-valued scalar

Negative sample factor, specified as the comma-separated pair consisting of 'NegativeSamplesFactor' and a real-valued scalar. The number of negative samples to use at each stage is equal to

NegativeSamplesFactor x [the number of positive samples used at each stage].

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

'NumCascadeStages' — Number of cascade stages20 (default) | positive integer

Number of cascade stages to train, specified as the comma-separated pair consisting of 'NumCascadeStages' and a positive integer. Increasing the number of stages may result in a more accurate detector but also increases training time. More stages may require more training images., because at each stage, some number of positive and negative samples may be eliminated. This value depends on the FalseAlarmRate and the TruePositiveRate. More stages may also allow you to increase the FalseAlarmRate. See the Train a Cascade Object Detector tutorial for more details.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

'FalseAlarmRate' — Acceptable false alarm rate0.5 (default) | value in the range (0 1]

Acceptable false alarm rate at each stage, specified as the comma-separated pair consisting of 'FalseAlarmRate' and a value in the range (0 1]. The false alarm rate is the fraction of negative training samples incorrectly classified as positive samples.

The overall false alarm rate is calculated using the FalseAlarmRate per stage and the number of cascade stages, NumCascadeStages:

FalseAlarmRateNumCascadeStages

Lower values for FalseAlarmRate increase complexity of each stage. Increased complexity can achieve fewer false detections but may result in longer training and detection times. Higher values for FalseAlarmRate may require a greater number of cascade stages to achieve reasonable detection accuracy.

Data Types: single | double

'TruePositiveRate'Minimum true positive rate0.995 (default)

Minimum true positive rate required at each stage, specified as the comma-separated pair consisting of 'TruePositiveRate' and a value in the range (0 1]. The true positive rate is the fraction of correctly classified positive training samples.

The overall resulting target positive rate is calculated using the TruePositiveRate per stage and the number of cascade stages, NumCascadeStages:

TruePositiveRateNumCascadeStages

Higher values for TruePositiveRate increase complexity of each stage. Increased complexity can achieve a greater number of correct detections but may result in longer training and detection times.

Data Types: single | double

'FeatureType' — Feature type'HOG' (default) | 'LBP' | 'Haar'

Feature type, specified as the comma-separated pair consisting of 'FeatureType' and one of three strings. The possible features types are:

'Haar'[1]Haar-like features
'LBP'[2]Local Binary Patterns
'HOG'[3]Histogram of Oriented Gradients

The function allocates a large amount of memory, this is especially the case for the Haar features. To avoid running out of memory, use this function on a 64-bit operating system with a sufficient amount of RAM.

Data Types: char

More About

expand all

Tips

Training a good detector requires thousands of training samples. Processing time for a large amount of data varies. It is likely to take on the order of hours or even days. During training, the function displays the time it took to train each stage in the MATLAB® command window.

References

[1] Viola, P., and M. J. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features". Proceedings of the 2001 IEEE Computer Society Conference. Volume 1, 15 April 2001, pp. I-511–I-518.

[2] Ojala, T., M. Pietikainen, and T. Maenpaa, "Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns". IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.

[3] Dalal, N., and B. Triggs, "Histograms of Oriented Gradients for Human Detection". IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Volume 1, (2005), pp. 886–893.

Was this topic helpful?