# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materials including this page, select Japan from the country navigator on the bottom of this page.

# trainFasterRCNNObjectDetector

Train a Faster R-CNN deep learning object detector

## Syntax

``trainedDetector = trainFasterRCNNObjectDetector(trainingData,network,options)``
``trainedDetector = trainFasterRCNNObjectDetector(trainingData,checkpoint,options)``
``trainedDetector = trainFasterRCNNObjectDetector(trainingData,detector,options)``
``trainedDetector = trainFasterRCNNObjectDetector(___,Name,Value)``

## Description

example

````trainedDetector = trainFasterRCNNObjectDetector(trainingData,network,options)` trains a Faster R-CNN (regions with convolution neural networks) object detector using deep learning. You can train a Faster R-CNN detector to detect multiple object classes. Specify your ground truth training data, your network, and training options. The network can be a pretrained series network such as `alexnet` or `vgg16` for training using transfer learning, or you can train a network from scratch using an array of `Layer` objects with uninitialized weights.This function requires that you have Neural Network Toolbox™. It is recommended that you also have Parallel Computing Toolbox™ to use with a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher.```
````trainedDetector = trainFasterRCNNObjectDetector(trainingData,checkpoint,options)` resumes training from a detector checkpoint.```
````trainedDetector = trainFasterRCNNObjectDetector(trainingData,detector,options)` continues training a detector with additional training data or performs more training iterations to improve detector accuracy.```
````trainedDetector = trainFasterRCNNObjectDetector(___,Name,Value)` uses additional options specified by one or more `Name,Value` pair arguments and any of the previous inputs.```

## Examples

collapse all

Load training data.

```data = load('fasterRCNNVehicleTrainingData.mat'); trainingData = data.vehicleTrainingData; trainingData.imageFilename = fullfile(toolboxdir('vision'),'visiondata', ... trainingData.imageFilename); ```

Setup network layers.

```layers = data.layers ```
```layers = 11x1 Layer array with layers: 1 'imageinput' Image Input 32x32x3 images with 'zerocenter' normalization 2 'conv_1' Convolution 32 3x3x3 convolutions with stride [1 1] and padding [1 1] 3 'relu_1' ReLU ReLU 4 'conv_2' Convolution 32 3x3x32 convolutions with stride [1 1] and padding [1 1] 5 'relu_2' ReLU ReLU 6 'maxpool' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0] 7 'fc_1' Fully Connected 64 fully connected layer 8 'relu_3' ReLU ReLU 9 'fc_2' Fully Connected 2 fully connected layer 10 'softmax' Softmax softmax 11 'classoutput' Classification Output crossentropyex with classes 'vehicle' and 'Background' ```

Configure training options.

• Lower the InitialLearningRate to reduce the rate at which network parameters are changed.

• Set the CheckpointPath to save detector checkpoints to a temporary directory. Change this to another location if required.

• Set MaxEpochs to 1 to reduce example training time. Increase this to 10 for proper training.

```options = trainingOptions('sgdm', ... 'InitialLearnRate', 1e-6, ... 'MaxEpochs', 1, ... 'CheckpointPath', tempdir); ```

Train detector. Training will take a few minutes.

```detector = trainFasterRCNNObjectDetector(trainingData, layers, options) ```
```************************************************************************* Training a Faster R-CNN Object Detector for the following object classes: * vehicle Step 1 of 4: Training a Region Proposal Network (RPN). |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.18 | 0.7672 | 37.50% | 0.0000 | | 1 | 50 | 8.18 | 0.7609 | 100.00% | 0.0000 | | 1 | 100 | 16.75 | 0.6738 | 100.00% | 0.0000 | | 1 | 150 | 25.25 | 0.4133 | 100.00% | 0.0000 | | 1 | 200 | 33.82 | 1.2629 | 50.00% | 0.0000 | | 1 | 250 | 43.45 | 0.6916 | 50.00% | 0.0000 | | 1 | 295 | 52.21 | 0.2594 | 100.00% | 0.0000 | |=========================================================================================| Step 2 of 4: Training a Fast R-CNN Network using the RPN from step 1. ******************************************************************* Training a Fast R-CNN Object Detector for the following object classes: * vehicle --> Extracting region proposals from 295 training images...done. |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.09 | 0.7288 | 85.71% | 0.0000 | | 1 | 50 | 3.82 | 0.4308 | 86.96% | 0.0000 | | 1 | 100 | 7.62 | 0.7009 | 69.57% | 0.0000 | | 1 | 150 | 11.39 | 0.1261 | 95.24% | 0.0000 | | 1 | 200 | 15.40 | 0.1939 | 100.00% | 0.0000 | | 1 | 215 | 16.41 | 0.5769 | 76.19% | 0.0000 | |=========================================================================================| Step 3 of 4: Re-training RPN using weight sharing with Fast R-CNN. |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.17 | 0.5654 | 100.00% | 0.0000 | | 1 | 50 | 7.56 | 0.3988 | 100.00% | 0.0000 | | 1 | 100 | 16.09 | 0.2646 | 100.00% | 0.0000 | | 1 | 150 | 23.77 | 0.8722 | 50.78% | 0.0000 | | 1 | 200 | 31.83 | 0.3936 | 100.00% | 0.0000 | | 1 | 250 | 40.20 | 0.4404 | 92.91% | 0.0000 | | 1 | 295 | 49.23 | 0.7402 | 50.00% | 0.0000 | |=========================================================================================| Step 4 of 4: Re-training Fast R-CNN using updated RPN. ******************************************************************* Training a Fast R-CNN Object Detector for the following object classes: * vehicle --> Extracting region proposals from 295 training images...done. |=========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning| | | | (seconds) | Loss | Accuracy | Rate | |=========================================================================================| | 1 | 1 | 0.09 | 0.5196 | 72.22% | 0.0000 | | 1 | 50 | 4.18 | 0.3972 | 72.73% | 0.0000 | | 1 | 100 | 9.31 | 0.4054 | 87.18% | 0.0000 | | 1 | 150 | 14.96 | 0.1904 | 96.43% | 0.0000 | | 1 | 200 | 18.36 | 0.1835 | 96.43% | 0.0000 | | 1 | 212 | 19.18 | 0.2212 | 95.24% | 0.0000 | |=========================================================================================| Finished training Faster R-CNN object detector. detector = fasterRCNNObjectDetector with properties: ModelName: 'vehicle' Network: [1×1 vision.cnn.FastRCNN] RegionProposalNetwork: [1×1 vision.cnn.RegionProposalNetwork] MinBoxSizes: [16 21] BoxPyramidScale: 2 NumBoxPyramidLevels: 5 ClassNames: {'vehicle' 'Background'} MinObjectSize: [15 15] ```

Test the Fast R-CNN detector on a test image.

```img = imread('highway.png'); ```

Run detector.

```[bbox, score, label] = detect(detector, img); ```

Display detection results.

```detectedImg = insertShape(img, 'Rectangle', bbox); figure imshow(detectedImg) ```

## Input Arguments

collapse all

Labeled ground truth images, specified as a table with two or more columns. The first column must contain paths and file names to grayscale or truecolor (RGB) images. The remaining columns must contain bounding boxes related to the corresponding image. Each column represents a single object class, such as a car, dog, flower, or stop sign.

Each bounding box must be in the format [x y width height]. The format specifies the upper-left corner location and size of the object in the corresponding image. The table variable name defines the object class name. To create the ground truth table, use the Training Image Labeler app.

Network, specified as a `SeriesNetwork` object or as an array of `Layer` objects. For example:

```layers = [imageInputLayer([28 28 3]) convolution2dLayer([5 5],10) reluLayer() fullyConnectedLayer(10) softmaxLayer() classificationLayer()]; ```

The network is trained to classify the object classes defined in the `trainingData` table.

When the network is a `SeriesNetwork` object, the function adjusts the network layers to support the number of object classes defined within the specified `trainingData`. The background is added as an additional class.

When the network is an array of `Layer` objects, the network must have a classification layer that supports the number of object classes, plus a background class. Use this input type to customize the learning rates of each layer.

The function replaces the last `averagePooling2dLayer` or `maxPooling2dLayer` with an ROI pooling layer.

### Note

`trainFasterRCNNObjectDetector` does not support DAG networks, such as ResNet-50, Inception-v3, or GoogLeNet. Additionally, you cannot pass a Layers array from a DAG network to the training function, because the Layers property from a DAG network does not contain the connection information.

Training parameters of the neural network, specified using the `trainingOptions` function. When you specify a single set of training options, the function uses those options for all four training stages. When you specify an array of four options, each stage uses its own set of options. To create an array of four options, assign the `trainingOptions` function output to each element.

```options(1) = trainingOptions('sgdm') options(2) = trainingOptions('sgdm') options(3) = trainingOptions('sgdm') options(4) = trainingOptions('sgdm')```

To fine-tune a pretrained network for detection, lower the initial learning rate to avoid changing the model parameters too rapidly. For example:

```options = trainingOptions('sgdm', ... 'InitialLearningRate',1e-6, ... 'CheckpointPath',tempdir); detector = trainFasterRCNNObjectDetector(trainingData,network,options);```

To save the detector after every epoch, set the `'CheckpointPath'` property when using the `trainingOptions` function. Saving a checkpoint after every epoch is recommended because network training can take a few hours.

### Note

`trainFasterRCNNObjectDetector` does not support these training options:

• The `ExecutionEnvironment` values: `'multi-gpu'` or `'parallel'`

• The `Plots` value: `'training-progress'`

• The `ValidationData`, `ValidationFrequency`, or `ValidationPatience` options

Saved detector checkpoint, specified as a `fasterRCNNObjectDetector` object. To save the detector after every epoch, set the `'CheckpointPath'` property when using the `trainingOptions` function. Saving a checkpoint after every epoch is recommended because network training can take a few hours.

To load a checkpoint for a previously trained detector, load the MAT-file from the checkpoint path. For example, if the `'CheckpointPath'` property of `options` is `'/tmp'`, load a checkpoint MAT-file using:

`data = load('/tmp/faster_rcnn_checkpoint__105__2016_11_18__14_25_08.mat');`

The name of the MAT-file includes the iteration number and timestamp of when the detector checkpoint was saved. The detector is saved in the `detector` variable of the file. Pass this file back into the `trainFasterRCNNObjectDetector` function:

```frcnn = trainFasterRCNNObjectDetector(stopSigns,... data.detector,options);```

Previously trained Faster R-CNN object detector, specified as a `fasterRCNNObjectDetector` object.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside single quotes (`' '`). You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'PositiveOverlapRange',[0.75 1]`

collapse all

Bounding box overlap ratios for positive training samples, specified as the comma-separated pair consisting of `'PositiveOverlapRange'` and a two-element vector. The vector contains values in the range [0,1]. Region proposals that overlap with ground truth bounding boxes within the specified range are used as positive training samples.

The overlap ratio used for both the `PositiveOverlapRange` and `NegativeOverlapRange` is defined as:

`$\frac{area\left(A\cap B\right)}{area\left(A\cup B\right)}$`

A and B are bounding boxes.

Bounding box overlap ratios for negative training samples, specified as the comma-separated pair consisting of `NegativeOverlapRange` and a two-element vector. The vector contains values in the range [0,1]. Region proposals that overlap with the ground truth bounding boxes within the specified range are used as negative training samples.

The overlap ratio used for both the `PositiveOverlapRange` and `NegativeOverlapRange` is defined as:

`$\frac{area\left(A\cap B\right)}{area\left(A\cup B\right)}$`

A and B are bounding boxes.

Maximum number of strongest region proposals to use for generating training samples, specified as the comma-separated pair consisting of `'NumStrongestRegions'` and a positive integer. Reduce this value to speed up processing time at the cost of training accuracy. To use all region proposals, set this value to `Inf`.

Length of smallest image dimension, either width or height, specified as the comma-separated pair consisting of `'SmallestImageDimension'` and a positive integer. Training images are resized such that the length of the shortest dimension is equal to the specified integer. By default, training images are not resized. Resizing training images helps reduce computational costs and memory used when training images are large. Typical values range from 400–600 pixels.

Minimum anchor box sizes used to build the anchor box pyramid of the region proposal network (RPN), specified as the comma-separated pair consisting of`'MinBoxSizes'` and an m-by-2 matrix. Each row defines the [height width] of an anchor box.

The default `'auto'` setting uses the minimum size and the median aspect ratio from the bounding boxes for each class in the ground truth data. To remove redundant box sizes, the function keeps boxes that have an intersection-over-union that is less than or equal to 0.5. This behavior ensures that the minimum number of anchor boxes are used to cover all the object sizes and aspect ratios.

Anchor box pyramid scale factor used to successively upscale anchor box sizes, specified as the comma-separated pair consisting of `'BoxPyramidScale'` and a scalar. Recommended values are from 1 through 2.

Number of levels in an anchor box pyramid, specified as the comma-separated pair consisting of `'NumBoxPyramidLevels'` and a scalar. Select a value that ensures that the multiscale anchor boxes are comparable in size to the size of objects in the ground truth data.

The default setting, `'auto'`, selects the number of levels based on the size of objects within the ground truth data. The number of levels is selected such that it covers the range of object sizes.

## Output Arguments

collapse all

Trained Faster R-CNN object detector, returned as a `fasterRCNNObjectDetector` object.

## Tips

• To accelerate data preprocessing for training, `trainFasterRCNNObjectDetector` automatically creates and uses a parallel pool based on your parallel preference settings. This requires Parallel Computing Toolbox.

• If you have a large network (such as VGG-16) or large images, you may encounter an "Out of Memory" error. To avoid this error for large images, set the '`SmallestImageDimension`' parameter to `600` or smaller, which will automatically resize the images during training. Alternatively, manually resize the images along with the bounding box ground truth data before calling `trainFasterRCNNObjectDetector`.

• Transfer learning is supported for series networks such as AlexNet and VGG-16. Passing a network or an array of Layers to the training function preserves the weights of the pretrained network. You can perform transfer learning using code such as this.

```net = alexnet; detector = trainFasterRCNNObjectDetector(trainingData,net,options);```

For more information, see Get Started with Transfer Learning (Neural Network Toolbox) and Transfer Learning Using AlexNet (Neural Network Toolbox).

• Use the `trainingOptions` function to enable or disable verbose printing.

## Algorithms

The `trainFasterRCNNObjectDetector` function uses the alternating training method [1].

## References

[1] Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." Advances in Neural Information Processing Systems . Vol. 28, 2015.

Download now