roialign

Non-quantized ROI pooling of dlarray data

Since R2021b

Syntax

dlY = roialign(dlX,boxes,outputSize)

dlY = roialign(dlX,boxes,outputSize,Name=Value)

Description

The ROI align operation pools a rectangular ROI into fixed sized bins without quantizing the grid points to the nearest pixel. The function uses bilinear interpolation to infer the value at each grid point.

Given input data of size [H W C N], where C is the number of channels and N is the number of observations, the pooled deep learning data has size [h w C sum(M)], where h and w are the specified output size. M is a vector of length N and M(i) is the number of ROIs associated with the i-th observation.

Note

To perform ROI pooling within a Layer (Deep Learning Toolbox) array, use roiAlignLayer.

This function requires Deep Learning Toolbox™.

dlY = roialign(dlX,boxes,outputSize) performs a pooling operation along the spatial dimensions of the input X for each bounding box in boxes. The outputs, Y, are of size outputSize.

example

dlY = roialign(dlX,boxes,outputSize,Name=Value) specifies additional name-value arguments.

Examples

collapse all

Perform ROI Pooling

This example uses:

Open Live Script

Create a 4-D formatted dlarray object that simulates a batch of two RGB images.

X = dlarray(rand(10,10,3,2),"SSCB");

Specify the position and batch index of one bounding box.

startXY = [2 2];
endXY = [4 4];
batchIdx = 1;
rois = [startXY endXY batchIdx]';

Perform ROI pooling with an output size of 3-by-3.

Y = roialign(X,rois,[3 3])

Y = 
  3(S) × 3(S) × 3(C) × 1(B) single dlarray


(:,:,1) =

    0.7464    0.3069    0.1780
    0.9212    0.8491    0.4677
    0.7303    0.9057    0.3840


(:,:,2) =

    0.3024    0.6428    0.6594
    0.1542    0.0046    0.1228
    0.6295    0.5182    0.3304


(:,:,3) =

    0.4915    0.7590    0.5035
    0.4574    0.4302    0.5453
    0.2960    0.2666    0.5389

Input Arguments

collapse all

`dlX` — Deep learning data to pool
4-D formatted `dlarray` object

Deep learning data to pool, specified as a 4-D formatted dlarray (Deep Learning Toolbox) object with a data format of "SSCB".

`boxes` — Bounding boxes
5-by-N numeric matrix

Bounding boxes, specified as a 5-by-N numeric matrix, where N is the number of bounding boxes. Each bounding box is formatted as a column vector of the form [x_start; y_start; x_end; y_end; batchIdx], where:

x_start and y_start specify the (x,y) coordinates of the upper-left corner of the rectangle.
x_end and y_end specify the (x,y) coordinates of the bottom-right corner of the rectangle.
batchIdx specifies the index of the observation corresponding to the rectangle.

By default, boxes are in the same coordinate space and scale as the input deep learning data dlX.

`outputSize` — Pooled output size
vector of two positive integers

Pooled output size, specified as a vector of two positive integers [h w], where h is the height and w is the width.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: dlY = roialign(dlX,boxes,outputSize,ROIScale=2) scales the input ROIs by a factor of 2

`ROIScale` — Ratio of scale of input feature map to ROI coordinates
`1` (default) | numeric scalar

Ratio of the scale of the input feature map to that of the ROI coordinates. This ratio specifies the factor used to scale input ROIs to the input feature map size.

`SamplingRatio` — Number of samples in each pooled bin
`"auto"` (default) | row vector of two positive integers

Number of samples in each pooled bin, specified as "auto" or a row vector of two positive integers. The two elements are the number of vertical and horizontal samples, respectively.

If you do not specify the sampling ratio, then the number of vertical samples has the default value ceil(roiHeight/outputHeight). Likewise, the number of horizontal samples has the default value ceil(roiWidth/outputWidth).

Data Types: double | char

Output Arguments

collapse all

`dlY` — Pooled deep learning data
4-D formatted `dlarray` object

Pooled deep learning data, returned as a 4-D formatted dlarray (Deep Learning Toolbox) object with a data format of "SSCB".

More About

collapse all

ROI Align

An ROI align operation returns fixed size feature maps for every rectangular ROI within an input dlarray. The function first partitions an ROI into fixed sized bins of size OutputSize without quantizing the grid points. Each bin is further sampled at SamplingRatio locations. The value at each sampled point is inferred using bilinear interpolation. The average of the sampled values is returned as the output value of each pooled bin.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

When you specify a custom range of values for the SamplingRatio argument, you will observe a slight numerical discrepancy between the MATLAB^® simulation results and the generated code.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Refer to the usage notes and limitations in the C/C++ Code Generation section. The same usage notes and limitations apply to GPU code generation.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2021b

roialign

Syntax

Description

Examples

Perform ROI Pooling

Input Arguments

`dlX` — Deep learning data to pool
4-D formatted `dlarray` object

`boxes` — Bounding boxes
5-by-N numeric matrix

`outputSize` — Pooled output size
vector of two positive integers

Name-Value Arguments

`ROIScale` — Ratio of scale of input feature map to ROI coordinates
`1` (default) | numeric scalar

`SamplingRatio` — Number of samples in each pooled bin
`"auto"` (default) | row vector of two positive integers

Output Arguments

`dlY` — Pooled deep learning data
4-D formatted `dlarray` object

More About

ROI Align

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Objects

Functions

Topics

roialign

Syntax

Description

Examples

Perform ROI Pooling

Input Arguments

dlX — Deep learning data to pool 4-D formatted dlarray object

boxes — Bounding boxes 5-by-N numeric matrix

outputSize — Pooled output size vector of two positive integers

Name-Value Arguments

ROIScale — Ratio of scale of input feature map to ROI coordinates 1 (default) | numeric scalar

SamplingRatio — Number of samples in each pooled bin "auto" (default) | row vector of two positive integers

Output Arguments

dlY — Pooled deep learning data 4-D formatted dlarray object

More About

ROI Align

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Objects

Functions

Topics

`dlX` — Deep learning data to pool
4-D formatted `dlarray` object

`boxes` — Bounding boxes
5-by-N numeric matrix

`outputSize` — Pooled output size
vector of two positive integers

`ROIScale` — Ratio of scale of input feature map to ROI coordinates
`1` (default) | numeric scalar

`SamplingRatio` — Number of samples in each pooled bin
`"auto"` (default) | row vector of two positive integers

`dlY` — Pooled deep learning data
4-D formatted `dlarray` object

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.