This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

augmentedImageSource

(To be removed) Generate batches of augmented image data

augmentedImageSource will be removed in a future release. Create an augmented image datastore using the augmentedImageDatastore function instead. For more information, see Compatibility Considerations.

Syntax

auimds = augmentedImageSource(outputSize,imds)
auimds = augmentedImageSource(outputSize,X,Y)
auimds = augmentedImageSource(outputSize,tbl)
auimds = augmentedImageSource(outputSize,tbl,responseName)
auimds = augmentedImageSource(___,Name,Value)

Description

auimds = augmentedImageSource(outputSize,imds) creates an augmented image datastore, auimds, for classification problems using images from image datastore imds, with output image size outputSize.

auimds = augmentedImageSource(outputSize,X,Y) creates an augmented image datastore for classification and regression problems. The array X contains the predictor variables and the array Y contains the categorical labels or numeric responses.

auimds = augmentedImageSource(outputSize,tbl) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses.

auimds = augmentedImageSource(outputSize,tbl,responseName) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses. The responseName argument specifies the response variable in tbl.

example

auimds = augmentedImageSource(___,Name,Value) creates an augmented image datastore, using name-value pairs to configure the image preprocessing done by the augmented image datastore. You can specify multiple name-value pairs.

Examples

Train Network with Rotational Invariance Using augmentedImageSource

Preprocess images using random rotation so that the trained convolutional neural network has rotational invariance. This example uses the augmentedImageSource function to create an augmented image datastore object. For an example of the recommended workflow that uses the augmentedImageDatastore function to create an augmented image datastore object, see Train Network with Augmented Images.

Load the sample data, which consists of synthetic images of handwritten numbers.

[XTrain,YTrain] = digitTrain4DArrayData;

digitTrain4DArrayData loads the digit training set as 4-D array data. XTrain is a 28-by-28-by-1-by-5000 array, where:

  • 28 is the height and width of the images.

  • 1 is the number of channels

  • 5000 is the number of synthetic images of handwritten digits.

YTrain is a categorical vector containing the labels for each observation.

Create an image augmenter that rotates images during training. This image augmenter rotates each image by a random angle.

imageAugmenter = imageDataAugmenter('RandRotation',[-180 180])
imageAugmenter = 
  imageDataAugmenter with properties:

           FillValue: 0
     RandXReflection: 0
     RandYReflection: 0
        RandRotation: [-180 180]
           RandScale: [1 1]
          RandXScale: [1 1]
          RandYScale: [1 1]
          RandXShear: [0 0]
          RandYShear: [0 0]
    RandXTranslation: [0 0]
    RandYTranslation: [0 0]

Use the augmentedImageSource function to create an augmented image datastore. Specify the size of augmented images, the training data, and the image augmenter.

imageSize = [28 28 1];
auimds = augmentedImageSource(imageSize,XTrain,YTrain,'DataAugmentation',imageAugmenter)
auimds = 
  augmentedImageDatastore with properties:

         NumObservations: 5000
           MiniBatchSize: 128
        DataAugmentation: [1x1 imageDataAugmenter]
      ColorPreprocessing: 'none'
              OutputSize: [28 28]
          OutputSizeMode: 'resize'
    DispatchInBackground: 0

Specify the convolutional neural network architecture.

layers = [
    imageInputLayer([28 28 1])
    
    convolution2dLayer(3,16,'Padding',1)
    batchNormalizationLayer
    reluLayer
    
    maxPooling2dLayer(2,'Stride',2)
       
    convolution2dLayer(3,32,'Padding',1)
    batchNormalizationLayer
    reluLayer
    
    maxPooling2dLayer(2,'Stride',2)
       
    convolution2dLayer(3,64,'Padding',1)
    batchNormalizationLayer
    reluLayer
        
    fullyConnectedLayer(10)
    softmaxLayer
    classificationLayer];

Set the training options for stochastic gradient descent with momentum.

opts = trainingOptions('sgdm', ...
    'MaxEpochs',10, ...
    'Shuffle','every-epoch', ...
    'InitialLearnRate',1e-3);

Train the network.

net = trainNetwork(auimds,layers,opts);
Training on single CPU.
Initializing image normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:01 |        7.81% |       2.4151 |          0.0010 |
|       2 |          50 |       00:00:23 |       52.34% |       1.4930 |          0.0010 |
|       3 |         100 |       00:00:44 |       74.22% |       1.0148 |          0.0010 |
|       4 |         150 |       00:01:05 |       78.13% |       0.8153 |          0.0010 |
|       6 |         200 |       00:01:26 |       76.56% |       0.6903 |          0.0010 |
|       7 |         250 |       00:01:45 |       87.50% |       0.4891 |          0.0010 |
|       8 |         300 |       00:02:06 |       87.50% |       0.4874 |          0.0010 |
|       9 |         350 |       00:02:30 |       87.50% |       0.4866 |          0.0010 |
|      10 |         390 |       00:02:46 |       89.06% |       0.4021 |          0.0010 |
|========================================================================================|

Input Arguments

collapse all

Size of output images, specified as a vector of two positive integers. The first element specifies the number of rows in the output images, and the second element specifies the number of columns. This value sets the OutputSize property of the returned augmented image datastore, auimds.

Images with labels, specified as an ImageDatastore object with categorical labels. You can store data in ImageDatastore for image classification networks only.

ImageDatastore allows batch reading of JPG or PNG image files using prefetching. If you use a custom function for reading the images, then ImageDatastore does not prefetch.

Tip

Use augmentedImageDatastore for efficient preprocessing of images for deep learning including image resizing.

Do not use the readFcn option of imageDatastore as this option is usually significantly slower.

Images, specified as a 4-D numeric array. The first three dimensions are the height, width, and channels, and the last dimension indexes the individual images.

If the array contains NaNs, then they are propagated through the training. However, in most cases, the training fails to converge.

Data Types: single | double | uint8 | int8 | uint16 | int16 | uint32 | int32

Responses for classification or regression, specified as one of the following:

  • For a classification problem, Y is a categorical vector containing the image labels.

  • For a regression problem, Y can be an:

    • n-by-r numeric matrix. n is the number of observations and r is the number of responses.

    • h-by-w-by-c-by-n numeric array. h-by-w-by-c is the size of a single response and n is the number of observations.

Responses must not contain NaNs.

Data Types: categorical | double

Input data, specified as a table. tbl must contain the predictors in the first column as either absolute or relative image paths or images. The type and location of the responses depend on the problem:

  • For a classification problem, the response must be a categorical variable containing labels for the images. If the name of the response variable is not specified in the call to augmentedImageSource, the responses must be in the second column. If the responses are in a different column of tbl, then you must specify the response variable name using the responseName positional argument.

  • For a regression problem, the responses must be numerical values in the column or columns after the first one. The responses can be either in multiple columns as scalars or in a single column as numeric vectors or cell arrays containing numeric 3-D arrays. When you do not specify the name of the response variable or variables, augmentedImageSource accepts the remaining columns of tbl as the response variables. You can specify the response variable names using the responseName positional argument.

Responses must not contain NaNs. If there are NaNs in the predictor data, they are propagated through the training, however, in most cases the training fails to converge.

Data Types: table

Names of the response variables in the input table, specified as a character vector or cell array of character vectors. For problems with one response, responseName is the corresponding variable name in tbl. For regression problems with multiple response variables, responseName is a cell array of the corresponding variable names in tbl.

Data Types: char | cell

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: augmentedImageSource([28,28],myTable,'OutputSizeMode','centercrop') creates an augmented image datastore that sets the OutputSizeMode property to crop images from the center.

Preprocessing operations performed on color channels of input images, specified as the comma-separated pair consisting of 'ColorPreprocessing' and 'none', 'gray2rgb', or 'rgb2gray'. This argument sets the ColorPreprocessing property of the returned augmented image datastore, auimds. The ColorPreprocessing property ensures that all output images from the augmented image datastore have the number of color channels required by inputImageLayer.

Data Types: char | string

Preprocessing applied to input images, specified as the comma-separated pair consisting of 'DataAugmentation' and an imageDataAugmenter object or 'none'. This argument sets the DataAugmentation property of the returned augmented image datastore, auimds. When DataAugmentation is 'none', no preprocessing is applied to input images.

Method used to resize output images, specified as the comma-separated pair consisting of 'OutputSizeMode' and one of the following. This argument sets the OutputSizeMode property of the returned augmented image datastore, auimds.

  • 'resize' — Scale the image to fit the output size. For more information, see imresize.

  • 'centercrop' — Take a crop from the center of the training image. The crop has the same size as the output size.

  • 'randcrop' — Take a random crop from the training image. The random crop has the same size as the output size.

Data Types: char | string

Perform augmentation in parallel, specified as the comma-separated pair consisting of 'BackgroundExecution' and false or true. This argument sets the DispatchInBackground property of the returned augmented image datastore, auimds. If 'BackgroundExecution' is true, and you have Parallel Computing Toolbox™ software installed, then the augmented image datastore auimds performs image augmentation in parallel.

Output Arguments

collapse all

Augmented image datastore, returned as an augmentedImageDatastore object.

Compatibility Considerations

expand all

Not recommended starting in R2018a

Introduced in R2017b