Training a network and making predictions on new data require images that match the input
size of the network. Depending on the format of your data, you can use
augmentedImageDatastore to resize images to the required size.
You can apply affine geometric transformations to images to augment training, validation, test, and prediction data sets. Augmenting training images helps to prevent the network from overfitting and memorizing the exact details of the training images.
For more advanced preprocessing, you can start with a built-in datastore that performs
specific image preprocessing operations suitable for common applications. You can also
preprocess images according to your own pipeline by using the
functions. For more information, see Datastores for Deep Learning.
You can store image data as a numeric array,
ImageDatastore, or table. An
ImageDatastore enables you
to import data from image collections that are too large to fit in memory. This function
is designed to read batches of images for faster processing in machine learning and
computer vision applications. You can use an augmented image datastore or a resized 4-D
array for training, prediction, and classification. You can use a resized 3-D array for
prediction and classification only.
The method to resize images depends on the image data type.
|Data Type||Resizing Function||Sample Code|
|3-D array representing a single color image, a single multispectral image, or a stack of grayscale images|
To resize images in the 3-D array
im = imresize(im3d,inputSize);
|4-D array representing a stack of images|
To resize images in the 4-D array
im = imresize(im4d,inputSize);
To rescale images in the 4-D array
auimds = augmentedImageDatastore(inputSize,im4d);
To rescale images in the image datastore
For a more complete example, see Train Deep Learning Network to Classify New Images.
auimds = augmentedImageDatastore(inputSize,imds);
To rescale images in the table
auimds = augmentedImageDatastore(inputSize,tbl);
augmentedImageDatastore rescales images to the desired
size. If instead you want to crop images from the center or from random positions in the
image, you can use the
' name-value pair argument. For example, this
code shows how to crop images in image datastore
imds from the center
auimds = augmentedImageDatastore(inputSize,imds,'OutputSizeMode','centercrop');
In addition to resizing images, an
augmentedImageDatastore enables you to preprocess images with a
combination of rotation, reflection, shear, and translation transformations. The diagram
trainNetwork uses an augmented image
datastore to transform training data for each epoch. For an example of the workflow, see
Train Network with Augmented Images.
Specify your training images.
Configure image transformation options, such as the range of rotation
angles and whether to apply reflection at random, by creating an
To preview the transformations applied to sample images, use the
augmentedImageDatastore. Specify the training images, the size
of output images, and the
imageDataAugmenter. The size of
output images must be compatible with the size of the
imageInputLayer of the network.
Train the network, specifying the augmented image datastore as the data
trainNetwork. For each
iteration of training, the augmented image datastore applies a random
combination of transformations to images in the mini-batch of training
When you use an augmented image datastore as a source of training images, the datastore randomly perturbs the training data for each epoch, so that each epoch uses a slightly different data set. The actual number of training images at each epoch does not change. The transformed images are not stored in memory.
Some datastores perform specific image preprocessing operations when they read a batch
of data. These application-specific datastores are listed in the table. You can use
these datastores as a source of training, validation, and test data sets for deep
learning applications that use Deep Learning
Toolbox™. All of these datastores return data in a format supported by
|Apply random affine geometric transformations, including resizing, rotation, reflection, shear, and translation, for training deep neural networks. For an example, see Transfer Learning Using AlexNet.|
|Apply identical affine geometric transformations to images and corresponding ground truth labels for training semantic segmentation networks (requires Computer Vision Toolbox™). For an example, see Semantic Segmentation Using Deep Learning.|
|Extract multiple pairs of random patches from images or pixel label images (requires Image Processing Toolbox™). You optionally can apply identical random affine geometric transformations to the pairs of patches. For an example, see Single Image Super-Resolution Using Deep Learning.|
|Apply randomly generated Gaussian noise for training denoising networks (requires Image Processing Toolbox).|
To perform more general and complex image preprocessing operations than offered by the
application-specific datastores, you can use the
transform function creates an altered form of a
datastore, called an underlying datastore, by transforming the
data read by the underlying datastore according to a transformation function that you
combine function concatenates the data read from multiple
datastores to the two-column table or two-column cell array format required by
combine function maintains
parity between the underlying datastores.
|Transform batches of read data from an underlying datastore according to your own preprocessing pipeline.|
|Horizontally concatenate the data read from two or more underlying datastores.|
The custom transformation function must accept data in the format returned by the
read function of the underlying datastore. For image data, the
format depends on the
ReadSize property of the underlying
ReadSize is 1, the transformation function must
accept an integer array. The size of the array is consistent with the type
of images in the
ImageDatastore. For example, a grayscale
image has dimensions m-by-n, a
truecolor image has dimensions
m-by-n-by-3, and a multispectral image
with c channels has dimensions
ReadSize is greater than 1, the transformation
function must accept a cell array of image data corresponding to each image
in the batch.
transform function must return data that matches the input size
of the network. The
transform function does not support one-to-many
transform function supports prefetching when the
ImageDatastore reads a batch of JPG or PNG image
files. For these image types, do not use the
ImageDatastore to apply image preprocessing, as this
option is usually significantly slower. If you use a custom read function, then
ImageDatastore does not prefetch.