This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Preprocess Images for Deep Learning

Training a network and making predictions on new data require that images match the input size of the network. To resize images to match the input size of the network, you can use imresize or augmentedImageDatastore.

In addition to resizing images, you can perform additional preprocessing to augment training, validation, test, and prediction data sets. Augmenting training images helps to prevent the network from overfitting and memorizing the exact details of the training images.

Resize Images

To find the image input size of the network, get the first two elements of the InputSize property of the imageInputLayer of the network. For example, to get the image input size for the AlexNet pretrained network:

net = alexnet;
inputSize = net.Layers(1).InputSize(1:2)
inputSize =

   227   227

The method to resize images depends on the image data type.

  • To rescale a 3-D array representing a single color image, a single multispectral image, or a stack of grayscale images, use imresize. For example, to resize images in the 3-D array im3d:

    im = imresize(im3d,inputSize);
  • To rescale a 4-D array representing a stack of images, you can use imresize. For example, to rescale images in the 4-D array im4d:

    im = imresize(im4d,inputSize);

    Alternatively, you can rescale or crop images in the 4-D array to the desired size by using augmentedImageDatastore. By default, augmentedImageDatastore rescales images to the desired size. If instead you want to crop images from the center or from random positions in the image, you can use the 'OutputSizeMode' name-value pair argument. For example, to crop images in the 4-D array im4d from the center of each image:

    auimds = augmentedImageDatastore(inputSize,im4d,'OutputSizeMode','centercrop');
    

  • To rescale or crop images in an ImageDatastore or table, use augmentedImageDatastore. For example, to rescale images in the image datastore imds:

    auimds = augmentedImageDatastore(inputSize,imds);
    
    For a more complete example, see Train Deep Learning Network to Classify New Images.

You can use an augmented image datastore or a resized 4-D array for training, prediction, and classification. You can use a resized 3-D array for prediction and classification only.

Augment Images for Training

In addition to resizing images, an augmentedImageDatastore enables you to augment images with a combination of rotation, reflection, shear, and translation transformations. The diagram shows how trainNetwork uses an augmented image datastore to transform training data for each epoch.

  1. Define your training images. You can store the images as an ImageDatastore, a 4-D numeric array, or a table. An ImageDatastore enables you to import data from image collections that are too large to fit in memory. This function is designed to read batches of images for faster processing in machine learning and computer vision applications.

  2. Configure image transformation options, such as the range of rotation angles and whether to apply reflection at random, by creating an imageDataAugmenter.

    Tip

    To preview the transformations applied to sample images, use the augment function.

  3. Create an augmentedImageDatastore, specifying the training images, the size of output images, and the imageDataAugmenter. The size of output images must be compatible with the size of the imageInputLayer of the network.

  4. Train the network, specifying the augmented image datastore as the data source for trainNetwork. For each iteration of training, the augmented image datastore applies a random combination of transformations to the mini-batch of training data.

    Note

    When you use an augmented image datastore as a source of training images, the datastore randomly perturbs the training data for each epoch, so that each epoch uses a slightly different data set. The actual number of training images at each epoch does not change. The transformed images are not stored in memory.

For an example of the workflow, see Train Network with Augmented Images.

Advanced Image Preprocessing

If you want to perform image preprocessing beyond the transformations offered by augmentedImageDatastore, then you can use a mini-batch datastore to perform data augmentation. A mini-batch datastore refers to any built-in or custom datastore that offers support for reading data in batches. You can use a mini-batch datastore as a source of training, validation, and test data sets for deep learning applications that use Deep Learning Toolbox™.

These built-in mini-batch datastores perform specific image preprocessing operations when they read a batch of data:

Type of Mini-Batch DatastoreDescription
augmentedImageDatastoreApply random affine geometric transformations, including resizing, rotation, reflection, shear, and translation, for training deep neural networks. For an example, see Transfer Learning Using AlexNet.
pixelLabelImageDatastoreApply identical affine geometric transformations to images and corresponding ground truth labels for training semantic segmentation networks (requires Computer Vision System Toolbox™). For an example, see Semantic Segmentation Using Deep Learning.
randomPatchExtractionDatastoreExtract pairs of random patches from images or pixel label images (requires Image Processing Toolbox™). You optionally can apply identical random affine geometric transformations to the pairs of patches. For an example, see Single Image Super-Resolution Using Deep Learning.
denoisingImageDatastoreApply randomly generated Gaussian noise for training denoising networks (requires Image Processing Toolbox).

To preprocess images using your own image processing pipeline, you can implement a custom mini-batch datastore. For more information, see Develop Custom Mini-Batch Datastore. For an example, see Define Custom Mini-Batch Datastore For Super-Resolution Networks.

Tip

When you define how your custom mini-batch datastore reads data, you can augment data with random affine geometric transformations. Specify transformation options by using an imageDataAugmenter object, then transform data by using the augment function. The augment function can apply identical transformations to input and response image pairs.

See Also

| | |

Related Topics