Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

JPEG Image Deblocking Using Deep Learning

This example shows how to train a denoising convolutional neural network (DnCNN), then use the network to reduce JPEG compression artifacts in an image.

The example shows how to train a DnCNN network and also provides a pretrained DnCNN network. If you choose to train the DnCNN network, use of a CUDA-capable NVIDIA™ GPU with compute capability 3.0 or higher is highly recommended (requires Parallel Computing Toolbox™).

If you do not want to download the training data set or train the network, then you can load the pretrained DnCNN network by typing load('pretrainedJPEGDnCNN.mat') at the command line. Then, go directly to the Perform JPEG Deblocking Using DnCNN Network section in this example.

Introduction

Image compression is used to reduce the memory footprint of an image. One popular and powerful compression method is employed by the JPEG image format, which uses a quality factor to specify the amount of compression. Reducing the quality value results in higher compression and a smaller memory footprint, at the expense of visual quality of the image.

JPEG compression is lossy, meaning that the compression process causes the image to lose information. For JPEG images, this information loss appears as blocking artifacts in the image. As shown in the figure, more compression results in more information loss and stronger artifacts. Textured regions with high-frequency content, such as the grass and clouds, look blurry. Sharp edges, such as the roof of the house and the guardrails atop the lighthouse, exhibit ringing.

JPEG deblocking is the process of reducing the effects of compression artifacts in JPEG images. Several JPEG deblocking methods exist, including more effective methods that use deep learning. This example implements one such deep learning-based method that attempts to minimize the effect of JPEG compression artifacts.

The DnCNN Network

This example uses a built-in deep feed-forward convolutional neural network, called DnCNN. The network was primarily designed to remove noise from images. However, the DnCNN architecture can also be trained to remove JPEG compression artifacts or increase image resolution.

The reference paper [1] employs a residual learning strategy, meaning that the DnCNN network learns to estimate the residual image. A residual image is the difference between a pristine image and a distorted copy of the image. The residual image contains information about the image distortion. For this example, distortion appears as JPEG blocking artifacts.

The DnCNN network is trained to detect the residual image from the luminance of a color image. The luminance channel of an image, Y, represents the brightness of each pixel through a linear combination of the red, green, and blue pixel values. In constrast, the two chrominance channels of an image, Cb and Cr, are different linear combinations of the red, green, and blue pixel values that represent color-difference information. DnCNN is trained using only the luminance channel because human perception is more sensitive to changes in brightness than changes in color.

If is the luminance of the pristine image and is the luminance of the image containing JPEG compression artifacts, then the input to the DnCNN network is and the network learns to predict from the training data.

Once the DnCNN network learns how to estimate a residual image, it can reconstruct an undistorted version of a compressed JPEG image by adding the residual image to the compressed luminance channel, then converting the image back to RGB colorspace.

Download Training Data

Download the IAPR TC-12 Benchmark, which consists of 20,000 still natural images [2]. The data set includes photos of people, animals, cities and more. You can use the helper function, downloadIAPRTC12Data, to download the data. The size of the data file is ~1.8 GB.

imagesDir = tempdir;
url = "http://www-i6.informatik.rwth-aachen.de/imageclef/resources/iaprtc12.tgz";
downloadIAPRTC12Data(url,imagesDir);

This example will train the network with a small subset of the IAPR TC-12 Benchmark data. Load the imageCLEF training data. All images are 32-bit JPEG color images.

trainImagesDir = fullfile(imagesDir,'iaprtc12','images','00');
exts = {'.jpg','.bmp','.png'};
trainImages = imageDatastore(trainImagesDir,'FileExtensions',exts);

List the number of training images.

numel(trainImages.Files)
ans = 251

Prepare Training Data

To create a training data set, read in pristine images and write out images in the JPEG file format with various levels of compression.

Create the folder structure to properly organize the training data.

originalFileLocation = fullfile(imagesDir,'iaprtc12','images','00');

% Make a folder structure for the training data
if ~exist(fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Original'),'dir')
    mkdir(fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Original'));
end
if ~exist(fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Compressed'),'dir')
    mkdir(fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Compressed'));
end

uncompressedFileLocation = fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Original');
compressedFileLocation = fullfile(imagesDir,'iaprtc12','JPEGDeblockingData','Compressed');

Specify the JPEG image quality values used to render image compression artifacts. Quality values must be in the range [0, 100]. Small quality values result in more compression and stronger compression artifacts. Use a denser sampling of small quality values so the training data has a broad range of compression artifacts.

JPEGQuality = [5:5:40 50 60 70 80];

Write pristine and compressed training images from the original data.

files = dir([originalFileLocation filesep '*.jpg']);
imNumber = 1;
for fileIndex = 1:size(files,1)
    fname = [originalFileLocation filesep files(fileIndex).name];
    im = imread(fname);
    if size(im,3) == 3
        im = rgb2gray(im);
    end
    for index = 1:length(JPEGQuality)
        imwrite(im,[uncompressedFileLocation filesep num2str(imNumber) '.jpg'],'JPEG','Quality',100)
        imwrite(im,[compressedFileLocation filesep num2str(imNumber) '.jpg'],'JPEG','Quality',JPEGQuality(index))
        imNumber = imNumber + 1;
    end
end

Define Mini-Batch Datastore for Training

A mini-batch datastore is used to feed the training data to the network. This example defines a custom implementation of a mini-batch datastore, called JPEGimagePatchDatastore, as a convenient way to generate augmented image patches for training a JPEG deblocking network.

The JPEGimagePatchDatastore extracts patches from the distorted input images and computes the target residuals from the corresponding patches in the pristine images. The distorted image patches act as the network input. The residual patches are the desired network output. Each mini-batch contains 128 patches of size 50-by-50 pixels. Only one mini-batch will be extracted from each image during training and all patches will be extracted from random positions in the images.

batchSize = 128;
patchSize = 50;
batchesPerImage = 1;

exts = {'.jpg'};
imdsUncompressed = imageDatastore(uncompressedFileLocation,'FileExtensions',exts);
imdsCompressed = imageDatastore(compressedFileLocation,'FileExtensions',exts);

ds = JPEGimagePatchDatastore(imdsUncompressed,imdsCompressed,...
    'MiniBatchSize',batchSize,...
    'PatchSize',patchSize,...
    'BatchesPerImage',batchesPerImage);

Perform a read operation on the mini-batch datastore to explore the data.

inputBatch = read(ds);
summary(inputBatch)
Variables:

    jpegPatches: 128×1 cell

    residuals: 128×1 cell

Set up DnCNN Layers

Create the layers of the built-in DnCNN network by using the dnCNNLayers function. By default, the network depth (the number of convolution layers) is 20.

layers = dnCNNLayers()
layers = 
  1x59 Layer array with layers:

     1   'InputLayer'             Image Input           50x50x1 images
     2   'Conv1'                  Convolution           64 3x3x1 convolutions with stride [1  1] and padding [1  1  1  1]
     3   'ReLU1'                  ReLU                  ReLU
     4   'Conv2'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
     5   'BNorm2'                 Batch Normalization   Batch normalization with 64 channels
     6   'ReLU2'                  ReLU                  ReLU
     7   'Conv3'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
     8   'BNorm3'                 Batch Normalization   Batch normalization with 64 channels
     9   'ReLU3'                  ReLU                  ReLU
    10   'Conv4'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    11   'BNorm4'                 Batch Normalization   Batch normalization with 64 channels
    12   'ReLU4'                  ReLU                  ReLU
    13   'Conv5'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    14   'BNorm5'                 Batch Normalization   Batch normalization with 64 channels
    15   'ReLU5'                  ReLU                  ReLU
    16   'Conv6'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    17   'BNorm6'                 Batch Normalization   Batch normalization with 64 channels
    18   'ReLU6'                  ReLU                  ReLU
    19   'Conv7'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    20   'BNorm7'                 Batch Normalization   Batch normalization with 64 channels
    21   'ReLU7'                  ReLU                  ReLU
    22   'Conv8'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    23   'BNorm8'                 Batch Normalization   Batch normalization with 64 channels
    24   'ReLU8'                  ReLU                  ReLU
    25   'Conv9'                  Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    26   'BNorm9'                 Batch Normalization   Batch normalization with 64 channels
    27   'ReLU9'                  ReLU                  ReLU
    28   'Conv10'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    29   'BNorm10'                Batch Normalization   Batch normalization with 64 channels
    30   'ReLU10'                 ReLU                  ReLU
    31   'Conv11'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    32   'BNorm11'                Batch Normalization   Batch normalization with 64 channels
    33   'ReLU11'                 ReLU                  ReLU
    34   'Conv12'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    35   'BNorm12'                Batch Normalization   Batch normalization with 64 channels
    36   'ReLU12'                 ReLU                  ReLU
    37   'Conv13'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    38   'BNorm13'                Batch Normalization   Batch normalization with 64 channels
    39   'ReLU13'                 ReLU                  ReLU
    40   'Conv14'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    41   'BNorm14'                Batch Normalization   Batch normalization with 64 channels
    42   'ReLU14'                 ReLU                  ReLU
    43   'Conv15'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    44   'BNorm15'                Batch Normalization   Batch normalization with 64 channels
    45   'ReLU15'                 ReLU                  ReLU
    46   'Conv16'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    47   'BNorm16'                Batch Normalization   Batch normalization with 64 channels
    48   'ReLU16'                 ReLU                  ReLU
    49   'Conv17'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    50   'BNorm17'                Batch Normalization   Batch normalization with 64 channels
    51   'ReLU17'                 ReLU                  ReLU
    52   'Conv18'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    53   'BNorm18'                Batch Normalization   Batch normalization with 64 channels
    54   'ReLU18'                 ReLU                  ReLU
    55   'Conv19'                 Convolution           64 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    56   'BNorm19'                Batch Normalization   Batch normalization with 64 channels
    57   'ReLU19'                 ReLU                  ReLU
    58   'Conv20'                 Convolution           1 3x3x64 convolutions with stride [1  1] and padding [1  1  1  1]
    59   'FinalRegressionLayer'   Regression Output     mean-squared-error

Select Training Options

Train the network using stochastic gradient descent with momentum (SGDM) optimization. Specify the hyperparameter settings for SDGM by using the trainingOptions function.

Training a deep network is time-consuming. Accelerate the training by specifying a high learning rate. However, this can cause the gradients of the network to explode or grow uncontrollably, preventing the network from training successfully. To keep the gradients in a meaningful range, enable gradient clipping by setting 'GradientThreshold' to a value of 0.01, and specify 'GradientThresholdMethod' to use the absolute value of the gradients.

maxEpochs = 30;
initLearningRate = 0.1;
l2reg = 0.0001;
batchSize = 128;

options = trainingOptions('sgdm',...
    'Momentum',0.9,...
    'InitialLearnRate',initLearningRate,...
    'LearnRateSchedule','piecewise',...
    'GradientThresholdMethod','absolute-value',...
    'GradientThreshold',0.005,...
    'L2Regularization',l2reg,...
    'MiniBatchSize',batchSize,...
    'MaxEpochs',maxEpochs,...
    'Plots','training-progress');

Train the Network

After configuring the training options and the mini-batch datastore, train the DnCNN network using the trainNetwork function. To train the network, set the doTraining parameter in the following code to true. A CUDA-capable NVIDIA™ GPU with compute capability 3.0 or higher is highly recommended for training.

If you keep the doTraining parameter in the following code as false, then the example returns a pretrained DnCNN network.

Note: Training takes about 40 hours on an NVIDIA(TM) Titan X and can take even longer depending on your GPU hardware.

% Training runs when doTraining is true
doTraining = false; 
if doTraining     
    [net, info] = trainNetwork(ds,layers,options); 
else 
    load('pretrainedJPEGDnCNN.mat'); 
end

You can now use the DnCNN network to remove JPEG compression artifacts from new images.

Perform JPEG Deblocking Using DnCNN Network

To perform JPEG deblocking using DnCNN, follow the remaining steps of this example. The remainder of the example shows how to:

  • Create sample test images with JPEG compression artifacts at three different quality levels.

  • Remove the compression artifacts using the DnCNN network.

  • Visually compare the images before and after deblocking.

  • Evaluate the quality of the compressed and deblocked images by quantifying their similarity to the undistorted reference image.

Create Sample Images with Blocking Artifacts

Create sample images to evaluate the result of JPEG image deblocking using the DnCNN network. The test data set, testImages, contains 21 undistorted images shipped in Image Processing Toolbox™. Load the images into an imageDatastore.

exts = {'.jpg','.png'};
fileNames = {'sherlock.jpg','car2.jpg','fabric.png','greens.jpg','hands1.jpg','kobi.png',...
    'lighthouse.png','micromarket.jpg','office_4.jpg','onion.png','pears.png','yellowlily.jpg',...
    'indiancorn.jpg','flamingos.jpg','sevilla.jpg','llama.jpg','parkavenue.jpg',...
    'peacock.jpg','car1.jpg','strawberries.jpg','wagon.jpg'};
filePath = [fullfile(matlabroot,'toolbox','images','imdata') filesep];
filePathNames = strcat(filePath,fileNames);
testImages = imageDatastore(filePathNames,'FileExtensions',exts);

Display the testing images as a montage.

montage(testImages)

Select one of the images to use as the reference image for JPEG deblocking. You can optionally use your own uncompressed image as the reference image.

indx = 7; % Index of image to read from the test image datastore
Ireference = readimage(testImages,indx);
imshow(Ireference)
title('Uncompressed Reference Image')

Create three compressed test images with the JPEG Quality values of 10, 20, and 50.

imwrite(Ireference,fullfile(tempdir,'testQuality10.jpg'),'Quality',10);
imwrite(Ireference,fullfile(tempdir,'testQuality20.jpg'),'Quality',20);
imwrite(Ireference,fullfile(tempdir,'testQuality50.jpg'),'Quality',50);

Preprocess Compressed Images

Read the compressed versions of the image into the workspace.

I10 = imread(fullfile(tempdir,'testQuality10.jpg'));
I20 = imread(fullfile(tempdir,'testQuality20.jpg'));
I50 = imread(fullfile(tempdir,'testQuality50.jpg'));

Display the compressed images as a montage.

montage({I50,I20,I10},'Size',[1 3])
title('JPEG-Compressed Images with Quality Factor: 50, 20 and 10 (left to right)')

Recall that DnCNN is trained using only the luminance channel of an image because human perception is more sensitive to changes in brightness than changes in color. Convert the JPEG-compressed images from the RGB colorspace to the YCbCr colorspace using the rgb2ycbcr function.

I10ycbcr = rgb2ycbcr(I10);
I20ycbcr = rgb2ycbcr(I20);
I50ycbcr = rgb2ycbcr(I50);

Apply the DnCNN Network

In order to perform the forward pass of the network, use the denoiseImage function. This function uses exactly the same training and testing procedures for denoising an image. You can think of the JPEG compression artifacts as a type of image noise.

I10y_predicted = denoiseImage(I10ycbcr(:,:,1),net);
I20y_predicted = denoiseImage(I20ycbcr(:,:,1),net);
I50y_predicted = denoiseImage(I50ycbcr(:,:,1),net);

The chrominance channels do not need processing. Concatenate the deblocked luminance channel with the original chrominance channels to obtain the deblocked image in the YCbCr color space.

I10ycbcr_predicted = cat(3,I10y_predicted,I10ycbcr(:,:,2:3));
I20ycbcr_predicted = cat(3,I20y_predicted,I20ycbcr(:,:,2:3));
I50ycbcr_predicted = cat(3,I50y_predicted,I50ycbcr(:,:,2:3));

Convert the deblocked YCbCr image to the RGB color space by using the ycbcr2rgb function.

I10_predicted = ycbcr2rgb(I10ycbcr_predicted);
I20_predicted = ycbcr2rgb(I20ycbcr_predicted);
I50_predicted = ycbcr2rgb(I50ycbcr_predicted);

Display the deblocked images as a montage.

montage({I50_predicted,I20_predicted,I10_predicted},'Size',[1 3])
title('Deblocked Images with Quality Factor: 50, 20 and 10 (left to right)')

To get a better visual understanding of the improvements, examine a smaller region inside each image. Specify a region of interest (ROI) using vector roi in the format [x y width height]. The elements define the x- and y-coordinate of the top left corner, and the width and height of the ROI.

roi = [30 440 100 80];

Crop the compressed images to this ROI, and display the result as a montage.

i10 = imcrop(I10,roi);
i20 = imcrop(I20,roi);
i50 = imcrop(I50,roi);
montage({i50 i20 i10},'Size',[1 3])
title('Patches from JPEG-Compressed Images with Quality Factor: 50, 20 and 10 (left to right)')

Crop the deblocked images to this ROI, and display the result as a montage.

i10predicted = imcrop(I10_predicted,roi);
i20predicted = imcrop(I20_predicted,roi);
i50predicted = imcrop(I50_predicted,roi);
montage({i50predicted,i20predicted,i10predicted},'Size',[1 3])
title('Patches from Deblocked Images with Quality Factor: 50, 20 and 10 (left to right)')

Quantitative Comparison

Quantify the quality of the deblocked images through four metrics. You can use the displayJPEGResults helper function to compute these metrics for compressed and deblocked images at the quality factors 10, 20, and 50.

  • Structural Similarity Index (SSIM). SSIM assesses the visual impact of three characteristics of an image: luminance, contrast and structure, against a reference image. The closer the SSIM value is to 1, the better the test image agrees with the reference image. Here, the reference image is the undistorted original image, Ireference, before JPEG compression. See ssim for more information about this metric.

  • Peak signal-to-noise ratio (PSNR). The larger the PNSR value, the stronger the signal compared to the distortion. See psnr for more information about this metric.

  • Naturalness Image Quality Evaluator (NIQE). NIQE measures perceptual image quality using a model trained from natural scenes. Smaller NIQE scores indicate better perceptual quality. See niqe for more information about this metric.

  • Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE). BRISQUE measures perceptual image quality using a model trained from natural scenes with image distortion. Smaller BRISQUE scores indicate better perceptual quality. See brisque for more information about this metric.

displayJPEGResults(Ireference,I10,I20,I50,I10_predicted,I20_predicted,I50_predicted)
------------------------------------------
SSIM Comparison
===============
I10: 0.90624    I10_predicted: 0.91286
I20: 0.94904    I20_predicted: 0.95444
I50: 0.97238    I50_predicted: 0.97482
------------------------------------------
PSNR Comparison
===============
I10: 26.6046    I10_predicted: 27.0793
I20: 28.8015    I20_predicted: 29.3378
I50: 31.4512    I50_predicted: 31.8584
------------------------------------------
NIQE Comparison
===============
I10: 7.0989    I10_predicted: 3.9334
I20: 4.5065    I20_predicted: 3.0699
I50: 2.8866    I50_predicted: 2.4109
NOTE: Smaller NIQE score signifies better perceptual quality
------------------------------------------
BRISQUE Comparison
==================
I10: 52.2731    I10_predicted: 38.9688
I20: 45.5237    I20_predicted: 30.9583
I50: 27.7386    I50_predicted: 24.3889
NOTE: Smaller BRISQUE score signifies better perceptual quality

Summary

This example shows how to create and train a DnCNN network, then use the network to reduce JPEG compression artifacts in images. These were the steps to train the network:

  • Download the training data.

  • Create training images by writing pristine images in the JPEG file format with various levels of compression.

  • Define a custom mini-batch datastore, called a JPEGimagePatchDatastore, to extract patches from the input compressed image and compute the target residuals from the corresponding patches in the pristine images. This datastore was used to feed training data to the network.

  • Create the layers of the DnCNN network by using the dnCNNLayers function.

  • Specify training options.

  • Train the network using the trainNetwork function.

After training the DnCNN network or loading a pretrained DnCNN network, the example compresses a test image at three quality values, then uses the network to remove the compression artifacts.

References

[1] Zhang, K., W. Zuo, Y. Chen, D. Meng, and L. Zhang, "Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising." IEEE® Transactions on Image Processing. Feb 2017.

[2] Grubinger, M., P. Clough, H. Müller, and T. Deselaers. "The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems." Proceedings of the OntoImage 2006 Language Resources For Content-Based Image Retrieval. Genoa, Italy. Vol. 5, May 2006, p. 10.

See Also

| | | | |

Related Topics

Was this topic helpful?