Color-Based Image Retrieval - Query by Example


Theodoros Giannakopoulos
E-mail: tyiannak@di.uoa.gr
Website: www.di.uoa.gr/~tyiannak

Introduction

Content-based image retrieval is the task of searching images in databases by analyzing the image contents. In this demo, a simple image retrieval method is presented, based on the color distribution of the images. The user simply provides an "example" image and the search is based upon that example (query by image example). For this first version of the demo no relevance feedback is used.

Method Description

(A) Training

Almost 1000 images have been used for populating the database. For each image a 3-D histogram of it's HSV values is computed. At the end of the training stage, all 3D HSV histograms are stored in the same .mat file.

(B) Query

In order to retrieve M (user-defined) query results, the following steps are executed:

  1. The 3D (HSV) histogram of the query image is computed. Then, the number of bins in each direction (i.e., HSV space)is duplicated by means of interpolation.
  2. For each image i in the database:
  3. Sort the similarity vector and prompt the user with the images that have the M smaller S values.

.

Provided Matlab files

getImageHists.m: Computes the (3D) HSV histogram of an image.

searchImageHist.m: This is the main m-file. It computes the histogram of the given image and then it returns the similar images based on the training data.

model1Hist.mat: This is the .mat files that contains the training data, i.e., the histograms of the almost 1000 image samples.

Also, in folder \images2 the thumbnails of the training images are stored. Finally, in the root folder 8 test query images are given.

.

Execution Example

Supose that we want to execute a query based on image 'redflower.jpg', and that we want 11 images to be returned:

>> searchImageHist('redflower.jpg', 'model1Hist', 11);

The execution contains two basic steps (as described above):

(a) First, the 3-D histogram of the query image is calculated. This may take almost 0.5 seconds for a 800x600 color image.

(b) When the histogram is calculated, the search algorithm described above is executed. During the searching step, for user interface reasons, some images (NOT all images of the database) are selected to be plotted, based on a simple thresholdin criterion (i.e., only images that correspond to a similarity measure smaller than a pre-defined threshold are presented - see Figure 1). When the searching is completed the 11 closest images are presented (Figure 2).

screenshot1

Figure 1: While the searching is being executed, some similar images (based on a pre-defined threshold) are presented.

screenshot1

Figure 2: When the process is competed, the query image, along with the (here, 11) closest images are presented.