Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: How to find similar images
Date: Thu, 9 Aug 2007 09:00:54 +0000 (UTC)
Organization: STFC Rutherford Appleton Laboratory
Lines: 59
Message-ID: <f9el46$9o4$1@fred.mathworks.com>
References: <f9dggq$lu8$1@fred.mathworks.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-01-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1186650054 9988 172.30.248.36 (9 Aug 2007 09:00:54 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Thu, 9 Aug 2007 09:00:54 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 968489
Xref: news.mathworks.com comp.soft-sys.matlab:423126



"Yair Altman" <altmanyDEL@gmailDEL.comDEL> wrote in message 
<f9dggq$lu8$1@fred.mathworks.com>...
> I have a large set of color images. I wish to find a
> characteristic number for each of them, so closer numbers
> would represent "similar" images (in very loose terms). 
I'd
> like the characteristic to be tolerant to cropping &
> resizing and if possible also to rotation, flipping &
> saturation. If possible, I'd also like it to be computed
> quickly (my set is very large).
> 
> The ultimate aim is to winnow down the large set to a much
> smaller subset of potential matching images, that would 
then
> be inspected manually.
> 
> Some simple functions that I thought of were the grayscale
> rms/mode/std/kurtosis/skewness. Could anyone please 
comment
> on which of these functions (or any other) would be best 
for
> my needs?
> 
> (I'm not an imaging expert nor have the Image Processing
> Toolbox...)
> 
> Thanks in advance,
> Yair

The way that I achieved what you are looking for was

1) Split the colour image into an intensity + colour space 
(e.g. h.s.i or amplitude and normalized rgb) and 
histogrammed each plane.
2) Converted the histogram into a probability function by 
dividing by the number of pixels in the image (the sum of 
the histogram) so now I was size independent.
3) I bucketed the histogram into a 16 feature vector.
4) I undertook a texture analysis using the 'Laws' 
convolution method (Have a Google - plenty of description) 
on each plane, and undertook an identical histogram 
operation on the texture images.

I concatenated all these 16 location probability vectors 
into a single feature vector, and compared image similarity 
using a simple Euclidean distance. 

As I added in images I was able to calculate the mean and 
standard deviation of the feature vector components, and 
was able to undertake a full statistical comparison to 
select the closest group that the latest image belonged to.

I did experiment with adding image entropy to my feature 
vector, which improved its selectivity significantly, but 
not massively.

Hope that helps

Dave Robinson