Correlation between two datasets
Show older comments
Hello, I have two data sets that I'd like to have some measure of similarity between them. The two plots are http://i.imgur.com/Oa0BNWv.png http://i.imgur.com/Ti3LcVm.png
I have two [x,y] sets. The two sets are not the same size. I, admittedly, don't have much of a clue of what to do. The two pictures appear visually similar, I'd like a way to define this.
1 Comment
DIPTI MISHRA
on 1 Jul 2020
Can anyone suggest me how to plot the distribuation for complete dataset?
Actually I want to check the similarity between two image datasets.
Answers (3)
Ahmet Cecen
on 12 May 2016
Edited: Ahmet Cecen
on 12 May 2016
0 votes
There are 3 ways I am aware of that you can do this:
- You find a way to make them have the same number of points and use the Pearson Product Moment Coefficient.
- You find a way to fit some distribution to them, maybe the x and y components independently, and compare the fit parameters, maybe with a t-test for statistical significance.
- You find a way to grid your data, which might involve rounding etc, but you need to have a say 1000x1000 matrix with 1s where there is a data point, which when you use imagesc to plot should give you an image similar to the plots you shared. In this case you can use a convolution to obtain some measure of similarity.
Without knowing more about your data, this is all I can suggest.
6 Comments
Image Analyst
on 12 May 2016
For #3, I think you really might mean normalized cross correlation, done by normxcorr2(). Demo is attached. Not sure how well it would work but it is easy to try and see. Just correlate and find the peak value and see if it goes down for dissimilar point patterns.
Ahmet Cecen
on 12 May 2016
Normalized cross-correlation is only 1 convolution based similarity metric, which has the virtue of not requiring further processing. However there are many other spectra based methods that can be calculated by convolutions, or n-point correlation functions etc. Although admittedly anything beyond normxcorr2() might be an overkill for a problem like this.
Image Analyst
on 12 May 2016
And correlation and convolution require a regularly spaced array. Do you know of any that can work on an N-by-2 list of (x,y) coordinates that are not necessarily on a regularly spaced grid?
Ahmet Cecen
on 12 May 2016
That is why the sentence starts with: "You find a way to grid your data,"
Which all things considered in a dataset like this, shouldn't create too much of an error. I mean his PNGs are 500x500 and each data points seem to resolve fairly well on the image. He can easily do a 1000x1000 or even 2000x2000 grid to be more accurate, since 2D convolutions require virtually no memory and are lightning fast to calculate (barring extreme cases).
The resource you mentioned seems to be using distribution based similarity measures (on a 3 minute skim), and in a more elaborate way than my somewhat simple suggestion in #2. This too should work in most cases, barring the cases where two datasets seem to have almost identical distributions, but look fairly differently (a case which is easy to achieve if you are trying to beat the system, but rarely happens in my experience in a real dataset).
I would suggest to chose whatever option seems to be closer to your background and appears workable.
Side Note: There are a few non uniform grid versions I have tried for my research before, although the ones I tried become effectively equivalent to rounding to nearest grid point in most cases, provided a fine enough grid. This problem becomes a real concern when the convolution needs to happen in 3D, since a 2000x2000x2000 grid requires considerably heavy resources.
Austin Gonzalez
on 13 May 2016
Ahmet Cecen
on 13 May 2016
Upload a pair of example datasets, I can show you then. Otherwise will take too long to explain.
Image Analyst
on 12 May 2016
0 votes
You might take a look at "the bible" for analyzing spatial point patterns - the book by Adrian Baddeley of CSIRO: http://umaine.edu/computingcoursesonline/files/2011/07/SpatstatmodelingWorkshop.pdf This is the authoritative reference on the topic.
Image Analyst
on 14 May 2016
0 votes
You will find lots of algorithms if you search on "point matching algorithms":
Categories
Find more on Correlation and Convolution in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!