Thread Subject: Finding closest numbers

Subject: Finding closest numbers

From: Kian

Date: 27 Aug, 2009 22:36:02

Message: 1 of 14

Hi,

I'm doing some reverse correlation analysis using MATLAB. I have a whole bunch of timestamps for the pages I show on the screen and a whole bunch of timestamps for the spikes that were fired. I'd like to find the screen timestamp that is closest to each spike timestamp.

e.g.:

screents = [100 200 300 400 500 600 700 800 900 1200 2400 5100];
spikets = [124 254 365 561 887 901 4600];

Currently, I have a for loop that goes through spikests, subtracts that from the screents, and finds the smallest non-positive number. That'd be the index to my screen timestamp.

Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.

I tried multiplying the 2 matrices together:

1./spikests .* screents and then find the largest value that's less than 1. This method is fast, except that 1. I don't know how to find those values (largest that's less than one) in a vectorized fashion, and that matrix size gets so big that I get an out of memory error.

I work with vector sizes over 60k elements (for both), so my current method (for-loop) takes forever.

Any help will be greatly appreciated.

Use the following codes to generate two sample data points:

screents = sort(unique(round(rand(1,20)*1000)));
spikests = sort(unique(round(rand(10,1)*1000)));

Subject: Finding closest numbers

From: Darren Rowland

Date: 28 Aug, 2009 01:39:10

Message: 2 of 14

"Kian " <kian.torab@utah.edu> wrote in message <h771oi$g76$1@fred.mathworks.com>...
> Hi,
>
> I'm doing some reverse correlation analysis using MATLAB. I have a whole bunch of timestamps for the pages I show on the screen and a whole bunch of timestamps for the spikes that were fired. I'd like to find the screen timestamp that is closest to each spike timestamp.
>
> e.g.:
>
> screents = [100 200 300 400 500 600 700 800 900 1200 2400 5100];
> spikets = [124 254 365 561 887 901 4600];
>
> Currently, I have a for loop that goes through spikests, subtracts that from the screents, and finds the smallest non-positive number. That'd be the index to my screen timestamp.
>
> Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.
>
> I tried multiplying the 2 matrices together:
>
> 1./spikests .* screents and then find the largest value that's less than 1. This method is fast, except that 1. I don't know how to find those values (largest that's less than one) in a vectorized fashion, and that matrix size gets so big that I get an out of memory error.
>
> I work with vector sizes over 60k elements (for both), so my current method (for-loop) takes forever.
>
> Any help will be greatly appreciated.
>
> Use the following codes to generate two sample data points:
>
> screents = sort(unique(round(rand(1,20)*1000)));
> spikests = sort(unique(round(rand(10,1)*1000)));

Hi Kian,
I've written some code, which is untested but should do what you're after. It relies on the fact that the vectors are sorted. The largest value in screents less than each spikets value is stored in c

jj = 1;
for ii = 1:numel(spikets)
    while spikets < screents(jj)
        jj = jj + 1;
    end
    c(ii) = screents(jj-1);
end

Hope that does the job,
Darren

Subject: Finding closest numbers

From: Darren Rowland

Date: 28 Aug, 2009 01:53:10

Message: 3 of 14

As a follow up to the code above, it is possible that the index into the screents vector (jj) could run off the end. This will happen if the largest value in spikets is greater than the largest value in screents.
A quick and dirty solution would be to append a large value to screents before entering the loop so
screents(end+1) = Inf;
would do. Otherwise add in a check for reaching the end of screents in which case you can short-circuit, since all remaining spikets values must share this screents value.

Subject: Finding closest numbers

From: Matt Fig

Date: 28 Aug, 2009 02:00:31

Message: 4 of 14

"Kian " <kian.torab@utah.edu> wrote in message <h771oi$g76$1@fred.mathworks.com>...
> Hi,
>
> I'm doing some reverse correlation analysis using MATLAB. I have a whole bunch of timestamps for the pages I show on the screen and a whole bunch of timestamps for the spikes that were fired. I'd like to find the screen timestamp that is closest to each spike timestamp.
>
> e.g.:
>
> screents = [100 200 300 400 500 600 700 800 900 1200 2400 5100];
> spikets = [124 254 365 561 887 901 4600];
>
> Currently, I have a for loop that goes through spikests, subtracts that from the screents, and finds the smallest non-positive number. That'd be the index to my screen timestamp.
>
> Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.
>
> I tried multiplying the 2 matrices together:
>
> 1./spikests .* screents and then find the largest value that's less than 1. This method is fast, except that 1. I don't know how to find those values (largest that's less than one) in a vectorized fashion, and that matrix size gets so big that I get an out of memory error.
>
> I work with vector sizes over 60k elements (for both), so my current method (for-loop) takes forever.
>
> Any help will be greatly appreciated.
>
> Use the following codes to generate two sample data points:
>
> screents = sort(unique(round(rand(1,20)*1000)));
> spikests = sort(unique(round(rand(10,1)*1000)));




But those random vectors will not always fulfill your stated requirements. It seems like you would need to ensure:

min(spikets)>min(screents)
max(spikets)<max(screents)

Is that the case? If not, what to do in the case where we cannot find a value in screents which is less than the value in spikets?

Subject: Finding closest numbers

From: Jos

Date: 28 Aug, 2009 06:14:02

Message: 5 of 14

"Kian " <kian.torab@utah.edu> wrote in message <h771oi$g76$1@fred.mathworks.com>...
> Hi,
>
> I'm doing some reverse correlation analysis using MATLAB. I have a whole bunch of timestamps for the pages I show on the screen and a whole bunch of timestamps for the spikes that were fired. I'd like to find the screen timestamp that is closest to each spike timestamp.
>
> e.g.:
>
> screents = [100 200 300 400 500 600 700 800 900 1200 2400 5100];
> spikets = [124 254 365 561 887 901 4600];
>
> Currently, I have a for loop that goes through spikests, subtracts that from the screents, and finds the smallest non-positive number. That'd be the index to my screen timestamp.
>
> Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.
>
> I tried multiplying the 2 matrices together:
>
> 1./spikests .* screents and then find the largest value that's less than 1. This method is fast, except that 1. I don't know how to find those values (largest that's less than one) in a vectorized fashion, and that matrix size gets so big that I get an out of memory error.
>
> I work with vector sizes over 60k elements (for both), so my current method (for-loop) takes forever.
>
> Any help will be greatly appreciated.
>
> Use the following codes to generate two sample data points:
>
> screents = sort(unique(round(rand(1,20)*1000)));
> spikests = sort(unique(round(rand(10,1)*1000)));

Take a look at NEARESTPOINT:
http://www.mathworks.com/matlabcentral/fileexchange/8939

A = rand(60000,1) ;
B = rand(60000,1) ;
tic ; idx = nearestpoint(A,B) ; toc ;
% Elapsed time is 0.079319 seconds.
% Check a random entry
n = ceil(numel(A) * rand) ;
m1 = min(abs(A(n)-B))
m2 = abs(A(n) - B(idx(n)))
isequal(m1, m2)
% yep!

hth
Jos

Subject: Finding closest numbers

From: Kian

Date: 28 Aug, 2009 18:35:44

Message: 6 of 14

Hi everyone,

Thanks for your help.

Darren, thanks for putting the time, but unfortunately, for-loop is not a method I was looking for. Sorry, I forgot to mention that. I have huge arrays, about 50k for spikes and another 50k for screens. That's only for 1 unit on one channel. Given that I have 96 channels and up to 5 units it will take me years to run a for-loop. To answer your question, yes, there may be times when there's a frame, but no spike afterwards. The code should check for that, of course.

Jos, I'd have to look into your method. I need to download that function. That's exactly what I'm looking for.

I actually found a method to do it which is really fast (have to compare with FINDNEAREST...). Here it is:

mainMat = 1./double(spikeTS) * screenTS;
mainMat(mainMat>1) = 0;
drivMat = diff(mainMat')';
spikeIndices = find(drivMat < 0);
[i, j] = ind2sub(size(drivMat), spikeIndices);

It's fast and vectorized!

Thanks,
Kian

Subject: Finding closest numbers

From: Bruno Luong

Date: 28 Aug, 2009 19:17:19

Message: 7 of 14

For sorted array A; If you prefer using a stock function try

idx = interp1(A, 1:length(A), B, 'nearest')

INTERP1 is a slow function. For faster, use HISTC

[trash idx] = histc(A, B);

Beside Jos's, there are few other alternative on FEX too.

Bruno

Subject: Finding closest numbers

From: Jan Simon

Date: 28 Aug, 2009 21:00:20

Message: 8 of 14

Dear Bruno!

> For sorted array A; If you prefer using a stock function try
> idx = interp1(A, 1:length(A), B, 'nearest')

The OP does not want the nearest, but the next smaller number:
> Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.

If speed matters, a MEX script whould be a good idea.

Good luck, Jan

Subject: Finding closest numbers

From: Bruno Luong

Date: 28 Aug, 2009 21:32:20

Message: 9 of 14

"Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de> wrote in message <h79gh4$ras$1@fred.mathworks.com>...
> Dear Bruno!
>
> > For sorted array A; If you prefer using a stock function try
> > idx = interp1(A, 1:length(A), B, 'nearest')
>
> The OP does not want the nearest, but the next smaller number:

Oh, thank you Jan; I'm confused with the "timestamp" description. If it's the case; this should do.

idx = ceil(interp1(A, 1:length(A), B)

> > Note: the screents has to be smaller than the spikets. For example, for spikets 887, even though it's closer to 900 the number I'm looking for is 800.
>
> If speed matters, a MEX script whould be a good idea.
>

Yes, I have submitted something called FIND_IDX in FEX just for the speed purpose.

Bruno

Subject: Finding closest numbers

From: Darren Rowland

Date: 1 Sep, 2009 10:55:17

Message: 10 of 14

Kian,
I was a bit puzzled that you would dismiss the code that I wrote, simply because it has a for loop. I decided to compare the methods of Jos, Bruno and yourself. Unfortunately I could not get yours to run (this line generates a matrix dimension error mainMat = 1./spikets * screents;).

The function I used to make the comparison is below the timing results. You will notice that the screen vector has been modified to address the issues that Matt raised above.

Anyway here are the comparisons for 1 million elements (3 times)

>> speedtest(1000000)
Darren Method
Elapsed time is 0.078000 seconds.
Jos Method
Elapsed time is 0.859000 seconds.
Bruno Method
Elapsed time is 1.110000 seconds.

>> speedtest(1000000)
Jos Method
Elapsed time is 1.141000 seconds.
Darren Method
Elapsed time is 0.078000 seconds.
Bruno Method
Elapsed time is 1.172000 seconds.

>> speedtest(1000000)
Bruno Method
Elapsed time is 1.172000 seconds.
Jos Method
Elapsed time is 1.172000 seconds.
Darren Method
Elapsed time is 0.078000 seconds.

Fairly consistent across the board. Now for 10 million elements.

>> speedtest(10000000)
Bruno Method
Elapsed time is 11.860000 seconds.
Jos Method
Elapsed time is 10.828000 seconds.
Darren Method
Elapsed time is 0.859000 seconds.

The for loop method is 10 times faster in both cases. I would be interested if the values were similar on your machine.
Darren.

% Function
function speedtest(k)

screents = sort(round(100000*rand(k,1)));
screents(1) = -Inf;
screents(end) = Inf;
spikets = sort(round(100000*rand(k,1)));
A = unique(screents);
B = spikets;

disp('Bruno Method')
tic; idxBruno = ceil(interp1(A, 1:length(A), B)); toc;

% Jos's Method (finds nearest NOT nearest and smaller)
disp('Jos Method')
tic; idx = nearestpoint(A,B) ; toc;

% disp('Kian Method')
% tic;
% mainMat = 1./spikets * screents; % error on this line
% mainMat(mainMat>1) = 0;
% drivMat = diff(mainMat')';
% spikeIndices = find(drivMat < 0);
% [i, j] = ind2sub(size(drivMat), spikeIndices);
% toc;
% clear mainMat drivMat spikeIndices

disp('Darren Method')
tic;
lspike = length(spikets);
idxDarren = zeros(lspike,1);
jj = 1;
for ii = 1:lspike
    while spikets(ii) > screents(jj)
        jj = jj + 1;
    end
    idxDarren(ii) = jj-1;
end
toc;

Subject: Finding closest numbers

From: Jos

Date: 1 Sep, 2009 11:08:03

Message: 11 of 14

"Darren Rowland" <darrenjremovethisrowland@hotmail.com> wrote in message

> % Jos's Method (finds nearest NOT nearest and smaller)

... but it does. This part of the help explains how:

    NEARESTPOINT(X, Y, M) specifies the operation mode M:
    'nearest' : default, same as above
    'previous': find the points in Y that are closest, but preceeds a point in X
                NEARESTPOINT([0 4 3 12],[0 3],'previous') -> [NaN 2 1 2]
    'next' : find the points in Y that are closets, but follow a point in X
                NEARESTPOINT([1 4 3 12],[0 3],'next') -> [2 NaN 2 NaN]

Note that my code was written before JIT acceleration worked properly.

Jos

Subject: Finding closest numbers

From: Bruno Luong

Date: 1 Sep, 2009 11:24:01

Message: 12 of 14

"Darren Rowland" <darrenjremovethisrowland@hotmail.com> wrote in message <h7iuil$fmv$1@fred.mathworks.com>...
> Kian,
> I was a bit puzzled that you would dismiss the code that I wrote, simply because it has a for loop. I decided to compare the methods of Jos, Bruno and yourself. Unfortunately I could not get yours to run (this line generates a matrix dimension error mainMat = 1./spikets * screents;).
>
> The function I used to make the comparison is below the timing results. You will notice that the screen vector has been modified to address the issues that Matt raised above.

Darren, for the complexity point of view your method is the best and it clearly shows especially for large array, no double. IIRC there is a FEX that merges two sorted arrays that works on the same principle. For longtime I want to write such algorithm in the MEX form. Such code is surely missing is built-in stock functions for array manipulation.

Bruno

Subject: Finding closest numbers

From: Kian

Date: 1 Sep, 2009 20:58:04

Message: 13 of 14

Hi Darren,

I apologies for dismissing your method. It does it really really fast. Weird how for loops are much faster than the vectorized methods. I'm glad mine crashed on you. It'd be very embarresing if it did work. It turns out that it's a very slow method and crashes on a Core i7 with 8 GB @ 1600. Ha!

Thanks all. When I start my own company I'm hiring all of you :).

K

Subject: Finding closest numbers

From: Darren Rowland

Date: 2 Sep, 2009 03:32:09

Message: 14 of 14

@ Jos
My mistake. I actually read the help but didn't connect the fact that when the arrays are sorted the values returned from 'previous' will satisfy Kian's problem statement.

@ Kian
Glad to help anyway, I already have a job ;)

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
forloopking Matt Fig 1 Sep, 2009 08:14:14
reverse correla... Kian 27 Aug, 2009 18:39:06
rssFeed for this Thread

Contact us at files@mathworks.com