Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
select top n from a matrix

Subject: select top n from a matrix

From: Iain

Date: 17 Sep, 2010 14:01:06

Message: 1 of 12

Hey guys,

I'm still fairly new to Matlab and have come across something that I just can't do without resorting to a loop. I was wondering if somebody here might be able to nudge me in the right direction. If I have an array like this:

sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19]

sample =

     1 24
     1 15
     1 21
     2 12
     2 5
     2 19
     2 21
     3 4
     3 23
     3 19

For each value in the first column I want to select the highest two values in the second column. So I am looking for this result:

results =

     1 24
     1 21
     2 19
     2 21
     3 23
     3 19

I can do it in a loop and I can do it without a loop in about 50 steps which is probably less efficient than using the loop. I cannae do it efficiently though.

Obviously this needs to scale up to fairly massive matrices - hence the focus on efficiency.

Any ideas?

Cheers
Iain

Subject: select top n from a matrix

From: someone

Date: 17 Sep, 2010 14:11:04

Message: 2 of 12

"Iain " <itrotter@hometrack.co.uk> wrote in message <i6vsb2$a67$1@fred.mathworks.com>...
> Hey guys,
>
> I'm still fairly new to Matlab and have come across something that I just can't do without resorting to a loop. I was wondering if somebody here might be able to nudge me in the right direction. If I have an array like this:
>
> sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19]
>
> sample =
>
> 1 24
> 1 15
> 1 21
> 2 12
> 2 5
> 2 19
> 2 21
> 3 4
> 3 23
> 3 19
>
> For each value in the first column I want to select the highest two values in the second column. So I am looking for this result:
>
> results =
>
> 1 24
> 1 21
> 2 19
> 2 21
> 3 23
> 3 19
>
> I can do it in a loop and I can do it without a loop in about 50 steps which is probably less efficient than using the loop. I cannae do it efficiently though.
>
> Obviously this needs to scale up to fairly massive matrices - hence the focus on efficiency.
>
> Any ideas?
>
> Cheers
> Iain

If the matrix is indeed massive, then a for loop MIGHT be the most efficient method.
In depends on how you coded the for loop which you haven't shown.

Subject: select top n from a matrix

From: Sean

Date: 17 Sep, 2010 14:29:21

Message: 3 of 12

"Iain " <itrotter@hometrack.co.uk> wrote in message <i6vsb2$a67$1@fred.mathworks.com>...
> Hey guys,
>
> I'm still fairly new to Matlab and have come across something that I just can't do without resorting to a loop. I was wondering if somebody here might be able to nudge me in the right direction. If I have an array like this:
>
> sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19]
>
> sample =
>
> 1 24
> 1 15
> 1 21
> 2 12
> 2 5
> 2 19
> 2 21
> 3 4
> 3 23
> 3 19
>
> For each value in the first column I want to select the highest two values in the second column. So I am looking for this result:
>
> results =
>
> 1 24
> 1 21
> 2 19
> 2 21
> 3 23
> 3 19
>
> I can do it in a loop and I can do it without a loop in about 50 steps which is probably less efficient than using the loop. I cannae do it efficiently though.
>
> Obviously this needs to scale up to fairly massive matrices - hence the focus on efficiency.
>
> Any ideas?
>
> Cheers
> Iain

Here's one way:
%%
sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19];

M = sort(accumarray(sample,sample(:,2),[],@(x)x),2,'descend');

results = [kron(unique(sample(:,1)),[1;1]), M(1:size(M,1)*2).']
%%

The results won't be in the original order (i.e. they'll be sorted (this can be easily changed)) and the results will be [n, val;n, 0] if there is only one value for the first column.

Subject: select top n from a matrix

From: Sean

Date: 17 Sep, 2010 14:39:04

Message: 4 of 12

"Sean " <sean.dewolski@nospamplease.umit.maine.edu> wrote in message <i6vu01$222$1@fred.mathworks.com>...
> "Iain " <itrotter@hometrack.co.uk> wrote in message <i6vsb2$a67$1@fred.mathworks.com>...
> > Hey guys,
> >
> > I'm still fairly new to Matlab and have come across something that I just can't do without resorting to a loop. I was wondering if somebody here might be able to nudge me in the right direction. If I have an array like this:
> >
> > sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19]
> >
> > sample =
> >
> > 1 24
> > 1 15
> > 1 21
> > 2 12
> > 2 5
> > 2 19
> > 2 21
> > 3 4
> > 3 23
> > 3 19
> >
> > For each value in the first column I want to select the highest two values in the second column. So I am looking for this result:
> >
> > results =
> >
> > 1 24
> > 1 21
> > 2 19
> > 2 21
> > 3 23
> > 3 19
> >
> > I can do it in a loop and I can do it without a loop in about 50 steps which is probably less efficient than using the loop. I cannae do it efficiently though.
> >
> > Obviously this needs to scale up to fairly massive matrices - hence the focus on efficiency.
> >
> > Any ideas?
> >
> > Cheers
> > Iain
>
> Here's one way:
> %%
> sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19];
>
> M = sort(accumarray(sample,sample(:,2),[],@(x)x),2,'descend');
>

The previous was wrong: it should read like this:
results = sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).'])

>
> The results won't be in the original order (i.e. they'll be sorted (this can be easily changed)) and the results will be [n, val;n, 0] if there is only one value for the first column.

Subject: select top n from a matrix

From: Sean

Date: 17 Sep, 2010 14:43:06

Message: 5 of 12

"Sean " <sean.dewolski@nospamplease.umit.maine.edu> wrote in message <i6vui8$9ag$1@fred.mathworks.com>...
> "Sean " <sean.dewolski@nospamplease.umit.maine.edu> wrote in message <i6vu01$222$1@fred.mathworks.com>...
> > "Iain " <itrotter@hometrack.co.uk> wrote in message <i6vsb2$a67$1@fred.mathworks.com>...
> > > Hey guys,
> > >
> > > I'm still fairly new to Matlab and have come across something that I just can't do without resorting to a loop. I was wondering if somebody here might be able to nudge me in the right direction. If I have an array like this:
> > >
> > > sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19]
> > >
> > > sample =
> > >
> > > 1 24
> > > 1 15
> > > 1 21
> > > 2 12
> > > 2 5
> > > 2 19
> > > 2 21
> > > 3 4
> > > 3 23
> > > 3 19
> > >
> > > For each value in the first column I want to select the highest two values in the second column. So I am looking for this result:
> > >
> > > results =
> > >
> > > 1 24
> > > 1 21
> > > 2 19
> > > 2 21
> > > 3 23
> > > 3 19
> > >
> > > I can do it in a loop and I can do it without a loop in about 50 steps which is probably less efficient than using the loop. I cannae do it efficiently though.
> > >
> > > Obviously this needs to scale up to fairly massive matrices - hence the focus on efficiency.
> > >
> > > Any ideas?
> > >
> > > Cheers
> > > Iain
> >
> > Here's one way:
> > %%
> > sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19];
> >
> > M = sort(accumarray(sample,sample(:,2),[],@(x)x),2,'descend');
> >
>
> The previous was wrong: it should read like this:
> results = sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).'])
>
> >
> > The results won't be in the original order (i.e. they'll be sorted (this can be easily changed)) and the results will be [n, val;n, 0] if there is only one value for the first column.

Also; if you have really big numbers you could create the intermediate matrix as sparse then convert it back:

M = sort(accumarray(sample,sample(:,2),[],@(x)x,[],true),2,'descend');
results = full(sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).']))

Subject: select top n from a matrix

From: Iain

Date: 17 Sep, 2010 14:53:04

Message: 6 of 12

Superb. Thanks Sean. Just what I was after.

Looking at it piece by piece, it actually does the same as I did in the 50 steps which I mentioned in my first post. It just does it much more efficiently using crazy syntax which it's going to take me a while to get my head around.

Thanks for the help. Really appreciate it!

Iain

Subject: select top n from a matrix

From: Jos (10584)

Date: 17 Sep, 2010 15:20:05

Message: 7 of 12

"Iain " <itrotter@hometrack.co.uk> wrote in message <i6vvcg$3ra$1@fred.mathworks.com>...
> Superb. Thanks Sean. Just what I was after.
>
> Looking at it piece by piece, it actually does the same as I did in the 50 steps which I mentioned in my first post. It just does it much more efficiently using crazy syntax which it's going to take me a while to get my head around.
>
> Thanks for the help. Really appreciate it!
>
> Iain


Here is a very fast and easy to follow solution, taking advantage of ML logical indexing:

% DATA
sample = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19 ; 4 20 ; 5 1 ; 5 3 ; 5 2]
% ENGINE
s = sortrows(sample,[1 -2])
q = ~[0 ; ~diff(s(:,1))] % first row
q2 = q | [true ; q(1:end-1)] % add 2nd row, if present
Result = s(q2,:) % select

hth
Jos

Subject: select top n from a matrix

From: Iain

Date: 17 Sep, 2010 16:05:05

Message: 8 of 12

Thanks for that Jos. An interesting way to tackle it.

It's tough to expand that if I wanted to look at the top 3 or 5 or 20 instead of 2 though. (at least, with my limited brain capacity I can't see a way to expand it to top N instead of top 2).

Do you have something clever up your sleeve?

Thanks for the help!
I

Subject: select top n from a matrix

From: Jan Simon

Date: 17 Sep, 2010 16:12:06

Message: 9 of 12

Dear Jos, dear Sean

Let me set this side by side (I've inserted a comma in Jos' "[1 -2]" to reduce confusions):

1.
M = sort(accumarray(sample,sample(:,2),[],@(x)x,[],true),2,'descend');
results = full(sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).']))

2.
s = sortrows(sample, [1, -2]);
q = ~[0 ; ~diff(s(:,1))];
q2 = q | [true; q(1:end-1)];
Result = s(q2,:)

I have two different associations, when seeing these two excellent solutions:
A) Tigerclaw
B) Rolling an angry armadillo over the keyboard.

Jan :-)

Subject: select top n from a matrix

From: Oleg Komarov

Date: 17 Sep, 2010 16:49:05

Message: 10 of 12

Another approach...flexible and fast:

A = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19];

times = zeros(100,2);
% My approach
for t = 1:100
    tic
    % row number the sequence partitioning by first column
    rn = histc(A(:,1),[-inf; A(:,1)]);
    rn = cumsum(ones(numel(A(:,1)),1) - rn(1:end-1));
    % Sort rows in descending order
    A2 = sortrows(A,[1,-2]);
    % Take first two columns on the base of the row numbering
    results1 = A2(ismembc(rn, 1:2),:);
    times(t,1) = toc;
end

pause(0.1)
% Sean's approach
for t = 1:100
    tic
    M = sort(accumarray(A,A(:,2),[],@(x)x),2,'descend');
    results2 = sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).']);
    times(t,2) = toc;
end

mean(times)

Oleg

Subject: select top n from a matrix

From: Sean

Date: 17 Sep, 2010 18:13:04

Message: 11 of 12

"Oleg Komarov" <oleg.komarovRemove.this@hotmail.it> wrote in message <i70660$3mn$1@fred.mathworks.com>...
> Another approach...flexible and fast:
>
> A = [1 24; 1 15; 1 21; 2 12; 2 5; 2 19; 2 21; 3 4; 3 23; 3 19];
>
> times = zeros(100,2);
> % My approach
> for t = 1:100
> tic
> % row number the sequence partitioning by first column
> rn = histc(A(:,1),[-inf; A(:,1)]);
> rn = cumsum(ones(numel(A(:,1)),1) - rn(1:end-1));
> % Sort rows in descending order
> A2 = sortrows(A,[1,-2]);
> % Take first two columns on the base of the row numbering
> results1 = A2(ismembc(rn, 1:2),:);
> times(t,1) = toc;
> end
>
> pause(0.1)
> % Sean's approach
> for t = 1:100
> tic
> M = sort(accumarray(A,A(:,2),[],@(x)x),2,'descend');
> results2 = sortrows([repmat([1:size(M,1)].',2,1), M(1:size(M,1)*2).']);
> times(t,2) = toc;
> end
>
> mean(times)
>
> Oleg



Oleg, two things:
-What were the times for those of us without ismembc? I'm sure it's much faster.
-[Scolding] You overwrote the build it function times, which I'm currently using inside of bsxfun(). It erred and caused a few seconds of frustration.

Subject: select top n from a matrix

From: Oleg Komarov

Date: 17 Sep, 2010 18:23:05

Message: 12 of 12


> Oleg, two things:
> -What were the times for those of us without ismembc? I'm sure it's much faster.
MATLAB 2008b Win32 Vista

w/ ismember (mine vs yours):
ans =
  1.0e-003 *
    0.1162 0.1627

w/ ismembc (mine vs yours):
ans =
  1.0e-003 *
    0.0837 0.1638

w/ rn ==1 | rn == 2 (but not flexible...)
ans =
  1.0e-003 *
    0.0828 0.1624

> -[Scolding] You overwrote the build it function times, which I'm currently using inside of bsxfun(). It erred and caused a few seconds of frustration.

Sabotage!!!! :D

Oleg

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us