Path: news.mathworks.com!not-for-mail
From: "Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Getting indexes of rows of matrix with more than n repetitions
Date: Wed, 6 Jan 2010 21:07:02 +0000 (UTC)
Organization: Universit&#228;t Heidelberg
Lines: 31
Message-ID: <hi2u1m$pg5$1@fred.mathworks.com>
References: <hhudae$4p$1@fred.mathworks.com> <hhuths$led$1@fred.mathworks.com> <hi023t$mtq$1@fred.mathworks.com> <hi1f7i$plv$1@fred.mathworks.com> <hi1g5n$f7i$1@fred.mathworks.com> <hi1ua1$d7e$1@fred.mathworks.com> <hi2300$6dk$1@fred.mathworks.com> <hi25nk$5lo$1@fred.mathworks.com> <hi2maq$8ba$1@fred.mathworks.com>
Reply-To: "Jan Simon" <matlab.THIS_YEAR@nMINUSsimon.de>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1262812022 26117 172.30.248.37 (6 Jan 2010 21:07:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 6 Jan 2010 21:07:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 869888
Xref: news.mathworks.com comp.soft-sys.matlab:596985

Dear Bruno!

> Jan, how much do you estimate an inplace sorting (e.g., on large double array) would save time?

It is clear, that the creation of the sort index is the demanding part of SORT, while the copy of the input in sorted order is secondary - usually. Unfortuantely one of the 2 DIMM ports of my computer is damaged and I have to live with 512MB RAM. But saving temporarily used memory is always useful, even on a 16 GB machine. Nevertheless, timing matters for 8MB array already:

For the estimation of the speed gain:
  x = rand(1e6, 1);
  tic; y = sort(x); toc  ==> 0.26 sec
  tic; [y, s] = sort(x); toc  ==> 0.40 sec
  tic; y = x(s); toc ==> 0.14 sec
  
So I assume I could save 35% computing time.
I assume, that SORT sorts the values inplace and the sorting index is created simultaneously, if 2 outputs are used.
Sorting a INT16 array is much faster:
  x = uint16(x * 32000);
  tic; y = sort(x); toc  ==> 0.16 sec
  tic; [y, s] = sort(x); toc  ==> 0.31 sec
Creating the sorting index needs additional 0.15 sec as in the DOUBLE case above.

If the replied index vector could be an UINT32 array, this would be even nicer:
  ints = uint32(s);
  tic; y = x(ints); toc  ==> 0.09 sec  (instead of 0.14 sec)

> I'm quite happy with the performance of matlab SORT, except I wish to have a sorting routine where the comparison operator can be customized - but I guess such feature is very inefficient in Matlab due to overhead. 

SORT is really fine, you are right! But this is not a reason to avoid improving it.
If you would create a MEX, which replies the sorting index (perhaps in a chosable type), which uses the standard comparison or optionally a user-defined operator, which can compete in speed with single-output SORT -- *I* would download it! Promised.
I assume, calling a user-defined operator through mexCallMATLAB would be a great brake.

Kind regards, Jan