Path: news.mathworks.com!not-for-mail
From: "us " <us@neurol.unizh.ch>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Optimize this!
Date: Fri, 8 Feb 2008 22:46:02 +0000 (UTC)
Organization: Universit&#228;tsSpital Z&#252;rich
Lines: 76
Message-ID: <foim3a$o76$1@fred.mathworks.com>
References: <5b74607b-c527-4c9b-833a-b08d4be5d48a@s13g2000prd.googlegroups.com>
Reply-To: "us " <us@neurol.unizh.ch>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1202510762 24806 172.30.248.35 (8 Feb 2008 22:46:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Fri, 8 Feb 2008 22:46:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 11
Xref: news.mathworks.com comp.soft-sys.matlab:450267


Pietro:
<SNIP fast re-indexing...

well, well...
since the order does not matter, one might look a bit 
further and come up with an even faster (2-line) solution...
the code snippet below shows its performance AND uncovers 
that <tc> solutions may get very slow and sometimes be 
incorrect(!?)

the code (copy/pasted) from previous posts

function t=foo
     fmt={
          '%5d %5d = %3.0f'
          '%11.3f %11.3f  '
          'isequal: %-1d %-1d'
     };
     nt=10;
     t=zeros(nt,7);
for  k=1:nt
     a=ceil(rand(10^4,1)*10^k).'; % integers
% ***** jd *****
     tic;
     [b,J,J]=unique(a);
     t(k,3)=toc;
try % look out for memory problems
% ***** tc *****
     vec=a;
     tic;
     x = zeros(max(vec)-min(vec)+1,1) + nan;
     x(vec) = 1;
     nu = sum(~isnan(x));
     x(~isnan(x)) = 1:nu;
     y = x(vec);
     t(k,4)=toc;
catch
     t(k,4)=nan;
     y=J.';
end
% ***** us *****
     tic;
     [as,ax]=sort(a);
     as(ax)=cumsum([1,diff(as)]~=0);
     t(k,5)=toc;
% run-time result
     t(k,1:2)=[numel(a),numel(b)];
     t(k,6:7)=[isequal(J,y.'),isequal(J,as)];
     t(k,3:5)=100*t(k,3:5)./t(k,3);
     disp(sprintf([fmt{:}],t(k,:)));
end
end

output on a wintel system c2.2*2.4/2gb/vista/r2007b (look 
at it in the original version for proper formatting...)

1) using the ~original input
     a=[3 7 2 2 3 100 7 7 56];
veclen uniquelen jd=100% tc% us% equal(jd,tc) equal(jd,us)
    9     5 = 100     46.076      13.029  isequal: 1 1

2) using the loop, however

10000    10 = 100     43.648      96.747  isequal: 1 1
10000   100 = 100     30.893      76.001  isequal: 1 1
10000  1000 = 100     38.042      77.902  isequal: 1 1
10000  6366 = 100    108.570      70.773  isequal: 1 1
10000  9521 = 100   1261.356      68.274  isequal: 0 1 !
10000  9950 = 100  12899.902      66.274  isequal: 0 1 !
10000  9997 = 100 125642.159      64.475  isequal: 0 1 !
10000 10000 = 100        NaN      75.425  isequal: 1 1
10000 10000 = 100        NaN      69.354  isequal: 1 1
10000 10000 = 100        NaN      73.868  isequal: 1 1

just a few thoughts...
us