No BSD License  

Highlights from
fastmode

Be the first to rate this file! 4 Downloads (last 30 days) File Size: 2.47 KB File ID: #10745
image thumbnail

fastmode

by Harold Bien

 

13 Apr 2006 (Updated 14 Apr 2006)

Optimized high speed version of statistical mode, i.e. element(s) occuring at greatest frequency

| Watch this File

File Information
Description

fastmode is an improvement on 'mode.m' by Michael Robbins (File ID 5266). It ignores NaN's and properly returns multiple results if more than one element occurs at equal frequency.

M-file profiler revealed extracting unique values took 50% of the time spent in the function, and since a list of unique values were not required, a work-around has been used that makes this even faster than the first submission (25 seconds to process a 100,000 element vector iterated 1,000 times, versus original 27 seconds and Robbins' hist-based version of 69 seconds).

The function is rather robust in that it can handle even single-element vectors, but it will generate an error if you attempt to call it with an empty vector. Since it's designed for speed, no error checking is performed on the input.

Acknowledgements

The author wishes to acknowledge the following in the creation of this submission:
mode
This submission has inspired the following:
mode calculator

MATLAB release MATLAB 7.0.4 (R14SP2)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (6)
13 Dec 2006 Mukhtar Ullah

Have you checked mode.m (file id 5826) I submitted in 2004? It does exactly what is claimed here? Also check the speed.

15 Dec 2006 Harold Bien

No, I did not see mode.m (5826) before. However, I have just checked the speed using the following command line:

dataset=round(rand(10000,1).*10);
% Time first routine
tic
for i=1:1000
    m=fastmode(dataset);
end
et.fastmode=toc;
% Time second routine
tic
for i=1:1000
    m=custmode(dataset);
end
et.custmode=toc;
% Time internal routine
tic
for i=1:1000
    m=mode(dataset);
end
et.mode=toc;

running MATLAB 2006a on a Pentium Core Duo 2.4Ghz machine with the following results:

et =

    fastmode: 1.3159
    custmode: 2.1718
        mode: 2.8320

'fastmode' is designed to be called from a loop like the command above - optimized for speed, not ease of use.

15 Dec 2006 Harold Bien

Forgot to mention, in previous comment, "custmode" is file 5826. Also, as expected, the results are the same for all 3 trials. What differs, however, are the following:

By design, MATLAB's mode returns only the smallest value if there are multiple entries with equal frequency. Both fastmode and custmode return all values (custmode had an issue with column vectors).

Also by design, MATLAB handles empty input whereas fastmode does not (see instructions for fastmode). custmode also fails to handle empty vectors.

17 Dec 2006 Mukhtar Ullah

Well, to my experience the method you use for speed comparison is biased towards the code entered first; that's why I never use this method. I swapped the for loops for mode and fastmode (I am excluding custmode becuase it does not exist in ML list) and I had the following results:
et =
    fastmode: 2.9829
        mode: 1.5077
My cpu is Pentium Core Duo 1.2Ghz

17 Dec 2006 Mukhtar Ullah

sorry, I now realised that what you call custmode is the file I submitted. And I just swapped the for loops without changing the et.mode ....lines. You are right; fastmode is really fast. Following are the results now

et =

    custmode: 2.4004
    fastmode: 1.8349
        mode: 2.3672

19 Dec 2006 Harold Bien

You are correct that the method I used can, sometimes, bias towards the first if there is some sort of caching going on. If you review the comments I gave to the other mode.m file (5266) you will note I ran the same test swapping the order several different ways without any real changes. I was lazy this time and neglected to do so, mostly because I didn't see any potential caching of results.

Please login to add a comment or rating.
Tag Activity for this File
Tag Applied By Date/Time
statistics Harold Bien 22 Oct 2008 08:22:18
probability Harold Bien 22 Oct 2008 08:22:18
statistical Harold Bien 22 Oct 2008 08:22:18
mode Harold Bien 22 Oct 2008 08:22:18
frequency Harold Bien 22 Oct 2008 08:22:18
fast Harold Bien 22 Oct 2008 08:22:18

Contact us at files@mathworks.com