Statistical mode.

Updated 05 Apr 2006

MODE finds the mode of a sample. The mode is the observation with the greatest frequency.

i.e. in a sample x=[0, 1, 0, 0, 0, 3, 0, 1, 3, 1, 2, 2, 0, 1] the mode is the most frequent item, 0.

joe joe

Harold Bien

Timing a single loop is not accurate. To time performance, you need to do several iterations with a large dataset. Using a (slightly) modified version of your code, the time it takes to determine the mode of a 10,000 random element: >> data=rand(10000,1); y=mode(data);
yields the following timing results (where I surrounded both methods with a for i=1:1000 loop:

Hist method: 9.49, 9.61, 9.53secs
Non-hist method: 3.23, 3.32, and 3.20secs

I swapped the order (both loops ran in same function) and the timing remained the same.

Command line I used and results:
m=mode(data);
Elapsed time is 9.489020 seconds.
Elapsed time is 3.234487 seconds.
>> m=mode(data);
Elapsed time is 9.606668 seconds.
Elapsed time is 3.318920 seconds.
>> m=mode(data); % Order swapped (non-hist first, hist second)
Elapsed time is 3.197718 seconds.
Elapsed time is 9.529675 seconds.

mode.m:
--------------------
tic;
for i=1:1000
sorted=sort(x(:));
[d1, i1]=unique(sorted);
h=diff(i1);
[d2, i2]=max(h);
m=d1(i2);
end
toc

tic;
for i=1:1000
[b,i,j] = unique(x);
h = hist(j,length(b));
m=max(h);
y=b(h==m);
end
toc

Kuncup Iswandy

The mode function seems to return only one solution when there are more than one solution, e.g. X = [1 1 0 2 3 3]; this function only returns 1. It should 1 and 3.

Here I give a little bit modifications

function y = modestat(x)

[b,i,j] = unique(x);
[m,k] = sort(hist(j,length(b)));
id = find(max(m) == m);
y = k(id);

Michael Robbins

Thanks guys, but I tested it and your way was 50% slower.

Elapsed time is 0.010098 seconds.
Elapsed time is 0.015903 seconds.

anonymous a.

FYI, here's a corrected version of Harold's suggestion above:

sorted=sort(data(:));
[d1, i1]=unique(sorted);
[d2, i2]=max([i1(1); diff(i1)]);
m=d1(i2(1));

Asaf Tsoar

Harold Bien

What about instead of using 'hist' use the distance between the indices returned from unique (only works for sorted data), a la:

sorted=sort(data);
[dummy, idx]=unique(sorted);
[dummy, idx]=max(diff(idx));
mode=sorted(idx);

Is this faster?

