Histogram of histogram (HoH) is a useful measure concerning the distribution of random data, which has diverse applications in data science, statistics, information theory, etc.
In this problem, given an n-by-m array x of integer numbers {1,2,...,S}, return the HoH along every column of x: f = HoH(x). An example for n = 5, m = 4, and S = 6 follows.
Input
x = [1 2 2 3 2 3 3 6 1 3 1 1 6 3 2 5 2 2 4 2]
Histogram
h = [2 0 1 1 2 2 2 1 0 3 1 1 0 0 1 0 0 0 0 1 1 0 0 1]
where the r-th (r=1,...,S) row of h is the histogram bin counts for number r along every column of x.
HoH
f = [1 0 3 5 2 1 1 0 0 1 0 0]
where f is a max(h(:))-by-m matrix, with the p-th row representing the histogram of number p along every column of h.
Hint : A straightforward reference scheme to obtain f could be:
h = histc(x,1:max(x(:)),1); f = histc(h,1:max(h(:)),1);
This is simple but inefficient in terms of both performance and memory (It will crash for the last test case). Note that the ultimate goal is to find f (HoH); thus, it is not necessary to go through exactly the same h as described above. Try your best to improve your code in terms of both speed and memory. Your score will be based on the speed of your code.
Hi Alfonso, I noticed that a faster solution is often beat by a slower solution (according to my own computer) when running on Cody. Could you share your knowledge or some guess on this issue? Thanks.
1381 Solvers
Find common elements in matrix rows
784 Solvers
Replace NaNs with the number that appears to its left in the row.
1686 Solvers
Create a cell array out of a struct
181 Solvers
Find the stride of the longest skip sequence
80 Solvers
Solution 773068
This is nearly 30% slower than Solution 772002(size 35) on my PC(win 10 pro x64,i7 4700mq)