File Exchange

## N-dimensional histogram

version 1.3.0.0 (2.85 KB) by
Compute n-dimensional histogram

Updated 26 Aug 2011

Like histc but for n-dimension.

Unlike the other nd histogram http://www.mathworks.com/matlabcentral/fileexchange/3957 beside counting, this function returns also the location of points in the bins. Coded entirely in Matlab (no mex required). Speed slightly slower (but quite decent).

User can data and/or function for specific need of accumulator on each patch

### Cite As

Bruno Luong (2021). N-dimensional histogram (https://www.mathworks.com/matlabcentral/fileexchange/23897-n-dimensional-histogram), MATLAB Central File Exchange. Retrieved .

285

fast!

Toby Dewhurst

Diogo Gonçalves

Jose Rueda

Dear Bruno,
Thank you for this function. I am wondering if there is a bug with the edges when histc is called. I mean, if I call this function with a scalar, looking for 'm' bins, I got 'm' mid points, but the vector counts have m+1 component, the same as the vector edges. However, if I give the edges to the routines now vector counts and vector mid have the dame dimensions and edges is one larger (as I think it should be). here is an example to show what I mean:
----input
y=randn(3000,1);
[c.counts,c.edges,c.mid,c.loc] = histcn(y);
c
[c.counts,c.edges,c.mid,c.loc] = histcn(y,[-3:0.05:3]);
c
----output
c =

struct with fields:

counts: [33×1 double]
edges: {[1×33 double]}
mid: {[1×32 double]}
loc: [3000×1 double]

c =

struct with fields:

counts: [120×1 double]
edges: {[1×121 double]}
mid: {[1×120 double]}
loc: [3000×1 double]

Thanks a lot,
José

Andrey Revyakin

Been using this for almost 10 years now for processing of super-resolution microscopy localizations. Very fast as a clustering algorithm for millions of XY float points, to detect clusters and eliminate outliers. For my needs, this produces more predictable and intuitive results than k-means and DBscan (and, obviously, much faster).

Petr Skonnikov

Stefan Dukic

Worked for what I needed. Thank you.

Pavel Kolesnichenko

Appeared to be very useful for my needs!

Dima

ben lambert

Hugh Nolan

Yifan Gu

swelltime

thanks for the tool!
changing the edge read in from seperate vectors to a cell and changing the readin to

edges = varargin{:};

will let you analyze arbitrary dimensional data.

David Shin

I found this script useful as a quick n-dim histogrammer. However, as mentioned below a couple of times, the function behaves undesirably for values at the max bin edge. Users must be aware of this.
Thank you Bruno.

Shengnan Liu

hi, does anyone have an efficient implementation for the cdf calculation based on this result for 3D data?

Itzik Ben Shabat

Hi, thanks. its great.
i reccommend changing line 108 to

[~,~, loc(:,d)] = histcounts (Xd, ed);

and removing the clear in line 114.
cheers

Ewan Hughes McInnes

Erik S.

Why is the outputs in the cell "mid" one element shorter than the size of the "count" variable?

Massimo Zanetti

Very useful and efficient routine, thanks a lot for submitting! COUNT output is exactly what I wanted!

Brian Little

arnold and J:

I noticed the same issue with the flipping of dimensions. I fixed this by transposing the count ouput parameter (count') before using it in scatter or pcolor or any other plotting function. Hope this helps :)

Emmanuel Farhi

An excellent and fast hist extension for ND. Thanks.

arnold

hist3 also gives the counts with flipped dimensions... strange

arnold

very nice work, but I see the same problem as J. Dimensions seem to be switched. Some non random data I have. Doing a scatterplot clearly indicates the highest data density in the upper left, using histcn clearly gives the same distribution, yet the highest counts are in the lower right. Yes, set everything to axis xy.

It must be something simple, a hint please :)

Bruno Luong

To Dave, see my comment in May 2009. This is how it supposes to work.

DaveD

This is a very useful piece of code, but it returns extra dimensions if any of your points fall along an outer edge.

Wok

mahmut

J

If I do the example usage but for 2-D data and do fewer random data points (say 200), then when I imagesc it and use plot() to plot the 2-D scatter, it seems like something is just a *little* but off. I'm guessing the code is working properly but that the usage example perhaps is plotting incorrectly to match up properly with a scatter plot? It's not a simple axis xy shift or flipup/fliplr on a single dimensions... the plots look similar but almost a reflection about the symmetry axis is required? Maybe I'm just missing something...

Sanya

Thanks!

Michal Ficek

Excelent work, it really helped me a lot. Runs fast!

Matthias Fripp

Jaclyn, if you just want bars plotted for the pairwise densities of two variables (i.e., a plot with x and y axes corresponding to values of your variables, and vertical bars on the x,y grid, corresponding to the counts for each 2D bin), you can get that with hist3().

Will, if you want to plot weights for 3 variables as varying sizes of bubbles on an x,y,z grid, you can do something like this:

% get the histogram
[count edges mid loc] = histcn(vals);

% make a grid for plotting
[X Y Z]=ndgrid(edges{1}, edges{2}, edges{3});
X=X(:); Y=Y(:); Z=Z(:);

% calculate sizes so the most dense cell gets a value of 100
% also convert from volume to "area" (as if drawing a sphere with
% the right volume and cross-sectional area s)
s_scale = 100/(max(count(:))^(2/3));
s = count(:).^(2/3) * s_scale;
% convert any zeros to small numbers for scatter3
s(s==0)=realmin;

% plot the densities
fh=figure();
set(fh, 'Renderer', 'OpenGL'); % faster drawing
scatter3(X, Y, Z, s, 'filled');

Soravit

I really appreciate it. Great Work!!

Andrey

Great. 100,000 XY float points in a 512x512 bin matrix takes ~40 ms, and I still get the list of bin assignment!

Jac Billington

Thank you!

Bruno Luong

Jaclyn, to put y-axis increase from bottom to top call

>> set(gca, 'Ydir', 'normal')

This is question is generic and better posted in the newsgroup. Thanks for the comment.

Jac Billington

This is great, doing just what I wanted. Just one question - I'm plotting a 3-d histogram of x y count. However it always plots the x axis as increasing values from origin and the y axis as decreasing. The reverse y axis doesn't work in m-code - only when I open the plot editor (bit tedious).

Could you possibly point out a tip for getting both axis to increase from origin. I've tried playing with the code but I'm a bit of a matlab novice

Many thanks. Jaclyn

Bruno Luong

Will, you better post the plotting problem on the newsgroup. I suggest to take a look at BAR3 function for the moment.

Will Huang

is there any way to represent the 3-dimentional histogram?

I have a 3-dimentional matrix with values in them, is there anyway i can change the size of the dot (bubble) to represent the values in the 3 D plot in matlab

Will

Daniel Golden

Works great! I'm using it to make a 2-dimensional histogram with an additional third dimension of "weights."

Bruno Luong

To Bin Zhang:

The way HISTCN is design: If user supply m-edges [e1, e2, ..., e_m] then there are m-1 bins:

bin1 : e1 <= X < x2
bin2: e2 <= X < x3
...
bin-{m-1}: e_m-1 <= X < e_m

The Bin #m does not exist, unless if there is point hitting the right border:

X == e_m

You can indeed force the number of output bins to m by changing the line #66. to

sz(d) = length(ed)-1

Thanks,

Bruno

bin zhang

Hi, Bruno:

Your coding is really elegant and excellent. ;-)
I had one question on line 66 " sz(d) = length(ed)-1; ". So here, why sz(d) is not instead length(ed)?

Due to this, when I provide, .i.e, 3 edges of size mx1, for a 3d matrix, I would get *count* as a (m-1)^3 rather than m^3 matrix. Am I misunderstanding sth here, or there is some special reason for this?

Thanks,
Bin

##### MATLAB Release Compatibility
Created with R2009a
Compatible with any release
##### Platform Compatibility
Windows macOS Linux