Like histc but for n-dimension.

Unlike the other nd histogram http://www.mathworks.com/matlabcentral/fileexchange/3957 beside counting, this function returns also the location of points in the bins. Coded entirely in Matlab (no mex required). Speed slightly slower (but quite decent).

User can data and/or function for specific need of accumulator on each patch

Bruno Luong (2021). N-dimensional histogram (https://www.mathworks.com/matlabcentral/fileexchange/23897-n-dimensional-histogram), MATLAB Central File Exchange. Retrieved .

Created with
R2009a

Compatible with any release

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

285fast!

Toby DewhurstDiogo GonçalvesJose RuedaDear Bruno,

Thank you for this function. I am wondering if there is a bug with the edges when histc is called. I mean, if I call this function with a scalar, looking for 'm' bins, I got 'm' mid points, but the vector counts have m+1 component, the same as the vector edges. However, if I give the edges to the routines now vector counts and vector mid have the dame dimensions and edges is one larger (as I think it should be). here is an example to show what I mean:

----input

y=randn(3000,1);

[c.counts,c.edges,c.mid,c.loc] = histcn(y);

c

[c.counts,c.edges,c.mid,c.loc] = histcn(y,[-3:0.05:3]);

c

----output

c =

struct with fields:

counts: [33×1 double]

edges: {[1×33 double]}

mid: {[1×32 double]}

loc: [3000×1 double]

c =

struct with fields:

counts: [120×1 double]

edges: {[1×121 double]}

mid: {[1×120 double]}

loc: [3000×1 double]

Thanks a lot,

José

Andrey RevyakinBeen using this for almost 10 years now for processing of super-resolution microscopy localizations. Very fast as a clustering algorithm for millions of XY float points, to detect clusters and eliminate outliers. For my needs, this produces more predictable and intuitive results than k-means and DBscan (and, obviously, much faster).

Petr SkonnikovStefan DukicWorked for what I needed. Thank you.

Pavel KolesnichenkoAppeared to be very useful for my needs!

Dimaben lambertHugh NolanYifan Guswelltimethanks for the tool!

changing the edge read in from seperate vectors to a cell and changing the readin to

edges = varargin{:};

will let you analyze arbitrary dimensional data.

David ShinI found this script useful as a quick n-dim histogrammer. However, as mentioned below a couple of times, the function behaves undesirably for values at the max bin edge. Users must be aware of this.

Thank you Bruno.

Shengnan Liuhi, does anyone have an efficient implementation for the cdf calculation based on this result for 3D data?

Itzik Ben ShabatHi, thanks. its great.

i reccommend changing line 108 to

[~,~, loc(:,d)] = histcounts (Xd, ed);

and removing the clear in line 114.

cheers

Ewan Hughes McInnesErik S.Why is the outputs in the cell "mid" one element shorter than the size of the "count" variable?

Massimo ZanettiVery useful and efficient routine, thanks a lot for submitting! COUNT output is exactly what I wanted!

Brian Littlearnold and J:

I noticed the same issue with the flipping of dimensions. I fixed this by transposing the count ouput parameter (count') before using it in scatter or pcolor or any other plotting function. Hope this helps :)

Emmanuel FarhiAn excellent and fast hist extension for ND. Thanks.

arnoldhist3 also gives the counts with flipped dimensions... strange

arnoldvery nice work, but I see the same problem as J. Dimensions seem to be switched. Some non random data I have. Doing a scatterplot clearly indicates the highest data density in the upper left, using histcn clearly gives the same distribution, yet the highest counts are in the lower right. Yes, set everything to axis xy.

It must be something simple, a hint please :)

Bruno LuongTo Dave, see my comment in May 2009. This is how it supposes to work.

DaveDThis is a very useful piece of code, but it returns extra dimensions if any of your points fall along an outer edge.

WokmahmutJIf I do the example usage but for 2-D data and do fewer random data points (say 200), then when I imagesc it and use plot() to plot the 2-D scatter, it seems like something is just a *little* but off. I'm guessing the code is working properly but that the usage example perhaps is plotting incorrectly to match up properly with a scatter plot? It's not a simple axis xy shift or flipup/fliplr on a single dimensions... the plots look similar but almost a reflection about the symmetry axis is required? Maybe I'm just missing something...

SanyaThanks!

Michal FicekExcelent work, it really helped me a lot. Runs fast!

Matthias FrippAbout plotting the results:

Jaclyn, if you just want bars plotted for the pairwise densities of two variables (i.e., a plot with x and y axes corresponding to values of your variables, and vertical bars on the x,y grid, corresponding to the counts for each 2D bin), you can get that with hist3().

Will, if you want to plot weights for 3 variables as varying sizes of bubbles on an x,y,z grid, you can do something like this:

% get the histogram

[count edges mid loc] = histcn(vals);

% make a grid for plotting

[X Y Z]=ndgrid(edges{1}, edges{2}, edges{3});

X=X(:); Y=Y(:); Z=Z(:);

% calculate sizes so the most dense cell gets a value of 100

% also convert from volume to "area" (as if drawing a sphere with

% the right volume and cross-sectional area s)

s_scale = 100/(max(count(:))^(2/3));

s = count(:).^(2/3) * s_scale;

% convert any zeros to small numbers for scatter3

s(s==0)=realmin;

% plot the densities

fh=figure();

set(fh, 'Renderer', 'OpenGL'); % faster drawing

scatter3(X, Y, Z, s, 'filled');

SoravitI really appreciate it. Great Work!!

AndreyGreat. 100,000 XY float points in a 512x512 bin matrix takes ~40 ms, and I still get the list of bin assignment!

Jac BillingtonThank you!

Bruno LuongJaclyn, to put y-axis increase from bottom to top call

>> set(gca, 'Ydir', 'normal')

This is question is generic and better posted in the newsgroup. Thanks for the comment.

Jac BillingtonThis is great, doing just what I wanted. Just one question - I'm plotting a 3-d histogram of x y count. However it always plots the x axis as increasing values from origin and the y axis as decreasing. The reverse y axis doesn't work in m-code - only when I open the plot editor (bit tedious).

Could you possibly point out a tip for getting both axis to increase from origin. I've tried playing with the code but I'm a bit of a matlab novice

Many thanks. Jaclyn

Bruno LuongWill, you better post the plotting problem on the newsgroup. I suggest to take a look at BAR3 function for the moment.

Will Huangis there any way to represent the 3-dimentional histogram?

I have a 3-dimentional matrix with values in them, is there anyway i can change the size of the dot (bubble) to represent the values in the 3 D plot in matlab

Will

Daniel GoldenWorks great! I'm using it to make a 2-dimensional histogram with an additional third dimension of "weights."

Bruno LuongTo Bin Zhang:

The way HISTCN is design: If user supply m-edges [e1, e2, ..., e_m] then there are m-1 bins:

bin1 : e1 <= X < x2

bin2: e2 <= X < x3

...

bin-{m-1}: e_m-1 <= X < e_m

The Bin #m does not exist, unless if there is point hitting the right border:

X == e_m

You can indeed force the number of output bins to m by changing the line #66. to

sz(d) = length(ed)-1

Thanks,

Bruno

bin zhangHi, Bruno:

Your coding is really elegant and excellent. ;-)

I had one question on line 66 " sz(d) = length(ed)-1; ". So here, why sz(d) is not instead length(ed)?

Due to this, when I provide, .i.e, 3 edges of size mx1, for a 3d matrix, I would get *count* as a (m-1)^3 rather than m^3 matrix. Am I misunderstanding sth here, or there is some special reason for this?

Thanks,

Bin