What is the logic behind contradicting bins in the function histc?

1 view (last 30 days)
I am trying to understand a certain behavior of MATLAB's function histc. The bin ranges should be monotonically non-decreasing, however, when repeating edges are present, MATLAB does not throw an error. Instead the following happens:
binranges = [-Inf 3 4 4 5 Inf];
a = [4 3];
histc(a,binranges)
ans = [0 1 0 1 0 0]
According to my understanding, MATLAB creates the following six bins:
-Inf <= x1 < 3
3 <= x2 < 4
4 <= x3 < 4 ??
4 <= x4 < 5
5 <= x5 < Inf
Inf = x6
In the bin 4 <= x3 < 4, MATLAB counts 0, which somehow makes sense since a number cannot be equal to and strictly less than a specific value at the same time. Is there a rationale behind this contradicting bin or why does it exists?

Accepted Answer

dpb
dpb on 14 Feb 2015
It only "exists" when you inadvertently create a zero-width bin which is, imo, an input user error. The author(s) of histc apparently didn't think that anybody would knowingly do such so the input error checking doesn't catch the problem.
Would be a possible reasonable enhancement request it would seem to add a little more robustness to the code. OTOMH I think of no reason for why it would seem advantageous to want this facility.
One can put the "defensive programming" paradigm to work in one's one code, however, by using unique on the input vector before calling histc to ensure there is no such error in the binning edges vector.
  2 Comments
Caroline
Caroline on 14 Feb 2015
Thank you very much for your answer. My confusion stems from the fact that the MATLAB function kstest2 seems to be dependent on this quirky feature. In this function, repetitive values usually are present in the input and the developers have chosen to bin the input with the function histc , taking advantage of this seemingly weird behavior. Have they just taken advantage of a bug?
dpb
dpb on 14 Feb 2015
Edited: dpb on 15 Feb 2015
Well, it's not a bug unless it violates the documentation. Hence it's a "feature"... :)
Don't have a klew as to whether the latter use influenced the behavior or not--since the toolboxes sorta' followed the base product my guess is that histc was simply initially implemented as described(+) and it just so happens there's a case where allowing it saves a step in not needing unique to filter the elements before calling it.
(+) Just looked back at the earliest release I have installed (R11) and histc has same behavior there. I presume it was probably that way from the git-go.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!