How to bin vector values into a fixed set of evenly distributed groups (most similar numerosity)

1 view (last 30 days)
I would like to bin contiguous values into a fixed number of evenly DISTRIBUTED groups (most similar numerosity).
However, when trying 'hiscounts' or 'discretize' I am only able to group data into evenly SPACED bins (most similar distance between bins).
Assume the following scenario:
N1 = ones([89 1]); N2 = repmat(2,[64 1]); N3 = repmat(3,[34 1]); N4 = repmat(4,[31 1]); N5 = repmat(5,[2 1]);
N6 = repmat(6,[37 1]); N7 = repmat(7,[19 1]); N8 = repmat(8,[1 1]); N9 = repmat(9,[2 1]);
VEC = [N1 ; N2 ; N3 ; N4 ; N5 ; N6 ; N7 ; N8 ; N9];
I would like to bin values in VEC in THREE groups such that the groups have the most similar numerosity as possible.
In other words, the solution I want to obtain is such that the sum of the absolute differences among each pair of groups should be the lowest.
Also, groups need to respect the 'order' of values such that, for example, a group composed by values 3, 4, and 5 is a valid one, but a group composed by 3, 4, and 8 is not, as 4 and 8 are discontiguous.
In this instance, the solution I aim to obtain is the following:
- group 1: values 1 (N = 89)
- group 2: values 2 and 3 (N = 98)
- group 3: values 4, 5, 6, 7, 8, and 9 (N = 92).
I have tried with prctiles, histcounts, and discretize but I cannot get this solution.
I have settled with hard-coding every combination of values allowed for each group (N=28), like this:
X = NaN([28 3]);
X(1,:)=[N1 N2 N3+N4+N5+N6+N7+N8+N9];
X(2,:)=[N1 N2+N3 N4+N5+N6+N7+N8+N9];
X(3,:)=[N1 N2+N3+N4 N5+N6+N7+N8+N9];
...
And then deciding on the best solution (minimum sum of differences) as follows:
SCORE = NaN([28 1]);
for i = 1:size(X,1)
SCORE(i) = ((abs(X(i,1)-X(i,2)))+(abs(X(i,1)-X(i,3)))+(abs(X(i,2)-X(i,3))))/3; % (I guess that '/3' is irrelevant)
end
[~,IDXmin]=min(SCORE);
GROUPS = X(IDXmin,:);
Although this solution works, I am sure that there is a much simpler and more elegant solution out there.
Can anyone help me with this please?

Answers (0)

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!