# clustering of 1d data

120 views (last 30 days)
joy on 2 Mar 2015
Answered: MS on 11 Sep 2019
So let's say I have an array like this:
[1,1,2,3,10,11,13,67,71]
Is there a convenient way to partition the array into something like this?
[[1,1,2,3],[10,11,13],[67,71]]
I searched with this topic...it seems that kmeans is not a suitable solution for 1d data.. Jenks Natural Breaks Optimization or Kernel Density Estimation could be an option..but which method will be suitable for matlab implementation? Is there any other way in matlab?

MS on 11 Sep 2019
Yes, you can apply the Jenks Natural Breaks iteratively to split the array into several classes based on the similarity of the elements. I wrote a function that applies this method to a one-dimensional array to split it into two classes. You can use it several times while updating the data array.
Example:
data = [1,1,2,3,10,11,13,67,71];
total = length (data);
% Split the initial array into two classes based on Jenks Natural Breaks
[SDCM_All, GF] = get_jenks_interface(data);
% get the first interface: index of maximum Goodness of Variance Fit
[M, I1] = max(GF);
% extract sub_array 3
sub_array_3 = data(I1+1:total);
% get the reamining elements
remaining_elements = data (1:I1);
total = length(remaining_elements);
% Split the remaining elements into two classes based on Jenks natural breaks
[SDCM_All, GF] = get_jenks_interface(remaining_elements);
% get the second interface: index that has the maximum Goodness of Variance Fit
[M, I2] = max(GF);
% extract sub_array_2
sub_array_2 = data(I2+1:total);
% extract sub_array_1
sub_array_1 = data(1:I2);
disp(sub_array_1);
disp(sub_array_2);
disp(sub_array_3);
>>>>>>>> Output:
>> main
1 1 2 3
10 11 13
67 71