Thread Subject: Clustering a curve of 3d points

Subject: Clustering a curve of 3d points

From: Sven

Date: 16 Jun, 2009 20:57:02

Message: 1 of 3

Hi all,

I'm trying to group a set of points distributed in 3d space. The points come from a set of points that contain both true and false positives, and I'm trying to weed out the false positives.
The true positives should essentially describe a long curve through my space.

Can anyone help me describe the 'type' of problem I'm facing here, so I can search for the right keywords? Even better, some hints about how to work towards a solution would be great.

I've included the basic data below. Just copy the (n by 4) matrix below into a variable called 'found_pts', and then run the simple code snippet below it. You'll see a plot of my points, and at about mid-level (in Z), you'll see the long curve of true positives I'm trying to extract (as automatically as possible).
The first 3 columns simply give X, Y, and Z locations. The 4th column can be ignored for the moment.

Thanks,
Sven.

found_pts = []; % Copy from below

290 143 -700 0.565
290 142 -695 0.499
169 180 -690 0.429
255 174 -685 0.553
261 176 -680 0.538
216 150 -675 0.587
239 149 -670 0.582
289 149 -665 0.569
169 151 -660 0.576
180 183 -655 0.547
178 183 -650 0.618
156 146 -645 0.58
154 146 -640 0.638
151 146 -635 0.625
147 146 -630 0.661
144 144 -625 0.642
262 130 -620 0.678
263 129 -615 0.774
264 128 -610 0.647
264 127 -605 0.573
266 125 -600 0.658
267 125 -595 0.736
268 123 -590 0.763
268 122 -585 0.752
268 122 -580 0.729
269 120 -575 0.699
269 120 -570 0.708
270 121 -565 0.688
271 122 -560 0.635
201 117 -555 0.646
201 117 -550 0.654
201 116 -545 0.656
201 115 -540 0.66
200 114 -535 0.664
199 115 -530 0.639
198 115 -525 0.659
198 116 -520 0.637
226 173 -515 0.605
206 114 -510 0.592
203 112 -505 0.663
203 109 -500 0.657
248 162 -495 0.706
274 164 -490 0.709
274 165 -485 0.716
274 166 -480 0.73
198 108 -475 0.654
198 107 -470 0.67
198 106 -465 0.659
199 104 -460 0.665
200 103 -455 0.662
201 102 -450 0.626
200 104 -445 0.619
255 188 -440 0.595
244 123 -435 0.565
200 102 -430 0.563
195 166 -425 0.572
195 167 -420 0.618
198 97.2 -415 0.619
122 142 -410 0.655
264 177 -405 0.649
245 115 -400 0.664
248 115 -395 0.688
195 95.8 -390 0.666
195 94.3 -385 0.668
195 92.1 -380 0.639
193 91.4 -375 0.673
145 123 -370 0.677
192 87.8 -365 0.641
194 87.1 -360 0.652
193 87.8 -355 0.633
195 86.4 -350 0.642
195 84.9 -345 0.63
195 83.5 -340 0.673
194 82 -335 0.656
193 81.3 -330 0.665
193 81.3 -325 0.682
193 81.3 -320 0.734
193 82 -315 0.71
231 156 -310 0.512
241 89.2 -305 0.505
237 89.2 -300 0.604
233 88.5 -295 0.615
230 89.2 -290 0.617
227 89.2 -285 0.573
224 88.5 -280 0.534
216 90 -275 0.467
265 86.4 -270 0.555
190 92.9 -265 0.601
155 93.6 -260 0.565
262 87.1 -255 0.577
262 87.1 -250 0.607
263 87.1 -245 0.553
147 165 -240 0.522
156 95 -235 0.557
163 96.5 -230 0.592
214 92.9 -228 0.559
213 93.6 -226 0.54
213 93.6 -225 0.516
247 91.4 -224 0.546
245 91.4 -223 0.528
237 92.1 -221 0.556

numcols = 15;
cols = jet(numcols);
hist_edges = linspace(min(found_pts(:,4)), max(found_pts(:,4)), numcols);
[N, n_bin] = histc(found_pts(:,4), hist_edges);
bin_size = mean(diff(X))/2;
figure, hold on
for i=1:numcols-1
    idxs = n_bin==i;
    plot3(found_pts(idxs,1), found_pts(idxs,2), found_pts(idxs,3),'.','Color',cols(i,:))
end
axis tight, view(3)
colorbar, caxis(hist_edges([1 end]))

Subject: Clustering a curve of 3d points

From: Tom Lane

Date: 17 Jun, 2009 14:32:53

Message: 2 of 3

> I'm trying to group a set of points distributed in 3d space. The points
> come from a set of points that contain both true and false positives, and
> I'm trying to weed out the false positives.
> The true positives should essentially describe a long curve through my
> space.

Sven, there may well be a better way, but I do have one idea. Cluster
analysis seeks to group points into clusters based on the distance between
them. Single-linkage clustering seeks to avoid gaps at the expense of
perhaps having a long cluster. So it might be able to find curves like the
one you describe.

Try this on your data.

% Try single-linkage clustering in attempt to keep nearby points together
figure(1)
Y = pdist(found_pts(:,1:3));
Z = linkage(Y,'single');
dendrogram(Z,101,'colorthreshold',30);
t = cluster(Z,'cutoff',30,'criterion','distance')

% Plot points color-coded by cluster assignment
figure(2), hold on
cols = jet(max(t));
for i=1:max(t)
    idxs = t==i;
    plot3(found_pts(idxs,1), found_pts(idxs,2),
found_pts(idxs,3),'.','Color',cols(i,:))
end

-- Tom

Subject: Clustering a curve of 3d points

From: Sven

Date: 17 Jun, 2009 15:48:01

Message: 3 of 3

> > I'm trying to group a set of points distributed in 3d space. The points
> > come from a set of points that contain both true and false positives, and
> > I'm trying to weed out the false positives.
> > The true positives should essentially describe a long curve through my
> > space.
>
> Sven, there may well be a better way, but I do have one idea. Cluster
> analysis seeks to group points into clusters based on the distance between
> them. Single-linkage clustering seeks to avoid gaps at the expense of
> perhaps having a long cluster. So it might be able to find curves like the
> one you describe.
>
> Try this on your data.
>
> % Try single-linkage clustering in attempt to keep nearby points together
> figure(1)
> Y = pdist(found_pts(:,1:3));
> Z = linkage(Y,'single');
> dendrogram(Z,101,'colorthreshold',30);
> t = cluster(Z,'cutoff',30,'criterion','distance')
>
> % Plot points color-coded by cluster assignment
> figure(2), hold on
> cols = jet(max(t));
> for i=1:max(t)
> idxs = t==i;
> plot3(found_pts(idxs,1), found_pts(idxs,2),
> found_pts(idxs,3),'.','Color',cols(i,:))
> end

Thanks for your help there Tom. Unfortunately I've only got the image processing toolbox at my disposal right now.
It looks like the stats toolbox could be very handy though.
Any ideas that don't rely on the stats toolbox?

For example, since my points are distributed along Z (with no points having the same Z location), the following code effectively weeds out any single outliers (ie, points which differ significantly in XY location from the previous point). Any further ideas?

idxs = abs(prod(diff(found_pts(:,1:2)),2)) < 20^2;
found_pts = found_pts(idxs,:);

Cheers,
Sven.

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
clustering Sven 16 Jun, 2009 16:59:08
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com