Skip to Main Content Skip to Search
Login
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Thread Subject: HELP!!!

Subject: HELP!!!

From: jenya polyakova

Date: 12 Nov, 2007 19:13:46

Message: 1 of 6

 I have three matrices a b and c of size 10x195
I need to perform some classification on the first raw of
these matrices. For example, k-means clustering would work
just fine. That is I would map really the first raw of
each matrix into 3-D space, and perform the classification.

For each cluster I need then to calculate the mean by
averaging over the values of the 11'th raw of matrix
a,band c.
Note, I do not have the kmeans in my matlab. Any
suggestions. THANKS

Subject: Re: HELP!!!

From: Randy Poe

Date: 12 Nov, 2007 19:20:24

Message: 2 of 6

On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
> I have three matrices a b and c of size 10x195
> I need to perform some classification on the first raw of
> these matrices. For example, k-means clustering would work
> just fine. That is I would map really the first raw of
> each matrix into 3-D space, and perform the classification.
>
> For each cluster I need then to calculate the mean by
> averaging over the values of the 11'th raw of matrix
> a,band c.
> Note, I do not have the kmeans in my matlab. Any
> suggestions. THANKS

Look up "k-means clustering" and implement it? It's
a pretty simple algorithm. Lots of links available on
line, some with code.

Actually I'd be surprised if somebody hasn't developed
Matlab code to share...

Yes. Doing a google search for "k-means clustering
Matlab" turns up numerous links, such as this one:

http://people.revoledu.com/kardi/tutorial/kMean/matlab_kMeans.htm

and this "fuzzy k-means" at the File Exchange:
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4353&objectType=File

and this one which looks like ordinary k-means:
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=5324&objectType=file

If you do a search at the File Exchange for "k means" or
for "clustering" you may find other useful codes as well.

           - Randy

Subject: Re: HELP!!!

From: Greg Heath

Date: 12 Nov, 2007 20:34:32

Message: 3 of 6

On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
> I have three matrices a b and c of size 10x195
> I need to perform some classification on the first raw of
> these matrices. For example, k-means clustering would work
> just fine. That is I would map really the first raw of
> each matrix into 3-D space, and perform the classification.
>
> For each cluster I need then to calculate the mean by
> averaging over the values of the 11'th raw of matrix
> a,band c.
> Note, I do not have the kmeans in my matlab. Any
> suggestions. THANKS

In general, clustering a mixture of multiple class data
via unsupervised clustering yields a suboptimal cluster
based classifier. However, cluster based classification
can be improved, significantly, if supervised clustering
using class labels, is used.

Effective versions of classifiers designed via supervised
clustering can be found by searching the acronyms
of ART, LVQ and RCE. However, I'm not sure if the
corresponding MATLAB code is readily available.

A simple alternative is just to cluster each class
separately and compare classification results with
classifiers created from clustering the multiclass mixture.

Hope this helps.

Greg

Subject: Re: HELP!!!

From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)

Date: 12 Nov, 2007 21:05:09

Message: 4 of 6

In article <1194899672.262350.297810@22g2000hsm.googlegroups.com>,
Greg Heath <heath@alumni.brown.edu> wrote:
>In general, clustering a mixture of multiple class data
>via unsupervised clustering yields a suboptimal cluster
>based classifier. However, cluster based classification
>can be improved, significantly, if supervised clustering
>using class labels, is used.

*If*, that is, the class labels are correct. Which turns
out to be a problem in practice. It is unfortunately not "rare"
for us to receive datasets in which samples have been misclassified.

The "Gold Standard" is classification by a trained experienced human
expert, but even experts make mistakes or are mislead by the data
subset that they examine to classify by (e.g., the visual shape of a
cell). We have found that for some datasets, that our unsupervised
classification methods have an accuracy significantly exceeding the
"Gold Standard".

A related issue that we deal with a lot is that when the datasets
contain large amounts of data (e.g., most any of the modern medical
"scanners" such as CT, MRS, MRI, infra-red), humans have a lot of
difficulty in perceiving the abstract multidimensional patterns
needed in order to create class labels in the first place. Spectral
noise certainly doesn't help!

Supervised classification is great if you already know exactly
what you are looking for, but it is not very good at figuring out
new relationships. If you have your eye on peaks in the oxygen
flow, you are likely to completely miss the much better correlation
with (say) the calcium concentration information...
--
   "Beware of bugs in the above code; I have only proved it correct,
   not tried it." -- Donald Knuth

Subject: Re: HELP!!!

From: Greg Heath

Date: 12 Nov, 2007 21:06:31

Message: 5 of 6

On Nov 12, 3:34 pm, Greg Heath <he...@alumni.brown.edu> wrote:
> On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
>
> > I have three matrices a b and c of size 10x195
> > I need to perform some classification on the first raw of
> > these matrices. For example, k-means clustering would work
> > just fine. That is I would map really the first raw of
> > each matrix into 3-D space, and perform the classification.
>
> > For each cluster I need then to calculate the mean by
> > averaging over the values of the 11'th raw of matrix
> > a,band c.
> > Note, I do not have the kmeans in my matlab. Any
> > suggestions. THANKS
>
> In general, clustering a mixture of multiple class data
> via unsupervised clustering yields a suboptimal cluster
> based classifier. However, cluster based classification
> can be improved, significantly, if supervised clustering
> using class labels, is used.
>
> Effective versions of classifiers designed via supervised
> clustering can be found by searching the acronyms
> of ART, LVQ and RCE. However, I'm not sure if the
> corresponding MATLAB code is readily available.


> A simple alternative is just to cluster each class
> separately and compare classification results with
> classifiers created from clustering the multiclass mixture.

If you wish to search for MATLAB codes, the following
information may help:

Artificial Resonance Theory, Grossburg
Learning Vector Quantization, Kohonen
Reduced Coulomb Energy, Cooper
AKA
Restricted Coulomb Energy, Cooper

Hope this helps.

Greg


Subject: Re: HELP!!!

From: Greg Heath

Date: 12 Nov, 2007 21:13:41

Message: 6 of 6

On Nov 12, 4:05 pm, rober...@ibd.nrc-cnrc.gc.ca (Walter Roberson)
wrote:
> In article <1194899672.262350.297...@22g2000hsm.googlegroups.com>,
> Greg Heath <he...@alumni.brown.edu> wrote:
>
> >In general, clustering a mixture of multiple class data
> >via unsupervised clustering yields a suboptimal cluster
> >based classifier. However, cluster based classification
> >can be improved, significantly, if supervised clustering
> >using class labels, is used.
>
> *If*, that is, the class labels are correct. Which turns
> out to be a problem in practice. It is unfortunately not "rare"
> for us to receive datasets in which samples have been misclassified.
>
> The "Gold Standard" is classification by a trained experienced human
> expert, but even experts make mistakes or are mislead by the data
> subset that they examine to classify by (e.g., the visual shape of a
> cell). We have found that for some datasets, that our unsupervised
> classification methods have an accuracy significantly exceeding the
> "Gold Standard".
>
> A related issue that we deal with a lot is that when the datasets
> contain large amounts of data (e.g., most any of the modern medical
> "scanners" such as CT, MRS, MRI, infra-red), humans have a lot of
> difficulty in perceiving the abstract multidimensional patterns
> needed in order to create class labels in the first place. Spectral
> noise certainly doesn't help!
>
> Supervised classification is great if you already know exactly
> what you are looking for, but it is not very good at figuring out
> new relationships. If you have your eye on peaks in the oxygen
> flow, you are likely to completely miss the much better correlation
> with (say) the calcium concentration information...
> --

That is why I have always recommended (search on greg-heath
pretraining advice) that unsupervised methods such as unsupervised
clustering and principal component analysis be used, before
supervised learning, in order to torture the data until they confess.

Hope this helps.

Greg

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
kmeans per isakson 12 Nov, 2007 16:48:16
clustering per isakson 12 Nov, 2007 16:48:16
rssFeed for this Thread

envelope graphic E-mail this page to a colleague

Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Disclaimer prior to use.
Related Topics