I have three matrices a b and c of size 10x195
I need to perform some classification on the first raw of
these matrices. For example, k-means clustering would work
just fine. That is I would map really the first raw of
each matrix into 3-D space, and perform the classification.
For each cluster I need then to calculate the mean by
averaging over the values of the 11'th raw of matrix
a,band c.
Note, I do not have the kmeans in my matlab. Any
suggestions. THANKS
On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
> I have three matrices a b and c of size 10x195
> I need to perform some classification on the first raw of
> these matrices. For example, k-means clustering would work
> just fine. That is I would map really the first raw of
> each matrix into 3-D space, and perform the classification.
>
> For each cluster I need then to calculate the mean by
> averaging over the values of the 11'th raw of matrix
> a,band c.
> Note, I do not have the kmeans in my matlab. Any
> suggestions. THANKS
Look up "k-means clustering" and implement it? It's
a pretty simple algorithm. Lots of links available on
line, some with code.
Actually I'd be surprised if somebody hasn't developed
Matlab code to share...
Yes. Doing a google search for "k-means clustering
Matlab" turns up numerous links, such as this one:
On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
> I have three matrices a b and c of size 10x195
> I need to perform some classification on the first raw of
> these matrices. For example, k-means clustering would work
> just fine. That is I would map really the first raw of
> each matrix into 3-D space, and perform the classification.
>
> For each cluster I need then to calculate the mean by
> averaging over the values of the 11'th raw of matrix
> a,band c.
> Note, I do not have the kmeans in my matlab. Any
> suggestions. THANKS
In general, clustering a mixture of multiple class data
via unsupervised clustering yields a suboptimal cluster
based classifier. However, cluster based classification
can be improved, significantly, if supervised clustering
using class labels, is used.
Effective versions of classifiers designed via supervised
clustering can be found by searching the acronyms
of ART, LVQ and RCE. However, I'm not sure if the
corresponding MATLAB code is readily available.
A simple alternative is just to cluster each class
separately and compare classification results with
classifiers created from clustering the multiclass mixture.
In article <1194899672.262350.297810@22g2000hsm.googlegroups.com>,
Greg Heath <heath@alumni.brown.edu> wrote:
>In general, clustering a mixture of multiple class data
>via unsupervised clustering yields a suboptimal cluster
>based classifier. However, cluster based classification
>can be improved, significantly, if supervised clustering
>using class labels, is used.
*If*, that is, the class labels are correct. Which turns
out to be a problem in practice. It is unfortunately not "rare"
for us to receive datasets in which samples have been misclassified.
The "Gold Standard" is classification by a trained experienced human
expert, but even experts make mistakes or are mislead by the data
subset that they examine to classify by (e.g., the visual shape of a
cell). We have found that for some datasets, that our unsupervised
classification methods have an accuracy significantly exceeding the
"Gold Standard".
A related issue that we deal with a lot is that when the datasets
contain large amounts of data (e.g., most any of the modern medical
"scanners" such as CT, MRS, MRI, infra-red), humans have a lot of
difficulty in perceiving the abstract multidimensional patterns
needed in order to create class labels in the first place. Spectral
noise certainly doesn't help!
Supervised classification is great if you already know exactly
what you are looking for, but it is not very good at figuring out
new relationships. If you have your eye on peaks in the oxygen
flow, you are likely to completely miss the much better correlation
with (say) the calcium concentration information...
--
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth
On Nov 12, 3:34 pm, Greg Heath <he...@alumni.brown.edu> wrote:
> On Nov 12, 2:13 pm, "jenya polyakova" <jeny...@yahoo.com> wrote:
>
> > I have three matrices a b and c of size 10x195
> > I need to perform some classification on the first raw of
> > these matrices. For example, k-means clustering would work
> > just fine. That is I would map really the first raw of
> > each matrix into 3-D space, and perform the classification.
>
> > For each cluster I need then to calculate the mean by
> > averaging over the values of the 11'th raw of matrix
> > a,band c.
> > Note, I do not have the kmeans in my matlab. Any
> > suggestions. THANKS
>
> In general, clustering a mixture of multiple class data
> via unsupervised clustering yields a suboptimal cluster
> based classifier. However, cluster based classification
> can be improved, significantly, if supervised clustering
> using class labels, is used.
>
> Effective versions of classifiers designed via supervised
> clustering can be found by searching the acronyms
> of ART, LVQ and RCE. However, I'm not sure if the
> corresponding MATLAB code is readily available.
> A simple alternative is just to cluster each class
> separately and compare classification results with
> classifiers created from clustering the multiclass mixture.
If you wish to search for MATLAB codes, the following
information may help:
Artificial Resonance Theory, Grossburg
Learning Vector Quantization, Kohonen
Reduced Coulomb Energy, Cooper
AKA
Restricted Coulomb Energy, Cooper
On Nov 12, 4:05 pm, rober...@ibd.nrc-cnrc.gc.ca (Walter Roberson)
wrote:
> In article <1194899672.262350.297...@22g2000hsm.googlegroups.com>,
> Greg Heath <he...@alumni.brown.edu> wrote:
>
> >In general, clustering a mixture of multiple class data
> >via unsupervised clustering yields a suboptimal cluster
> >based classifier. However, cluster based classification
> >can be improved, significantly, if supervised clustering
> >using class labels, is used.
>
> *If*, that is, the class labels are correct. Which turns
> out to be a problem in practice. It is unfortunately not "rare"
> for us to receive datasets in which samples have been misclassified.
>
> The "Gold Standard" is classification by a trained experienced human
> expert, but even experts make mistakes or are mislead by the data
> subset that they examine to classify by (e.g., the visual shape of a
> cell). We have found that for some datasets, that our unsupervised
> classification methods have an accuracy significantly exceeding the
> "Gold Standard".
>
> A related issue that we deal with a lot is that when the datasets
> contain large amounts of data (e.g., most any of the modern medical
> "scanners" such as CT, MRS, MRI, infra-red), humans have a lot of
> difficulty in perceiving the abstract multidimensional patterns
> needed in order to create class labels in the first place. Spectral
> noise certainly doesn't help!
>
> Supervised classification is great if you already know exactly
> what you are looking for, but it is not very good at figuring out
> new relationships. If you have your eye on peaks in the oxygen
> flow, you are likely to completely miss the much better correlation
> with (say) the calcium concentration information...
> --
That is why I have always recommended (search on greg-heath
pretraining advice) that unsupervised methods such as unsupervised
clustering and principal component analysis be used, before
supervised learning, in order to torture the data until they confess.
Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for
all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content.
Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available
via MATLAB Central. Read the complete Disclaimer prior to use.