I'm using the statistics toolbox. I want to classify my data using LDA
and QDA.
However, my training data is very sparse, 40 instances with 100
features. So LDA
does not like it.
[class,err,POSTERIOR,logp,coeff] = classify(X,X,label);
??? Error using ==> classify at 233
The pooled covariance matrix of TRAINING must be positive definite.
How should I pre-process my training data to make it work? It works
very well when
I use SVM.
On Jul 1, 11:08=A0am, Rob <AJAX...@gmail.com> wrote:
> I'm using the statistics toolbox. I want to classify my data using LDA
> and QDA.
> However, my training data is very sparse, 40 instances with 100
> features. So LDA
> does not like it.
>
> [class,err,POSTERIOR,logp,coeff] =3D classify(X,X,label);
> ??? Error using =3D=3D> classify at 233
> The pooled covariance matrix of TRAINING must be positive definite.
>
> How should I pre-process my training data to make it work? It works
> very well when
> I use SVM.
40 instances in general position define a 39 dimensional space.
Project
your 100 component feature data into that space using the eigenvectors
of Cov coresponding to nonzero eigenvalues.
Alternatively, do not use classify. Instead, create the discriminant
directly using pinv(Cov).
On Jul 1, 11:08=A0am, Rob <AJAX...@gmail.com> wrote:
> I'm using the statistics toolbox. I want to classify my data using LDA
> and QDA.
> However, my training data is very sparse, 40 instances with 100
> features. So LDA
> does not like it.
>
> [class,err,POSTERIOR,logp,coeff] =3D classify(X,X,label);
> ??? Error using =3D=3D> classify at 233
> The pooled covariance matrix of TRAINING must be positive definite.
>
> How should I pre-process my training data to make it work? It works
> very well when
> I use SVM.
Terminology:
Your data is sparse. The corresponding Cov is not
necessarily sparse.
The ratio of the number of zero elements to the number of nonzero
elements determines the degree of sparcity.
Rob wrote:
> I'm using the statistics toolbox. I want to classify my data using LDA
> and QDA.
> However, my training data is very sparse, 40 instances with 100
> features. So LDA
> does not like it.
If you are using R2008a, you might look at SEQUENTIALFS, which will allow you to
select a subset of features.
On Jul 7, 10:24 am, Peter Perkins
<Peter.PerkinsRemoveT...@mathworks.com> wrote:
> Rob wrote:
> > I'm using the statistics toolbox. I want to classify my data using LDA
> > and QDA.
> > However, my training data is very sparse, 40 instances with 100
> > features. So LDA
> > does not like it.
>
> If you are using R2008a, you might look at SEQUENTIALFS, which will allow you to
> select a subset of features.
Otherwise you can use STEPWISEFIT (or STEPWISE) on the augmented
predictor matrix including crossproduct and squares of the original
variables.
Hope this helps.
Greg
Tags for this Thread
Add a New Tag:
Separated by commas
Ex.: root locus, bode
What are tags?
A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.
Anyone can tag a thread. Tags are public and visible to everyone.
Public Submission Policy
NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for
all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content.
Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available
via MATLAB Central. Read the complete Disclaimer prior to use.