labeling target in neural network in matlab

2 views (last 30 days)
I have a training set of 200rows, 600columns as the training set, each 5 rows represent feature vectors from one subject, meaning I have 40 subjects. How do I label the target in neural network such that I have the same 5 labels for each subject meaning I would have 40 labels at the end or have 200 different labels for the whole feature vector. because in matlab neural network only accept 0 0r 1 as target label but I want to have for 40 labels (1,2,3.....40) or for 200 labels (1,2,3....200).

Accepted Answer

Greg Heath
Greg Heath on 12 Mar 2013
You have 40 subjects. Each subject is a class. You have 5 I0-dimensional input vectors (I0 = 600) for each class.
1. If the N = 200 vectors are linearly independent, they define a N dimensional subspace. Linear dependencies will reduce the dimensionality of the subspace.
2.If the initial dimensionality of the input matrix is
[ I0 N ] = size(x0) % [ 600 200 ]
the true dimensionality of the subspace is
Itrue = rank(x0) % Itrue <= min(I0,N) <= 200
2.Obviously, you should reduce the size of the input matrix from [I0 N] to [ I N ] with I <= Itrue .
3.If this was a regression problem, you could reduce the dimensionality using singular value decomposition (help/doc SVD) or principal component analysis using one or more of the functions obtained from the list resulting from the command
>> lookfor 'principal component'
4. However, this is a classification problem. Therefore, instead of choosing variables with the most variance, you should try to find variables that provide the most separation between the c = 40 classes.
5. Choose the target for each input vector to be a column of the c-dimensional unit matrix eye(c) with the row containing the "1" indicating the classindex.
6. With batch processing, the ordering of the vectors is irrelevant. Therefore you could make life easy by ordering the input vectors according to either
target = repmat(eye(40),5,1);
class = vec2ind(target) % Remove semicolon to verify screen printout
target = ind2vec(class);
or
class = repmat(1:40,5,1)
target = ind2vec(class);
7. Dimensionality reduction for classification is tricky business.I usually start with standardized data and polynomials that are linear models in the coefficients. They can be solved with BACKSLASH as well as MATLABs collection of variable selection algorithms including STEPWISE, STEPWISEFIT, SEQUENTIALFS and PLSREGRESS. Simple models tend to make good (but suboptimal) choices for variable selection.In a multistep process, this is usually a good first step.
Hope this helps.
Thank you for formally accepting my answer
Greg

More Answers (2)

Jack Sparrow
Jack Sparrow on 12 Mar 2013
Thank you Greg, This is quite complicated than I thought. What if I set the label of the 5 samples of the first subject to 1 and other 39 users to 0. Then train neural network and test with my 5 other samples of all users. This step is repeated iteratively 39 times to cover all users. which means I would achieve a classification accuracy for each user. Is this kind of classification logical and legal for face recognition.

Greg Heath
Greg Heath on 12 Mar 2013
Your basic problem is input variable subset selection. Say from 600 to 39. However, if you design 40 classifiers, you have 40 different variable subset selection problems.
Believe me, NOT the way to go.
You don't need the optimal selection, just a good one. Again, standardizing and using a linear coefficient model tends to work well. Learn how to choose a good subset for a linear classifier. Then use them in a neural classifier.
If you are familiar with PCA, you could try that 1st. However, that ranks orthogonal PCA variables based on their spread without taking into account the relative positions of target classes.
LDA would be better, it ranks orthogonal LDA variables (generalized eigenvectors w.r.t within-class and between-class covariance matrices). with respect to between-class to within-class spread ratios.
PLS is also better than PCA because it depends on both input and target matrices. Not sure how it compares with LDA.
If you want to choose original variables and not orthogonal transformed ones, then look at STEPWISE and STEPWISEFIT.
I tend to use the latter and sometimes include 2nd order variable terms.
However, I don't remember trying to deal with 600 original variables. Maybe you could try 6 subsets of 100 variables, select subsets from each of them and then combine them for a final subset selection. I don't know at what point your computer will choke on I input variables.
Hope this helps.
Greg

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!