Path: news.mathworks.com!not-for-mail
From: <HIDDEN>
Newsgroups: comp.soft-sys.matlab
Subject: Re: MLP Optimization Problem ( generalization problem ) -- on OCR
Date: Tue, 10 Feb 2009 11:33:01 +0000 (UTC)
Organization: The MathWorks, Inc.
Lines: 183
Message-ID: <gmrold$1qq$1@fred.mathworks.com>
References: <ggpan0$s2d$1@fred.mathworks.com> <e9dc5bcb-cf3d-4a6f-9fdd-d4ea0b8ee6a9@d23g2000yqc.googlegroups.com>
Reply-To: <HIDDEN>
NNTP-Posting-Host: webapp-02-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1234265581 1882 172.30.248.37 (10 Feb 2009 11:33:01 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Tue, 10 Feb 2009 11:33:01 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1707782
Xref: news.mathworks.com comp.soft-sys.matlab:517272


Greg Heath <heath@alumni.brown.edu> wrote in message <e9dc5bcb-cf3d-4a6f-9fdd-d4ea0b8ee6a9@d23g2000yqc.googlegroups.com>...
> On Nov 28, 12:45 pm, "zaheer ahmad" <ahmad.zah...@yah00000.com> wrote:
> > Dear All
> >
> > I am developing an OCR (Urdu) but having 'goal doesnt meet' problem.
> >
> > my network is
> >
> > Input =400 also reduced and checked on 100 and 144
> > output=54
> > Hidden layer = 20 but checked on 30,40,50,60,70,80,90,100 upto 250
> >
> > Sample size i tried to train th net are
> > 5400   (i.e. 100*54=5400) but also checked on
> > 540    (i.e. 10*54=540) and
> > 1080   (i.e. 20*54=1080) and
> > 1350   (i.e. 25*54=1350) and
> > 2700   (i.e. 50*54=2700)
> > where 54 are the number of character and 100,10,20,25 and 50 are samples of each character
> 
> So you have
> 
> size(p) = [400 Ntrn] for a character with 20*20 = 400 pixels
> size(t)  = [54 Ntrn] for 54 letters, integers and special characters?
> 
> > i tried on using traingdx, trainlm and trainscg( because of out of memory error ) with both   mse and sse.
> 
> Forget sse
> 
> > i dont know why it doesnt reach to the gaol the goal=0.1 for traingdx (or goal= 0.009 for tranlm)
> 
> Why the difference? How were the goals determined?
> 
> > some time it reach to goal but doesnt recognise test data.
> 
> How similar are testing and training sets?
> 
> Clustering and visualizing the data should help.
> 
> > the code is given as below:
> >
> > clear;clc;
> >
> > % SET CHARACTERS:
> > Alphabet =Alpha4Train;%Alphabet =Alphabet(:,1:100);
> > Target=TargetSet;%Target=Target(1:100);
> 
> ??
> 5400 not 100
> 
> > [S1,Qa] = size(Alphabet);
> 
> [400 5400]
> 
> > [S2,Q] =size(Target);
> 
> [54 5400]
> 
> if Q ~= Qa, error, end
> 
> 
> > % DEFINING THE NETWORK
> > % ====================
> > H1 =120 ;%115=10  120=0 with mc=0.5   120=2...80....200=met for 10 char ...150     120  for 10 alphas
> 
> I have no idea what the comment is supposed to mean.
> 
> Nw = (400+1)*120+(120+1)*54 = 54+(1+400+54)*120 = 54,654
> Neq = 5400*54 = 291,600 ~ 5.3*Nw
> 
> Would have prefered a higher ratio.
> 
> > net =newff(minmax(Alphabet),[H1 S2],{'logsig' 'logsig'},'traingdx');%trainrp  trainscg
> 
> Why not standardize inputs and use tansig hidden nodes??
> 
> > %%%%traingdx  traingdm  trainlm traincgf,   net =newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'},'traingdx');
> >
> > net.performFcn = 'sse'; % sse   Sum-Squared Error performance function
> 
> Why not use mse??
> 
> > net.trainParam.goal =0.10;% mean(var(Target))/100; %0.10;% 0.009;% Sum-squared error goal.
> 
> ??
> 
> c = 54
> mean(Target) = [ 1 + (c-1)*0]/c = 1/c = 1/54 = 1.85e-2
> mean(var(Target)) = [(1-1/c)^2 + (c-1)*(0-1/c)^2]/(c-1) = 1/c
> 
> net.trainParam.goal  = 1.85e-4     % MSE
> 
> > net.trainParam.show = 10; % Frequency of progress displays (in epochs).
> > net.trainParam.epochs = 95000; %5000 Maximum number of epochs to train.
> > %  net.trainParam.mc = 0.95;%0.65;% % Momentum constant.  mc=0.65 and s1=100 good memorization
> 
> H = S1?
> 
> > %  net.trainParam.mem_reduc =99999;
> > % net.trainParam.lr=0.01;%Learning rate
> > % net.trainParam.lr_inc=1.9;
> > % net.trainParam.lr_dec = 0.5;
> 
> I use trainlm or trainscg and only specify goal,
> show and (rarely) epochs.
> So, I can't comment on the other settings.
> 
> > % TRAINING THE NETWORK
> > % ====================
> >
> > P = [Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet];
> > T = [Target,Target,Target,Target,Target,Target,Target];
> 
> This doesn't make sense.
> 
> > [net,tr] = train(net,P,T);
> >
> > % TRAINING THE NETWORK WITH NOISE...GET DIRTY FOR GOOD RESULTS AT THE END
> 
> This is called Jittering. Go to Google groups and search on
> 
> greg-heath jittering
> 
> > % =======================================================================
> > netn = net;
> > netn.trainParam.goal =0.01;% mean(var(Target))/100; %0.009;%mean(var(Target))/100; % Mean-squared error goal.
> 
> Revisit this.
> 
> > netn.trainParam.epochs = 85000;%500
> > netn.trainParam.show = 10; %%% Frequency of progress displays (in epochs).
> >
> > T = [Target,Target,Target,Target,Target,Target,Target];
> > P = [(Alphabet + randn(S1,Qa)*0.2), Alphabet + randn(S1,Qa)*0.3, Alphabet + > randn(S1,Qa)*0.3,Alphabet,Alphabet,
> > (Alphabet + randn(S1,Qa)*0.2), Alphabet + randn(S1,Qa)*0.3];
> > [netn,trn] = train(netn,Alphabet,Target);
> 
> Since Neq/Nw ~ 5. Probably don't need to increase Ntrn
> by more than a factor of 2 to 4.
> 
> Use only one noise level and scale it to the
> standard deviation of Alphabet in order to get
> a specified SNR.
> 
> > %   load netxxx1010; ImProc(netn,net);
> >
> > save netgdx2115;
> >
> > %%%%%%%%%%
> > i have only 100 samples for each character to i have used
> >
> > P = [Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet];
> >
> > to the get inequality
> >
> > Neq >~ r* Nw       where (~2 < r < ~ 64).    as described by Greg Heath in posts.
> > so it doesnt need to tell that i have tried to follow Greg Heath rule rule for choosing hidden layer.
> > even tried to overrule it some times but all in vain.
> >
> > the comments in the code shows the values i have tested, so i have not omitted the comments for yours
> > reading despite it make the code reading a bit difficult, hope no one will mind.
> 
> You should overlay plots of misclassified characters
> with plots of means of the correct and assigned classes.
> Perhaps the classes are not defined well enough and
> you may need to use clustering to create well defined
> subclasses.
> 
> You can also replace forced classification (always
> make a classification) with conditional classification
> (only make a classification if the posterior estimate
> is larger than a threshold). To do this, overlay the
> color coded histograms of the output for the classes
> that get the most confused.
> 
> Go to Google Groups and search on
> 
> greg-heath forced-classification
> greg-heath conditional-classification
> 
> Hope this helps.
> 
> Greg