Path: news.mathworks.com!newsfeed-00.mathworks.com!newsfeed2.dallas1.level3.net!news.level3.com!postnews.google.com!d23g2000yqc.googlegroups.com!not-for-mail
From: Greg Heath <heath@alumni.brown.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: MLP Optimization Problem ( generalization problem ) -- on OCR 
	data--Urdu
Date: Tue, 2 Dec 2008 14:57:13 -0800 (PST)
Organization: http://groups.google.com
Lines: 182
Message-ID: <e9dc5bcb-cf3d-4a6f-9fdd-d4ea0b8ee6a9@d23g2000yqc.googlegroups.com>
References: <ggpan0$s2d$1@fred.mathworks.com>
NNTP-Posting-Host: 68.39.98.10
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Trace: posting.google.com 1228258633 30273 127.0.0.1 (2 Dec 2008 22:57:13 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Tue, 2 Dec 2008 22:57:13 +0000 (UTC)
Complaints-To: groups-abuse@google.com
Injection-Info: d23g2000yqc.googlegroups.com; posting-host=68.39.98.10; 
	posting-account=mUealwkAAACvQrLWvunjg50tRAnsNtJR
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; 
	Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; 
	.NET CLR 2.0.50727; .NET CLR 3.0.04506.30; Seekmo 10.0.341.0),gzip(gfe),gzip(gfe)
Xref: news.mathworks.com comp.soft-sys.matlab:504526


On Nov 28, 12:45 pm, "zaheer ahmad" <ahmad.zah...@yah00000.com> wrote:
> Dear All
>
> I am developing an OCR (Urdu) but having 'goal doesnt meet' problem.
>
> my network is
>
> Input =400 also reduced and checked on 100 and 144
> output=54
> Hidden layer = 20 but checked on 30,40,50,60,70,80,90,100 upto 250
>
> Sample size i tried to train th net are
> 5400   (i.e. 100*54=5400) but also checked on
> 540    (i.e. 10*54=540) and
> 1080   (i.e. 20*54=1080) and
> 1350   (i.e. 25*54=1350) and
> 2700   (i.e. 50*54=2700)
> where 54 are the number of character and 100,10,20,25 and 50 are samples of each character

So you have

size(p) = [400 Ntrn] for a character with 20*20 = 400 pixels
size(t)  = [54 Ntrn] for 54 letters, integers and special characters?

> i tried on using traingdx, trainlm and trainscg( because of out of memory error ) with both   mse and sse.

Forget sse

> i dont know why it doesnt reach to the gaol the goal=0.1 for traingdx (or goal= 0.009 for tranlm)

Why the difference? How were the goals determined?

> some time it reach to goal but doesnt recognise test data.

How similar are testing and training sets?

Clustering and visualizing the data should help.

> the code is given as below:
>
> clear;clc;
>
> % SET CHARACTERS:
> Alphabet =Alpha4Train;%Alphabet =Alphabet(:,1:100);
> Target=TargetSet;%Target=Target(1:100);

??
5400 not 100

> [S1,Qa] = size(Alphabet);

[400 5400]

> [S2,Q] =size(Target);

[54 5400]

if Q ~= Qa, error, end


> % DEFINING THE NETWORK
> % ====================
> H1 =120 ;%115=10  120=0 with mc=0.5   120=2...80....200=met for 10 char ...150     120  for 10 alphas

I have no idea what the comment is supposed to mean.

Nw = (400+1)*120+(120+1)*54 = 54+(1+400+54)*120 = 54,654
Neq = 5400*54 = 291,600 ~ 5.3*Nw

Would have prefered a higher ratio.

> net =newff(minmax(Alphabet),[H1 S2],{'logsig' 'logsig'},'traingdx');%trainrp  trainscg

Why not standardize inputs and use tansig hidden nodes??

> %%%%traingdx  traingdm  trainlm traincgf,   net =newff(minmax(alphabet),[S1 S2],{'logsig' 'logsig'},'traingdx');
>
> net.performFcn = 'sse'; % sse   Sum-Squared Error performance function

Why not use mse??

> net.trainParam.goal =0.10;% mean(var(Target))/100; %0.10;% 0.009;% Sum-squared error goal.

??

c = 54
mean(Target) = [ 1 + (c-1)*0]/c = 1/c = 1/54 = 1.85e-2
mean(var(Target)) = [(1-1/c)^2 + (c-1)*(0-1/c)^2]/(c-1) = 1/c

net.trainParam.goal  = 1.85e-4     % MSE

> net.trainParam.show = 10; % Frequency of progress displays (in epochs).
> net.trainParam.epochs = 95000; %5000 Maximum number of epochs to train.
> %  net.trainParam.mc = 0.95;%0.65;% % Momentum constant.  mc=0.65 and s1=100 good memorization

H = S1?

> %  net.trainParam.mem_reduc =99999;
> % net.trainParam.lr=0.01;%Learning rate
> % net.trainParam.lr_inc=1.9;
> % net.trainParam.lr_dec = 0.5;

I use trainlm or trainscg and only specify goal,
show and (rarely) epochs.
So, I can't comment on the other settings.

> % TRAINING THE NETWORK
> % ====================
>
> P = [Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet];
> T = [Target,Target,Target,Target,Target,Target,Target];

This doesn't make sense.

> [net,tr] = train(net,P,T);
>
> % TRAINING THE NETWORK WITH NOISE...GET DIRTY FOR GOOD RESULTS AT THE END

This is called Jittering. Go to Google groups and search on

greg-heath jittering

> % =======================================================================
> netn = net;
> netn.trainParam.goal =0.01;% mean(var(Target))/100; %0.009;%mean(var(Target))/100; % Mean-squared error goal.

Revisit this.

> netn.trainParam.epochs = 85000;%500
> netn.trainParam.show = 10; %%% Frequency of progress displays (in epochs).
>
> T = [Target,Target,Target,Target,Target,Target,Target];
> P = [(Alphabet + randn(S1,Qa)*0.2), Alphabet + randn(S1,Qa)*0.3, Alphabet + > randn(S1,Qa)*0.3,Alphabet,Alphabet,
> (Alphabet + randn(S1,Qa)*0.2), Alphabet + randn(S1,Qa)*0.3];
> [netn,trn] = train(netn,Alphabet,Target);

Since Neq/Nw ~ 5. Probably don't need to increase Ntrn
by more than a factor of 2 to 4.

Use only one noise level and scale it to the
standard deviation of Alphabet in order to get
a specified SNR.

> %   load netxxx1010; ImProc(netn,net);
>
> save netgdx2115;
>
> %%%%%%%%%%
> i have only 100 samples for each character to i have used
>
> P = [Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet,Alphabet];
>
> to the get inequality
>
> Neq >~ r* Nw       where (~2 < r < ~ 64).    as described by Greg Heath in posts.
> so it doesnt need to tell that i have tried to follow Greg Heath rule rule for choosing hidden layer.
> even tried to overrule it some times but all in vain.
>
> the comments in the code shows the values i have tested, so i have not omitted the comments for yours
> reading despite it make the code reading a bit difficult, hope no one will mind.

You should overlay plots of misclassified characters
with plots of means of the correct and assigned classes.
Perhaps the classes are not defined well enough and
you may need to use clustering to create well defined
subclasses.

You can also replace forced classification (always
make a classification) with conditional classification
(only make a classification if the posterior estimate
is larger than a threshold). To do this, overlay the
color coded histograms of the output for the classes
that get the most confused.

Go to Google Groups and search on

greg-heath forced-classification
greg-heath conditional-classification

Hope this helps.

Greg