Path: news.mathworks.com!not-for-mail
From: "Rafael " <rafael.fritz@physik.uni-marburg.de>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Parallel configuration validation in SGE env
Date: Wed, 4 Nov 2009 12:39:02 +0000 (UTC)
Organization: Universit&#228;t Marburg
Lines: 41
Message-ID: <hcrsl6$8d9$1@fred.mathworks.com>
References: <hcrjih$ep5$1@fred.mathworks.com> <ytwvdhqfude.fsf@uk-eellis-deb5-64.mathworks.co.uk>
Reply-To: "Rafael " <rafael.fritz@physik.uni-marburg.de>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1257338342 8617 172.30.248.35 (4 Nov 2009 12:39:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Wed, 4 Nov 2009 12:39:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1634821
Xref: news.mathworks.com comp.soft-sys.matlab:582344


Edric M Ellis <eellis@mathworks.com> wrote in message <ytwvdhqfude.fsf@uk-eellis-deb5-64.mathworks.co.uk>...
> "Rafael " <rafael.fritz@physik.uni-marburg.de> writes:
> 
> > I've been able to find the right configurations for the parallel Matlab
> > computing on our linux cluster, using the sun grid engine SGE scheduler.  Now,
> > if I try to validate these configurations the findResource part passes and
> > also the parallel part and the matlabpool.  But there always occurs a failure
> > with distributed jobs!  Why?  Thanks very much for your suggestions!
> 
> Very strange - usually if parallel and matlabpool jobs are working, that's the
> hardest part. Is there any output from the validation that you could post here?
> Or perhaps you could try something like this:
> 
> s = findResource( .... ); % get your scheduler
> j = s.createJob;
> j.createTask( @matlabroot, 1 );
> j.createTask( @matlabroot, 1 );
> j.submit;
> j.wait(); s.getDebugLog( j )
> 
> and post the output.

So, I've tried this code and found submission and start of Matlab at the working nodes. But there has been no workload for the whole running time I configured (20min).
Then I interrupted with strg+c and looked for the logfile but didn't get one.
Here some output:

"Submitting task 1
Job output will be written to: /home/fritzra/matlab/hello_test_files/Job16_Task1.out
QSUB output: Your job 1858617 ("Job16.1") has been submitted
Submitting task 2
Job output will be written to: /home/fritzra/matlab/hello_test_files/Job16_Task2.out
QSUB output: Your job 1858618 ("Job16.2") has been submitted
??? Error using ==> distcomp.abstractjob.wait at 45 
>> s.getDebugLog( j )
??? No appropriate method, property, or field getDebugLog for class
distcomp.genericscheduler."

Don't know what happens.
Why are there Matlab worker sessions starting without actually working?
How do I get the debugLog?
Thanks, Rafael