Products & Services Industries Academia Support User Community Company

Learn more about Parallel Computing Toolbox   

Troubleshooting and Debugging

Object Data Size Limitations

The size limit of data transfers among the parallel computing objects is limited by the Java Virtual Machine (JVM) memory allocation. This limit applies to single transfers of data between client and workers in any job using a job manager as a scheduler, or in any parfor-loop. The approximate size limitation depends on your system architecture:

System ArchitectureMaximum Data Size Per Transfer (approx.)
64-bit2.0 GB
32-bit600 MB

File Access and Permissions

Ensuring That Workers on Windows Operating Systems Can Access Files

By default, a worker on a Windows operating system is installed as a service running as LocalSystem, so it does not have access to mapped network drives.

Often a network is configured to not allow services running as LocalSystem to access UNC or mapped network shares. In this case, you must run the mdce service under a different user with rights to log on as a service. See the section Setting the User in the MATLAB Distributed Computing Server System Administrator's Guide.

Task Function Is Unavailable

If a worker cannot find the task function, it returns the error message

Error using ==> feval
      Undefined command/function 'function_name'.

The worker that ran the task did not have access to the function function_name. One solution is to make sure the location of the function's file, function_name.m, is included in the job's PathDependencies property. Another solution is to transfer the function file to the worker by adding function_name.m to the FileDependencies property of the job.

Load and Save Errors

If a worker cannot save or load a file, you might see the error messages

??? Error using ==> save
Unable to write file myfile.mat: permission denied.
??? Error using ==> load
Unable to read file myfile.mat: No such file or directory.

In determining the cause of this error, consider the following questions:

Tasks or Jobs Remain in Queued State

A job or task might get stuck in the queued state. To investigate the cause of this problem, look for the scheduler's logs:

Possible causes of the problem are

No Results or Failed Job

Task Errors

If your job returned no results (i.e., getAllOutputArguments(job) returns an empty cell array), it is probable that the job failed and some of its tasks have their ErrorMessage and ErrorIdentifier properties set.

You can use the following code to identify tasks with error messages:

errmsgs = get(yourjob.Tasks, {'ErrorMessage'});
nonempty = ~cellfun(@isempty, errmsgs);
celldisp(errmsgs(nonempty));

This code displays the nonempty error messages of the tasks found in the job object yourjob.

Debug Logs

If you are using a supported third-party scheduler, you can use the getDebugLog function to read the debug log from the scheduler for a particular job or task.

For example, find the failed job on your LSF scheduler, and read its debug log.

sched = findResource('scheduler', 'type', 'lsf')
failedjob = findJob(sched, 'State', 'failed');
message = getDebugLog(sched, failedjob(1))

Connection Problems Between the Client and Job Manager

For testing connectivity between the client machine and the machines of your compute cluster, you can use Admin Center. For more information about Admin Center, including how to start it and how to test connectivity, see Admin Center in the MATLAB Distributed Computing Server documentation.

Detailed instructions for other methods of diagnosing connection problems between the client and job manager can be found in some of the Bug Reports listed on the MathWorks Web site.

The following sections can help you identify the general nature of some connection problems.

Client Cannot See the Job Manager

If you cannot locate your job manager with

findResource('scheduler','type','jobmanager')

the most likely reasons for this failure are

Job Manager Cannot See the Client

If findResource displays a warning message that the job manager cannot open a TCP connection to the client computer, the most likely reasons for this are

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS