How can I work around a race condition on a Parallel Computing job storage location?
Show older comments
When submitting multiple parallel computing jobs simultaneously from different MATLAB instances, a race condition can occur at the job storage location. This can especially be an issue on MATLAB Production Server workers. This manifests as a crash or hang when a second pool is opened at the same time as the first pool, and produces a variety of strange errors like the following:
Error using parallel.Job/preSubmit (line 592)
Unable to read MAT-file /nfs/pathname/username/.matlab/local_cluster_jobs/R2019a/Job1.in.mat.
File might be corrupt.
Error using parpool (line 113)
Failed to convert value stored in Settings for property JobStorageLocation to a
datalocation.
Error using parallel.internal.pool.InteractiveClient>iThrowWithCause (line 678)
Failed to start pool.
Error using parallel.Job/createTask (line 320)
Only one task may be created on a concurrent Job.
Error using parpool (line 113)
Invalid default value for property 'ParallelNode' in class 'parallel.internal.settings.ParallelSettingsTree':
No value is set for setting 'PCTVersionNumber' at any level.
How can I work around this issue?
Accepted Answer
More Answers (0)
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!