Parallrl calculations for Deep learning Toolbox

Question

0 votes

Greetings!

I have problem with parallel calculations for YOLO detector based on Resnet-50 network.

For the learning task, I use a virtual machine with 32 cores without connected GPU. In the settings of Parallel preferences, I picked 8 as the number of workers.

After running the code, I get the following error:

Error using nnet.internal.cnn.DistributedDispatcher (line 79)

'nnet.internal.cnn.GeneralDatastoreDispatcher' does not support order-preserving distribution.

Error in nnet.internal.cnn.DataDispatcherFactory>iCreateDistributedDispatcherIfRequired (line 204)

dispatcher = nnet.internal.cnn.DistributedDispatcher( dispatcher, executionSettings.workerLoad, retainDataOrder );

Error in nnet.internal.cnn.DataDispatcherFactory.createDataDispatcherMIMO (line 176)

dispatcher = iCreateDistributedDispatcherIfRequired(...

Error in vision.internal.cnn.trainNetwork>iCreateTrainingDataDispatcher (line 180)

dispatcher = nnet.internal.cnn.DataDispatcherFactory.createDataDispatcherMIMO( ...

Error in vision.internal.cnn.trainNetwork (line 34)

trainingDispatcher = iCreateTrainingDataDispatcher(ds, mapping, trainedNet,...

Error in trainYOLOv2ObjectDetector>iTrainYOLOv2 (line 391)

[yolov2Net, info] = vision.internal.cnn.trainNetwork(...

Error in trainYOLOv2ObjectDetector (line 187)

[net, info] = iTrainYOLOv2(ds, lgraph, params, mapping, options, checkpointSaver);

Error in YOLO_Multi_ver (line 83)

[detector,info] = trainYOLOv2ObjectDetector(preprocessedTrainingData,lgraph,options);

My traning options is

trainingOptions('sgdm','MiniBatchSize', 16, 'InitialLearnRate',1e-3, 'MaxEpochs',20, 'CheckpointPath', tempdir, 'Shuffle','never','ExecutionEnvironment', 'parallel');

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Joss Knight on 19 Apr 2020

1 vote

Sorry about this not-very-good error, which should be fixed in the current release. What it means is that 'Shuffle', 'never' is not supported for your input data when training in parallel, because when the data is distributed to your GPUs there is no way to ensure that it is divided in such a way that the exact same sequence of observations is read. To fix it, change to 'Shuffle', 'once'.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Parallrl calculations for Deep learning Toolbox

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Parallrl calculations for Deep learning Toolbox

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments