Why do I get the error "CUDNN_STATUS_EXECUTION_FAILED" when training a neural network on a GPU on a server?
Show older comments
When training a neural network on a GPU on a server, it usually fails after some time with the following error message:
Error using trainNetwork (line 154)
Unexpected error calling cuDNN: CUDNN_STATUS_EXECUTION_FAILED.
Caused by:
Error using nnet.internal.cnngpu.lstmForwardTrain
Unexpected error calling cuDNN: CUDNN_STATUS_EXECUTION_FAILED.
This generally happens when someone else launches another program on the same GPU.
Accepted Answer
More Answers (0)
Categories
Find more on Parallel and Cloud in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!