Why do I receive the "CUDA_ERROR_LAUNCH_TIMEOUT" error when trying to run GPU code with Parallel Computing Toolbox?

Question

MathWorks Support Team on 18 Apr 2012

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/99300-why-do-i-receive-the-cuda_error_launch_timeout-error-when-trying-to-run-gpu-code-with-parallel-com

Edited: MathWorks Support Team on 30 Jan 2024

I am trying to run my computation on the GPU. When I execute my program I receive the following error message:

ERROR: Warning: An unexpected error occurred during CUDA execution. The CUDA error
was: CUDA_ERROR_LAUNCH_TIMEOUT.
Error using arrayfun
The kernel execution failed because the CUDA driver timeout was encountered.

Sign in to answer this question.

Answer 1

MathWorks Support Team on 30 Jan 2024

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/99300-why-do-i-receive-the-cuda_error_launch_timeout-error-when-trying-to-run-gpu-code-with-parallel-com#answer_108647

Edited: MathWorks Support Team on 30 Jan 2024

This is a limitation imposed on the Parallel Computing Toolbox by the underlying operating system.

This error occurs when a gpuArray operation or a CUDA kernel runs for a long time on a GPU that is used for both graphics rendering and CUDA computations. The error is triggered by the operating system, which limits the time that the GPU can dedicate to computations instead of updating the display. Computations that exceed this time limit trigger the Timeout Detection and Recovery (TDR) mechanism.

There are several ways to avoid this error:

1. Use different GPUs for graphics and computation

The time limit only applies to GPUs that are used for graphics. It is therefore recommended that you run gpuArray operations and CUDA kernels on a GPU that is not providing output for a display.

On Windows, to ensure that your GPU is never used for display, set your GPU to use the Tesla Compute Cluster (TCC) driver model. To see which driver model your GPU device is using, inspect the DriverModel property returned by the gpuDevice function in MATLAB. Not all NVIDIA GPUs support the TCC driver model.

2. Segment your computation into smaller chunks

If possible, split your large computations into several, smaller computations. Smaller computations are less likely to trigger the TDR mechanism.

3. Modify the TDR mechanism (Windows)

If your system has GPUs without a display attached, you can manually modify the TDR mechanism. Increase the time limit to allow GPU computations to run for extended periods of time without triggering an error.

For information on TDR and how to modify the timeout on Windows, see https://learn.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Why do I receive the "CUDA_ERROR_LAUNCH_TIMEOUT" error when trying to run GPU code with Parallel Computing Toolbox?

Accepted Answer

1. Use different GPUs for graphics and computation

2. Segment your computation into smaller chunks

3. Modify the TDR mechanism (Windows)

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

Why do I receive the "CUDA_ERRO​R_LAUNCH_T​IMEOUT" error when trying to run GPU code with Parallel Computing Toolbox?

Accepted Answer

1. Use different GPUs for graphics and computation

2. Segment your computation into smaller chunks

3. Modify the TDR mechanism (Windows)

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

Why do I receive the "CUDA_ERROR_LAUNCH_TIMEOUT" error when trying to run GPU code with Parallel Computing Toolbox?

0 Comments
Show -2 older commentsHide -2 older comments