## Code Generation by Using the GPU Coder App

The easiest way to create CUDA® kernels is to place the `coder.gpu.kernelfun` pragma into your primary MATLAB® function. The primary function is also known as the top-level or entry-point function. When GPU Coder™ encounters the `kernelfun` pragma, it attempts to parallelize all the computation within this function and then maps it to the GPU. For more information about GPU kernels, see GPU Programming Paradigm.

### Learning Objectives

In this tutorial, you learn how to:

• Prepare your MATLAB code for CUDA code generation by using the `kernelfun` pragma.

• Create and set up a GPU Coder project.

• Define function input properties.

• Check for code generation readiness and run-time issues.

• Specify code generation properties.

• Generate CUDA code by using the GPU Coder app.

### Tutorial Prerequisites

This tutorial requires the following products:

• MATLAB

• MATLAB Coder™

• GPU Coder

• C++ compiler

• NVIDIA® GPU enabled for CUDA

• CUDA Toolkit and driver

• Environment variables for the compilers and libraries. For more information, see Environment Variables.

### Example: The Mandelbrot Set

#### Description

The Mandelbrot set is the region in the complex plane consisting of the values z0 for which the trajectories defined by this equation remain bounded at k→∞.

`${z}_{k+1}={z}_{k}^{2}+{z}_{0},\text{ }k=0,\text{\hspace{0.17em}}1,\text{ }\text{\hspace{0.17em}}\dots$`

The overall geometry of the Mandelbrot set is shown in the figure. This view does not have the resolution to show the richly detailed structure of the fringe just outside the boundary of the set. At increasing magnifications, the Mandelbrot set exhibits an elaborate boundary that reveals progressively finer recursive detail.

#### Algorithm

For this tutorial, pick a set of limits that specify a highly zoomed part of the Mandelbrot set in the valley between the main cardioid and the p/q bulb to its left. A 1000-by-1000 grid of real parts (x) and imaginary parts (y) is created between these two limits. The Mandelbrot algorithm is then iterated at each grid location. An iteration number of 500 renders the image in full resolution.

```maxIterations = 500; gridSize = 1000; xlim = [-0.748766713922161,-0.748766707771757]; ylim = [0.123640844894862,0.123640851045266]; ```

This tutorial uses an implementation of the Mandelbrot set by using standard MATLAB commands running on the CPU. This calculation is vectorized such that every location is updated simultaneously.

### Tutorial Files

Create a MATLAB function called `mandelbrot_count.m` with the following lines of code. This code is a baseline vectorized MATLAB implementation of the Mandelbrot set. For every point `(xGrid,yGrid)` in the grid, it calculates the iteration index `count` at which the trajectory defined by the equation reaches a distance of `2` from the origin. It then returns the natural logarithm of `count`, which is used generate the color coded plot of the Mandelbrot set. Later in this tutorial, you modify this file to make it suitable for code generation.

```function count = mandelbrot_count(maxIterations,xGrid,yGrid) % mandelbrot computation z0 = complex(xGrid,yGrid); count = ones(size(z0)); z = z0; for n = 0:maxIterations z = z.*z + z0; inside = abs(z)<=2; count = count + inside; end count = log(count); ```

Create a MATLAB script called `mandelbrot_test.m` with the following lines of code. The script generates a 1000-by-1000 grid of real parts (x) and imaginary parts (y) between the limits specified by `xlim` and `ylim`. It also calls the `mandelbrot_count` function and plots the resulting Mandelbrot set.

```maxIterations = 500; gridSize = 1000; xlim = [-0.748766713922161,-0.748766707771757]; ylim = [0.123640844894862,0.123640851045266]; x = linspace(xlim(1),xlim(2),gridSize); y = linspace(ylim(1),ylim(2),gridSize); [xGrid,yGrid] = meshgrid(x,y); %% Mandelbrot computation in MATLAB count = mandelbrot_count(maxIterations,xGrid,yGrid); % Show figure(1) imagesc(x,y,count); colormap([jet();flipud(jet());0 0 0]); axis off title('Mandelbrot set with MATLAB'); ```

### Run the Original MATLAB Code

#### Run the Mandelbrot Example

Before making the MATLAB version of the Mandelbrot set algorithm suitable for code generation, you can test the functionality of the original code.

1. Change the current MATLAB working folder to the location that contains `mandelbrot_count.m` and `mandelbrot_test.m`. GPU Coder places generated code in this folder. Change your current working folder if you do not have full access to this folder.

2. Run the `mandelbrot_test` script.

The test script runs and shows the geometry of the Mandelbrot within the boundary set by the variables `xlim` and `ylim`.

### Prepare MATLAB Code for Code Generation

Before you generate code with GPU Coder, check for coding issues in the original MATLAB code.

#### Check for Issues at Design Time

There are two tools that help you detect code generation issues at design time:

• Code Analyzer tool

The Code Analyzer is a tool incorporated into the MATLAB Editor that continuously checks your code as you enter it. The Code Analyzer reports issues and recommends modifications to maximize performance and maintainability of your code. To identify the warnings and errors specific to code generation from your MATLAB code, add the `%#codegen` directive to your MATLAB file. For more information, see Code Analyzer preferences.

Note

The Code Analyzer does not detect all code generation issues. After eliminating the errors or warnings that the Code Analyzer detects, compile your code with GPU Coder to determine if the code has other compliance issues.

The code generation readiness tool screens the MATLAB code for features and functions that are not supported for code generation. This tool provides a report that lists issues and recommendations for making the MATLAB code suitable for code generation. You can access the code generation readiness tool in these ways:

• In the current folder browser — right-click the MATLAB file that contains the entry-point function.

• At the command line — by using the `coder.screener` function with the `-gpu` flag.

• In the GPU Coder app — after specifying the entry-point files, the app runs the Code Analyzer and the code generation readiness tool.

#### Check for Issues at Code Generation Time

You can use GPU Coder to check for issues at code generation time. When GPU Coder detects errors or warnings, it generates an error report that describes the issues and provides links to the problematic MATLAB code. For more information, see Code Generation Reports.

### Make the MATLAB Code Suitable for Code Generation

To begin the process of making your MATLAB code suitable for code generation, use the file `mandelbrot_count.m`.

1. Set your MATLAB current folder to the work folder that contains your files for this tutorial.

2. In the MATLAB Editor, open `mandelbrot_count.m`. The Code Analyzer message indicator at the top right corner of the MATLAB Editor is green. The analyzer did not detect errors, warnings, or opportunities for improvement in the code.

3. After the function declaration, add the `%#codegen` directive to turn on the error checking that is specific to code generation.

```function count = mandelbrot_count(maxIterations,xGrid,yGrid) %#codegen ```

The Code Analyzer message indicator remains green, indicating that it has not detected any code generation issues.

4. To map the `mandelbrot_count` function to a CUDA kernel, modify the original MATLAB code by placing the `coder.gpu.kernelfun` pragma in the body of the function.

```function count = mandelbrot_count(maxIterations,xGrid,yGrid) %#codegen % Add kernelfun pragma to trigger kernel creation coder.gpu.kernelfun; % mandelbrot computation z0 = complex(xGrid,yGrid); count = ones(size(z0)); z = z0; for n = 0:maxIterations z = z.*z + z0; inside = abs(z)<=2; count = count + inside; end count = log(count); ```

If you use the `coder.gpu.kernelfun` pragma, GPU Coder attempts to map the computations in the function `mandelbrot_count` to the GPU.

5. Save the file. You are now ready to compile your code by using the GPU Coder app.

### Generate Code by Using the GPU Coder App

#### Open the GPU Coder App

On the MATLAB toolstrip Apps tab, under Code Generation, click the GPU Coder app icon. You can also open the app by typing `gpucoder` in the MATLAB Command Window. The app opens the Select source files page.

#### Select Source Files

1. On the Select source files page, enter or select the name of the primary function, `mandelbrot_count`. The primary function is also known as the top-level or entry-point function. The app creates a project with the default name `mandelbrot_count.prj` in the current folder.

2. Click and go to the Define Input Types step. The app analyzes the function for coding issues and code generation readiness. If the app identifies issues, it opens the Review Code Generation Readiness page where you can review and fix issues. In this example, because the app does not detect issues, it opens the Define Input Types page.

#### Define Input Types

The code generator must determine the data types of all the variables in the MATLAB files at compile time. Therefore, you must specify the data types of all the input variables. You can specify the input data types in one of these two ways:

• Provide a test file that calls the project entry-point functions. The GPU Coder app can infer the input argument types by running the test file.

• Enter the input types directly.

In this example, to define the properties of the inputs `maxIterations`, `xGrid`, and `yGrid`, specify the test file `mandelbrot_test.m`:

1. Enter or select the test file `mandelbrot_test.m`.

2. Click .

The test file `mandelbrot_test.m` calls the entry-point function, `mandelbrot_count.m` with the expected input types. The app infers that the input `maxIterations` is `double(1x1)` and the inputs `xGrid` and `yGrid` are `double(1000x1000)`.

3. Click go to the Check for Run-Time Issues step.

#### Check for Run-Time Issues

The Check for Run-Time Issues step generates a MEX file from your entry-point functions, runs the MEX function, and reports issues. This step is optional. However, it is a best practice to perform this step. Using this step, you can detect and fix defects that are harder to diagnose in the generated GPU code.

GPU Coder provides the option to perform GPU-specific checks at this point. When you select this option, GPU Coder generates CUDA code and a MEX file from your entry-point functions, runs the MEX function, and reports issues. Some of the GPU-specific run-time checks include:

• Checks for register spills.

• Stack size conformance checks.

Note

There may be certain MATLAB constructs in your code that cause the Check for Run-Time Issues to fail CPU-specific checks but pass the GPU-specific checks.

1. To open the Check for Run-Time Issues dialog box, click the arrow.

2. In the Check for Run-Time Issues dialog box, specify a test file or enter code that calls the entry-point function with example inputs. For this example, use the test file `mandelbrot_test.m` that you used to define the input types.

3. To enable GPU-specific checks, select the GPU option button. Click .

The app generates a MEX function. It runs the test script `mandelbrot_test` replacing calls to `mandelbrot_count` with calls to the generated MEX. If the app detects issues during the MEX function generation or execution, it provides warning and error messages. You can click these messages to navigate to the problematic code and fix the issue. In this example, the app does not detect issues. The MEX function has the same functionality as the original `mandelbrot_count` function.

Note

There may be certain MATLAB constructs in your code that cause the Check for Run-Time Issues to fail CPU-specific checks but pass the GPU-specific checks.

4. Click go to the Generate Code step.

#### Generate CUDA Code

1. To open the dialog box, click the arrow.

2. In the Generate dialog box, you can select the type of build that you want GPU Coder to perform. The available options are listed in this table.

Build TypeDescription
`Source code`

CUDA source code to integrate with an external project.

`MEX`

Compiled code to run inside MATLAB.

`Static Library`

Binary library for static linking with an external project.

`Dynamic Library`

Binary library for dynamic linking with an external project.

`Executable`

Standalone program (requires a custom CUDA main file).

For this tutorial, set Build type to `MEX(.mex)`. By generating a MEX output, you can check the correctness of the generated CUDA code from within MATLAB. The MEX build type does not require additional settings like Toolchain and Hardware Board. It also does not provide the option to generate only the source code. GPU Coder can automatically select an available CUDA toolchain as long as the Environment Variables are set properly.

To view advanced options, select More Settings - > GPU Code. To the Compiler Flags option, add `--fmad=false`. This flag, when passed to the `nvcc`, instructs the compiler to disable Floating-point Multiply-add (FMAD) optimization. This option is set to prevent numerical mismatch in the generated code because of architectural differences between the CPU and the GPU. For more information, see Numerical Differences Between CPU and GPU.

3. Click .

GPU Coder generates the MEX executable `mandelbrot_count_mex` in your working folder. The `<pwd>\codegen\mex\mandelbrot_count` folder contains all other the generated files including the CUDA source (*.cu) and header files. The GPU Coder app indicates that the code generation succeeded. It displays the source MATLAB files and generated output files on the left side of the page. On the Variables tab, it displays information about the MATLAB source variables. On the Target Build Log tab, it displays the build log, including compiler warnings and errors. By default, in the code window, the app displays the CUDA source file `mandelbrot_count.cu`. To view a different file, in the Source Code or Output Files pane, click the file name.

4. To view the code generation report, click View Report. The report provides links to your MATLAB code and the generated CUDA (*.cu) files. It also provides compile-time information for the variables and expressions in your MATLAB code. This information helps you to find sources of error and warnings. It also helps you to debug code generation issues in your code. For more information, see Code Generation Reports.

The GPU Kernels section on the Generated Code tab provides a list of kernels created during GPU code generation. The items in this list link to the relevant source code. For example, when you click mandelbrot_count_kernel1, the code section for this kernel is shown in the code browser window.

After you review the report, you can close the Code Generation Report window. To view the report later, open `report.mldatx` in `<pwd>\codegen\mex\mandelbrot_cout\html` folder.

5. The `<pwd>\codegen\mex\mandelbrot_count` contains the `gpu_codegen_info.mat` MAT-file that contains the statistics for the generated GPU code. This MAT-file contains the `cuda_Kernel` variable that has information about the thread and block sizes, shared and constant memory usage, and input and output arguments of each kernel. The `cudaMalloc` and `cudaMemcpy` variables contain information about the size of all the GPU variables and the number of `memcpy` calls between the host and the device.

6. In the GPU Coder app, click to open the Finish Workflow page.

#### Review the Finish Workflow Page

The Finish Workflow page indicates that the code generation succeeded. It provides a project summary and links to the MATLAB source files, the code generation report, and the generated output binaries. You can save the configuration parameters of the current GPU Coder project as a MATLAB script. See Convert MATLAB Coder Project to MATLAB Script.

#### Verify Correctness of the Generated Code

To verify the correctness of the generated MEX file, see Verify Correctness of the Generated Code.