<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866</link>
    <title>MATLAB Central Newsreader - Compiling CUDA C/C++ mex code under linux</title>
    <description>Feed for thread: Compiling CUDA C/C++ mex code under linux</description>
    <language>en-us</language>
    <copyright>&amp;copy;1994-2012 by MathWorks, Inc.</copyright>
    <webmaster>webmaster@mathworks.com</webmaster>
    <generator>MATLAB Central Newsreader</generator>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>60</ttl>
    <image>
      <title>MathWorks</title>
      <url>http://www.mathworks.com/images/membrane_icon.gif</url>
    </image>
    <item>
      <pubDate>Tue, 29 Sep 2009 08:53:01 -0400</pubDate>
      <title>Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#683328</link>
      <author>Oliver Woodford</author>
      <description>Hi all&lt;br&gt;
&lt;br&gt;
There a several methods available on the file exchange for compiling CUDA C/C++ code into mex files under Windows, but none that I've come across work for linux. However, I've found a nice, easy way to do it, which I'll share with you, though I must confess I haven't tested it extensively.&lt;br&gt;
&lt;br&gt;
The idea is to use Nvidia's nvcc compiler to convert CUDA C/C++ code into standard C++ code, then use mex after that. The first stage looks something like:&lt;br&gt;
&lt;br&gt;
system(sprintf('nvcc -I&quot;%s/extern/include&quot; --cuda &quot;mexfun.cu&quot; --output-file &quot;mexfun.cpp&quot;', matlabroot));&lt;br&gt;
&lt;br&gt;
Then the second stage is roughly:&lt;br&gt;
&lt;br&gt;
mex -I/opt/cuda/include -L/opt/cuda/lib -lcudart mexfun.cpp&lt;br&gt;
&lt;br&gt;
Obviously you need to set the various paths and file/function names to suit your needs.&lt;br&gt;
&lt;br&gt;
HTH,&lt;br&gt;
Oliver&lt;br&gt;
&lt;br&gt;
PS Does anyone think this approach could reduce the efficiency of the resulting machine code? I do wonder if it doesn't limit the level of optimization that can be applied.</description>
    </item>
    <item>
      <pubDate>Mon, 07 Dec 2009 12:11:04 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#700375</link>
      <author>Thomas Clark</author>
      <description>Oliver!&lt;br&gt;
&lt;br&gt;
Five stars, thanks very much. I was SO close to this solution, but with things still going wrong - your commands below just sorted me right out :)&lt;br&gt;
&lt;br&gt;
Re. Performance, I don't think it'll have a detrimental effect. The NVCC compiler will use it's standard options for the device code; and gcc (or whatever c/c++ compiler you use) will use the options that mex passes it to compile the part of the code which runs on the host.&lt;br&gt;
&lt;br&gt;
If you were concerned about the host code performance, you can always adjust the mexopts.sh file (in matlabroot/bin/) or the call to mex to pass various performance flags (O3 etc) in to the compiler.&lt;br&gt;
&lt;br&gt;
Cheers again!&lt;br&gt;
&lt;br&gt;
Tom Clark&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&quot;Oliver Woodford&quot; &amp;lt;o.j.woodford.98@cantab.net&amp;gt; wrote in message &amp;lt;h9shtd$q9g$1@fred.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; Hi all&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; There a several methods available on the file exchange for compiling CUDA C/C++ code into mex files under Windows, but none that I've come across work for linux. However, I've found a nice, easy way to do it, which I'll share with you, though I must confess I haven't tested it extensively.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The idea is to use Nvidia's nvcc compiler to convert CUDA C/C++ code into standard C++ code, then use mex after that. The first stage looks something like:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; system(sprintf('nvcc -I&quot;%s/extern/include&quot; --cuda &quot;mexfun.cu&quot; --output-file &quot;mexfun.cpp&quot;', matlabroot));&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Then the second stage is roughly:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; mex -I/opt/cuda/include -L/opt/cuda/lib -lcudart mexfun.cpp&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Obviously you need to set the various paths and file/function names to suit your needs.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; HTH,&lt;br&gt;
&amp;gt; Oliver&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; PS Does anyone think this approach could reduce the efficiency of the resulting machine code? I do wonder if it doesn't limit the level of optimization that can be applied.</description>
    </item>
    <item>
      <pubDate>Fri, 18 Nov 2011 11:21:10 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#858856</link>
      <author>Andrei </author>
      <description>hi all, &lt;br&gt;
&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;I'm trying to use your method to call CUDA code as function in MATLAB. I succeeded to obtain the mex file, but I have some problems:&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Mex file entry point is missing.  Please check the (case-sensitive) &lt;br&gt;
spelling of mexFunction (for C MEX-files), or the (case-insensitive) &lt;br&gt;
spelling of MEXFUNCTION (for FORTRAN MEX-files).&lt;br&gt;
Invalid MEX-file&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
can you please help me?&lt;br&gt;
&lt;br&gt;
Thank you</description>
    </item>
    <item>
      <pubDate>Fri, 18 Nov 2011 13:13:08 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#858863</link>
      <author>Oliver Woodford</author>
      <description>&quot;Andrei&quot; wrote:&lt;br&gt;
&amp;gt; hi all, &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt;      I'm trying to use your method to call CUDA code as function in MATLAB. I succeeded to obtain the mex file, but I have some problems:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Mex file entry point is missing.  Please check the (case-sensitive) &lt;br&gt;
&amp;gt; spelling of mexFunction (for C MEX-files), or the (case-insensitive) &lt;br&gt;
&amp;gt; spelling of MEXFUNCTION (for FORTRAN MEX-files).&lt;br&gt;
&amp;gt; Invalid MEX-file&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; can you please help me?&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Thank you&lt;br&gt;
&lt;br&gt;
The error is quite helpful. Do you have a mexFunction function in your mex file?</description>
    </item>
    <item>
      <pubDate>Mon, 21 Nov 2011 09:53:13 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#859049</link>
      <author>Andrei </author>
      <description>&quot;Oliver Woodford&quot; wrote in message &amp;lt;ja5ll3$s5c$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &quot;Andrei&quot; wrote:&lt;br&gt;
&amp;gt; &amp;gt; hi all, &lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt;      I'm trying to use your method to call CUDA code as function in MATLAB. I succeeded to obtain the mex file, but I have some problems:&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; Mex file entry point is missing.  Please check the (case-sensitive) &lt;br&gt;
&amp;gt; &amp;gt; spelling of mexFunction (for C MEX-files), or the (case-insensitive) &lt;br&gt;
&amp;gt; &amp;gt; spelling of MEXFUNCTION (for FORTRAN MEX-files).&lt;br&gt;
&amp;gt; &amp;gt; Invalid MEX-file&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; can you please help me?&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; Thank you&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The error is quite helpful. Do you have a mexFunction function in your mex file?&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
I have a CUDA file, using your commands I generated the mex file.&lt;br&gt;
&lt;br&gt;
&amp;nbsp;I have to write a mex file? I thought that the commands above generates the mex file.&lt;br&gt;
&lt;br&gt;
Maybe you can post some example files.&lt;br&gt;
&lt;br&gt;
Thanks </description>
    </item>
    <item>
      <pubDate>Tue, 22 Nov 2011 16:19:08 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#859203</link>
      <author>Oliver Woodford</author>
      <description>&quot;Andrei&quot; wrote:&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have a CUDA file, using your commands I generated the mex file.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt;  I have to write a mex file? I thought that the commands above generates the mex file.&lt;br&gt;
&lt;br&gt;
Andrei,&lt;br&gt;
&lt;br&gt;
The commands I give compile a .cu file into a .cpp file. They do not create a mexFunction host function (which is the gateway function between MATLAB and any C/C++/CUDA code) in the resulting .cpp file. You either need to have a mexFunction function in the .cu file or in another .c/.cpp file which you compile along with the converted .cu file in the last step.&lt;br&gt;
&lt;br&gt;
I have included a simple example of a .cu file containing such a function below, which generates an array of 256 random numbers and squares them on the gpu, then checks these against the result computed on the gpu.&lt;br&gt;
&lt;br&gt;
This topic is a little outside the scope of this thread, so if you need further help  then I suggest you look at MATLABs documentation on mex files, and/or post a new question to this newsgroup or on MATLAB Answers.&lt;br&gt;
&lt;br&gt;
HTH,&lt;br&gt;
Oliver&lt;br&gt;
&lt;br&gt;
===================================================&lt;br&gt;
&lt;br&gt;
#include &quot;mex.h&quot;&lt;br&gt;
&lt;br&gt;
// CUDA kernel which squares a float&lt;br&gt;
__global__ void sq(float *d_buffer)&lt;br&gt;
{&lt;br&gt;
	d_buffer[threadIdx.x] *= d_buffer[threadIdx.x];&lt;br&gt;
}&lt;br&gt;
&lt;br&gt;
// Function which interfaces with MATLAB&lt;br&gt;
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])&lt;br&gt;
{    &lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (nrhs != 0)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;mexErrMsgTxt(&quot;No inputs expected.&quot;);&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (nlhs != 0)&lt;br&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;mexErrMsgTxt(&quot;No outputs expected.&quot;);&lt;br&gt;
&lt;br&gt;
	// Generate the data&lt;br&gt;
	float h_buffer[256];&lt;br&gt;
	for (int a = 0; a &amp;lt; 256; ++a)&lt;br&gt;
		h_buffer[a] = float(rand()) / RAND_MAX;&lt;br&gt;
&lt;br&gt;
	// Copy to the gpu&lt;br&gt;
	float *d_buffer;&lt;br&gt;
	cudaMalloc((void **)&amp;d_buffer, 256 * sizeof(*d_buffer));&lt;br&gt;
	cudaMemcpy(d_buffer, h_buffer, 256 * sizeof(*d_buffer), cudaMemcpyHostToDevice);&lt;br&gt;
&lt;br&gt;
	// Call the CUDA kernel&lt;br&gt;
	sq&amp;lt;&amp;lt;&amp;lt;1, 256&amp;gt;&amp;gt;&amp;gt;(d_buffer);&lt;br&gt;
&lt;br&gt;
	// Copy from gpu&lt;br&gt;
	float h_buffer2[256];&lt;br&gt;
	cudaMemcpy(h_buffer2, d_buffer, 256 * sizeof(*d_buffer), cudaMemcpyDeviceToHost);&lt;br&gt;
&lt;br&gt;
	// Check result&lt;br&gt;
	for (int a = 0; a &amp;lt; 256; ++a) {&lt;br&gt;
		if (h_buffer2[a] != h_buffer[a] * h_buffer[a])&lt;br&gt;
			mexErrMsgTxt(&quot;Error in calculation!&quot;);&lt;br&gt;
	}&lt;br&gt;
	mexPrintf(&quot;Test passed!\n&quot;);&lt;br&gt;
}</description>
    </item>
    <item>
      <pubDate>Wed, 23 Nov 2011 09:11:09 -0500</pubDate>
      <title>Re: Compiling CUDA C/C++ mex code under linux</title>
      <link>http://www.mathworks.com/matlabcentral/newsreader/view_thread/261866#859286</link>
      <author>Andrei </author>
      <description>&quot;Oliver Woodford&quot; wrote in message &amp;lt;jagi1s$1j8$1@newscl01ah.mathworks.com&amp;gt;...&lt;br&gt;
&amp;gt; &quot;Andrei&quot; wrote:&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt; I have a CUDA file, using your commands I generated the mex file.&lt;br&gt;
&amp;gt; &amp;gt; &lt;br&gt;
&amp;gt; &amp;gt;  I have to write a mex file? I thought that the commands above generates the mex file.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; Andrei,&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; The commands I give compile a .cu file into a .cpp file. They do not create a mexFunction host function (which is the gateway function between MATLAB and any C/C++/CUDA code) in the resulting .cpp file. You either need to have a mexFunction function in the .cu file or in another .c/.cpp file which you compile along with the converted .cu file in the last step.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; I have included a simple example of a .cu file containing such a function below, which generates an array of 256 random numbers and squares them on the gpu, then checks these against the result computed on the gpu.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; This topic is a little outside the scope of this thread, so if you need further help  then I suggest you look at MATLABs documentation on mex files, and/or post a new question to this newsgroup or on MATLAB Answers.&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; HTH,&lt;br&gt;
&amp;gt; Oliver&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; ===================================================&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; #include &quot;mex.h&quot;&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; // CUDA kernel which squares a float&lt;br&gt;
&amp;gt; __global__ void sq(float *d_buffer)&lt;br&gt;
&amp;gt; {&lt;br&gt;
&amp;gt; 	d_buffer[threadIdx.x] *= d_buffer[threadIdx.x];&lt;br&gt;
&amp;gt; }&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; // Function which interfaces with MATLAB&lt;br&gt;
&amp;gt; void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])&lt;br&gt;
&amp;gt; {    &lt;br&gt;
&amp;gt;     if (nrhs != 0)&lt;br&gt;
&amp;gt;         mexErrMsgTxt(&quot;No inputs expected.&quot;);&lt;br&gt;
&amp;gt;     if (nlhs != 0)&lt;br&gt;
&amp;gt;         mexErrMsgTxt(&quot;No outputs expected.&quot;);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 	// Generate the data&lt;br&gt;
&amp;gt; 	float h_buffer[256];&lt;br&gt;
&amp;gt; 	for (int a = 0; a &amp;lt; 256; ++a)&lt;br&gt;
&amp;gt; 		h_buffer[a] = float(rand()) / RAND_MAX;&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 	// Copy to the gpu&lt;br&gt;
&amp;gt; 	float *d_buffer;&lt;br&gt;
&amp;gt; 	cudaMalloc((void **)&amp;d_buffer, 256 * sizeof(*d_buffer));&lt;br&gt;
&amp;gt; 	cudaMemcpy(d_buffer, h_buffer, 256 * sizeof(*d_buffer), cudaMemcpyHostToDevice);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 	// Call the CUDA kernel&lt;br&gt;
&amp;gt; 	sq&amp;lt;&amp;lt;&amp;lt;1, 256&amp;gt;&amp;gt;&amp;gt;(d_buffer);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 	// Copy from gpu&lt;br&gt;
&amp;gt; 	float h_buffer2[256];&lt;br&gt;
&amp;gt; 	cudaMemcpy(h_buffer2, d_buffer, 256 * sizeof(*d_buffer), cudaMemcpyDeviceToHost);&lt;br&gt;
&amp;gt; &lt;br&gt;
&amp;gt; 	// Check result&lt;br&gt;
&amp;gt; 	for (int a = 0; a &amp;lt; 256; ++a) {&lt;br&gt;
&amp;gt; 		if (h_buffer2[a] != h_buffer[a] * h_buffer[a])&lt;br&gt;
&amp;gt; 			mexErrMsgTxt(&quot;Error in calculation!&quot;);&lt;br&gt;
&amp;gt; 	}&lt;br&gt;
&amp;gt; 	mexPrintf(&quot;Test passed!\n&quot;);&lt;br&gt;
&amp;gt; }&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Thank you very much. </description>
    </item>
  </channel>
</rss>

