I have a custom function that takes in a m by 2 matrix (2 columns) and operates on it. It's quite a bit complicated function as it involves several matrix multiplications going sequentially through one of the column vectors (in a for loop) and depending on the corresponding value from the other column vector choose the matrix to multiply. More like a cumulative matrix product with elements on on column but conditional upon values in one of the column.
col1 col2 0 0.03 0 0.04 1 0.02 0 0.1 1 0.004
if values are 0, one matrix is chosen to multiply or if it's 1 a different one is chosen. Then a cumulative matrix product is taken. ie., Values = diag(Valuesmat); cumulMatProduct = ini;
for ix = 1:length(col2) if col1(ix) == 0 matrixToMultiply = matrix1; elsif col1(ix) == 1 matrixToMultiply = matrix2; end
anotherMatrixtoMultiply = diag( exp(Values).*col2(ix) ); cumulMatProduct = matrixToMultiply*anotherMatrixtoMultiply*cumulMatProduct;
Basically that's what the function does.
Now, I have a large number of such column data and so would like to know if I could use GPU computation with it. ( having access to Matlab r2013A with PCT & a TESLA s2050 )
I would like do something like:
DataMatrix1 = [col1; col1; col1] ; DataMatrix2 = [col2; col2; col2];
gpuDat1 = gpuArray(DataMatrix1); gpuDat2 = gpuArray(DataMatrix2);
[resultVect] = myFuncCall(gpuDat1, gpuDat2, ValueMat,ini); %(ValueMat & ini is not sliced & each processor will have its copy)
ie., slice the matrix as columns to each of the gpuProcessor & make each processor use myfunction to give me an output of the cumulativeMatrixProduct for those input columns of data. (more like independent, grained parallelization to cpu nodes/workers but on GPUs)? Or even what is the best way to do this in parallel ? (even just with CPUs/Workers. Is matlabpool the best option ?
It looks like you could set up all the data in one pass. You might try organizing your data such that the matrix to use for col1(ix) was stored in matrixToMultiply(:,:,ix) and the matrix corresponding to col2(ix) was stored in anotherMatrixToMultiply(:,:,ix). You haven't mentioned the size of your data, so this may very well cause you to run out of memory on your GPU. However, if these variables can fit on your GPU as gpuArrays then you can use
pagefun(@mtimes, matrixToMultiply, anotherMatrixToMultiply)
to perform all of the matrix mutliplications at one time in an efficient way.