How to adapt pmtm for a gpuArray?

2 views (last 30 days)
This would really speed up some signal processing that I am doing at work. Does anyone know how to easily adapt the function pmtm for use with a gpuArray? I know that the fastest way would be to use mex functions with cuda and not use either toolbox, but this is a bit beyond me at the moment.

Accepted Answer

Daniel Harvey
Daniel Harvey on 24 May 2018
I finally got around to readdressing this challenge. The following is a function that will evaluate a power spectral density using a thomson multitaper method on a 2D input matrix with signal in each column. It uses the czt method to compute the DFT, and the eigen method for estimating the average of dpss windows. It takes about 1/15 of the time as using pmtm() in a parfor loop (on my machine). This is a rather specific implementation, but hopefully it will be useful to someone.
function psd = mypmtm(xin,fs,bins_per_hz)
[m, n] = size(xin);%m=length of signal, n=# of signals
k=fs*bins_per_hz;
nfft = 2^nextpow2(m+k-1);%- Length for power-of-two fft.
[E,V] = dpss(m,4);
g=gpuDevice;
s=(m+nfft)*32*length(V);%how many bytes will be needed for ea. signal
ne=floor(g.AvailableMemory/s);%number of signals that can be processed at once with available memory
indx=[0:ne:n,n];%number of iterations that will be necessary
psd=zeros(k/2,n);%initialize output
for i=1:length(indx)-1
x=gpuArray(xin(:,1+indx(i):indx(i+1)));
w = exp(-1i .* 2 .* pi ./ k);
x=x.*permute(E,[1 3 2]); %apply dpss windows
%------- Premultiply data.
kk = ( (-m+1):max(k-1,m-1) )';
kk = (kk .^ 2) ./ 2;
ww = w .^ (kk); % <----- Chirp filter is 1./ww
nn = (0:(m-1))';
x = x .* ww(m+nn);
%------- Fast convolution via FFT.
x = fft( x, nfft );
fv = fft( 1 ./ ww(1:(k-1+m)), nfft ); % <----- Chirp filter.
x = x .* fv;
x = ifft( x );
%------- Final multiply.
x = x( m:(m+k-1), : , :) .* ww( m:(m+k-1) );
x = abs(x).^2;
x=x.*permute(V,[2 3 1])/length(V);%'eigen' method of estimating average
x=sum(x,3);
x=x(2:end/2+1,:)./fs;
psd(:,1+indx(i):indx(i+1))=gather(x);
end
end
  1 Comment
ForTex
ForTex on 4 Sep 2019
Thank you for this!
Testing this out made me realize, that since the dpss function is the most computationally intensive here or in native pmtm, I wonder if that could be sped up with a gpu. If calculating a bunch of multitaper PSDs with the same length m, you could precalculate the dpss, and change the code to use E as input rather than calculating it every time you call the function.
I do however wonder, why there is a discrepancy of about 1/pi between mypmtm and pmtm?

Sign in to comment.

More Answers (0)

Categories

Find more on Fourier Analysis and Filtering in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!