|
Hi,
I was trying to implement the Decimation in Frequency FFT algorithm.
I took the following two approaches -
1. In-place computation of the radix-2 FFT.
2. Call to custom written function that computes the radix-2 FFT
butterfly .
In both the cases bit-reversal is performed using a recursive
function.
However, for a sample of size 2^18 both approaches take ~ 80 seconds
for computation (strange, considering the number of function calls in
approach 2)
I would like to improve the performance of my code (especially
approach 1). Could I vectorize the computations?
Also, is there a way to force use of 'single' precision and would it
improve the performance?
Profiler results:
Time(s) Calls Line
4.66 262143 7 for lvar3 = indx:(indx+(len/2)-1)
3.42 2359296 8 l1 = lvar3; l2 = lvar3 + len/2;
76.38 2359296 9 y(l1) = y(l1)+ y(l2);
9.12 2359296 10 y(l2) = (y(l1) - 2*y(l2))*exp(-
i*2*pi*(lvar3-indx)/len);
9.89 2359296 11 end
0.62 262143 12 indx=indx+ len;
0.54 262143 13 end
thx.
|