I've been going through a lot of my tools and trying to make things faster and reduce memory use. I know that using double-precision FP as my default working datatype is part of the problem regarding memory, but I had expected using single-precision may be faster as well.
Simple tests seem to indicate that it would be (these times are all averages of many tests):
R=imfilter(bg,fs);
R=flipdim(bg,2);
R=bg.*fg;
R=fg.^2 + 2*bg.*fg.*(1-bg);
but operations involving masking via multiplication were significantly slower in single:
hi=I>0.5;
R=(1-2*(1-I).*(1-M)).*hi + (2*M.*I).*~hi;
Explicitly casting the logical mask as numeric and handling it without the NOT operator does speed things up a bit, but either case with numeric masks is still slower than using double with logical masks.
hi=single(I>0.5);
R=(1-2*(1-I).*(1-M)).*hi + (2*M.*I).*(1-hi);
You might ask why I'm masking via multiplication in the first place. Why not just use logical indexing? I used to do everything that way, but apparently overcalculation is faster than a bunch of logical indexing:
hi=I>0.5; lo=~hi;
R=zeros(size(I),'single');
R(lo)=2*I(lo).*M(lo);
R(hi)=1-2*(1-M(hi)).*(1-I(hi));
Am I misguided to expect reliable speed gains from using single-precision for a wide range of operations across different machines (this code will be used by others)? Comments like this make me think so. That, and if I were to pursue this flexibility for the conservation of memory alone, is there a better approach to masked operations than what I've described?
0 Comments
Sign in to comment.