I've been going through a lot of my tools and trying to make things faster and reduce memory use. I know that using double-precision FP as my default working datatype is part of the problem regarding memory, but I had expected using single-precision may be faster as well.
Simple tests seem to indicate that it would be (these times are all averages of many tests):
R=fg.^2 + 2*bg.*fg.*(1-bg);
but operations involving masking via multiplication were significantly slower in single:
R=(1-2*(1-I).*(1-M)).*hi + (2*M.*I).*~hi;
Explicitly casting the logical mask as numeric and handling it without the NOT operator does speed things up a bit, but either case with numeric masks is still slower than using double with logical masks.
R=(1-2*(1-I).*(1-M)).*hi + (2*M.*I).*(1-hi);
You might ask why I'm masking via multiplication in the first place. Why not just use logical indexing? I used to do everything that way, but apparently overcalculation is faster than a bunch of logical indexing:
Am I misguided to expect reliable speed gains from using single-precision for a wide range of operations across different machines (this code will be used by others)? Comments like this make me think so.
That, and if I were to pursue this flexibility for the conservation of memory alone, is there a better approach to masked operations than what I've described?