Thread Subject: Speed is everything - 25mill NaNs

Subject: Speed is everything - 25mill NaNs

From: Mike Mann

Date: 19 Nov, 2009 16:38:19

Message: 1 of 12

I have a 6 images 2520 by 7040 and I need to replace all cells that have an NaN in any image.

image 1
2 3 4
Nan 3 6
1 4 5

image 2
2 3 4
2 NaN 6
1 4 5

output of image 1 must be
2 3 4
Nan Nan 6
1 4 5
likewise with image 2


I have created a logical index where NaN=0 in any of the images and 1 otherwise called CheckNAN2 and reshaped it to a 1 column vector.

I need help in figuring out the fastest way to replace all 0s in CheckNAN2 with NaNs and 1 otherwise, b/c then i can just reshape it to the original dimensions and multiply each image by CheckNAN2.

I am currently trying

index = CheckNAN2==0;
CheckNAN2(index)=NaN;

on a linux server and it is taking forever. Is there a faster way? Thanks for your help!

Subject: Speed is everything - 25mill NaNs

From: Matt Fig

Date: 19 Nov, 2009 16:47:03

Message: 2 of 12

Just to be clear, is this what you are doing?

% Data
im1 = [1 2 3;nan 5 6;7 8 9]
im2 = [1 2 nan;4 5 6;7 8 9]
im3 = [1 2 3;4 5 6;7 nan 9]

% Engine
idx = isnan(im1) | isnan(im2) | isnan(im3);
im1(idx) = nan
im2(idx) = nan
im3(idx) = nan

Subject: Speed is everything - 25mill NaNs

From: Mike Mann

Date: 19 Nov, 2009 17:43:04

Message: 3 of 12

Yes more of less

 % Data
 im1 = isnan( [1 2 3;nan 5 6;7 8 9])==0 % 1 = nonNAN 0=NAN
 im2 = isnan([1 2 nan;4 5 6;7 8 9])==0
 im3 = isnan([1 2 3;4 5 6;7 nan 9])==0
 
 % Engine
 idx = isnan(im1) .* isnan(im2) .* isnan(im3); % 1 = nonNAN 0=NAN

index2= idx==0
idx(index2)=NaN

im1=im1.*idx
im2=im2.*idx
im3=im3.*idx

Thanks ahead of time for any suggestions.


"Matt Fig" <spamanon@yahoo.com> wrote in message <he3sq7$a8e$1@fred.mathworks.com>...
> Just to be clear, is this what you are doing?
>
> % Data
> im1 = [1 2 3;nan 5 6;7 8 9]
> im2 = [1 2 nan;4 5 6;7 8 9]
> im3 = [1 2 3;4 5 6;7 nan 9]
>
> % Engine
> idx = isnan(im1) | isnan(im2) | isnan(im3);
> im1(idx) = nan
> im2(idx) = nan
> im3(idx) = nan

Subject: Speed is everything - 25mill NaNs

From: Matt Fig

Date: 19 Nov, 2009 18:14:22

Message: 4 of 12

Your example produces im1==im2==im3==NaN.

What I was trying to get at is: given the types of inputs I give, do you want to arrive at the outputs I give? If so, does it matter exactly how it is done, or do you just care about arriving at the same output I show?

Subject: Speed is everything - 25mill NaNs

From: Mike Mann

Date: 19 Nov, 2009 18:45:37

Message: 5 of 12

I think this might clarify it

For any image1(i,j)=NaN then make image2(i,j)=NaN & image3(i,j)=NaN
or image2(i,j)=NaN then make image1(i,j)=NaN & image3(i,j)=NaN

In other words all NaN locations are the same across all images. It really doesn't matter how it gets done, but faster is better.

So far I am shocked how long it is taking to do :

index = CheckNAN2==0;
CheckNAN2(index)=NaN; % this is taking hours and hours on a linux server





"Matt Fig" <spamanon@yahoo.com> wrote in message <he41tu$2oi$1@fred.mathworks.com>...
> Your example produces im1==im2==im3==NaN.
>
> What I was trying to get at is: given the types of inputs I give, do you want to arrive at the outputs I give? If so, does it matter exactly how it is done, or do you just care about arriving at the same output I show?

Subject: Speed is everything - 25mill NaNs

From: Matt Fig

Date: 19 Nov, 2009 18:57:21

Message: 6 of 12

"Mike Mann" <mllmouse100@yahoo.com> wrote in message <he43oh$pb7$1@fred.mathworks.com>...
> I think this might clarify it
>
> For any image1(i,j)=NaN then make image2(i,j)=NaN & image3(i,j)=NaN
> or image2(i,j)=NaN then make image1(i,j)=NaN & image3(i,j)=NaN
>
> In other words all NaN locations are the same across all images. It really doesn't matter how it gets done, but faster is better.
>
> So far I am shocked how long it is taking to do :
>
> index = CheckNAN2==0;
> CheckNAN2(index)=NaN; % this is taking hours and hours on a linux server



If it doesn't matter how it gets done, then just use this:

idx = isnan(im1) | isnan(im2) | isnan(im3) | isnan(im4) | isnan(im5) | isnan(im6);
im1(idx) = nan;
im2(idx) = nan;
im3(idx) = nan ;
im4(idx) = nan;
im5(idx) = nan;
im6(idx) = nan;

This way you don't have to worry about setting things to zeros, ones, equality checking, CheckNAN2, reshape, multiply, etc. I think this will be about as fast as you can get.

Subject: Speed is everything - 25mill NaNs

From: Nathan

Date: 19 Nov, 2009 19:16:49

Message: 7 of 12

On Nov 19, 10:45 am, "Mike Mann" <mllmouse...@yahoo.com> wrote:
> I think this might clarify it
>
> For any  image1(i,j)=NaN  then make  image2(i,j)=NaN  & image3(i,j)=NaN
> or          image2(i,j)=NaN  then make  image1(i,j)=NaN & image3(i,j)=NaN
>
> In other words all NaN locations are the same across all images.  It really doesn't matter how it gets done, but faster is better.
>
> So far I am shocked how long it is taking to do :
>
> index = CheckNAN2==0;  
> CheckNAN2(index)=NaN;   % this is taking hours and hours on a linux server
>
> "Matt Fig" <spama...@yahoo.com> wrote in message <he41tu$2o...@fred.mathworks.com>...
> > Your example produces im1==im2==im3==NaN.
>
> > What I was trying to get at is:  given the types of inputs I give, do you want to arrive at the outputs I give?  If so, does it matter exactly how it is done, or do you just care about arriving at the same output I show?
>
>

Just to clarify: Did Matt Fig's method not work quick enough for you?
Was his output what you were expecting, rather than your erroneous
one?

-Nathan

Subject: Speed is everything - 25mill NaNs

From: Matt

Date: 19 Nov, 2009 19:24:06

Message: 8 of 12

"Mike Mann" <mllmouse100@yahoo.com> wrote in message <he3s9r$848$1@fred.mathworks.com>...
> I have a 6 images 2520 by 7040 and I need to replace all cells that have an NaN in any image.
=====

BTW, if speed is everything (as the title of your post seems to convey), then you might consider not using NaNs. The processing of NaNs is generally slower than for Infs, say.

Subject: Speed is everything - 25mill NaNs

From: Matt

Date: 19 Nov, 2009 19:33:19

Message: 9 of 12

"Mike Mann" <mllmouse100@yahoo.com> wrote in message <he3s9r$848$1@fred.mathworks.com>...
> I have a 6 images 2520 by 7040 and I need to replace all cells that have an NaN in any image.
>
> image 1
> 2 3 4
> Nan 3 6
> 1 4 5
>
> image 2
> 2 3 4
> 2 NaN 6
> 1 4 5
>
> output of image 1 must be
> 2 3 4
> Nan Nan 6
> 1 4 5
> likewise with image 2
>
>
> I have created a logical index where NaN=0 in any of the images and 1 otherwise called CheckNAN2 and reshaped it to a 1 column vector.
===================

Since the preponderance of NaNs is fairly high for you (about 25%), it may be cheaper not to search for the NaNs at all and just average all the images together.

result=(Image1+Image2+Image3+Image4+Image5+Image6)/6;

Any cell containing a NaN will just force the average over all other corresponding cells to be NaN as well.

Again, though, this would be a lot faster if you used Infs instead.

Subject: Speed is everything - 25mill NaNs

From: Mike Mann

Date: 19 Nov, 2009 19:56:01

Message: 10 of 12

I think that this will just have to do it. Maybe I will try inf next time. Thanks for all the feedback guys!


idx = isnan(im1) | isnan(im2) | isnan(im3) | isnan(im4) | isnan(im5) | isnan(im6);
im1(idx) = nan;
im2(idx) = nan;
im3(idx) = nan ;
etc

Subject: Speed is everything - 25mill NaNs

From: Mike Mann

Date: 19 Nov, 2009 21:20:08

Message: 11 of 12

One final and very important note:

Matlab can't do everything well. I used a unix script to replace the 0's with NaN
Matlab has been working for 1 and 1/2 days and unix completed it in.... no joke.... about 1 minute!!!!!

So if your interested here is the code

sed 's/whateveryouwantofind/whateveryouwanttoreplace/g' filein.txt > fileout.txt

ENJOY!!! I know I did!










"Mike Mann" <mllmouse100@yahoo.com> wrote in message <he47sg$ckv$1@fred.mathworks.com>...
> I think that this will just have to do it. Maybe I will try inf next time. Thanks for all the feedback guys!
>
>
> idx = isnan(im1) | isnan(im2) | isnan(im3) | isnan(im4) | isnan(im5) | isnan(im6);
> im1(idx) = nan;
> im2(idx) = nan;
> im3(idx) = nan ;
> etc

Subject: Speed is everything - 25mill NaNs

From: James Tursa

Date: 19 Nov, 2009 21:33:26

Message: 12 of 12

"Mike Mann" <mllmouse100@yahoo.com> wrote in message <he47sg$ckv$1@fred.mathworks.com>...
> I think that this will just have to do it. Maybe I will try inf next time. Thanks for all the feedback guys!
>
>
> idx = isnan(im1) | isnan(im2) | isnan(im3) | isnan(im4) | isnan(im5) | isnan(im6);
> im1(idx) = nan;
> im2(idx) = nan;
> im3(idx) = nan ;
> etc

If you are really desperate for speed you might consider a mex routine to do this operation in-place. A mex routine would avoid all of the large intermediate variables and multiple accesses of the data. e.g., a bare bones routine (no argument checking) to do the above for an arbitrary number of inputs (up to 100):

#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double *pr[100];
    mwSize i, j, k, n;
    for(j=0; j<nrhs; j++) {
        pr[j] = mxGetPr(prhs[j]);
    }
    n = mxGetNumberOfElements(prhs[0]);
    for(i=0; i<n; i++) {
        for(j=0; j<nrhs; j++) {
            if( mxIsNaN(*pr[j]) ) {
                for(k=0; k<nrhs; k++) {
                    *pr[k] = mxGetNaN( );
                }
                break;
            }
        }
        for(j=0; j<nrhs; j++) {
            ++pr[j];
        }
    }
}

I will stress again that the above code does the operation in-place, meaning it alters the input variables directly. This violates the mex rules and will cause problems if any of the input variables are sharing data with other variables, but it does work and can save you if you are really in a bind for speed and/or memory.

James Tursa

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
code Mike Mann 19 Nov, 2009 11:39:07
optimize Mike Mann 19 Nov, 2009 11:39:07
nan Mike Mann 19 Nov, 2009 11:39:07
replacement Mike Mann 19 Nov, 2009 11:39:07
rssFeed for this Thread

Contact us at files@mathworks.com