Path: news.mathworks.com!not-for-mail
From: "Adrian " <ajr@med.unc.edu>
Newsgroups: comp.soft-sys.matlab
Subject: Re: Need algorithm help: molecule length from an image
Date: Thu, 15 May 2008 16:27:02 +0000 (UTC)
Organization: University of North Carolina
Lines: 253
Message-ID: <g0ho8m$1f7$1@fred.mathworks.com>
References: <g023qb$j0m$1@fred.mathworks.com> <g0couj$np$1@fred.mathworks.com> <g0crs0$6al$1@canopus.cc.umanitoba.ca> <g0d7jc$mtc$1@fred.mathworks.com>
Reply-To: "Adrian " <ajr@med.unc.edu>
NNTP-Posting-Host: webapp-05-blr.mathworks.com
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1210868822 1511 172.30.248.35 (15 May 2008 16:27:02 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Thu, 15 May 2008 16:27:02 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 1371482
Xref: news.mathworks.com comp.soft-sys.matlab:468625


"Ken Campbell" <campbeks@gmail.com> wrote in message
<g0d7jc$mtc$1@fred.mathworks.com>...
> roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote in 
> message <g0crs0$6al$1@canopus.cc.umanitoba.ca>...
> > In article <g0couj$np$1@fred.mathworks.com>, Adrian  
> <ajr@med.unc.edu> wrote:
> > 
> > >4) Scan the image from left to right and then "walk" 
> along
> > >areas that are labeled as "1". If the program finds a 
> pixel
> > >that is labeled "1", look toward the 8 surrounding 
> pixels to
> > >find if there is another pixel labeled "1". If so, 
> continue
> > >to look for pixels labeled "1" (e.g., keep "walking" 
> along
> > >the molecule), and if possible, continue "walking" in the
> > >same direction as previously (to prevent the "walker" 
> from
> > >going back to where it came from). Then, after walking 
> along
> > >the molecule, label all of the points it walked on 
> as "0" so
> > >the scanner won't walk there again. 
> > >5) Output the length measurement as pixel and convert to 
> nm
> > >by looking at the scale.
> > 
> > Suppose you have
> > 
> > 1011
> > 0100
> > 0010
> > 
> > And you start at the top-left corner. You walk left to 
> right so you
> > encounter the top-left 1. Presumably you increment your 
> count to 1.
> > You examine the surrounding pixels and find
> > the 1 diagonally down and right. You try to continue left 
> to right, but
> > there is no 1 there so you take the path that is open by 
> going diagonally
> > down and right. However, as you were not able to continue 
> in the same
> > direction as you had been going, you label the points you 
> walked in
> > that stretch with a 0 so the scanner won't walk there 
> again, leading to
> > 
> > 0011
> > 0100
> > 0010
> > 
> > You are at the second pixel, presumably you increment 
> your count to 2.
> > Now you are headed diagonally down and right. You examine 
> the
> > surrounding pixels and find the 1 diagonally up and right 
> and
> > the 1 diagonally down and right. As per your algorithm, 
> you prefer
> > to keep going in the same direction, so you take the 
> diagonal
> > down and right. As you -were- able to continue in the 
> same direction,
> > you are on a run and so do -not- mark the pixel you came 
> from with a 0.
> > 
> > You are at the third pixel, presumably you increment your 
> count to 3.
> > You look around from that bottom pixel and find the only 
> path open
> > to be the one up and left. However, as you were not able 
> to continue
> > in the same direction as you had been going, you label 
> the points
> > you walked in the run with a 0 won't walk there again, 
> leading to
> > 
> > 0011
> > 0000
> > 0000
> > 
> > There is now an ambiguity in your algorithm: you had an 
> available
> > path up-left a moment ago, but that pixel is now 0'd, and 
> there
> > are now no pixels left in the neighbourhood that are non-
> 0. Do you
> > stop at this point with a pixel count of 3? My 
> interpretation of
> > your algorithm is that you do -not- stop, because your 
> instruction
> > was to do the search before the 0'ing.
> > 
> > So you are now back at 2nd row 2nd column. Your algorithm 
> didn't
> > say anything about not counting on backtracks, so 
> presumably you
> > increment the count to 4. You look around and find the 
> open path
> > to the up-right. That's a change of direction, so you 0 
> out the pixels
> > you had in that run, which makes no change because they 
> are already 0.
> > 
> > You are now at the 1st row 3rd column, trying to head up 
> and right.
> > Presumably you increment your counter to 5. You look 
> around and
> > find the open path to the right. That's a change of 
> direction
> > so you 0 out the pixels you had in that run, which makes 
> the matrix
> > 
> > 0001
> > 0000
> > 0000
> > 
> > You are now at the 1st row 4th column, trying to head 
> right.
> > Presumably you increment your counter to 6. You look 
> around and find
> > no open paths. Presumably you stop there.
> > 
> > Your recorded molecule length is now 6, but the longest 
> connected
> > chain was 4 -- (1,1) -> (2,2) -> (1,3) -> (1,4), so 
> arguably the
> > molecule length is 4 instead. Or, since there were two 
> diagonal
> > moves and then 1 lateral move, 2*sqrt(2)+1, which is 
> about 3.8.
> > Add 1 if you are counting from the edges instead of from 
> the centers.
> > 
> > 
> > Now, suppose you had started with
> > 
> > 101111
> > 010000
> > 001000
> > 000100
> > 
> > then when you are examining (2,2) after coming from (1,1),
> > because your algorithm always continues in the same 
> direction if it
> > can, it would continue on the downward diagonal to the 
> edge, find
> > the upward-left step from the bottom as the only 
> connection to the
> > bottom pixel, and would 0 out the entire diagonal, 
> leaving you with
> > 
> > 001111
> > 000000
> > 000000
> > 000000
> > 
> > with you positioned at (3,3). There are no connections 
> from there
> > so the algorithm would stop, unable to reach that chain 
> at the top
> > right corner. The algorithm has thus clearly not found 
> the proper length.
> > 
> > 
> > Generally speaking, the algorithm's preference for 
> continuing in
> > the same direction as possible is going to create 
> problems for you,
> > as it will be unable to recognize where the molecule 
> changes
> > direction even just due to a standard convex curve. And 
> as I showed
> > above, counting pixels visited is not an accurate length 
> measure
> > if any branches exist. Thirdly, you are not talking into 
> account
> > that diagonal moves are further than vertical or 
> horizontal moves.
> > -- 
> >   "Prevention is the daughter of intelligence."
> >                                               -- Sir 
> Walter Raleigh
> 
> 
> Adrian,
> 
> This is very roughly coded but it gives 'an' answer. You 
> might find it  helpful.
> 
> Ken
> 
> % Read in image and compress to 2D
> ifs='c:\temp\mol.png';
> im=imread(ifs);
> im=sum(im,3);
> 
> % Show original image
> figure(1);
> imagesc(im);
> 
> % Blur and display
> filt=fspecial('disk',2);
> blurred_im=imfilter(im,filt,'replicate');
> 
> figure(2);
> imagesc(blurred_im);
> colorbar;
> 
> % Threshold the blurred version
> threshold_im=zeros(size(im));
> threshold_im(blurred_im>230)=1;
> 
> figure(3);
> imagesc(threshold_im);
> 
> % And keep only the largest 'island' which is the molecule
> molecule_im=bwareaopen(threshold_im,600);
> 
> figure(4);
> clf;
> imagesc(molecule_im);
> hold on;
> 
> % Now get the perimeter
> mol_properties=regionprops(bwlabel(molecule_im), ...
>     'perimeter');
> molecule_length=(mol_properties.Perimeter)/2
> 
> % And draw it to make sure we have the right thing
> [p,l]=bwboundaries(molecule_im,4,'noholes');
> boundary=p{1};
> plot(boundary(:,2),boundary(:,1),'y');
> 

Ken,

Is there anyway to modify your code to outline more than one
molecule? It works great for one molecule, but when I start
looking at more than one, the program breaks. 

I'm planning on making a GUI for the program so the user can
change the thresholds as needed for an image. After trying
out your algorithm with various DNA images I have, the
shading is different between each one (because of the
inconsistent heavy-metal shadowing, I assume). 

Thanks,
Adrian