|
On 13 Jul, 15:44, "Arthur Zheng" <hzhe...@gatech.edu> wrote:
> I've got a file that looks like follows. How to match the first pattern = \d+\.{2}\d+
> after a give string, say, "AAur_0002 dnaN" using regular expression?
> For AAur_0002 dnaN, the right match should be 1995..3119. thanks.
>
> The file looks like:
> AAur_0001 dnaA 1 1..1419 1419
> AAur_0002 dnaN 1995 1995..3119 1125
> AAur_0003 3224 3224..4108 885
> AAur_0004 recF 4132 4132..5331 1200
> AAur_0005 5318 5318..5872 555
> AAur_0006 gyrB 6210 6210..8288 2079
> AAur_0007 gyrA 8351 8351..11020 2670
> AAur_0008 11017 11017..11736 720
> AAur_0009 11850 11850..11923 74
> AAur_0010 11965 11965..12090 126
> AAur_0011 12161 12161..12236 76
> AAur_0012 12349 12349..13290 942
The problem is to screen the lines for whatever code
dictates that this line is the one you want. I *assume*
the string 'dnaN' is the flag you look for, so I would
wrap your regex above in a test for the flag:
rexp = '\s+\d+\.{2}\d+\s+''
rexp1 = '\s+dnaN\s+';
if ~isempty(regexpi(s,rexp1))
rexp = '\d+\.{2}\d+';
[i1,i2]=regexp(s,rexp);
s(i1:i2)
end
Rune
|