Thread Subject: regular expression

Subject: regular expression

From: Arthur Zheng

Date: 13 Jul, 2009 13:44:03

Message: 1 of 2

I've got a file that looks like follows. How to match the first pattern = \d+\.{2}\d+
after a give string, say, "AAur_0002 dnaN" using regular expression?
For AAur_0002 dnaN, the right match should be 1995..3119. thanks.

The file looks like:
AAur_0001 dnaA 1 1..1419 1419
AAur_0002 dnaN 1995 1995..3119 1125
AAur_0003 3224 3224..4108 885
AAur_0004 recF 4132 4132..5331 1200
AAur_0005 5318 5318..5872 555
AAur_0006 gyrB 6210 6210..8288 2079
AAur_0007 gyrA 8351 8351..11020 2670
AAur_0008 11017 11017..11736 720
AAur_0009 11850 11850..11923 74
AAur_0010 11965 11965..12090 126
AAur_0011 12161 12161..12236 76
AAur_0012 12349 12349..13290 942

Subject: regular expression

From: Rune Allnor

Date: 13 Jul, 2009 14:02:17

Message: 2 of 2

On 13 Jul, 15:44, "Arthur Zheng" <hzhe...@gatech.edu> wrote:
> I've got a file that looks like follows.  How to match the first pattern = \d+\.{2}\d+
> after a give string, say, "AAur_0002 dnaN" using regular expression?
> For AAur_0002 dnaN, the right match should be 1995..3119.  thanks.
>
> The file looks like:
> AAur_0001       dnaA    1       1..1419 1419    
> AAur_0002       dnaN    1995    1995..3119      1125    
> AAur_0003               3224    3224..4108      885    
> AAur_0004       recF    4132    4132..5331      1200    
> AAur_0005               5318    5318..5872      555    
> AAur_0006       gyrB    6210    6210..8288      2079    
> AAur_0007       gyrA    8351    8351..11020     2670    
> AAur_0008               11017   11017..11736    720    
> AAur_0009               11850   11850..11923    74      
> AAur_0010               11965   11965..12090    126    
> AAur_0011               12161   12161..12236    76      
> AAur_0012               12349   12349..13290    942

The problem is to screen the lines for whatever code
dictates that this line is the one you want. I *assume*
the string 'dnaN' is the flag you look for, so I would
wrap your regex above in a test for the flag:

rexp = '\s+\d+\.{2}\d+\s+''
rexp1 = '\s+dnaN\s+';
if ~isempty(regexpi(s,rexp1))
    rexp = '\d+\.{2}\d+';
    [i1,i2]=regexp(s,rexp);
    s(i1:i2)
end

Rune

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
regexpi regexp ... Arthur Zheng 13 Jul, 2009 09:49:04
rssFeed for this Thread

Contact us at files@mathworks.com