Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Having regular expression stop at the first instance of a string, and not subsequent ones

Asked by Cheng-Ming on 18 Jan 2013

Hello!

I'm trying to find a solution to my problem, and I am not sure if there is one.

I have some Elmer code I am trying to parse using regular expressions in Matlab, that looks something like this:

...

Body 2 Target Bodies(1) = 2 Name = "Body 2" Equation = 1 Material = 3 End

Body 3 Target Bodies(1) = 3 Name = "Body 3" Equation = 2 Material = 1 Body Force = 1 End ...

My expression is : Body\s+N.*?(Body\sForce\s*=\s*)(\d+?)\s.*?End

where N is 2 or 3.

I am trying to pull the Body Force from the Body if there is one, and have the token return an empty string if there isn't one.

When I apply this with N = 3, it works fine. However, when N=2, the regexp gives back all of the text, so both body 2 and body 3. Is there a way to specifically tell it to stop at the first End it sees? Thank you very much!

Cheng

0 Comments

Cheng-Ming

Products

No products are associated with this question.

2 Answers

Answer by Walter Roberson on 18 Jan 2013

Are both lines in the same string, or is it one line at a time? If it is one line at a time, then regexp for '(?<=Body\sForce\s+=\s+)(\d+)'

3 Comments

Cheng-Ming on 19 Jan 2013

They are in the same string. I could split them then parse again, that would not be difficult, but I was wondering if there was a more elegant solution.

Walter Roberson on 19 Jan 2013

Use the lazy quantifier .*? instead of .* which is the greedy quantifier.

Walter Roberson on 20 Jan 2013

Also, you can set the 'dotexceptnewline' regexp() option so that the .* will not cross linefeeds. In general when you start matching within individual lines you often end up also wanting the 'lineanchors' regexp() option, so that you can use ^ and $ to match the beginning and end of individual lines.

Walter Roberson
Answer by Cedric Wannaz on 19 Jan 2013
Edited by Cedric Wannaz on 20 Jan 2013
 >> doc regexp

Under Command options

 'once'  : Return only the first match found.

EDIT (after Walter's comment):

 >> str = 'Body 2 Target Bodies(1) = 2 Name = "Body 2" Equation = 1 Material = 3 End Body 3 Target Bodies(1) = 3 Name = "Body 3" Equation = 2 Material = 1 Body Force = 1 End Body 44 Target Bodies(1) = 2 Name = "Body 44" Equation = 1 Material = 3 End Body 345 Target Bodies(12) = 36 Name = "Body 345" Equation = 22 Material = 123456 Body Force = 987 End' ; 
 >> fmt = '(?<=Body %d(((?!End).)+)e \\= )\\d*' ;
 >> regexp(str, sprintf( fmt, 28), 'match', 'once' )
 ans = ''
 >> regexp(str, sprintf( fmt, 2), 'match', 'once' )
 ans = ''
 >> regexp(str, sprintf( fmt, 3), 'match', 'once' )
 ans = '1'
 >> regexp(str, sprintf( fmt, 44), 'match', 'once' )
 ans = ''
 >> regexp(str, sprintf( fmt, 345), 'match', 'once' )
 ans = '987'

I'll get some aspirin now ;)

Cedric

2 Comments

Walter Roberson on 19 Jan 2013

That will not solve the problem here as regexp are "greedy" by default so the .* will go as far as possible.

Cedric Wannaz on 20 Jan 2013

Ah yes; .. I'll update my answer.

Cedric Wannaz

Contact us