Split along word into parts with different lengths

4 views (last 30 days)
problem solved
  3 Comments
Cedric
Cedric on 6 Jun 2014
I guess that it is not always starting with the same string, otherwise you wouldn't have to parse it. We need more examples, as well as a clear description of criteria which define sub parts.
dpb
dpb on 6 Jun 2014
Specifically, what's the rule for breaking down the first set of six letters into the two groups of three if that isn't a consistent string. If it is, then just substring referencing of 1:3 and 4:6 takes care of it.
The rest looks like looking for capitals after a lowercase letter.

Sign in to comment.

Accepted Answer

Cedric
Cedric on 6 Jun 2014
Edited: Cedric on 6 Jun 2014
Here is a solution based on comments above. Yet, we will need you to answer these comments so we can refine the approach. Assuming
data = {'NCCINS.AttitudePositionINS1.Pitch', 'NCCINS.AttitudePositionINS1.Roll'} ;
we perform a single call to REGEXP on a comma-separated merger of the original strings, as follows:
pattern = '(\w{3})(\w{3}).(.[^A-Z]+)(.[^A-Z]+)([^\.]+).([^,]+)' ;
tokens = regexp( sprintf('%s,', data{:}), pattern, 'tokens' ) ;
with that, we get
>> tokens{1}
ans =
'NCC' 'INS' 'Attitude' 'Position' 'INS1' 'Pitch'
>> tokens{2}
ans =
'NCC' 'INS' 'Attitude' 'Position' 'INS1' 'Roll'
  1 Comment
Cedric
Cedric on 6 Jun 2014
Edited: Cedric on 6 Jun 2014
Can you attach a few example, or one package, or even the 70. The more I have, the more I can refine the approach. The approach above is based on regular expressions and pattern matching. I cannot explain it generally because there would be way too much to develop. Yet, once we have a pattern which suites your case specifically, I can explain it to you.

Sign in to comment.

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!