No BSD License  

Highlights from
rsplit

4.0

4.0 | 3 ratings Rate this file 9 Downloads (last 30 days) File Size: 1.84 KB File ID: #15226

rsplit

by Gerald Dalley

 

06 Jun 2007 (Updated 07 Jun 2007)

Splits a delimited string into a cell array using a regular expression.

| Watch this File

File Information
Description

%L=RSPLIT(R,S)
% Splits a string S using the regular expression R. Meant to work like
% the PERL split function. Returns a cell array of strings. Requires
% REGEXP.
%
% This function acts a bit like a dual to REGEXP: it returns all of the
% strings that don't match the regular expression instead of those that
% do match. STRREAD is similar to RSPLIT, but it uses a fixed delimiter
% set (whitepace) when use '%s'.
%
%Examples:
% >> rsplit('[_/]+', 'this_is___a_/_string/_//')
% ans =
% 'this' 'is' 'a' 'string'
%
% >> rsplit(',', '$GPGGA,012911.00,7111.04510,N,15841.80861,W')
% ans =
% '$GPGGA' '012911.00' '7111.04510' 'N' '15841.80861' 'W'
%
% >> rsplit(',', '')
% ans =
% {}
%
% >> rsplit(',', ',')
% ans =
% {}
%
% >> rsplit('\s+',' 47.590516667 N 122.341633333 W 1 4 ')
% ans =
% '' '47.590516667' 'N' '122.341633333' 'W' '1' '4'
%
%Implementation inspired by
% Val Schmidt, Center for Coastal and Ocean Mapping
% University of New Hampshire
%
%by Gerald Dalley (dalleyg@mit.edu), 2007

Acknowledgements

The author wishes to acknowledge the following in the creation of this submission:
str2strs.m, explode_implode, Split delimiter separated strings into a matrix, split, String to Cells

MATLAB release MATLAB 7.2 (R2006a)
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (7)
07 Jun 2007 Jos x@y.z

This loopy code can be replaceb by the following one-liner:

strread(regexprep(S,R,' '),'%s').'

07 Jun 2007 Gerald Dalley

As it turns out,
  strread(regexprep(S,R,' '),'%s')
doesn't actually work correctly. Consider
  rsplit(',', 'abc,def ghi')
the proposed regexprep solution returns
  'abc' 'def' 'ghi'
but the correct solution is
  'abc' 'def ghi'

If we use
  tic; l = rsplit(' ', char(rand(1,1000000)*255)); toc
to compare the speed of the existing and the proposed solution, the existing solution is within 10-25% of the speed of the proposed one on Matlab 2006a and 2007a.

As a minor issue, the proposed alternative solution eats leading delimiters unlike Perl's split (see the last example in the description).

07 Jun 2007 Urs (us) Schwarz

the utility STR2CELL, which does the same, has been on the FEX for 4 years...
in addition, it takes numeric inputs and splits them using numeric delimiters...

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4247&objectType=FILE

us

08 Jun 2007 Gerald Dalley

US: this function is different than str2cell (and the acknowledged functions). Instead of taking a string of delimiter characters, it takes a regular expression.

29 May 2009 Ged Ridgway

Excellent, gravely underrated submission. Perl programmers will know how useful its split function is, and this seems to be a consistent MATLAB implementation.

Those that think strread or str2cell or similar is equivalent should try reproducing this:
  rsplit('\s+(and|&)\s+', 'one and two & three')
which returns {'one', 'two', 'three'}.

29 May 2009 Jos (10584)

@Ged Ridgway

As suggested:

R = '\s+(and|&)\s+'
S= 'one and two & three'
Answer = strread(regexprep(S,R,' '),'%s').'

does the job ...

10 Nov 2011 Anton Guimera  
Please login to add a comment or rating.
Tag Activity for this File
Tag Applied By Date/Time
strings Gerald Dalley 22 Oct 2008 09:14:50
string Gerald Dalley 22 Oct 2008 09:14:50
break Gerald Dalley 22 Oct 2008 09:14:50
split Gerald Dalley 22 Oct 2008 09:14:50
cell Gerald Dalley 22 Oct 2008 09:14:50
implode Gerald Dalley 22 Oct 2008 09:14:50
explode Gerald Dalley 22 Oct 2008 09:14:50
array Gerald Dalley 22 Oct 2008 09:14:50
regexp Ged Ridgway 29 May 2009 04:49:00
regular expression Ged Ridgway 29 May 2009 04:49:01
explode James 10 Apr 2011 12:23:28

Contact us at files@mathworks.com