File Exchange

image thumbnail

rsplit

version 1.0.0.0 (942 Bytes) by Gerald Dalley
Splits a delimited string into a cell array using a regular expression.

0 Downloads

Updated 07 Jun 2007

No License

%L=RSPLIT(R,S)
% Splits a string S using the regular expression R. Meant to work like
% the PERL split function. Returns a cell array of strings. Requires
% REGEXP.
%
% This function acts a bit like a dual to REGEXP: it returns all of the
% strings that don't match the regular expression instead of those that
% do match. STRREAD is similar to RSPLIT, but it uses a fixed delimiter
% set (whitepace) when use '%s'.
%
%Examples:
% >> rsplit('[_/]+', 'this_is___a_/_string/_//')
% ans =
% 'this' 'is' 'a' 'string'
%
% >> rsplit(',', '$GPGGA,012911.00,7111.04510,N,15841.80861,W')
% ans =
% '$GPGGA' '012911.00' '7111.04510' 'N' '15841.80861' 'W'
%
% >> rsplit(',', '')
% ans =
% {}
%
% >> rsplit(',', ',')
% ans =
% {}
%
% >> rsplit('\s+',' 47.590516667 N 122.341633333 W 1 4 ')
% ans =
% '' '47.590516667' 'N' '122.341633333' 'W' '1' '4'
%
%Implementation inspired by
% Val Schmidt, Center for Coastal and Ocean Mapping
% University of New Hampshire
%
%by Gerald Dalley (dalleyg@mit.edu), 2007

Comments and Ratings (7)

Anton Guimera

Jos (10584)

@Ged Ridgway

As suggested:

R = '\s+(and|&)\s+'
S= 'one and two & three'
Answer = strread(regexprep(S,R,' '),'%s').'

does the job ...

Ged Ridgway

Excellent, gravely underrated submission. Perl programmers will know how useful its split function is, and this seems to be a consistent MATLAB implementation.

Those that think strread or str2cell or similar is equivalent should try reproducing this:
rsplit('\s+(and|&)\s+', 'one and two & three')
which returns {'one', 'two', 'three'}.

Gerald Dalley

US: this function is different than str2cell (and the acknowledged functions). Instead of taking a string of delimiter characters, it takes a regular expression.

Urs (us) Schwarz

the utility STR2CELL, which does the same, has been on the FEX for 4 years...
in addition, it takes numeric inputs and splits them using numeric delimiters...

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4247&objectType=FILE

us

Gerald Dalley

As it turns out,
strread(regexprep(S,R,' '),'%s')
doesn't actually work correctly. Consider
rsplit(',', 'abc,def ghi')
the proposed regexprep solution returns
'abc' 'def' 'ghi'
but the correct solution is
'abc' 'def ghi'

If we use
tic; l = rsplit(' ', char(rand(1,1000000)*255)); toc
to compare the speed of the existing and the proposed solution, the existing solution is within 10-25% of the speed of the proposed one on Matlab 2006a and 2007a.

As a minor issue, the proposed alternative solution eats leading delimiters unlike Perl's split (see the last example in the description).

Jos x@y.z

This loopy code can be replaceb by the following one-liner:

strread(regexprep(S,R,' '),'%s').'

MATLAB Release Compatibility
Created with R2006a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Discover Live Editor

Create scripts with code, output, and formatted text in a single executable document.


Learn About Live Editor