http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491
MATLAB Central Newsreader  manipulating strings
Feed for thread: manipulating strings
enus
©19942015 by MathWorks, Inc.
webmaster@mathworks.com
MATLAB Central Newsreader
http://blogs.law.harvard.edu/tech/rss
60
MathWorks
http://www.mathworks.com/images/membrane_icon.gif

Tue, 07 Jul 2009 11:29:50 +0000
manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663184
arun
Hi,<br>
<br>
suppose I have a string A whose size is 1*10^7. I would now like to<br>
remove certain characters in the string. I tried strfind and regexprep<br>
as follows<br>
<br>
A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
and then i repeat this for all numbers from 0 to 9 and for "space".<br>
<br>
Alternative efficient way i hoped would be,<br>
A = regexprep(A, "[09, ]", '');<br>
but the first expression takes for ever as the vector is long and the<br>
second one strangely gives me "out of memory" error...<br>
<br>
<br>
any ways to speed up?<br>
<br>
thank you very much,<br>
arun.

Tue, 07 Jul 2009 12:28:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663200
nor ki
arun <aragorn168b@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62cd1@26g2000yqk.googlegroups.com>...<br>
> Hi,<br>
> <br>
> suppose I have a string A whose size is 1*10^7. I would now like to<br>
> remove certain characters in the string. I tried strfind and regexprep<br>
> as follows<br>
> <br>
> A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> and then i repeat this for all numbers from 0 to 9 and for "space".<br>
> <br>
> Alternative efficient way i hoped would be,<br>
> A = regexprep(A, "[09, ]", '');<br>
> but the first expression takes for ever as the vector is long and the<br>
> second one strangely gives me "out of memory" error...<br>
> <br>
> <br>
> any ways to speed up?<br>
> <br>
> thank you very much,<br>
> arun.<br>
<br>
Hi Arun,<br>
as you only look for single characters you could build a lookup table of type logical which contains true for each of the desired characters and false for the characters which should be removed.<br>
call this one just lut<br>
<br>
then you make an array for the positions of your desired characters:<br>
<br>
idx = lut(A);<br>
<br>
and get them back in A<br>
<br>
A = A(idx);<br>
<br>
or in short:<br>
<br>
A = A(lut(A));<br>
<br>
hth<br>
kinor

Tue, 07 Jul 2009 17:30:14 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663307
arun
On Jul 7, 2:28 pm, "nor ki" <kinor.remov...@gmx.de> wrote:<br>
> arun <aragorn1...@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62...@26g2000yqk.googlegroups.com>...<br>
> > Hi,<br>
><br>
> > suppose I have astringA whose size is 1*10^7. I would now like to<br>
> > remove certain characters in thestring. I tried strfind and regexprep<br>
> > as follows<br>
><br>
> > A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> > and then i repeat this for all numbers from 0 to 9 and for "space".<br>
><br>
> > Alternative efficient way i hoped would be,<br>
> > A = regexprep(A, "[09, ]", '');<br>
> > but the first expression takes for ever as the vector is long and the<br>
> > second one strangely gives me "out of memory" error...<br>
><br>
> > any ways to speed up?<br>
><br>
> > thank you very much,<br>
> > arun.<br>
><br>
> Hi Arun,<br>
> as you only look for single characters you could build a lookup table of type logical which contains true for each of the desired characters and false for the characters which should be removed.<br>
> call this one just lut<br>
><br>
> then you make an array for the positions of your desired characters:<br>
><br>
> idx = lut(A);<br>
><br>
> and get them back in A<br>
><br>
> A = A(idx);<br>
><br>
> or in short:<br>
><br>
> A = A(lut(A));<br>
><br>
> hth<br>
> kinor<br>
<br>
Hi Kinor,<br>
<br>
Thank you for the suggestion. I just have some trouble understanding<br>
how to construct this lut. Is it like a map? because I have to know<br>
this character has a true and this character has a false...<br>
<br>
suppose A = "1,1600,A,G,G,G,A,A,A,G,A,A,G";<br>
<br>
and here I don't need the comma, and the numbers 1 and 1600, that is,<br>
the desired string is A = "AGGGAAAGAAG"<br>
if i don't have a map, then my look up table should consist of values<br>
for all entries, right? I don't think you suggested that way.... I<br>
mean,<br>
<br>
lut = [0,0,0,0,0,0,0,1,0,1,0,1...] and then use A = lut(A)...<br>
is this what you suggested?<br>
thank you very much,<br>
arun.

Tue, 07 Jul 2009 17:48:57 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663314
arun
On Jul 7, 2:28 pm, "nor ki" <kinor.remov...@gmx.de> wrote:<br>
> arun <aragorn1...@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62...@26g2000yqk.googlegroups.com>...<br>
> > Hi,<br>
><br>
> > suppose I have a string A whose size is 1*10^7. I would now like to<br>
> > remove certain characters in the string. I tried strfind and regexprep<br>
> > as follows<br>
><br>
> > A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> > and then i repeat this for all numbers from 0 to 9 and for "space".<br>
><br>
> > Alternative efficient way i hoped would be,<br>
> > A = regexprep(A, "[09, ]", '');<br>
> > but the first expression takes for ever as the vector is long and the<br>
> > second one strangely gives me "out of memory" error...<br>
><br>
> > any ways to speed up?<br>
><br>
> > thank you very much,<br>
> > arun.<br>
><br>
> Hi Arun,<br>
> as you only look for single characters you could build a lookup table of type logical which contains true for each of the desired characters and false for the characters which should be removed.<br>
> call this one just lut<br>
><br>
> then you make an array for the positions of your desired characters:<br>
><br>
> idx = lut(A);<br>
><br>
> and get them back in A<br>
><br>
> A = A(idx);<br>
><br>
> or in short:<br>
><br>
> A = A(lut(A));<br>
><br>
> hth<br>
> kinor<br>
<br>
Hi,<br>
<br>
I tried it like this...<br>
<br>
lut = 'AGCT';<br>
%str is a 1*100million string.<br>
<br>
str = str(ismember(str,lut));<br>
<br>
this seems to work pretty fast for 10^7 but not for 10^8 or 10^ 9 as<br>
it gives out of memory error. But I guess, this should be pretty fast<br>
for parsing using a for loop and taking 10^7 entries at a time...<br>
<br>
thank you... i would appreciate it if some1 could let me know of<br>
better methods available.<br>
<br>
thanks,<br>
arun.

Tue, 07 Jul 2009 18:27:54 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663329
Rune Allnor
On 7 Jul, 13:29, arun <aragorn1...@gmail.com> wrote:<br>
> Hi,<br>
><br>
> suppose I have a string A whose size is 1*10^7. I would now like to<br>
> remove certain characters in the string. I tried strfind and regexprep<br>
> as follows<br>
><br>
> A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> and then i repeat this for all numbers from 0 to 9 and for "space".<br>
><br>
> Alternative efficient way i hoped would be,<br>
> A = regexprep(A, "[09, ]", '');<br>
> but the first expression takes for ever as the vector is long<br>
<br>
Every time this finds a match, the vector is shortened by one<br>
character, meaning you need to allocate space reshuffle the<br>
contents every time one finds a hit. This would probably be<br>
better:<br>
<br>
rexp = '[^09, ]';<br>
idx = regex(A,rexp);<br>
A = A(idx);<br>
<br>
Rune

Wed, 08 Jul 2009 07:07:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663458
nor ki
arun <aragorn168b@gmail.com> wrote in message <5e0e7082b29541569bfc56540c99cf65@h18g2000yqj.googlegroups.com>...<br>
> On Jul 7, 2:28?pm, "nor ki" <kinor.remov...@gmx.de> wrote:<br>
> > arun <aragorn1...@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62...@26g2000yqk.googlegroups.com>...<br>
> > > Hi,<br>
> ><br>
> > > suppose I have a string A whose size is 1*10^7. I would now like to<br>
> > > remove certain characters in the string. I tried strfind and regexprep<br>
> > > as follows<br>
> ><br>
> > > A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> > > and then i repeat this for all numbers from 0 to 9 and for "space".<br>
> ><br>
> > > Alternative efficient way i hoped would be,<br>
> > > A = regexprep(A, "[09, ]", '');<br>
> > > but the first expression takes for ever as the vector is long and the<br>
> > > second one strangely gives me "out of memory" error...<br>
> ><br>
> > > any ways to speed up?<br>
> ><br>
> > > thank you very much,<br>
> > > arun.<br>
> ><br>
> > Hi Arun,<br>
> > as you only look for single characters you could build a lookup table of type logical which contains true for each of the desired characters and false for the characters which should be removed.<br>
> > call this one just lut<br>
> ><br>
> > then you make an array for the positions of your desired characters:<br>
> ><br>
> > idx = lut(A);<br>
> ><br>
> > and get them back in A<br>
> ><br>
> > A = A(idx);<br>
> ><br>
> > or in short:<br>
> ><br>
> > A = A(lut(A));<br>
> ><br>
> > hth<br>
> > kinor<br>
> <br>
> Hi,<br>
> <br>
> I tried it like this...<br>
> <br>
> lut = 'AGCT';<br>
> %str is a 1*100million string.<br>
> <br>
> str = str(ismember(str,lut));<br>
> <br>
> this seems to work pretty fast for 10^7 but not for 10^8 or 10^ 9 as<br>
> it gives out of memory error. But I guess, this should be pretty fast<br>
> for parsing using a for loop and taking 10^7 entries at a time...<br>
> <br>
> thank you... i would appreciate it if some1 could let me know of<br>
> better methods available.<br>
> <br>
> thanks,<br>
> arun.<br>
Hi Arun,<br>
<br>
str = str(ismember(str,lut)); <br>
applies ismember for the whole variable str <br>
<br>
do it like this:<br>
<br>
lut = ~ismember(1:256,removechars);<br>
str = str(lut(str));<br>
<br>
where removechars are the characters you want to be removed.<br>
<br>
for 10^8 you really have to use a loop, maybe Runes idea works better then..<br>
<br>
kinor

Wed, 08 Jul 2009 07:49:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663472
us
arun <aragorn168b@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62cd1@26g2000yqk.googlegroups.com>...<br>
> Hi,<br>
> <br>
> suppose I have a string A whose size is 1*10^7. I would now like to<br>
> remove certain characters in the string. I tried strfind and regexprep<br>
> as follows<br>
> <br>
> A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> and then i repeat this for all numbers from 0 to 9 and for "space".<br>
> <br>
> Alternative efficient way i hoped would be,<br>
> A = regexprep(A, "[09, ]", '');<br>
> but the first expression takes for ever as the vector is long and the<br>
> second one strangely gives me "out of memory" error...<br>
> <br>
> <br>
> any ways to speed up?<br>
> <br>
> thank you very much,<br>
> arun.<br>
<br>
one of the solutions<br>
 use ISMEMBC rather than ISMEMBER<br>
<br>
clear ix v; % < save old stuff<br>
tmpl='0':'z';<br>
v=repmat(tmpl(randperm(numel(tmpl))),1,600000);<br>
size(v,2)<br>
% ans = 45,000,000<br>
tmpl=sort(['0':'9',',']); % < must be SORTed!<br>
tic;<br>
ix=ismembc(v,tmpl);<br>
toc<br>
%{<br>
Elapsed time is 0.558719 seconds.<br>
% wintel system: ic2/2*2.6gzh/2mb/winxp.sp3/r2009a<br>
%}<br>
<br>
us

Wed, 08 Jul 2009 09:06:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663490
nor ki
"us " <us@neurol.unizh.ch> wrote in message <h31j1d$b16$1@fred.mathworks.com>...<br>
> arun <aragorn168b@gmail.com> wrote in message <87c67726964b48ce80f0a50d24b62cd1@26g2000yqk.googlegroups.com>...<br>
> > Hi,<br>
> > <br>
> > suppose I have a string A whose size is 1*10^7. I would now like to<br>
> > remove certain characters in the string. I tried strfind and regexprep<br>
> > as follows<br>
> > <br>
> > A(strfind(A, ',')) = ''; %replace entries with a comma with nothing<br>
> > and then i repeat this for all numbers from 0 to 9 and for "space".<br>
> > <br>
> > Alternative efficient way i hoped would be,<br>
> > A = regexprep(A, "[09, ]", '');<br>
> > but the first expression takes for ever as the vector is long and the<br>
> > second one strangely gives me "out of memory" error...<br>
> > <br>
> > <br>
> > any ways to speed up?<br>
> > <br>
> > thank you very much,<br>
> > arun.<br>
> <br>
> one of the solutions<br>
>  use ISMEMBC rather than ISMEMBER<br>
> <br>
> clear ix v; % < save old stuff<br>
> tmpl='0':'z';<br>
> v=repmat(tmpl(randperm(numel(tmpl))),1,600000);<br>
> size(v,2)<br>
> % ans = 45,000,000<br>
> tmpl=sort(['0':'9',',']); % < must be SORTed!<br>
> tic;<br>
> ix=ismembc(v,tmpl);<br>
> toc<br>
> %{<br>
> Elapsed time is 0.558719 seconds.<br>
> % wintel system: ic2/2*2.6gzh/2mb/winxp.sp3/r2009a<br>
> %}<br>
> <br>
> us<br>
<br>
Hi Us,<br>
<br>
where did you find ismembc? is there a place to find undocumented functions?<br>
<br>
kinor<br>
<br>
tmpl='0':'z';<br>
strvar=repmat(tmpl(randperm(numel(tmpl))),1,1e6);<br>
removechars=sort(['0':'9',',']); % < must be SORTed!<br>
<br>
tic<br>
lut1 = ~ismember(1:256,removechars);<br>
res1 = strvar(lut1(strvar));<br>
toc<br>
<br>
tic<br>
lut2 = ~ismembc(strvar, removechars);<br>
res2 = strvar(lut2);<br>
toc<br>
<br>
isequal(res1, res2)<br>
<br>
Elapsed time is 1.523525 seconds.<br>
Elapsed time is 1.862163 seconds.<br>
<br>
ans =<br>
<br>
1

Wed, 08 Jul 2009 09:21:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663499
us
"nor ki"<br>
> where did you find ismembc? is there a place to find undocumented functions?<br>
<br>
it is not an undocumented function...<br>
rather, look at ISMEMBER<br>
<br>
edit ismember;<br>
% and you'll find this at line #121  127<br>
%{<br>
% Two CHelper Functions are used in the code below:<br>
<br>
% ISMEMBC  S must be sorted  Returns logical vector indicating which <br>
% elements of A occur in S<br>
% ISMEMBC2  S must be sorted  Returns a vector of the locations of <br>
% the elements of A occurring in S. If multiple instances occur,<br>
% the last occurrence is returned <br>
%}<br>
% then, being an investigative person, you'll immediately do this<br>
which ismembc;<br>
% MLROOT\toolbox\matlab\ops\ismembc.mexw32 % < a MEX...<br>
% and play with it in the command window (timing and so on)<br>
<br>
it's often worthwhile to look at ML stock functions to<br>
 see how TMW does things (not always optimized...)<br>
 look for hidden gems...<br>
<br>
us

Wed, 08 Jul 2009 09:29:02 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663503
nor ki
"us " <us@neurol.unizh.ch> wrote in message <h31odt$6aq$1@fred.mathworks.com>...<br>
> "nor ki"<br>
> > where did you find ismembc? is there a place to find undocumented functions?<br>
> <br>
> it is not an undocumented function...<br>
> rather, look at ISMEMBER<br>
> <br>
> edit ismember;<br>
> % and you'll find this at line #121  127<br>
> %{<br>
> % Two CHelper Functions are used in the code below:<br>
> <br>
> % ISMEMBC  S must be sorted  Returns logical vector indicating which <br>
> % elements of A occur in S<br>
> % ISMEMBC2  S must be sorted  Returns a vector of the locations of <br>
> % the elements of A occurring in S. If multiple instances occur,<br>
> % the last occurrence is returned <br>
> %}<br>
> % then, being an investigative person, you'll immediately do this<br>
> which ismembc;<br>
> % MLROOT\toolbox\matlab\ops\ismembc.mexw32 % < a MEX...<br>
> % and play with it in the command window (timing and so on)<br>
> <br>
> it's often worthwhile to look at ML stock functions to<br>
>  see how TMW does things (not always optimized...)<br>
>  look for hidden gems...<br>
> <br>
> us<br>
<br>
Hi US,<br>
<br>
thank you for the hint<br>
<br>
kinor

Wed, 08 Jul 2009 10:38:09 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663517
arun
On Jul 8, 11:21 am, "us " <u...@neurol.unizh.ch> wrote:<br>
> "nor ki"<br>
><br>
> > where did you find ismembc? is there a place to find undocumented functions?<br>
><br>
> it is not an undocumented function...<br>
> rather, look at ISMEMBER<br>
><br>
> edit ismember;<br>
> % and you'll find this at line #121  127<br>
> %{<br>
> % Two CHelper Functions are used in the code below:<br>
><br>
> % ISMEMBC  S must be sorted  Returns logical vector indicating which<br>
> % elements of A occur in S<br>
> % ISMEMBC2  S must be sorted  Returns a vector of the locations of<br>
> % the elements of A occurring in S. If multiple instances occur,<br>
> % the last occurrence is returned <br>
> %}<br>
> % then, being an investigative person, you'll immediately do this<br>
> which ismembc;<br>
> % MLROOT\toolbox\matlab\ops\ismembc.mexw32 % < a MEX...<br>
> % and play with it in the command window (timing and so on)<br>
><br>
> it's often worthwhile to look at ML stock functions to<br>
>  see how TMW does things (not always optimized...)<br>
>  look for hidden gems...<br>
><br>
> us<br>
<br>
nor ki,<br>
<br>
thank you for your suggestions. They work very well. Now my next<br>
formidable task is to reshape this vector to a 192*240605 matrix. (My<br>
actual task is to parse a file which is 270 MB long line by line and<br>
do these operations. But I found another topic in whice UWE has shown<br>
the fastest way to read a whole file onto a variable using fread and<br>
now, I am trying to remove the unwanted entries and then shape them in<br>
to the desired matrix. The old linebyline method takes about 3045<br>
mins on this old computer.. so far, before the reshape step, without<br>
out of memory error, it takes 1.5 mins. let me see!! )<br>
<br>
Uwe, yours also works like a charm. I personally dont see a difference<br>
between ismember and ismembc, at least on this machine! :)<br>
<br>
Rune, regexp and regexprep both give the "out of memory" error when<br>
used on such long strings on my slowwww computer... (at my work).<br>
I guess, it will be faster and able to be run on my new laptop...<br>
still waiting ........<br>
<br>
<br>
thanks again guys,<br>
best,<br>
arun.

Wed, 08 Jul 2009 11:01:04 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663523
Rune Allnor
On 8 Jul, 12:38, arun <aragorn1...@gmail.com> wrote:<br>
<br>
> Rune, regexp and regexprep both give the "out of memory" error when<br>
> used on such long strings on my slowwww computer... (at my work).<br>
> I guess, it will be faster and able to be run on my new laptop...<br>
> still waiting ........<br>
<br>
Why do you need to process the whole string at once?<br>
Just do like everybody else and split it up in many<br>
manageable parts.<br>
<br>
Rune

Wed, 08 Jul 2009 14:16:08 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663587
arun
On Jul 8, 1:01 pm, Rune Allnor <all...@tele.ntnu.no> wrote:<br>
> On 8 Jul, 12:38, arun <aragorn1...@gmail.com> wrote:<br>
><br>
> > Rune, regexp and regexprep both give the "out of memory" error when<br>
> > used on such longstringson my slowwww computer... (at my work).<br>
> > I guess, it will be faster and able to be run on my new laptop...<br>
> > still waiting ........<br>
><br>
> Why do you need to process the whole string at once?<br>
> Just do like everybody else and split it up in many<br>
> manageable parts.<br>
><br>
> Rune<br>
<br>
Yes, that is an alternative I have been using for quite a while coz of<br>
my system limitations. I wanted to know about other indexing methods<br>
that could accomplish this task even in my case, like the logical<br>
indexing with ismember shown by some of the members.<br>
<br>
thank you very much,<br>
best, arun.

Wed, 08 Jul 2009 17:50:34 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663665
Loren Shure
In article <d96393976d064e7589fc<br>
b6e988b1f16d@h2g2000yqg.googlegroups.com>, aragorn168b@gmail.com says...<br>
> On Jul 8, 11:21 am, "us " <u...@neurol.unizh.ch> wrote:<br>
> > "nor ki"<br>
> ><br>
> > > where did you find ismembc? is there a place to find undocumented functions?<br>
> ><br>
> > it is not an undocumented function...<br>
> > rather, look at ISMEMBER<br>
> ><br>
> > edit ismember;<br>
> > % and you'll find this at line #121  127<br>
> > %{<br>
> > % Two CHelper Functions are used in the code below:<br>
> ><br>
> > % ISMEMBC  S must be sorted  Returns logical vector indicating which<br>
> > % elements of A occur in S<br>
> > % ISMEMBC2  S must be sorted  Returns a vector of the locations of<br>
> > % the elements of A occurring in S. If multiple instances occur,<br>
> > % the last occurrence is returned <br>
> > %}<br>
> > % then, being an investigative person, you'll immediately do this<br>
> > which ismembc;<br>
> > % MLROOT\toolbox\matlab\ops\ismembc.mexw32 % < a MEX...<br>
> > % and play with it in the command window (timing and so on)<br>
> ><br>
> > it's often worthwhile to look at ML stock functions to<br>
> >  see how TMW does things (not always optimized...)<br>
> >  look for hidden gems...<br>
> ><br>
> > us<br>
> <br>
> nor ki,<br>
> <br>
> thank you for your suggestions. They work very well. Now my next<br>
> formidable task is to reshape this vector to a 192*240605 matrix. (My<br>
> actual task is to parse a file which is 270 MB long line by line and<br>
> do these operations. But I found another topic in whice UWE has shown<br>
> the fastest way to read a whole file onto a variable using fread and<br>
> now, I am trying to remove the unwanted entries and then shape them in<br>
> to the desired matrix. The old linebyline method takes about 3045<br>
> mins on this old computer.. so far, before the reshape step, without<br>
> out of memory error, it takes 1.5 mins. let me see!! )<br>
> <br>
> Uwe, yours also works like a charm. I personally dont see a difference<br>
> between ismember and ismembc, at least on this machine! :)<br>
> <br>
> Rune, regexp and regexprep both give the "out of memory" error when<br>
> used on such long strings on my slowwww computer... (at my work).<br>
> I guess, it will be faster and able to be run on my new laptop...<br>
> still waiting ........<br>
> <br>
> <br>
> thanks again guys,<br>
> best,<br>
> arun.<br>
> <br>
<br>
Help reshape.<br>
<br>
 <br>
Loren<br>
<a href="http://blogs.mathworks.com/loren">http://blogs.mathworks.com/loren</a>

Wed, 08 Jul 2009 19:53:01 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663707
Yair Altman
A few remarks:<br>
<br>
> > > where did you find ismembc? is there a place to find undocumented functions?<br>
<br>
One place to look is my blog: <a href="http://UndocumentedMatlab.com">http://UndocumentedMatlab.com</a><br>
Another place is this forum.<br>
Yet another is Matlab's own files, as noted by Us.<br>
<br>
> > it is not an undocumented function...<br>
> > rather, look at ISMEMBER...[snip]<br>
<br>
Actually, ismembc is an example of an internal helper function that is neither fully documented nor supported by TheMathWorks. <br>
<br>
> > it's often worthwhile to look at ML stock functions to<br>
> >  see how TMW does things (not always optimized...)<br>
> >  look for hidden gems...<br>
> ><br>
> > us<br>
<br>
This is true  most of what I've ever found about undocumented stuff in Matlab comes from Matlab's own sourcecode, which is part of the official installation. Note that this does *NOT* imply official support by MathWorks. The ruleofthum is that only something that appears in the online documentation (or the doc command) is officially supported.<br>
<br>
> Uwe, yours also works like a charm. I personally dont see a difference<br>
> between ismember and ismembc, at least on this machine! :)<br>
<br>
The difference is quite evident within large loops and/or large arrays. See here: <a href="http://undocumentedmatlab.com/blog/ismembcundocumentedhelperfunction/">http://undocumentedmatlab.com/blog/ismembcundocumentedhelperfunction/</a><br>
<br>
Yair Altman<br>
<a href="http://UndocumentedMatlab.com">http://UndocumentedMatlab.com</a> <br>

Thu, 09 Jul 2009 10:11:21 +0000
Re: manipulating strings
http://www.mathworks.com/matlabcentral/newsreader/view_thread/255491#663879
arun
On Jul 8, 7:50 pm, Loren Shure <lo...@mathworks.com> wrote:<br>
> In article <d96393976d064e7589fc<br>
> b6e988b1f...@h2g2000yqg.googlegroups.com>, aragorn1...@gmail.com says...<br>
><br>
><br>
><br>
> > On Jul 8, 11:21 am, "us " <u...@neurol.unizh.ch> wrote:<br>
> > > "nor ki"<br>
><br>
> > > > where did you find ismembc? is there a place to find undocumented functions?<br>
><br>
> > > it is not an undocumented function...<br>
> > > rather, look at ISMEMBER<br>
><br>
> > > edit ismember;<br>
> > > % and you'll find this at line #121  127<br>
> > > %{<br>
> > > % Two CHelper Functions are used in the code below:<br>
><br>
> > > % ISMEMBC  S must be sorted  Returns logical vector indicating which<br>
> > > % elements of A occur in S<br>
> > > % ISMEMBC2  S must be sorted  Returns a vector of the locations of<br>
> > > % the elements of A occurring in S. If multiple instances occur,<br>
> > > % the last occurrence is returned <br>
> > > %}<br>
> > > % then, being an investigative person, you'll immediately do this<br>
> > > which ismembc;<br>
> > > % MLROOT\toolbox\matlab\ops\ismembc.mexw32 % < a MEX...<br>
> > > % and play with it in the command window (timing and so on)<br>
><br>
> > > it's often worthwhile to look at ML stock functions to<br>
> > >  see how TMW does things (not always optimized...)<br>
> > >  look for hidden gems...<br>
><br>
> > > us<br>
><br>
> > nor ki,<br>
><br>
> > thank you for your suggestions. They work very well. Now my next<br>
> > formidable task is to reshape this vector to a 192*240605 matrix. (My<br>
> > actual task is to parse a file which is 270 MB long line by line and<br>
> > do these operations. But I found another topic in whice UWE has shown<br>
> > the fastest way to read a whole file onto a variable using fread and<br>
> > now, I am trying to remove the unwanted entries and then shape them in<br>
> > to the desired matrix. The old linebyline method takes about 3045<br>
> > mins on this old computer.. so far, before the reshape step, without<br>
> > out of memory error, it takes 1.5 mins. let me see!! )<br>
><br>
> > Uwe, yours also works like a charm. I personally dont see a difference<br>
> > between ismember and ismembc, at least on this machine! :)<br>
><br>
> > Rune, regexp and regexprep both give the "out of memory" error when<br>
> > used on such longstringson my slowwww computer... (at my work).<br>
> > I guess, it will be faster and able to be run on my new laptop...<br>
> > still waiting ........<br>
><br>
> > thanks again guys,<br>
> > best,<br>
> > arun.<br>
><br>
> Help reshape.<br>
><br>
> <br>
> Loren<a href="http://blogs.mathworks.com/loren">http://blogs.mathworks.com/loren</a><br>
<br>
When I meant, I have to "reshape" the arrays, I meant literally to use<br>
the "reshape" function. :)<br>
thank you,<br>
Arun.