Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
read the un-structured data by textread

Subject: read the un-structured data by textread

From: edward kabanyas

Date: 31 Jan, 2012 06:32:10

Message: 1 of 3

Hi All,

I would like to read the following data in more effective way:

MRR 120101000000 UTC+07 AVE 60 STP 150 ASL 865 SMP 125e3 SVS 6.0.0.2 DVS 6.00 DSN 99030001 CC 5477180 MDQ 100 TYP AVE
H 150 300 450 600 750 900 1050 1200 1350 1500 1650 1800 1950 2100 2250 2400 2550 2700 2850 3000 3150 3300 3450 3600 3750 3900 4050 4200 4350 4500 4650
TF 0.0149 0.0515 0.1171 0.2042 0.3018 0.4032 0.5067 0.6055 0.6980 0.7743 0.8427 0.8914 0.9285 0.9588 0.9855 0.9981 1.0000 0.9899 0.9817 0.9781 0.9593 0.9391 0.9140 0.8827 0.8528 0.8227 0.7970 0.7645 0.7290 0.6368 0.4398
F00 -67.04 -77.02 -75.32 -60.88
F01 -69.27 -77.79 -76.28 -63.22
F02 -76.39 -95.13 -95.35 -96.18 -84.30 -86.71 -90.38 -81.63 -81.08 -69.78
F03 -88.74 -92.98 -92.90 -94.99 -97.68 -98.63 -92.56 -85.04 -87.23 -94.19 -81.08 -76.06
F04 -92.28 -97.11 -94.35 -99.44 -85.00 -87.29 -84.38 -81.09 -78.10
MRR 120101000100 UTC+07 AVE 60 STP 150 ASL 865 SMP 125e3 SVS 6.0.0.2 DVS 6.00 DSN 99030001 CC 5477180 MDQ 100 TYP AVE
H 150 300 450 600 750 900 1050 1200 1350 1500 1650 1800 1950 2100 2250 2400 2550 2700 2850 3000 3150 3300 3450 3600 3750 3900 4050 4200 4350 4500 4650
TF 0.0149 0.0515 0.1171 0.2042 0.3018 0.4032 0.5067 0.6055 0.6980 0.7743 0.8427 0.8914 0.9285 0.9588 0.9855 0.9981 1.0000 0.9899 0.9817 0.9781 0.9593 0.9391 0.9140 0.8827 0.8528 0.8227 0.7970 0.7645 0.7290 0.6368 0.4398
F00 -66.97 -92.20 -75.97 -73.48 -60.43
F01 -69.22 -78.07 -75.10 -62.80
F02 -76.29 -92.57 -90.66 -84.95 -82.30 -79.59 -69.91
F03 -88.27-101.45 -99.00 -94.58 -90.69 -98.91 -91.23 -85.12 -90.68 -88.75 -83.04 -81.81 -90.00 -76.63
F04 -101.44 -92.11 -93.18 -109.18 -87.99 -87.17 -89.44 -98.74 -94.04 -89.20 -85.90 -96.34 -93.96 -84.02 -84.92 -83.02 -77.30

The above data are only example. It is hourly data with many white space. Now I am using textread to read the data, however, because the data size is large enough (> 70 mB), reading each parameter will cause a problem of "out of memory".

[t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 ...
 t21 t22 t23 t24 t25 t26 t27 t28 t29 t30 t31 t32]= textread('test.txt', ...
'%s %d %s %s %d %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %f %f %f %f %f %f %f %f %f');

Question, do you have another option to read the data ? Can we use grep of perl or shell script in Matlab ?

thanks for best help

EDward

Subject: read the un-structured data by textread

From: dpb

Date: 31 Jan, 2012 15:22:32

Message: 2 of 3

On 1/31/2012 12:32 AM, edward kabanyas wrote:
> Hi All,
>
> I would like to read the following data in more effective way:
>
> MRR 120101000000 UTC+07 AVE 60 STP 150 ASL 865 SMP 125e3 SVS 6.0.0.2 DVS
...

> The above data are only example. It is hourly data with many white
> space. Now I am using textread to read the data, however, because the
> data size is large enough (> 70 mB), reading each parameter will cause a
> problem of "out of memory".
>
> [t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 ...
> t21 t22 t23 t24 t25 t26 t27 t28 t29 t30 t31 t32]= textread('test.txt', ...
> '%s %d %s %s %d %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %f
> %f %f %f %f %f %f %f %f');
>
> Question, do you have another option to read the data ? Can we use grep
> of perl or shell script in Matlab ?
...

textread() is deprecated; use textscan() instead--it's more flexible.

Your read above wouldn't work on the datafile given anyway; the format
would be applied repetitively over each record in the file and the first
one w/ the shorter record will abort on a mismatch.

Surely you don't need all the character strings in Matlab and you won't
be able to hold them in anything except cell arrays unless you can
figure out how to regularize the data, anyway.

70 MB isn't _that_ large but I would think you need to evaluate what
your end need is first, then figure out how to get the data into a form
useful for that purpose.

Note, btw, that you can use the '%*' prefix on a field format specifier
to skip an unwanted field. Also, since textscan() uses the file handle
returned from an fopen() call, it can be called in a loop or
sequentially w/ different format strings to handle the nonuniformity of
the input file that textread() can't deal with.

One could always use any other tool one is more familiar with to clean
up the file externally before bringing it into Matlab, of course.
regexp() is implemented in later version of Matlab but it doesn't look
to me as though it would be the best tool for the job here (altho I'm
about the world's worst in advising in that toolset).

--

Subject: read the un-structured data by textread

From: edward kabanyas

Date: 13 Feb, 2012 06:10:09

Message: 3 of 3

Hi dpb;

thanks for your nice suggestion, it seems that textscan is more applicable for my task. Thanks again,

Edward



<none@non.net> wrote in message <jg90vn$6hq$1@speranza.aioe.org>...
> On 1/31/2012 12:32 AM, edward kabanyas wrote:
> > Hi All,
> >
> > I would like to read the following data in more effective way:
> >
> > MRR 120101000000 UTC+07 AVE 60 STP 150 ASL 865 SMP 125e3 SVS 6.0.0.2 DVS
> ...
>
> > The above data are only example. It is hourly data with many white
> > space. Now I am using textread to read the data, however, because the
> > data size is large enough (> 70 mB), reading each parameter will cause a
> > problem of "out of memory".
> >
> > [t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15 t16 t17 t18 t19 t20 ...
> > t21 t22 t23 t24 t25 t26 t27 t28 t29 t30 t31 t32]= textread('test.txt', ...
> > '%s %d %s %s %d %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %f
> > %f %f %f %f %f %f %f %f');
> >
> > Question, do you have another option to read the data ? Can we use grep
> > of perl or shell script in Matlab ?
> ...
>
> textread() is deprecated; use textscan() instead--it's more flexible.
>
> Your read above wouldn't work on the datafile given anyway; the format
> would be applied repetitively over each record in the file and the first
> one w/ the shorter record will abort on a mismatch.
>
> Surely you don't need all the character strings in Matlab and you won't
> be able to hold them in anything except cell arrays unless you can
> figure out how to regularize the data, anyway.
>
> 70 MB isn't _that_ large but I would think you need to evaluate what
> your end need is first, then figure out how to get the data into a form
> useful for that purpose.
>
> Note, btw, that you can use the '%*' prefix on a field format specifier
> to skip an unwanted field. Also, since textscan() uses the file handle
> returned from an fopen() call, it can be called in a loop or
> sequentially w/ different format strings to handle the nonuniformity of
> the input file that textread() can't deal with.
>
> One could always use any other tool one is more familiar with to clean
> up the file externally before bringing it into Matlab, of course.
> regexp() is implemented in later version of Matlab but it doesn't look
> to me as though it would be the best tool for the job here (altho I'm
> about the world's worst in advising in that toolset).
>
> --

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us