Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
Importing poorly formatted data

Subject: Importing poorly formatted data

From: Nate Jensen

Date: 31 Mar, 2011 15:13:04

Message: 1 of 10

So, I have a fortran code that I can't change and unfortunately it outputs some not easy to import data.

The format basically looks something like this:

a = 10 23
b = 20 34

c = 59 d = 64
e = 13 f = 89

The format is the same between each file, it just looks weird. I am trying to extract the numerical values from each file.

Is there a good way to go about importing this data?

Subject: Importing poorly formatted data

From: Florin Neacsu

Date: 31 Mar, 2011 17:35:21

Message: 2 of 10

"Nate Jensen" wrote in message <in25m0$742$1@fred.mathworks.com>...
> So, I have a fortran code that I can't change and unfortunately it outputs some not easy to import data.
>
> The format basically looks something like this:
>
> a = 10 23
> b = 20 34
>
> c = 59 d = 64
> e = 13 f = 89
>
> The format is the same between each file, it just looks weird. I am trying to extract the numerical values from each file.
>
> Is there a good way to go about importing this data?

Hi,

If that is the exact format, you could try this :

fid=fopen('foo.txt','r');

    [a1]=fscanf(fid,'a = %d %d\n',2);
    [b1]=fscanf(fid,'b = %d %d\n\n',2);
    [c1]=fscanf(fid,'c = %d d = %d\n',2);
    [e1]=fscanf(fid,'e = %d f = %d\n',2);
 
fclose(fid);

Hope it helps,
Florin

Subject: Importing poorly formatted data

From: dpb

Date: 31 Mar, 2011 18:17:16

Message: 3 of 10

On 3/31/2011 10:13 AM, Nate Jensen wrote:
> So, I have a fortran code that I can't change and unfortunately it
> outputs some not easy to import data.
>
> The format basically looks something like this:
>
> a = 10 23
> b = 20 34
>
> c = 59 d = 64
> e = 13 f = 89
>
> The format is the same between each file, it just looks weird. I am
> trying to extract the numerical values from each file.
>
> Is there a good way to go about importing this data?

One could probably rig up a regular expressions pattern, but my version
of ML precedes its introduction so I'll not guess about it...

I'd preprocess the files...something like

fidi=fopen(filename,'rt');
fido=fopen(fileproc,'rt');

while ~feof(fidi)
   s = fgetl(fidi);
   if ~isempty(s)
     strrep(s(~isletter(s)),'=','');
     fprintf(fido,'%s\n',s);
   end
end
fidi=fclose(fidi);
fido=fclose(fido);

Then you can open the preprocessed file and read it into an array w/
textscan as space-delimited (or, of course, you could fscanf the munged
string and concatenate values in the above loop).

If choose the latter, you'll want to preallocate some size and index
into that if the files are of any size at all; if the example is the
full size that's easy. If they're variable or of unknown length the
former makes those issues go away basically by letting textscan do the
automagic allocation.

--

Subject: Importing poorly formatted data

From: Nate Jensen

Date: 31 Mar, 2011 18:30:21

Message: 4 of 10

> Hi,
>
> If that is the exact format, you could try this :
>
> fid=fopen('foo.txt','r');
>
> [a1]=fscanf(fid,'a = %d %d\n',2);
> [b1]=fscanf(fid,'b = %d %d\n\n',2);
> [c1]=fscanf(fid,'c = %d d = %d\n',2);
> [e1]=fscanf(fid,'e = %d f = %d\n',2);
>
> fclose(fid);
>
> Hope it helps,
> Florin

Solid, thank you

Subject: Importing poorly formatted data

From: dpb

Date: 31 Mar, 2011 18:35:55

Message: 5 of 10

On 3/31/2011 1:30 PM, Nate Jensen wrote:
>> Hi,
>>
>> If that is the exact format, you could try this :
>>
>> fid=fopen('foo.txt','r');
>>
>> [a1]=fscanf(fid,'a = %d %d\n',2);
>> [b1]=fscanf(fid,'b = %d %d\n\n',2);
>> [c1]=fscanf(fid,'c = %d d = %d\n',2);
>> [e1]=fscanf(fid,'e = %d f = %d\n',2);
>>
>> fclose(fid);
>>
>> Hope it helps,
>> Florin
>
> Solid, thank you

And if it isn't exact, I just posted a cheep but cheery preprocessing
script.

--

Subject: Importing poorly formatted data

From: Florin Neacsu

Date: 31 Mar, 2011 18:43:05

Message: 6 of 10

dpb <none@non.net> wrote in message <in2hib$h9a$2@dont-email.me>...
> On 3/31/2011 1:30 PM, Nate Jensen wrote:
> >> Hi,
> >>
> >> If that is the exact format, you could try this :
> >>
> >> fid=fopen('foo.txt','r');
> >>
> >> [a1]=fscanf(fid,'a = %d %d\n',2);
> >> [b1]=fscanf(fid,'b = %d %d\n\n',2);
> >> [c1]=fscanf(fid,'c = %d d = %d\n',2);
> >> [e1]=fscanf(fid,'e = %d f = %d\n',2);
> >>
> >> fclose(fid);
> >>
> >> Hope it helps,
> >> Florin
> >
> > Solid, thank you
>
> And if it isn't exact, I just posted a cheep but cheery preprocessing
> script.
>
> --


Well , it depends if OP uses a linux based OS or not. He can always use 'sed' and delete all the unwanted data.

But it might be beyond his needs.

Florin

Subject: Importing poorly formatted data

From: Florin Neacsu

Date: 31 Mar, 2011 19:14:05

Message: 7 of 10

"Florin Neacsu" <fneacsu2@gmail.com> wrote in message <in2hvp$181$1@fred.mathworks.com>...
> dpb <none@non.net> wrote in message <in2hib$h9a$2@dont-email.me>...
> > On 3/31/2011 1:30 PM, Nate Jensen wrote:
> > >> Hi,
> > >>
> > >> If that is the exact format, you could try this :
> > >>
> > >> fid=fopen('foo.txt','r');
> > >>
> > >> [a1]=fscanf(fid,'a = %d %d\n',2);
> > >> [b1]=fscanf(fid,'b = %d %d\n\n',2);
> > >> [c1]=fscanf(fid,'c = %d d = %d\n',2);
> > >> [e1]=fscanf(fid,'e = %d f = %d\n',2);
> > >>
> > >> fclose(fid);
> > >>
> > >> Hope it helps,
> > >> Florin
> > >
> > > Solid, thank you
> >
> > And if it isn't exact, I just posted a cheep but cheery preprocessing
> > script.
> >
> > --
>
>
> Well , it depends if OP uses a linux based OS or not. He can always use 'sed' and delete all the unwanted data.
>
> But it might be beyond his needs.
>
> Florin

@dpb :

Sorry, I just saw the first lines of your post

"One could probably rig up a regular expressions pattern, but my version
of ML precedes its introduction so I'll not guess about it..."

That makes my suggestion completely unnecessary.

Regards,
Florin

P.S. I am not that good with sed but OP could try this :

sed 's/[a-z].=/''/g' foo.txt > temp.txt

Subject: Importing poorly formatted data

From: Nate Jensen

Date: 31 Mar, 2011 19:15:21

Message: 8 of 10

> > And if it isn't exact, I just posted a cheep but cheery preprocessing
> > script.
> >
> > --
>
>
> Well , it depends if OP uses a linux based OS or not. He can always use 'sed' and delete all the unwanted data.
>
> But it might be beyond his needs.
>
> Florin

I have a windows OS actually, so I do not believe that 'sed' will help me out. That's only for unix systems, correct?

Thank you though for your input,

Nate

Subject: Importing poorly formatted data

From: dpb

Date: 31 Mar, 2011 19:19:58

Message: 9 of 10

On 3/31/2011 2:14 PM, Florin Neacsu wrote:
...

> @dpb :
>
> Sorry, I just saw the first lines of your post

No problem... :)

> "One could probably rig up a regular expressions pattern, but my version
> of ML precedes its introduction so I'll not guess about it..."
>
> That makes my suggestion completely unnecessary.
> Regards,
> Florin
>
> P.S. I am not that good with sed but OP could try this :
>
> sed 's/[a-z].=/''/g' foo.txt > temp.txt

I've never really mastered them from the days of the Brief programmers'
editor where I first came across the idea 30 yrs ago now...of course, it
implemented a unique version that didn't match anything else very much
and I've never recovered.

Of course, the was the days of "type in a random set of gibberish to a
TECO edit session and see what happens", too... :)

--

Subject: Importing poorly formatted data

From: dpb

Date: 31 Mar, 2011 22:03:31

Message: 10 of 10

On 3/31/2011 2:15 PM, Nate Jensen wrote:
...

> I have a windows OS actually, so I do not believe that 'sed' will help
> me out. That's only for unix systems, correct?
...

There are various ports to Windows but I can't speak to any of them.

--

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us