Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
skipping lines quickly in large files

Subject: skipping lines quickly in large files

From: Mark Abramson

Date: 8 Jan, 2009 01:06:20

Message: 1 of 5

I have a MATLAB function that parses large ASCII data files using textscan and fscanf to search for text labels and then read in data that follows. However, in the applications that use this file, I typically only need data from one or two sections, rather than all 16. So I would like to condition on nargout so I can skip sections I don't need. The problem is that when I skip certain labels in this file, it actually takes MUCH longer to skip the lines than to read them. So my question is, is there something faster than fscanf (or textscan) that I can use to skip unneeded lines if I know in advance exactly how many lines I need to skip before I need to start looking for the next label?

Subject: skipping lines quickly in large files

From: Paul

Date: 8 Jan, 2009 02:55:03

Message: 2 of 5

"Mark Abramson" <mark.a.abramson@boeing.com> wrote in message <gk3jic$jrl$1@fred.mathworks.com>...
> I have a MATLAB function that parses large ASCII data files using textscan and fscanf to search for text labels and then read in data that follows. However, in the applications that use this file, I typically only need data from one or two sections, rather than all 16. So I would like to condition on nargout so I can skip sections I don't need. The problem is that when I skip certain labels in this file, it actually takes MUCH longer to skip the lines than to read them. So my question is, is there something faster than fscanf (or textscan) that I can use to skip unneeded lines if I know in advance exactly how many lines I need to skip before I need to start looking for the next label?

what about 'fgetl' ? It would seem that fscanf is doing formatting of the data, while fgetl is just loading a buffer with the character string and therefore should be faster. But this is just my guess.

Subject: skipping lines quickly in large files

From: Vadim Teverovsky

Date: 8 Jan, 2009 13:26:37

Message: 3 of 5

Having some sample code would help.

For example, how are you using textscan to skip? Are you actually reading
in the lines you don't want, or using the Headerlines parameter?

"Mark Abramson" <mark.a.abramson@boeing.com> wrote in message
news:gk3jic$jrl$1@fred.mathworks.com...
>I have a MATLAB function that parses large ASCII data files using textscan
>and fscanf to search for text labels and then read in data that follows.
>However, in the applications that use this file, I typically only need data
>from one or two sections, rather than all 16. So I would like to condition
>on nargout so I can skip sections I don't need. The problem is that when I
>skip certain labels in this file, it actually takes MUCH longer to skip the
>lines than to read them. So my question is, is there something faster than
>fscanf (or textscan) that I can use to skip unneeded lines if I know in
>advance exactly how many lines I need to skip before I need to start
>looking for the next label?
>

Subject: skipping lines quickly in large files

From: Peter Boettcher

Date: 8 Jan, 2009 14:14:28

Message: 4 of 5

"Mark Abramson" <mark.a.abramson@boeing.com> writes:

> I have a MATLAB function that parses large ASCII data files using
> textscan and fscanf to search for text labels and then read in data
> that follows. However, in the applications that use this file, I
> typically only need data from one or two sections, rather than all 16.
> So I would like to condition on nargout so I can skip sections I don't
> need. The problem is that when I skip certain labels in this file, it
> actually takes MUCH longer to skip the lines than to read them. So my
> question is, is there something faster than fscanf (or textscan) that
> I can use to skip unneeded lines if I know in advance exactly how many
> lines I need to skip before I need to start looking for the next
> label?

There's no way around reading every byte up to the line you want,
because you have to look at every byte to see if it is a newline
character in order to count lines. If fgetl isn't fast enough, you can
fread large blocks of data, then do something like

num_newlines = sum(datablock == sprintf('\n'));

But this then adds a fair bit of bookkeeping, as you need to "put back"
extra lines, or redo your parser to reach directly into a random spot in
your block of data, and handle overflows into the next block, etc.

Can you just parse the file once into a binary format? Binary random
access is MUCH easier.


-Peter

Subject: skipping lines quickly in large files

From: Yuri Geshelin

Date: 8 Jan, 2009 14:20:18

Message: 5 of 5

"Mark Abramson" <mark.a.abramson@boeing.com> wrote in message <gk3jic$jrl$1@fred.mathworks.com>...
> I have a MATLAB function that parses large ASCII data files using textscan and fscanf to search for text labels and then read in data that follows. However, in the applications that use this file, I typically only need data from one or two sections, rather than all 16. So I would like to condition on nargout so I can skip sections I don't need. The problem is that when I skip certain labels in this file, it actually takes MUCH longer to skip the lines than to read them. So my question is, is there something faster than fscanf (or textscan) that I can use to skip unneeded lines if I know in advance exactly how many lines I need to skip before I need to start looking for the next label?

Hi,

If you use fscanf and have control over the structure of your input file, consider fseek.

textscan is different though. You either parse your data with fseek or fscanf, those are different methods.

Yuri

Tags for this Thread

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us