Trouble with textscan and large .dat files

1 view (last 30 days)
I am trying to import specific values from a very large .dat file (use dummy.dat).
These values are in a single column, that is extremely long (700000 rows). I am trying to pick out specific values within this column and then move on without importing the whole column.
When I use
A = importdata('dummy.dat')
I get a nice [700000x 1] array in my workspace, so that works but again, I don't want to take the time to import the whole thing.
When I use
fid=fopen('dummy.dat');
A = textscan(fid,%f,'delimiter','')
I get a 1 x 1 cell in which the cell is a [700000 x 1] double, so that works, but I am still importing the whole thing.
Say I want to pick out the number that is in the 5th row, and only that number. I am trying:
fid=fopen('dummy.dat');
A = textscan(fid,%f,1,'delimiter','','headerlines',4)
For some reason, when I do this, the single column nature of the .dat file is changed into 4 columns so instead of reading
1
2
3
4
5
6...
I get
1 2 3 4
5 6 ...
Which is screwing up my rows and headerlines and what values I am reading.
Anyone know whats going on here?
Thanks.
  1 Comment
Walter Roberson
Walter Roberson on 8 May 2013
What is your intention in setting the delimiter to '' ? Why not just leave the delimiter unspecified ?

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 8 May 2013
If you are importing the same file multiple times, I suggest reading it once and writing a version of it in binary. Then, each time you want to read, knowing which position you want to start at, you can fseek() to the (position - 1) * (the size in bytes of a single entry) and fread() from there.
  2 Comments
kschau
kschau on 8 May 2013
I would but unfortunately I need to extract a few data points from one .dat file and then move on to another many many times.
kschau
kschau on 11 Jun 2013
Trick was to just compile ALL the files into one long binary string and then just remember byte sequence to jump quickly between what were separate .dat files. Thanks for the advice!

Sign in to comment.

More Answers (1)

Gabriel
Gabriel on 11 Jun 2013
If you don't care about speed at all, The easiest way is to use fgetl to read each line, then textscan on each line to grab what you want. Slow but easy.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!