How to load portions of .dat file

Hello everyone,
I'm trying to load a large .dat file. It is made by a 3x15000000 matrix. So I want to load only a portion of it time, like 3x30000 at one time, analyze that portion and then load another 3x30000 portion. Can someone help me? Thank you

6 Comments

What is the format of your data file? Attach some sample data if possible.
It is machine format with big-endian byte ordering
INT8, INT16, INT32, INT64 or unsiged, or single or double floating point?
unsigned, uint32
doc fopen
doc fread
doc fseek
Thank you so much, fseek was the function I was searching for

Sign in to comment.

Answers (1)

[fid, msg] = fopen(FileName, 'r', 'b');
assert(fid > 0, msg);
while true
data = fread(fid, [3, 30000], 'uint32');
if numel(data) < 3*30000
break;
end
... process the block of data here
end

7 Comments

sorry but only the first portion of file is loaded. What about loading the second portion, so from 30000 to 60000 for example?
The while true ensures that after processing the block of data, it will go back and try to read the next block.
it doesn't work, it only loads a 3x1296 block of data
If you ask to fread() with [3, 30000] size request and you only get [3, 1296] back, then your file does not contain 90000 elements of the requested size starting from that position.
Try the following test:
At the point where you are just about to do the fread() do
current_position = ftell(fid)
and after the end of loop do
fseek(fid, 0, 'eof');
last_position = ftell(fid)
At this point, last_position should be exactly the file size in bytes, and current_position would be the number of bytes from the beginning of file that the file was positioned at when it was just about to do the fread()
The maximum number of columns of 3 uint32 that can be read from the file would be (last_position / 4)/3 -- there just isn't any more data in the file.
One thing I would wonder is whether possibly the dat file is compressed data, such as equivalent to zip. If so then there is the possibility that the [3, 30000] is represented in the file, but in a form that requires decoding.
the result from last_position is 65175552
65175552/4/3
ans = 5431296
so your file has room for about 18 times as much data as would be needed for [3, 30000], and the data = fread(fid, [3, 30000], 'uint32'); should have succeeded unless you had already read a portion of the file (or had used fseek to get to near the end.)
If you have a structured binary file, consider using memmapfile()

Sign in to comment.

Categories

Find more on Simulink in Help Center and File Exchange

Asked:

on 5 Oct 2022

Commented:

on 6 Oct 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!