How to load portions of .dat file
Show older comments
Hello everyone,
I'm trying to load a large .dat file. It is made by a 3x15000000 matrix. So I want to load only a portion of it time, like 3x30000 at one time, analyze that portion and then load another 3x30000 portion. Can someone help me? Thank you
6 Comments
Chunru
on 5 Oct 2022
What is the format of your data file? Attach some sample data if possible.
Raffaella Assogna
on 5 Oct 2022
Jan
on 5 Oct 2022
INT8, INT16, INT32, INT64 or unsiged, or single or double floating point?
Raffaella Assogna
on 5 Oct 2022
Chunru
on 6 Oct 2022
doc fopen
doc fread
doc fseek
Raffaella Assogna
on 6 Oct 2022
Answers (1)
Jan
on 5 Oct 2022
[fid, msg] = fopen(FileName, 'r', 'b');
assert(fid > 0, msg);
while true
data = fread(fid, [3, 30000], 'uint32');
if numel(data) < 3*30000
break;
end
... process the block of data here
end
7 Comments
Raffaella Assogna
on 5 Oct 2022
Walter Roberson
on 5 Oct 2022
The while true ensures that after processing the block of data, it will go back and try to read the next block.
Raffaella Assogna
on 5 Oct 2022
Walter Roberson
on 6 Oct 2022
If you ask to fread() with [3, 30000] size request and you only get [3, 1296] back, then your file does not contain 90000 elements of the requested size starting from that position.
Try the following test:
At the point where you are just about to do the fread() do
current_position = ftell(fid)
and after the end of loop do
fseek(fid, 0, 'eof');
last_position = ftell(fid)
At this point, last_position should be exactly the file size in bytes, and current_position would be the number of bytes from the beginning of file that the file was positioned at when it was just about to do the fread()
The maximum number of columns of 3 uint32 that can be read from the file would be (last_position / 4)/3 -- there just isn't any more data in the file.
One thing I would wonder is whether possibly the dat file is compressed data, such as equivalent to zip. If so then there is the possibility that the [3, 30000] is represented in the file, but in a form that requires decoding.
Raffaella Assogna
on 6 Oct 2022
65175552/4/3
so your file has room for about 18 times as much data as would be needed for [3, 30000], and the data = fread(fid, [3, 30000], 'uint32'); should have succeeded unless you had already read a portion of the file (or had used fseek to get to near the end.)
Walter Roberson
on 6 Oct 2022
If you have a structured binary file, consider using memmapfile()
Categories
Find more on Simulink in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!