smartread.m

Indexed read function for large numeric .csv files
226 Downloads
Updated 29 May 2011

View License

This function operates on numeric CSV files. It will first check to see if there an index available for the file, and if there is none, it builds this index and gives it the filename of the file to be read appended with the letter "i" at the end. It will then use the line indices to quickly return the range of data requested. Once the index has been built, the function should return the range of data requested effectively "instantly" (I was able to get data blocks out of a 1 GB CSV file with ~1E5 rows and 1300 columns in about 0.15 seconds, as compared to around 50 seconds for dlmread). The performance benefit will be a function of both file size and data "shape", since it only indexes the position of the first element in each row (so files which have only a few rows but comparatively many columns will not see as much of a performance gain).

Row/column range is in the same format that is used by dlmread, so what Matlab calls column 1 would actually be column 0 for this function (same goes for rows).

Thanks to Walter Roberson for suggesting a means by which to index the files.

Cite As

Josh Warren (2024). smartread.m (https://www.mathworks.com/matlabcentral/fileexchange/31573-smartread-m), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2010a
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Large Files and Big Data in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.1.0.0

Edited file description.

1.0.0.0