MATLAB Answers

Import data tool speed

24 views (last 30 days)
Brad_EE
Brad_EE on 11 Dec 2020
Commented: Yair Altman on 13 Dec 2020
Platform: W10-64bit, 4 core, 500GB SSD, 32GB Ram
File to import description: 80MB, contains comma delimited numbers and short character arrays. Overall format 27 columns x 400K lines.
Clicking the import data tool icon prompts for a file name, shows a message "Opening a large text file ..." and then displays the file contents in a table format on the GUI after ~5 seconds. The table displayed matches the file contents.
On the same GUI I set the output type to cell array and range to A2:AA395201 (entire file minus header line). Clicking the Import Selection button displays message "Importing Data..." and a status bar that stays gray for 35 minutes before it suddenly disappears. At that point the import is complete and the variable name appears in the workspace.
Why does the initial opening large text file finish in 5 seconds but the import take 35 minutes? It seems for the opening large text file step to complete and display the data in the GUI table, it has essentially imported the data but 500X faster!

  0 Comments

Sign in to comment.

Accepted Answer

Yair Altman
Yair Altman on 13 Dec 2020
The import tool GUI only shows you a preview of the data, based on the top N lines in the file, it does not read and process the entire file. Only when you click the <Import Selection> button is the entire file processed based on the selected range that you specified and the file format detected by the preview. This naturally takes much longer than the preview processing.

  4 Comments

Show 1 older comment
Yair Altman
Yair Altman on 13 Dec 2020
Each time that you scroll, Matlab only needs to read and process a small number of lines. What you see is called "Percieved performance", it does not mean that the entire file is in fact loaded at once. The very fact that there is a small lag of a few secs each time that you scroll, tells you that this part of the file is being read and processed at this time (otherwise it would have been displayed immediately).
Brad_EE
Brad_EE on 13 Dec 2020
Perhaps it is reading 35 lines at a time based on the relative position of the scroll bar. So for instance if the scroll bar is pulled down 3/4 of its length, only 35 lines from that location in the file is actually read and displayed.
Yair Altman
Yair Altman on 13 Dec 2020
Yes, this is exactly what I meant

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!