MATLAB Answers

1

Problem importing large text file (.txt) of data

Asked by Marcus HS on 13 Jun 2016
Latest activity Answered by Dr. Oscar Gaete on 16 Sep 2018 at 21:05

Hey everybody,

I have a space delimited textfile with about 30.000 rows looking like this: | |

Ping: 37.639 ms
Download: 34.35 Mbit/s
Upload: 5.59 Mbit/s
Start: Tue May 31 08:18:01 2016
 10.|-- zrh04s08-in-x07.1e102.net  0.0%    10   23.5  26.6  22.9  42.8   5.9
  9.|-- fra02s15-in-x0d.1e102.net 10.0%    10   91.4 105.4  27.3 184.9  66.2
Ping: 56.94 ms
Download: 28.81 Mbit/s
Upload: 5.66 Mbit/s
Start: Tue May 31 08:20:01 2016
  9.|-- zrh04s08-in-x07.1e102.net  0.0%    10   25.1  23.4  19.3  32.2   3.6
  9.|-- fra02s15-in-x0d.1e102.net  0.0%    10   20.9  21.9  19.7  24.2   1.1
.
.
.||

with the "Import Data" option in Matlab it is easy extracting the data I want. The problem is that as soon as I import more than 10.000 cells at once for example the Time (Format HH:mm:ss) changes from this (<10.000 cells):

'NaT'
'08:16:01'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'08:18:01'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'08:20:01'
'NaT'
'NaT'
'NaT'

to this (>10.000 cells):

'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'
'NaT'
'NaT'
'13-Jun-2016'
'NaT'
'NaT'
'NaT'

There is no error message whatsoever. I know I could just import the data in blocks, but since I have more Data to analyze it would be lots of unnessesary work and I would really like to understand what the problem is rather than working around it. Thanks a lot! (btw I am pretty new to Matlab)

  3 Comments

What data are you trying to read (other than the time and is the date of no interest; just the time)?

Since the file looks regular in formatting, I'd write a specific format string and use textscan to return precisely what wanted and only what wanted--but to do that need to know what that is...

Hi Marcus,

As mentioned in the following documentation while importing dates and times, MATLAB interprets them as text strings unless one specify that they should be interpreted as date and time information:

http://www.mathworks.com/help/matlab/import_export/import-formatted-dates-and-times.html

Hence,'textscan' would be a good choice to read formatted data from text file which gives a better control to specify formats. If you choose to generate script from the 'Import' dialog, you would see that even it uses 'textscan' to read the formatted data from the file.

Refer the following documentation for more details on 'textscan':

http://www.mathworks.com/help/matlab/ref/textscan.html

Since you also mentioned about reading in blocks, for this you could refer to the following documentation:

http://www.mathworks.com/help/matlab/import_export/import-large-text-files.html

However, it is strange that for rows till 10,000 you saw one format and then different format.Once you imported data from the file using 'Import Data', did you try to change the format of the columns to check if you see the same behavior?

-Ritesh

  • '13-Jun-2016' is the date of your test. It doesn't come from the text file - I guess.
  • Have you inspected the end of the file in an editor? I guess, the "time-strings" differ after row 10.000.

Sign in to comment.

2 Answers

Answer by Shameer Parmar on 16 Jun 2016

Hello Marcus,

Try using fopen() command instead of importdata()

   clear all;
   count = 1;
   fid = fopen('ascii_file.txt');
   tline = fgetl(fid);
   while ischar(tline)
       disp(tline);
       if (tline ~= -1)
	  data(count,:) = {tline};
       else
	  data(count,:) = {''};
       end
       count = count + 1;
       tline = fgetl(fid);
   end
   fclose(fid);

then use array variable 'data' for next operation.

  0 Comments

Sign in to comment.


Answer by Dr. Oscar Gaete on 16 Sep 2018 at 21:05

Same problem. Using the Import Data GUI, if importing >10.000 the datetime values get corrupted. Solution: In the GUI, instead of importing the data directly, generate a function. Now, call that function from the command window or from a script. That worked for me in version R2016b. Cheers

  0 Comments

Sign in to comment.