Tall array read csv file

Hi why i am encounter this error while train my tall array data in machine learning model?
Error using matlab.io.datastore.TabularTextDatastore/readData (line 77)
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 6738, field number 4) ==> I/O Timeout,I/O Timeout,I/O
Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O
Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Timeout,I/O Time...
Learn more about errors encountered during GATHER.
FYI my csv file go (200000x344). i tried to remove row number 6738 but the error still occur.
please advise.

5 Comments

dpb
dpb on 20 Nov 2019
We'd have to be able to see the first part of the file up to and including some past the offending record to tell for sure, but it looks like that record contained a bunch of text fields, not numeric. Perhaps the following record did, too, after you removed the first one? But, w/ nothing to look at, it's just a guess.
Hi, thanks for your reply. how im going to remove past record data if it stored in my previous datastore? i dont think my file data in text field, if it in text field why only line 6738, not line 1? here part of my data:
Annotation 2019-11-20 091838.png
dpb
dpb on 20 Nov 2019
That would all depend upon what created the file and how. You also didn't show us the code line used to do the read so we can't know what function was used and thereby know what it uses for parsing...some of the higher-level routines have quite a bit of internal data content analysis built into them.
Open the input file in the editor and save 7000 or so records to a new file and attach it...trying to diagnose w/o data is just guessing at what might be...
hi,
i already found the problem. my file contain the worng data in row 56892 but it the error its stated 6738, maybe because tall array already chucked my data.
Annotation 2019-11-20 091838.png
BTW, thanks for your reply.
dpb
dpb on 20 Nov 2019
"the worng data in row 56892 but it the error its stated 6738, "
I've not used the tall array stuff, but apparently the error message line count is coming from the segment in use rather than being referenced back to the beginning of the file. That's probably worth a "Quality of Implementation" bug report to TMW to make debugging easier.
The klew to what was wrong is that it did echo the offending record content so a search for that string in the file would locate it...

Sign in to comment.

Answers (0)

Categories

Products

Release

R2019b

Asked:

on 19 Nov 2019

Commented:

dpb
on 20 Nov 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!