read data from csv file & fast processing

i have a data like which is read from csv file like
10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434
i have 9lakhs of data like this what is best way to separate by ; & form a matrix
because now it is consuming to much of time
Thanks in advance

12 Comments

What does "9lakhs" mean? What exactly is time consuming in your program?
9lakhs row of data
eg:
10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 ......... ............
What command are you using to read your data in? You could just use the uiimport wizard, or importdata command specifying ";" as a delimiter
@shaz: "9lakhs row of data" is not helpful for me, because I cannot imagine what a "9lakhs" is. I even do not know any term which starts with a number. It would be helpful if you use English words only.
I ask again: What is the time-consuming part you want to get fixed?
the file has got headers at the first row & then followed by data
using textscan to read the data from file
strtok of each elememt(row wise) & placing in a matrix form is consuming lot of time because forloop needs to run lakhs of time
Jan
Jan on 7 Nov 2012
Edited: Jan on 7 Nov 2012
Dear shaz, I'm not sure if you are kidding. While I do not know "9lakhs", I even do not know the unnumbered "lakhs" also. Please, could somebody enlighten me what this term means? Perhaps "lots"?
Please show us the lines of code, which consume more time than you expect. A text description of your program does not allow to improve your code.
per isakson
per isakson on 7 Nov 2012
Edited: per isakson on 7 Nov 2012
@Jan: "A lakh is a unit in the South Asian numbering system equal to one hundred thousand. [...] In Indian English the word is used both as an attributive and non-attributive noun, and with either a marked ("-s") or unmarked plural" - wiki informed me.
@shaz: is the file free from decimal separators?
No.here is the example
a;b;c;d;f;g;i;p;t;r;l;e;r;y;u % Headers
10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434
Jan
Jan on 8 Nov 2012
Edited: Jan on 8 Nov 2012
@per: Thanks! I'm surprised, that shaz does not want to explain this by himself. Perhaps he or she is not interested in my answers.
@shaz: Please post comments to the already suggested solutions, such that we can knwo, if the problem is solved or how the suggestion could be improved.
a;b;c;d;f;g;i;p;t;r;l;e;r;y;u % Headers
10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434
10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434
10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434
....................% data
.....................% data
like wise we have 9 lakh rows(900000 rows) of data & one row of header
now how to get the headers in cell format(1st row) & rest data in matrix format
@shaz: Although it looks obvious, that you neither read the comments nor care about them: Please stop using the term "lakh" without any further explanations, because it is not part of the English language such that it might confuse the readers. It is very friendly that Per explained this, but this has been your your turn actually.
Can you imagine the effects, when you do not care about questions for clarifications?

Sign in to comment.

 Accepted Answer

Jan
Jan on 7 Nov 2012
Edited: Jan on 8 Nov 2012
fid = fopen(FileName, 'r');
if fid == -1, error('Cannot open file!'); end
header = fgetl(fid); % Read header line
pos = ftell(fid); % Store position [EDITED: fseek -> ftell]
line1 = fgetl(fid); % first data line
data = sscanf(line1, '%g;', Inf);
ncol = length(data);
fmt = repmat('%g; ', 1, ncol);
fseek(fid, pos, 'bof'); % Restore file position
data = fscanf(fid, fmt, Inf);
data = transpose(reshape(data, ncol, []));
fclose(fid);
Please check the reshape and transpose, if it satisfies your needs.

3 Comments

a;b;c;d;f;g;i;p;t;r;l;e;r;y;u % Headers 10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;4.5;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 10;20;2;3;45;56;87;56;988;434;10;20;2;3;45;56;87;56;988;434 ....................% data .....................% data
like wise we have 9 lakh rows(900000 rows) of data & one row of header
now how to get the headers in cell format(1st row) & rest data in matrix format
Does this comment contain a question?
@simon: thanks for the support...great

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import and Analysis in Help Center and File Exchange

Tags

Asked:

on 7 Nov 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!