read csv. convert char-array to matrix

3 views (last 30 days)
Per Sundqvist
Per Sundqvist on 2 Jun 2020
Edited: Stephen23 on 3 Jun 2020
Hi!
I have a csv-file that looks like below (there might be linebreak in the text below, shown is two rows)
structure is $d,$date (or $s) $f,$f,$f,$f,...
Nch=26;
strcmp1=['%f ${yyyy-MM-dd hh:mm:ss:ms}D '];
for j=1:Nch
strcmp1=[strcmp1,'%f '];
end
---
Headerlines...
1,2020-06-02 09:30:43:262,24.54,0,24.963,0,25.077,0,25.349,0,25.46,0,-54.779,0,25.439,0,25.492,0,25.516,0,25.627,0,25.656,0,-9.9E+37,0,-38.479,0,-26.866,0,-14.383,0,24.201,0,24.506,0,24.709,0,-9.9E+37,0,24.989,0,25.2,0,25.413,0,26.149,0,26.168,0,25.565,0,25.622,0
2,2020-06-02 09:30:53:246,24.527,0,24.963,0,25.072,0,25.333,0,25.468,0,-22.303,0,25.453,0,25.487,0,25.508,0,25.627,0,25.619,0,-9.9E+37,0,-15.862,0,-49.783,0,-43.28,0,24.185,0,24.534,0,24.723,0,-9.9E+37,0,24.982,0,25.197,0,25.411,0,26.148,0,26.145,0,25.553,0,25.62,0
---
I tried but does not work
fid = fopen(file, 'rt');
Data = textscan(fid, strcmp1, 'headerLines', 35,'Delimiter',',');
fclose(fid)
---
Data =
1×26 cell array
Columns 1 through 5
{0×1 double} {0×1 double}
---
I tried also readtable (file is a .csv)
X= readtable(file);
X2=X.Variables;
X3=X2(1:end-1,3:end) %bad data in last row, 2 first colums are %d and %date wich i dont want t see
str1=X3{1,1} %check 1 element, ok
X4=cellfun(@(str)strrep(str,' ',''), X3)
str2=convertCharsToStrings(str1)
strrep(str2,' ','')
--- (Part of output below)
>> X3
X3 =
51×52 cell array
Columns 1 through 5
{' 2 4 . 5 4 ' } {' 0 '} {' 2 4 . 9 6 3 '} {' 0 '} {' 2 5 . 0 7 7 '}
{' 2 4 . 5 2 7 '} {' 0 '} {' 2 4 . 9 6 3 '} {' 0 '} {' 2 5 . 0 7 2 '}
{' 2 4 . 5 3 5 '} {' 0 '} {' 2 4 . 9 4 2 '} {' 0 '} {' 2 5 . 0 7 2 '}
--
Question: I cannot figure out how to get rid of the space " " in the text above and convert it into %f float. Ho do I do that? It seemes that ' 2 4 . 5 4 ' is "char" and to use strrep it need t be string. I tried to convert it to string and then do strrep but it does not work anyway. I'm so close... I want the numerical matrix.
BR,
Per
  4 Comments
Per Sundqvist
Per Sundqvist on 3 Jun 2020
Hi again, I try now to read line by line
file='Y:\04 Provning\BypassVUCLprov\VUCLN BP 1600A 2020 with Temp\Agilent Tempdata\Data 0x2007 0x0957 6_2_2020 19_12_52.csv';%
fid = fopen(file);
for j=1:36
tline = fgetl(fid);%headerlines
end
while ischar(tline)
%disp(tline)
tline = fgetl(fid)
tline2=strrep(tline,',',';')
ix=strfind(tline2,';');
tline3=tline2(ix(2)+1:end) %skip 2 first
t = sscanf(tline3, '%f')
Dline = textscan(tline3,strcmp1)
end
fclose(fid);
However, I think the data-type is strange, its not a string, it says its char. Below is a comment from another thread that can give some interesting clue? Is the thing I read "ascii" and not string? It looks like normal text to me when I open it in excel. Matlab also do a print-out on the screen that look like text, but maby its something else?
---
"When you read a line of text it is returned as an array of characters (which us computer geeks refer to as a 'string'). When you asked for t(1) you were fooled into thinking it was numbers because your first value only contained a single character.
However, it's not. And you discovered this when you tried to get t(2). The ASCII value for the character '1' is 49, so that happy coincidence of is not so happy after all.
What you need to do is process the string to extract the numbers. Here, they are represented in base-10 form, which is how humans like to write numbers. But that's not how computers like their numbers, so it takes some code to translate."
---
BR/ Per

Sign in to comment.

Answers (1)

Stephen23
Stephen23 on 3 Jun 2020
Edited: Stephen23 on 3 Jun 2020
Because MATLAB does not handle UCS-2 file I first converted your file to UTF-8 (attached).
This imports all of the main matrix data:
opt = {'Delimiter',',','CollectOutput',true};
fnm = 'Data 0x2007 0x0957 6_2_2020 09_30_43.csv';
[fid,msg] = fopen(fnm,'rt');
assert(fid>=3,msg)
hdr = '';
while ~strncmpi(hdr,'SCAN,',5)
hdr = fgetl(fid);
end
hdr = regexp(hdr,',','split');
nmc = numel(hdr);
fmt = ['%f%s',repmat('%f',1,nmc-2)]; % or use DATETIME format instead of %s
out = textscan(fid,fmt,opt{:});
fclose(fid);
Giving all 51 rows of data:
>> out
out =
[51x1 double] {51x1 cell} [51x52 double]
Checking the first few rows of data:
>> out{2}(1:4)
ans =
'2020-06-02 09:30:43:262'
'2020-06-02 09:30:53:246'
'2020-06-02 09:31:03:246'
'2020-06-02 09:31:13:246'
>> out{3}(1:4,:)
ans =
Columns 1 through 8
24.54 0 24.963 0 25.077 0 25.349 0
24.527 0 24.963 0 25.072 0 25.333 0
24.535 0 24.942 0 25.072 0 25.33 0
24.519 0 24.958 0 25.075 0 25.336 0
Columns 9 through 16
25.46 0 -54.779 0 25.439 0 25.492 0
25.468 0 -22.303 0 25.453 0 25.487 0
25.487 0 9.019 0 25.442 0 25.508 0
25.492 0 -67.765 0 25.437 0 25.489 0
Columns 17 through 24
25.516 0 25.627 0 25.656 0 -9.9e+37 0
25.508 0 25.627 0 25.619 0 -9.9e+37 0
25.508 0 25.632 0 25.627 0 -59.366 0
25.516 0 25.637 0 25.635 0 19.199 0
Columns 25 through 32
-38.479 0 -26.866 0 -14.383 0 24.201 0
-15.862 0 -49.783 0 -43.28 0 24.185 0
-39.344 0 -8.778 0 -56.062 0 24.211 0
-23.072 0 -46.635 0 -20.713 0 24.235 0
Columns 33 through 40
24.506 0 24.709 0 -9.9e+37 0 24.989 0
24.534 0 24.723 0 -9.9e+37 0 24.982 0
24.523 0 24.722 0 -196.5 0 24.98 0
24.522 0 24.736 0 -9.9e+37 0 25.025 0
Columns 41 through 48
25.2 0 25.413 0 26.149 0 26.168 0
25.197 0 25.411 0 26.148 0 26.145 0
25.16 0 25.417 0 26.157 0 26.143 0
25.155 0 25.407 0 26.15 0 26.139 0
Columns 49 through 52
25.565 0 25.622 0
25.553 0 25.62 0
25.54 0 25.607 0
25.533 0 25.595 0
>>

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!