Textscan: how to ignore single '-' characters, while preserving '-' in negative numbers?

Question

R V on 13 Sep 2012

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/48058-textscan-how-to-ignore-single-characters-while-preserving-in-negative-numbers

I use the following code to read the block below.

fid = fopen('data.csv');
C = textscan(fid,'%s%f%f%f%f%f%f%f%f','headerlines',1,'delimiter',';');
fclose(fid);

Because of the single '-' characters in data.csv this does not work yet. I want to ignore single '-' characters from the input and use NaN values there.

How can I read single '-' characters as NaN? I tried 'TreatAsEmpty' but this leads to the situation where negative values are transformed to positive. Because negative values also include a '-' character, and 'TreatAsEmpty' also removes these.

Block:

Headerline
01-01-2006 (00 uur);-;-1.61;-;-0.70;-;1;-;239
01-01-2006 (01 uur);-;-1.66;-;-0.70;-;-;-;1108
01-01-2006 (02 uur);-;-1.68;-;-0.75;-;1;-;1827
01-01-2006 (03 uur);-;-1.64;-;-0.77;-;-;-;-
01-01-2006 (04 uur);-;-1.62;-;-0.74;-;-;-;-
01-01-2006 (05 uur);-;-1.61;-;-0.74;-;1;-;2053
01-01-2006 (06 uur);-;-1.66;-;-0.75;-;-;-;2870
01-01-2006 (07 uur);-;-1.68;-;-0.80;-;0;-;3585
01-01-2006 (08 uur);-;-1.64;-;-0.80;-;-;-;-
01-01-2006 (09 uur);-;-1.63;-;-0.79;-;-;-;-
01-01-2006 (10 uur);-;-1.62;-;-0.77;-;-;-;-
01-01-2006 (11 uur);-;-1.62;-;-0.74;-;1;-;3967

[EDITED, Jan, code and file contents formatted]

1 Comment
Show -1 older commentsHide -1 older comments

Jan on 13 Sep 2012

Yes, per.

Sign in to comment.

Sign in to answer this question.

Answer 1

per isakson on 13 Sep 2012

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/48058-textscan-how-to-ignore-single-characters-while-preserving-in-negative-numbers#answer_58678

Edited: per isakson on 13 Sep 2012

Open in MATLAB Online

Try this

    str = fileread('cssm.txt');
    str = strrep( str, '-;', 'nan;' );
    nl  = [char(13),char(10)];
    str = regexprep( str, [';-\s*',nl], [';nan',nl] );
    C = textscan( str,'%s%f%f%f%f%f%f%f%f','headerlines',1,'delimiter',';');

where cssm.txt contains the rows of text in the question

The approach is

read the whole file as text
replace the "-", which stands for missing, with NaN
parse the modified string with textscan

Note: the value of the variable, nl, must match the end of line characters in your file.

Jan, thanks for formatting the question.

4 Comments
Show 2 older commentsHide 2 older comments

per isakson on 13 Sep 2012

Edited: per isakson on 13 Sep 2012

Open in MATLAB Online

@Roel, I guess you need to change

nl = [char(13),char(10)];

to

nl = [char(10)];

according to my "Note:". You could check with

double( str(1:80) )

and look for the number "10". Is it preceeded by "13" or not?

.

@Matt, your expression is better; it is shorter and more robust. It actually checks whether "-" is followed by a digit. I tried the approach, but made a mistake:(. Thus replace

    str = strrep( str, '-;', 'nan;' );
    nl  = [char(13),char(10)];
    str = regexprep( str, [';-\s*',nl], [';nan',nl] );

by

str = regexprep(str,'-(?!\d)','nan')

R V on 14 Sep 2012

Brilliant! this works very well. Thanks a lot Per and Matt!

Sign in to comment.

Textscan: how to ignore single '-' characters, while preserving '-' in negative numbers?

1 Comment
Show -1 older commentsHide -1 older comments

Accepted Answer

4 Comments
Show 2 older commentsHide 2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Textscan: how to ignore single '-' characters, while preserving '-' in negative numbers?

1 Comment Show -1 older commentsHide -1 older comments

Accepted Answer

4 Comments Show 2 older commentsHide 2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments

4 Comments
Show 2 older commentsHide 2 older comments