fprintf is printing strange characters instead of numbers

Question

0 votes

output.txt

I want to print a vector of unsigned integers to a text file, with a space between each number. But the file I get is just weird symbols. I must be doing some trivial mistake, it's not the first time it happens, I can't remember what could be the fix. It's happening on R2018b (but I remember it happening on older versions as well). Here's sample code below:

clear
data = uint32(zeros(1, 1615));
data(1:2:50) = 1;
output = fopen('output.txt', 'wt');
fprintf(output, '%d ', data);
fclose(output);

Output I get is: ‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‱‰‰‰‰‰‰‰‰‰‰ ...

Output I want is: 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ...

17 Comments
Show 15 older comments Hide 15 older comments

Walter Roberson on 3 Oct 2018

Yes, unfortunately if there is no byte order mark then it gets more difficult to figure out without false positives.

I have some internal versions of the detection routine that proceed to detect zip and image files and xls and .mat files; my intention was to proceed onwards to detect character encoding such as ISO-8896-1 versus ISO-8896-6 or SHIFT-JIS. I researched that, but once I got into the windows code pages, the number of cases was starting to feel too big for me to bother. Some of the cases could not be told apart (except perhaps by statistics); others had only a single code point difference (that is, one code point was assigned a meaning in one character set but not in the other character set), so if you saw the one codepoint you could disprove a particular character set, but lack of it would not prove the other...

Rik on 3 Oct 2018

That is why I came close to giving up, even for telling UTF-8 and windows-1252 apart, which should be relatively easy, but turns out to be non-trivial. Notepad++ seems to do a better job than what I can manage with Matlab.

For the files I'm using, my FEX submission works, but I have no idea how future-proof that function is (or past-proof for that matter).

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 3 Oct 2018

Edited: Guillaume on 3 Oct 2018

1 vote

The problem is due to Notepad broken encoding detection algorithm. For some reason it assumes that the file is encoded in UTF16, where the byte sequence [49 32] (characters '1' and ' ' in ansi and UTF8) indeed represents the character '‱'. Note that simply adding a space, ' ', before your sequence of numbers completely change the behaviour of Notepad's detection algorithm. With a space at the start, it correctly interprets the file as ANSI or UTF8.

As Stephen said don't use notepad. It's broken (and very limited in functionality). A good alternative is the open source Notepad++. You can also simply use matlab's editor.

edit: another option would be to save your output.txt as UTF16, which would then be read correctly by notepad, particularly, if you insert a BOM at the beginning. However, while matlab can read/write UTF16, it's not documented so there may be some edge cases where it doesn't work appropriately. Additionally, notepad is probably the only software that tends to assume UTF16, everything else tends to assume UTF8 by default, so the display will look odd in almost every other text editor (matlab's included).

1 Comment
Show -1 older comments Hide -1 older comments

Noam Greenboim on 21 May 2020

Open in MATLAB Online

For adding BOM, see here:

https://www.mathworks.com/matlabcentral/fileexchange/75708-update-bom-for-unicode-text-files

Usage (for example):

BOM('output.txt','UTF-16_LE')

Sign in to comment.

fprintf is printing strange characters instead of numbers

17 Comments
Show 15 older comments Hide 15 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

fprintf is printing strange characters instead of numbers

17 Comments Show 15 older comments Hide 15 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

17 Comments
Show 15 older comments Hide 15 older comments

1 Comment
Show -1 older comments Hide -1 older comments