MATLAB Answers

Why can't read german umlauts from a .txt file?

70 views (last 30 days)
prjctdth
prjctdth on 10 Jan 2017
Commented: prjctdth on 16 Jan 2017
I need to read german text from a file and translate it into ascii-numbers. Of course it contains german umlauts 'ä', 'ö', 'ü' and the character 'ß' as well. After using fscanf my resulting string contains only the '?', where these chars were. Changing the encoding in Matlab doesn't work. Is it an issue with my operating system Mac OS? Here is a simple example:
fileID = fopen('text.txt','r');
string = fscanf(fileID,'%s');
and the content of my text.txt for testing:
Lorem ipösum dolor! sit am?et, consüetetur{ sadip$sciäng e&litr, sßed.
All the other special characters are read fine. The result is
string =
loremip?sumdolor!sitam?et,cons?etetur{sadip$sci?nge&litr,s?ed.
P.S. The removement of white space is alright in my application.

  0 Comments

Sign in to comment.

Accepted Answer

Vishal Neelagiri
Vishal Neelagiri on 16 Jan 2017
I tried reproducing the issue that you are facing but I was able to successfully read all the german umlauts from the .txt file. In my case,
string =
Loremipösumdolor!sitam?et,consüetetur{sadip$sciänge&litr,sßed.
I am using MATLAB R2016b on a Windows 10 machine. Your issue seems to be related to your operating system. You might want to refer to this MATLAB Answers page which addresses a similar issue:
https://www.mathworks.com/matlabcentral/answers/100749-why-am-i-unable-to-visualize-umlaut-characters-a-o-u-in-uicontrol-objects-when-i-use-a-german-keyb

  1 Comment

prjctdth
prjctdth on 16 Jan 2017
Thanks for your effort Vishal. Meanwhile I found a solution that works for me.
In fact it was an issue with the encoding of my txt-file. Matlabs encoding on startup is US-ASCII (I would wish to change this in the preferences, but I couldn't find where to do this).
After changing it to UTF-8 I could read files that I wrote with an external editor e.g. Textmate and saved it in UTF-8. txt-Files that I wrote with Matlab (with UTF-8) were curiously encoded in US-ASCII or so. Chars like ä,ö,ü and ß changed anyhow to squares with ? in it (I don't know which char this is).
After opening, changing and saving the external made files, the encoding was broken as well. I now think that encoding-changes in Matlab only impact on the internal file-read-functions. The build in editor can only handle US-ASCII.
I hope that above-named workaround can help someone who faces a similar problem with Matlab.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!