So following Walter's hint that this is actually UTF-8, we find that filenames on Macs are returned as decomposed form, whereas other systems use composed forms (or perhaps whatever was given by the user). http://download.oracle.com/javase/6/docs/api/java/text/Normalizer.html
I didn't find any way to handle this in matlab native, but Java provides the required methods:
%% some handy definitions
NFD = javaMethod('valueOf', 'java.text.Normalizer$Form','NFD');
NFC = javaMethod('valueOf', 'java.text.Normalizer$Form','NFC');
UTF8=java.nio.charset.Charset.forName('UTF-8');
%% convert a name of a file from dir to a sensible matlab string:
D = dir('*.txt');
s2 = D.name;
s = java.lang.String(uint8(s2),UTF8);
sc = java.text.Normalizer.normalize(s,NFC);
sc = char(sc);
strcmp(sc,'öäå.txt')
ans =
1
%% and the reverse, to open a file with accented characters:
filename = 'öäå.txt';
s = java.lang.String(filename);
sc = java.text.Normalizer.normalize(s,NFD);
bs=single(sc.getBytes(UTF8)');
bs(bs<0) = 256+(bs(bs<0));
id = fopen(char(bs),'r')
id =
3