Get MATLAB Engine (engEvalString) to take/return unicode.

Asked by Szabolcs
on 31 Mar 2013

The MATLAB Engine is a C interface to MATLAB. It provides a function engEvalString() which takes some MATLAB code as a C string (char *), evaluates it, then returns MATLAB's output as a C string again.

I need to be able to pass unicode data to MATLAB through engEvalString() and to retrieve the output as unicode. How can I do this? I don't care about the particular encoding (UTF-8, UTF-16, etc.), any will do. I can adapt my program.


More details:

To give a concrete example, if I send the following sting, encoded as, say, UTF-8,

    s='Paul Erdős'

I would like to get back the following output, encoded again as UTF-8:

    s =
    Paul Erdős

I hoped to achieve this by sending feature('DefaultCharacterSet', 'UTF-8') ( reference ) before doing anything else, and this worked fine when working with MATLAB R2012b on OS X. It also works fine with R2013a on Ubuntu Linux. It does not work on R2013a on OS X though. Instead of the character "ő" in the output of engEvalString(), I get character code 26, which is supposed to mean "I don't know how to represent this". However, if I retrieve the contents of the variable "s" by other means, I see that MATLAB does correctly store the character "ő" in the string. This means that it's only the output that didn't work, but MATLAB did interpret the UTF-8 input correctly. If I test this on Windows with R2013a, neither input, nor output works correctly. (Note that the Windows and the Mac/Linux implementations of the MATLAB Engine are different.)

The question is: how can I get unicode input/output working on all platforms (Win/Mac/Linux) with engEvalString()? I need this to work in R2013a, and preferably also in R2012b.


If people are willing to experiment, I can provide some test C code. I'm not posting that yet because it's a lot of work to distill a usable small example from my code, that makes it possible to experiment with various encodings.



I learned about feature('locale') which returns some locale-related data. On Linux, where everything works correctly, all encodings it returns are UTF-8. But not on OS X / Windows. Is there any way I could set the various encodings returned by feature('locale')?

0 Answers

