Convert cell array of strings to unicode quickly

3 views (last 30 days)
I have an array of approximately 10M strings, and I'm interested in converting each string to its unicode values. Is there a quick, one-line way to convert the whole string array into numeric values? Ideally, I'd love a solution like this:
numeric_matrix = double(string_array);
But of course double (and unicode2native) does not support cells. So my current solution is to loop through the string array:
for ii = 1:length(string_array)
numeric_matrix(ii,:) = double(string_array{ii});
end
Unfortunately this for-loop solution is very inefficient. It can take upwards of 10 minutes for very large numbers of strings. I tried googling this but didn't see anything better. Is there a simpler, faster way to do this, ideally in one line?

Accepted Answer

Walter Roberson
Walter Roberson on 2 Feb 2016
Try
numeric_array = cellfun(@uint16, stringarray);
Try it on a smaller subset first as I do not know how the timing would compare. It should have the advantage of not needing to change the internal representation.
  3 Comments
Guillaume
Guillaume on 2 Feb 2016
As far as I understand, matlab native encoding is not unicode but whatever is your system locale, so converting the string to double (or uint16) may not convert it to unicode unless your locale is also unicode. You would have to call native2unicode on the strings to be sure.
Most likely your cellfun is slower than a loop because you're using an anonymous function to perform your extra operation. Anonymous function calls have a significant overhead in matlab.
Greg
Greg on 2 Feb 2016
Thanks. I'm not interested in the unicode values per se. I just wanted a way to turn a string into a (hopefully) unique numeric value. But that's good to know about unicode.
And thanks for mentioning the anonymous function. That's probably what's happening!

Sign in to comment.

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!