I want to get the Unicode for a character. Could you please help me. Which encoding type should I need to choose? UTF8 or Unicode???

Question

0 votes

I want to convert text to speech for Malayalm language(native language of Kerala). First I need the Unicode value of each letter. I saved the characters in text file . What is the commant to read a text file to get the Unicode.

1 Comment
Show -1 older comments Hide -1 older comments

Neethu K on 3 Mar 2018

I saved a Malayalam phoneme in a text file with utf8. I want to get the Unicode of that letter. Could you please send me the code to fetch the file and get the corresponding Unicode value

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Walter Roberson on 3 Mar 2018

0 votes

The available characters are listed at https://en.wikipedia.org/wiki/Malayalam_(Unicode_block) . They start at U+0D00 which is char(3328) for the first entry.

The way to read the file to get the unicode depends upon exactly how the file was stored. Sometimes the method can be quite simple, but until you know which encoding is being used you have to be more careful. See https://www.mathworks.com/matlabcentral/answers/267176-read-and-seperate-csv-data#answer_209938 for some code of mine that figures out how a documented has been encoded.

8 Comments
Show 6 older comments Hide 6 older comments

Neethu K on 8 Mar 2018

It shows an error in using fread(invalid file identifier)

Walter Roberson on 8 Mar 2018

Open in MATLAB Online

audiodir = 'C:\Users\NeeK\Documents\MATLAB\EE403\Final Project\malayalm\wav';     %adjust as appropriate
[filename, pathname] = uigetfile('*.txt', 'Choose a text file');
if ~ischar(filename)
  fprintf('Cancel!\n')
  return;    %user cancel
end
fullname = fullfile(pathname, filename);
[fid, msg] = fopen(fullname, 'r', 'n', 'UTF8');
if fid < 0
  error('Failed to open file "%s" because "%s"', fullname, msg);
end
S = fread(fid, '*char', [1 inf]);
fclose(fid);
if isempty(S)
  fprintf('Text file "%s" is empty!\n', fullname);
  return
end
if S(1) == 65279; S(1) = ''; end
audio_data = [];
fs = 1;
for thischar = S
  basename = sprintf('%04x.wav', thischar);
  this_filename = fullfile(audiodir, basename);
  if ~exist(this_filename, 'file')
    fprintf('audio file "%s" not found, skipping character "%c"\n', basename, thischar);
  else
    [thissound, fs] = audioread(this_filename);
    if isempty(audio_data)
       audio_data = thissound;
    else
      oldchan = size(audio_data, 2);
      newchan = size(thissound, 2);
      if newchan < oldchan
        thissound(end,oldchan) = 0;
      elseif oldchan < newchan
        audio_data(end,newchan) = 0;
      end
      audio_data = [audio_data; thissound];
    end
  end
end

Sign in to comment.

I want to get the Unicode for a character. Could you please help me. Which encoding type should I need to choose? UTF8 or Unicode???

1 Comment
Show -1 older comments Hide -1 older comments

Answers (1)

8 Comments
Show 6 older comments Hide 6 older comments

Categories

Tags

Community Treasure Hunt

I want to get the Unicode for a character. Could you please help me. Which encoding type should I need to choose? UTF8 or Unicode???

1 Comment Show -1 older comments Hide -1 older comments

Answers (1)

8 Comments Show 6 older comments Hide 6 older comments

Categories

Tags

See Also

Community Treasure Hunt

1 Comment
Show -1 older comments Hide -1 older comments

8 Comments
Show 6 older comments Hide 6 older comments