You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Could you please give me the code to fetch the wav files and concatenate them to produce corresponding sound of the given text
3 views (last 30 days)
Show older comments
I saved the wav files by giving Unicode names. I need to fetch the corresponding sounds of the given text and concatenate them.My idea is that, When giving a text input it reads the corresponding Unicode and fetch the wav files which has the Unicode name. It compares and concatenate the wav files and gives the output. Is it possible??? Could you please give me the code for this??
Accepted Answer
Walter Roberson
on 3 Mar 2018
Once you have converted the input text to char as I described in my answers to your previous questions, then supposing the characters are stored in the string S, and that the wav files are stored in the directory named in the variable audiodir using 4 digit hex file names ending in .wav then
audio_data = [];
fs = 1;
for thischar = S
basename = sprintf('%04x.wav', thischar);
this_filename = fullfile(audiodir, basename);
if ~exist(this_filename, 'file')
fprintf('audio file "%s" not found, skipping character "%c"\n', basename, thischar);
else
[thissound, fs] = audioread(this_filename);
oldchan = size(audio_data, 2);
newchan = size(thissound, 2);
if newchan < oldchan
thissound(end,oldchan) = 0;
elseif oldchan < newchan
audio_data(end,newchan) = 0;
end
audio_data = [audio_data; thissound];
end
end
After this, audio_data will be the combined sounds, and fs will be the frequency (under the assumption that they are all the same.) If the sound files were not all the same number of channels then the output audio_data will be filled out to the maximum number of channels that occurred.
Remember to have an audio file of some short silence corresponding to whatever spacing notation will occur in the source text so that there is a distinction between the words andeverythingdoesnotjustruntogether
"Is it possible???"
Yes -- code is above.
But remember back in https://www.mathworks.com/matlabcentral/answers/385851-can-you-please-help-me-to-do-a-text-to-speech-sythesiser-using-concatenative-synthesis-in-matlab?s_tid=prof_contriblnk#comment_541173 where I said that this would work poorly for English and not work well for many other languages? Well, the language you are interested in is one of the ones that this will work poorly for. http://languagephrases.com/malayalam/common-malayalam-pronunciation-rules/ . But I guess you will just need to run the above code to prove this for yourself.
28 Comments
Neethu K
on 4 Mar 2018
After running the code I didn't get any sound at the output and An error is shown as undefined function or variable 'audiodir'. But I saved the required audios in a folder named as audiodir. Is there required any extra command to call it????? Why the sound is not produced? And also I need a text box to type the required Malayalam text box .please help me.
Walter Roberson
on 4 Mar 2018
I indicated that for this code, "that the wav files are stored in the directory named in the variable audiodir" .
That means that you should define a variable named audiodir such as
audiodir = 'C:\Users\NeeK\Documents\MATLAB\EE403\Final Project\malayalm\wav';
which you should change to the actual name of the directory your stored your .wav files in.
"And also I need a text box to type the required Malayalam text box"
S = inputdlg('Enter the text', 'Malayalam Input', 1, 'നാം വാൾട്ടർ നന്ദി');
Neethu K
on 5 Mar 2018
I add the text to box to my program given by you. when Im running this an error is occured as Error using sprintf Function is not defined for 'cell' inputs.
Error in TTS (line 7) basename = sprintf('%04x.wav', thischar); could you please help me to solve the problem.
Walter Roberson
on 5 Mar 2018
Give a sample complete name for a .wav file and indicate which character it corresponds to.
Neethu K
on 5 Mar 2018
Thank you sir. After this it gives an error as subscript indices must either be real or positive integers or logical. The error occur in audio_data(end,newchan)=0; since the audiodata appears to change size on every loop iteration have to consider a preallocating variable or array before entering to the loop by zeros, ones, or cell. How can we solve the problem??
Walter Roberson
on 5 Mar 2018
audio_data = [];
fs = 1;
for thischar = S
basename = sprintf('%04x.wav', thischar);
this_filename = fullfile(audiodir, basename);
if ~exist(this_filename, 'file')
fprintf('audio file "%s" not found, skipping character "%c"\n', basename, thischar);
else
[thissound, fs] = audioread(this_filename);
if isempty(audio_data)
audio_data = thissound;
else
oldchan = size(audio_data, 2);
newchan = size(thissound, 2);
if newchan < oldchan
thissound(end,oldchan) = 0;
elseif oldchan < newchan
audio_data(end,newchan) = 0;
end
audio_data = [audio_data; thissound];
end
end
end
Walter Roberson
on 8 Mar 2018
That is what the above code already does. After the above code has executed, audio_data contains the concatenated sound information. If you want to write that out to a file then do so.
Neethu K
on 13 Mar 2018
Sir, I want to display Malayalam text. Now it comes as square boxes. What is the code to display Malayalam in matlab?
Walter Roberson
on 13 Mar 2018
Which MATLAB version are you using and which operating system? Is the problem affecting the command window, or is it affecting only text() and labels and titles, or is it affecting uicontrol style edit, or is it affecting uitable, or....?
Neethu K
on 13 Mar 2018
I'm using Matlab version R2016a in windows 10.when I'm typing Malayalam in Matlab it doesn't display. It shows squares and round shapes.
Walter Roberson
on 15 Mar 2018
I had to scrub my Windows copies of MATLAB a couple of months ago for technical reasons. I have been putting my Windows systems back together but I have not gotten around to adding MATLAB back in yet.
Neethu K
on 20 Mar 2018
OK sir. After this concatenation of letters it has some delay problems and it's not pleasant to hear. So could you please give me a silence removal algorithm or a smoothing algorithm
Walter Roberson
on 20 Mar 2018
You can search MATLAB Answers for silence removal. It is not something I have worked on myself.
Neethu K
on 21 Mar 2018
Thank you sir. Can you please give me a code to split sentences into words, words into letters and to fetch the corresponding sounds. Because I saved some of them as letters and some of them as words. So I need fetch the correct wavfiles depending Up on the input given. Please help me sir
Walter Roberson
on 21 Mar 2018
"Can you please give me a code to split sentences into words, words into letters and to fetch the corresponding sounds"
I would hesitate to write code to split sentences into words even for English (which I know well); I would not make the attempt for Malayalm, which I do not know and which has a reputation for being fairly irregular.
I do not recommend splitting into letters. Any good text to speech system needs to split into phonemes instead -- groups of letters that together form a distinct sound. Even then, some groups of phonemes blend together to give you a modified sound. It is not easy in English; see http://www.auburn.edu/academic/education/reading_genie/phoncount.html . And see https://www.independent.co.uk/news/uk/home-news/an-increasing-number-of-british-people-dont-pronounce-the-word-three-properly-these-maps-explain-why-a7079976.html for how "correct" pronunciation can vary quite regionally, into forms that do not match the letters much.
The definition of "word" does not match spelling very well. How many word is "isn't" ? How many words is "ain't" ? How many words is "birthday" ? Is "birth-day" a different number of words? Is "birth day" different? The references I consulted a few months ago said that when a group of text took on a distinct meaning, then that was a "word", regardless of whether it was written with or without spaces, and with or without a hyphen. But that does not mean that hyphen is word forming: for example in "meta-sentence" the "meta-" is a modifier that is distinct rather than combining with a single word. "meta-logic" is quite possibly two distinct words, but "metalogic" is probably one word (because in the context it would be used it would have a distinct meaning), and "metalogical" would definitely be one word.
Neethu K
on 22 Mar 2018
Actually my problem is that, though I saved the wavfiles as phonemes and words in Malayalam. when I'm giving an input text the program has to check whether there is a space and if so consider them as words so split them and store in an array. Next check whether the first word is in the database, if so speak out it, else check each of the single letter is in database and speak out one by one to get a smooth pleasant speech at the output. Is there is any code or command for this???
Walter Roberson
on 22 Mar 2018
Walter Roberson
on 22 Mar 2018
However, according to the bottom of page 12 of "Western Influence on Malayalam Language and Literature" by K. M. George https://books.google.ca/books/about/Western_Influence_On_Malayalam_Language.html?id=IP8OAAAAMAAJ&redir_esc=y, Malayalam combines words to some extent, a compromise between the influence of Sanskrit (which combines more) and English (which combines less.) Therefore you cannot use spaces to break words into sentences in Malayalam.
Neethu K
on 23 Mar 2018
I found the places of space and I got an array like word='0d150d13' '0d050d25' '0d28' Depending upon the input I'm given. That is the seperated words Unicode's are given in single quotes. Now I need to check whether these single word(the first Unicode value in single quotes) is in database of not. If not then want to check whether the first letter ie. First Unicode value (0d15 in above) is in the database and fetch the wav files and make sound. Please help me..
Walter Roberson
on 23 Mar 2018
audiodir = ''; %set as appropriate
word = {'0d150d13' '0d050d25' '0d28'}; %assumed to come from previous steps
%process word list looking for files
sounds = zeros(0,2);
for word_idx = 1 : length(word)
thisword = word{word_idx};
prefix = thisword; suffix = '';
while ~isempty(prefix)
basename = [prefix '.wav'];
this_filename = fullfile(audiodir, basename);
if exist(this_filename, 'file')
[data_for_sound, fs] = audioread(this_filename);
sounds = [sounds; data_for_sound]; %#ok<AGROW>
prefix = suffix; suffix = '';
elseif length(prefix) <= 4
fprintf('No sound file for %s!\n', prefix);
prefix = suffix; suffix = '';
else
lastletter = prefix(end-3:end);
prefix = prefix(1:end-4);
suffix = [lastletter suffix]; %#ok<AGROW>
end
end
end
The above code automatically takes the longest prefix of a word that has a matching file, and continues to process the remainder of the word.
Neethu K
on 25 Mar 2018
Thank you so much sir. It works well, but when I'm giving input which include both words and single letters (phonemes) It shows an error message as
Dimension of matrices being concatenated is not consistent.
How can I solve this problem??
Walter Roberson
on 25 Mar 2018
I suspect that some of your sound files are two channel and some of them are one channel. Is that the case? Also is it certain that all of the files are the same sampling frequency ?
Walter Roberson
on 25 Mar 2018
Edited: Walter Roberson
on 25 Mar 2018
I specifically tested the code on words and on single letters, and on cases where the word as a whole was not present but could be broken down into smaller units that were present.
The one case that my algorithm fails upon is the case where you have something of the pattern ABC where A, AB, and BC are all present but ABC and C are not. My algorithm is a "greedy" algorithm which would check ABC first, find it does not exist, then would try AB and find it exists and take that, leaving C to be checked and not found. A better algorithm would "look-ahead" to see that instead it should be decomposed as A and BC.
But the error message about dimension mismatch is not one you would get for that issue. The dimension mismatch is what you would get if your sounds do not all have the same number of channels. My code at the beginning of this Answer already solves that problem -- the thissound and oldchan and newchan logic. Use that same logic to replace
[data_for_sound, fs] = audioread(this_filename);
sounds = [sounds; data_for_sound]; %#ok<AGROW>
Don't make me do all of the work.
More Answers (0)
See Also
Categories
Find more on Multirate Signal Processing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)