Could you please give me the code to fetch the wav files and concatenate them to produce corresponding sound of the given text

3 views (last 30 days)
I saved the wav files by giving Unicode names. I need to fetch the corresponding sounds of the given text and concatenate them.My idea is that, When giving a text input it reads the corresponding Unicode and fetch the wav files which has the Unicode name. It compares and concatenate the wav files and gives the output. Is it possible??? Could you please give me the code for this??

Accepted Answer

Walter Roberson
Walter Roberson on 3 Mar 2018
Once you have converted the input text to char as I described in my answers to your previous questions, then supposing the characters are stored in the string S, and that the wav files are stored in the directory named in the variable audiodir using 4 digit hex file names ending in .wav then
audio_data = [];
fs = 1;
for thischar = S
basename = sprintf('%04x.wav', thischar);
this_filename = fullfile(audiodir, basename);
if ~exist(this_filename, 'file')
fprintf('audio file "%s" not found, skipping character "%c"\n', basename, thischar);
else
[thissound, fs] = audioread(this_filename);
oldchan = size(audio_data, 2);
newchan = size(thissound, 2);
if newchan < oldchan
thissound(end,oldchan) = 0;
elseif oldchan < newchan
audio_data(end,newchan) = 0;
end
audio_data = [audio_data; thissound];
end
end
After this, audio_data will be the combined sounds, and fs will be the frequency (under the assumption that they are all the same.) If the sound files were not all the same number of channels then the output audio_data will be filled out to the maximum number of channels that occurred.
Remember to have an audio file of some short silence corresponding to whatever spacing notation will occur in the source text so that there is a distinction between the words andeverythingdoesnotjustruntogether
"Is it possible???"
Yes -- code is above.
But remember back in https://www.mathworks.com/matlabcentral/answers/385851-can-you-please-help-me-to-do-a-text-to-speech-sythesiser-using-concatenative-synthesis-in-matlab?s_tid=prof_contriblnk#comment_541173 where I said that this would work poorly for English and not work well for many other languages? Well, the language you are interested in is one of the ones that this will work poorly for. http://languagephrases.com/malayalam/common-malayalam-pronunciation-rules/ . But I guess you will just need to run the above code to prove this for yourself.
  28 Comments
Walter Roberson
Walter Roberson on 25 Mar 2018
Edited: Walter Roberson on 25 Mar 2018
I specifically tested the code on words and on single letters, and on cases where the word as a whole was not present but could be broken down into smaller units that were present.
The one case that my algorithm fails upon is the case where you have something of the pattern ABC where A, AB, and BC are all present but ABC and C are not. My algorithm is a "greedy" algorithm which would check ABC first, find it does not exist, then would try AB and find it exists and take that, leaving C to be checked and not found. A better algorithm would "look-ahead" to see that instead it should be decomposed as A and BC.
But the error message about dimension mismatch is not one you would get for that issue. The dimension mismatch is what you would get if your sounds do not all have the same number of channels. My code at the beginning of this Answer already solves that problem -- the thissound and oldchan and newchan logic. Use that same logic to replace
[data_for_sound, fs] = audioread(this_filename);
sounds = [sounds; data_for_sound]; %#ok<AGROW>
Don't make me do all of the work.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!