MATLAB Answers

SEPARATE VOICE FROM BACKGROUND MUSIC

124 views (last 30 days)
VEMURI MANI DEEPAK
VEMURI MANI DEEPAK on 19 Feb 2020
Commented: Swetha Polemoni on 31 Aug 2020 at 7:36
% SEPARATION OF VOICE FROM BACKGROUND MUSIC
% TRYING TO SEPARATE VOICE AND BACKGROUND MUSIC
% TAKING AN AUDIO FILE IN .WAVFORMAT
[audio_in,audio_freq_samp1] = audioread('sound.wav');
%ALLIGNING THE VALUES TO LENGTH OF AUDIO, AND DF IS THE MINIMUM FREQUENCY RANGE
length_audio = length(audio_in);
df = audio_freq_samp1/Length_audio;
%CALCULATING FREQUENCY VALUES TO BE ASSIGNED ON THE X-AXIS OF THE GRAPH
frequency_audio = -audio_freq_samp1/2:df:audio_freq_samp1/2-df;
%BY APPLYING FOURIER TRANSFORM TO THE AUDIO FILE
FET_audio_in = fftshift(fft(audio_in)/length(fft(audio_in)));
% PLOTTING
plot(frequency_audio,abs(FFT_audio_in));
title('FFT of input Audio');
xlabel('frequency(HZ)');
ylabel('Amplitude');
%NOW LETS SEPARATE THE VARIOUS COMPONENTS BY CUTTING IT IN FREQUENCY RANGE
lower_threshold = 150;
upper_threshold = 2500;
% WHEN THE VALUES IN THE ARRAY ARE IN THE FREQUENCY RANGE THEN WE HAVE 1 AT
% THAT INDEX AND O FOR OTHERS I.E; CREATING AN BOOLEAN INDEX ARRAY
val = abs(frequency_audio)<upper_threshold & abs(requency_audio)>lower_threshold;
FFT_ins = FFT_audio_in(:,1);
FFT_voc = FFt_audio_in(:,1);
%BY THE LOGICAL ARRAY THE FOURIER IN FREQUENCY RANGE IS KEPT IN VOCALS;AND
%REST IN INSTRUMENTAL AND REST OF THE VALUES TO ZERO
FFT_ins(val) = 0;
FFT_voc(~va1) = 0;
%NOW WE PERFORM THE INVERSE FOURIER TRANSFORM TO GET BACK THE SIGNAL
FFT_a = ifftshift(FFT_audio.in);
FFT_a11 = ifftshift(FFT_ins);
FFT_a31 = ifftshift(FFT_voc);
%CREATING THE TIME DOMAIN SIGNAL
s1 = ifft(FFT_a11*length(fft(audio_in));
s3 = ifft(FFT_a31*length(fft(audio_in)));
%WRITING THE FILE
audiowrite('sound_background.wav',s1,audio_freq_samp1);
audiowrite('sound_voice.wav',s3,audio_freq_samp1);

  2 Comments

John Lynch
John Lynch on 29 Apr 2020
I and a group of students are actually taking a look at this code, as it might be useful in understanding how to approach a polyphonic search engine (searching for a song by the given melody). It seems that there may be several typos throughout, as well as one logic error for the final audiowrites. Correct me if these weren't issues when you ran it.
If we give you credit, and can make it work within our project, would you give us permission to use your code? We might not end up using it, but I wanted to ask just in case.
-JL
Swetha Polemoni
Swetha Polemoni on 31 Aug 2020 at 7:36
Hi
The above code has typos which are rectified in the following code snippet.
% TRYING TO SEPARATE VOICE AND BACKGROUND MUSIC
% TAKING AN AUDIO FILE IN .WAVFORMAT
[audio_in,audio_freq_samp1] = audioread('sound.wav');
%ALLIGNING THE VALUES TO LENGTH OF AUDIO, AND DF IS THE MINIMUM FREQUENCY RANGE
length_audio = length(audio_in);
df = audio_freq_samp1/length_audio;
%CALCULATING FREQUENCY VALUES TO BE ASSIGNED ON THE X-AXIS OF THE GRAPH
frequency_audio = -audio_freq_samp1/2:df:audio_freq_samp1/2-df;
%BY APPLYING FOURIER TRANSFORM TO THE AUDIO FILE
FFT_audio_in = fftshift(fft(audio_in)/length(fft(audio_in)));
% PLOTTING
%plot(frequency_audio,abs(FFT_audio_in));
plot(FFT_audio,abs(music_audio_in));
title('FFT of input Audio');
xlabel('frequency(HZ)');
ylabel('Amplitude');
%NOW LETS SEPARATE THE VARIOUS COMPONENTS BY CUTTING IT IN FREQUENCY RANGE
lower_threshold = 150;
upper_threshold = 2500;
% WHEN THE VALUES IN THE ARRAY ARE IN THE FREQUENCY RANGE THEN WE HAVE 1 AT
% THAT INDEX AND O FOR OTHERS I.E; CREATING AN BOOLEAN INDEX ARRAY
val = abs(frequency_audio)<upper_threshold & abs(frequency_audio)>lower_threshold;
FFT_ins = FFT_audio_in(:,1);
FFT_voc = FFT_audio_in(:,1);
%BY THE LOGICAL ARRAY THE FOURIER IN FREQUENCY RANGE IS KEPT IN VOCALS;AND
%REST IN INSTRUMENTAL AND REST OF THE VALUES TO ZERO
FFT_ins(val) = 0;
FFT_voc(~val) = 0;
%NOW WE PERFORM THE INVERSE FOURIER TRANSFORM TO GET BACK THE SIGNAL
FFT_a = ifftshift(FFT_audio_in);
FFT_a11 = ifftshift(FFT_ins);
FFT_a31 = ifftshift(FFT_voc);
%CREATING THE TIME DOMAIN SIGNAL
s1 = ifft(FFT_a11*length(fft(audio_in)));
s3 = ifft(FFT_a31*length(fft(audio_in)));
%WRITING THE FILE
audiowrite('sound_background.wav',s1,audio_freq_samp1);
audiowrite('sound_voice.wav',s3,audio_freq_samp1);% TRYING TO SEPARATE VOICE AND BACKGROUND MUSIC
% TAKING AN AUDIO FILE IN .WAVFORMAT
[speech,fs]=audioread('speech.wav');
[music,fs]=audioread('strings.wav');
audio_in=speech(1:length(music),1)+music(:,1);
audio_freq_samp1=fs;
[audio_in,audio_freq_samp1] = audioread('sound.wav');
%ALLIGNING THE VALUES TO LENGTH OF AUDIO, AND DF IS THE MINIMUM FREQUENCY RANGE
length_audio = length(audio_in);
df = audio_freq_samp1/length_audio;
%CALCULATING FREQUENCY VALUES TO BE ASSIGNED ON THE X-AXIS OF THE GRAPH
frequency_audio = -audio_freq_samp1/2:df:audio_freq_samp1/2-df;
%BY APPLYING FOURIER TRANSFORM TO THE AUDIO FILE
FFT_audio_in = fftshift(fft(audio_in)/length(fft(audio_in)));
music_audio_in = fftshift(fft(music)/length(fft(music)));
speech_audio_in = fftshift(fft(speech(1:length(music),1))/length(fft(speech(1:length(music),1))));
% PLOTTING
%plot(frequency_audio,abs(FFT_audio_in));
plot(frequency_audio,abs(music_audio_in));
title('FFT of input Audio');
xlabel('frequency(HZ)');
ylabel('Amplitude');
%NOW LETS SEPARATE THE VARIOUS COMPONENTS BY CUTTING IT IN FREQUENCY RANGE
lower_threshold = 970000;
upper_threshold = 2630000;
% WHEN THE VALUES IN THE ARRAY ARE IN THE FREQUENCY RANGE THEN WE HAVE 1 AT
% THAT INDEX AND O FOR OTHERS I.E; CREATING AN BOOLEAN INDEX ARRAY
val = abs(frequency_audio)<upper_threshold & abs(frequency_audio)>lower_threshold;
FFT_ins = FFT_audio_in(:,1);
FFT_voc = FFT_audio_in(:,1);
%BY THE LOGICAL ARRAY THE FOURIER IN FREQUENCY RANGE IS KEPT IN VOCALS;AND
%REST IN INSTRUMENTAL AND REST OF THE VALUES TO ZERO
FFT_ins(val) = 0;
FFT_voc(~val) = 0;
%NOW WE PERFORM THE INVERSE FOURIER TRANSFORM TO GET BACK THE SIGNAL
FFT_a = ifftshift(FFT_audio_in);
FFT_a11 = ifftshift(FFT_ins);
FFT_a31 = ifftshift(FFT_voc);
%CREATING THE TIME DOMAIN SIGNAL
s1 = ifft(FFT_a11*length(fft(audio_in)));
s3 = ifft(FFT_a31*length(fft(audio_in)));
%WRITING THE FILE
audiowrite('sound_background.wav',s1,audio_freq_samp1);
audiowrite('sound_voice.wav',s3,audio_freq_samp1);
This will work with the prior knowledge of music frequency range.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!