How do I split apart an audio signal where a certain tone occurs?

18 views (last 30 days)
Hello, I have a bunch of speech audio streams that consists of scripted and unscripted data. I am trying to code an automatic audio segmenter that takes a cue from an externally generated 1 kHz sine wave where the audio signal needs clipped. What I have done so far is windowed the original audio file into frames of 0.5 s and done a Fourier transform on each frame to determine where in time this tone occurs using the frame index, (since Fourier doesn't retain time information). I've determined from plotting each transformed frame that the tone can be found when when the xaxis is 1 kHz and the amplitude is above 0.001. I can't figure out though how to iterate through each frame and concatenate frames based on the scheme below. Stars represent the frames where the tone occurs. I want to delete those frames and make new files composed of the frames between the beeps labeled as below:
Attached is my code so far. The file to be segmented is too large to attach but it is a 44.1 kHz about 318 seconds long. I would also like to be able to eventually segment three audio files that captured the same content but with different devices. This would entail lining up the files at the first beep and doing the segmentation simultaneously. However, I need to start with this first step of segmenting one file. The enframe function comes from VoiceBox. Any help is greatly appreciated. Thank you
% filename=strcat('7135459_12182015_1_subject.wav');
filePath=fullfile('C:\MATLAB\VoiceRec',filename);
[subject,fs]=audioread(filePath);
T_s = 1/fs;
Y=T_s*fft(subject);
absY=fftshift(abs(Y));
f2 = -fs/2:fs/length(Y):fs/2-(fs/length(Y));
figure(1)
plot(f2,fftshift(abs(Y)));
xlim([0 1200]);
[f,t,w]=enframe(subject,44100/2); %windowing original signal with 44100/2 samples in each frame, 0.5 seconds long
enframeY = T_s*fft(f,[],2);
enframeabsY=fftshift(abs(enframeY));
enframef2 = -fs/2:fs/length(enframeY):fs/2-(fs/length(enframeY));
for i = 1:30 %only doing first 30 frames for now
% figure(i)
% plot(enframef2,fftshift(abs(enframeY(i,:))));
% xlim([0 1200]);
if enframef2 = 1000 && fftshift(abs(enframeY(i,:) > 0.001
f(i,:) = []; %delete frames with tone
else
unscripted() = f(i,:) %put frames between first two tones in new file called unscripted
end
end
  3 Comments
John BG
John BG on 10 Feb 2016
can you hang a chunk of the audio file that includes at least a few beeps?
Without any further information, I would not start with an fft of the full track, but with a loop applying fft to small enough time shifting window.
John
Star Strider
Star Strider on 10 Feb 2016
You’re not calculating your Fourier transform correctly. See the R2015a documentation for fft for the correct way to analyse your signals. Note particularly the code between the top two plot figures.
Please show your code using spectrogram.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!