LPC Analysis and Synthesis of Speech

Open Live Script

This example shows how to implement a speech compression technique known as Linear Predictive Coding (LPC) using DSP System Toolbox™ functionality available at the MATLAB® command line.

Introduction

In this example you implement LPC on a speech signal. This process consists of two steps: analysis and synthesis. In the analysis section, you extract the reflection coefficients from the signal and use it to compute the residual signal. In the synthesis section, you reconstruct the signal using the residual signal and reflection coefficients. The residual signal and reflection coefficients require fewer bits to code than the original speech signal.

This diagram shows the system that you will implement in this example.

In this system, you first divide the speech signal into frames of size 3200 samples, with an overlap of 1600 samples, and multiply by a Hamming window. Compute the twelfth-order autocorrelation coefficients, and then calculate the reflection coefficients from the autocorrelation coefficients using the Levinson-Durbin algorithm. You then pass the original speech signal through an analysis filter, which is an all-zero filter with the same coefficients as the reflection coefficients from the previous step. The output of the filter is the residual signal. Finally, you pass this residual signal through a synthesis filter, which is the inverse of the analysis filter. The output of the synthesis filter is the original signal.

Initialization

Initialize the frame size and the FFT length constants, and instantiate the System objects. These objects also precompute any necessary variables or tables resulting in efficient processing calls later inside a loop.

frameSize = 1600;
fftLen = 2048;

Create a System object™ to read from an audio file and determine the audio sampling rate.

audioReader = dsp.AudioFileReader(SamplesPerFrame=frameSize, OutputDataType='double');

fileInfo = info(audioReader);
Fs = fileInfo.SampleRate;

Create an FIR digital filter System object to use for pre-emphasis.

preEmphasisFilter = dsp.FIRFilter(Numerator=[1 -0.95]);

Create a buffer System object and set its properties such that you get an output that is twice the length of frameSize and with an overlap length of frameSize.

signalBuffer = dsp.AsyncBuffer(2*frameSize);

Create an FIR digital filter System object to use for analysis, and two all-pole digital IIR filter System objects to use for synthesis and de-emphasis.

analysisFilter = dsp.FIRFilter(...
                    Structure='Lattice MA',...
                    ReflectionCoefficientsSource='Input port');

synthesisFilter = dsp.AllpoleFilter(Structure='Lattice AR');

deEmphasisFilter = dsp.AllpoleFilter(Denominator=[1 -0.95]);

Create a System object to play the resulting audio.

audioWriter = audioDeviceWriter(SampleRate=Fs);

% Setup plots for visualization.
scope = spectrumAnalyzer(SampleRate=Fs, ...
    PlotAsTwoSidedSpectrum=false, YLimits=[-140, 0], ...
    Title='Linear Prediction of Speech', ...
    ShowLegend=true, ChannelNames={'Signal', 'LPC'});

Stream Processing Loop

Call your processing loop where you perform the LPC analysis and synthesis of the input audio signal using the System objects.

The loop stops when you reach the end of the input file, which is detected by the AudioFileReader System object.

while ~isDone(audioReader)
    % Read audio input
    sig = audioReader();                         
    
    % Analysis
    % Note that the filter coefficients are passed in as an argument to the
    % analysisFilter System object.    
    sigpreem   = preEmphasisFilter(sig);        
    write(signalBuffer,sigpreem);
    sigbuf     = read(signalBuffer,2*frameSize, frameSize);
    hammingwin = hamming(2*frameSize);
    sigwin     = hammingwin.*sigbuf;

    % Autocorrelation sequence on [0:13]
    sigacf = xcorr(sigwin, 12, 'biased');
    sigacf = sigacf(13:end);
    
    % Compute the reflection coefficients from auto-correlation function
    % using the Levinson-Durbin recursion. The function outputs both
    % polynomial coefficients and reflection coefficients. The polynomial
    % coefficients are used to compute and plot the LPC spectrum.
    [sigA, ~, sigK] = levinson(sigacf); % Levinson-Durbin
    siglpc          = analysisFilter(sigpreem, sigK);

    % Synthesis
    synthesisFilter.ReflectionCoefficients = sigK.';
    sigsyn = synthesisFilter(siglpc);          
    sigout = deEmphasisFilter(sigsyn);         
    
    % Play output audio
    audioWriter(sigout);

    % Update plots
    sigA_padded = zeros(size(sigwin), like=sigA.'); % Zero-padded to plot
    sigA_padded(1:size(sigA.',1), :) = sigA.';
    scope([sigwin, sigA_padded]);
end

Release

Call the release method on the System objects to close any open files and devices.

release(audioReader);
pause(10*audioReader.SamplesPerFrame/audioReader.SampleRate); % Wait until audio finishes playing
release(audioWriter);
release(scope);

Conclusion

This example showed you how to implement the speech compression technique using LPC. The implementation used the DSP System Toolbox functionality available at the MATLAB command line. The example required you to only call System objects with appropriate input arguments. It did not involve any error-prone manual state tracking.