Documentation

Audio Sample-Rate Conversion

This example shows how to use a multistage/multirate approach to sample rate conversion between different audio sampling rates.

The example uses dsp.SampleRateConverter. This component automatically determines how many stages to use and designs the filter required for each stage in order to perform the sample rate conversion in a computationally efficient manner.

This example focuses on converting an audio signal sampled at 96 kHz (DVD quality) to an audio signal sampled at 44.1 kHz (CD quality).The comparison is done using data sampled at 96 kHz available online at http://src.infinitewave.ca/. In order to run this example, you must download the files at < http://src.infinitewave.ca/TestSignals.zip> and make them visible to the MATLAB path using addpath.

Reading the 96 kHz File

The website above has 3 sets of files at different qualities in order to perform the comparison. In this example the focus will be on one of the files only: Swept_int.wav. This file contains a chirp sine wave sweeping from 0 Hz to 48 kHz over the course of 8 seconds. The format of the file is 32-bit integers, given it a very high dynamic range.

% Here you create a System object to read from the audio file and
% determine the file's audio sampling rate.
frameSize = 320;
AFR96 = dsp.AudioFileReader('Swept_int.wav', ...
    'SamplesPerFrame', frameSize, ...
    'OutputDataType', 'double');

fileInfo = info(AFR96);
inFs = fileInfo.SampleRate;

Load Spectrum Analyzers

In order to preset all the settings for the spectrum analyzers to be used, these have been saved to a MAT file. The MAT file contains 4 saved spectrum analyzers: two to be use for spectrogram and two to be used for power spectrum. These will be used to visualize the frequency content of the original signal as well as of the signals converted to 44.1 kHz.

load srcSpectrumAnalyzers

Spectrum of Original Signal Sampled at 96 kHz

The loop below plots the spectrogram and power spectrum of the original 96 kHz signal. The power spectrum is shown with Max-hold which keeps a trace of the maximum power values shown at each frequency. This helps visualize the progression of the chirp over time.

while ~isDone(AFR96)
    % Source
    sig = step(AFR96);

    % Spectrogram
    step(Spectrogram96,sig);

    % Power spectrum with max hold
    step(PowerSpectrum96,sig);
end
release(AFR96)
release(Spectrogram96)
release(PowerSpectrum96)

Setting up the Sample Rate Converter

In order to convert the signal, dsp.SampleRateConverter is used. A first attempt sets the bandwidth of interest to 40 kHz, i.e. to cover the range [-20 kHz, 20 kHz]. This is the usually accepted range that is audible to humans. The stopband attenuation for the filters to be used to remove spectral replicas and aliased replicas is left at the defaut value of 80 dB.

BW40 = 40e3;
OutFs = 44.1e3;
SRC40kHz80dB = dsp.SampleRateConverter('Bandwidth',BW40,...
    'InputSampleRate',inFs,'OutputSampleRate',OutFs);

Analysis of the Filters Involved in the Conversion

Use info() to get information on the filters that are designed to perform the conversion. This reveals that the conversion will be performed in two steps. The first step involves a decimation by two filter which converts the signal from 96 kHz to 48 kHz. The second step involves an FIR rate converter that interpolates by 147 and decimates by 160. This results in the 44.1 kHz required. The freqz command can be used to visualize the combined frequency response of the two stages involved. Zooming in reveals that the passband extends up to 20 kHz as specified and that the passband ripple is in the milli-dB range (less than 0.003 dB).

info(SRC40kHz80dB)
[H80dB,f] = freqz(SRC40kHz80dB,0:10:25e3);
plot(f,20*log10(abs(H80dB)/norm(H80dB,inf)));
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
axis([0 25e3 -140 5]);
sig = step(AFR96);
setup(SRC40kHz80dB,sig);
ans =

Overall Interpolation Factor    : 147
Overall Decimation Factor       : 320
Number of Filters               : 2
Multiplications per Input Sample: 42.334375
Number of Coefficients          : 8618
Filters:                         
   Filter 1:
   dsp.FIRDecimator     - Decimation Factor   : 2 
   Filter 2:
   dsp.FIRRateConverter - Interpolation Factor: 147
                        - Decimation Factor   : 160 


Create an Audio File Writer for the Converted Signal

Once the signal has been converted to 44.1 kHz, you can write it back to an audio file using an AudioFileWriter.

% Here you create a System object to write the audio file
AFW44p1 = dsp.AudioFileWriter('Swept_44p1kHz.wav',...
    'FileFormat','WAV','DataType','int32',....
    'SampleRate',OutFs);

Main Processing Loop

The loop below reads a frame of 320 samples at a time, converts it to a frame of 147 samples (sampled at 44.1 kHz) and writes the result to disk. The conversion time is measured using tic/toc. On most machines, the conversion is done in less than 8 seconds. Since the processing involves a signal that lasts for 8 seconds, it can be concluded that this processing is appropriate for real-time conversion of a 96 kHz signal (for instance if it is desired to convert a 96 kHz signal from a live audio input port rather than from disk.

tic
while ~isDone(AFR96)
    sig = step(AFR96);            % Read audio input
    sig = step(SRC40kHz80dB,sig); % Convert sample-rate
    step(AFW44p1, sig);           % Write output audio
end
toc
release(AFR96);
release(AFW44p1);
Elapsed time is 1.601691 seconds.

Reading the 44.1 kHz

Here you setup an audio file reader for the 44.1 kHz signal.

AFR44p1 = dsp.AudioFileReader('Swept_44p1kHz.wav', ...
    'SamplesPerFrame', frameSize, ...
    'OutputDataType', 'double');

Visualize Spectrum of 44.1 kHz File

Now the spectrogram and power spectrum of the converted signal are plotted. The extra lines in the spectrogram are correspond to spectral aliases/images left by the filters in the sample rate conveter. The replicas are attenuated by better than 80 dB as can be verifies with the power spectrum plot.

while ~isDone(AFR44p1)
    % Source
    sig = step(AFR44p1);

    % Spectrogram
    step(Spectrogram44p1,sig);

    % Power spectrum with max hold
    step(PowerSpectrum44p1,sig);
end
release(AFR44p1)
release(Spectrogram44p1)
release(PowerSpectrum44p1)

A More Precise Sample Rate Converter

In order to improve the sample rate converter quality, two changes can be made. First, the bandwidth can be extended from 40 kHz to 43.5 kHz. This in turn requires filters with a sharper transition. Second, the stopband attenuation can be increased from 80 dB to 160 dB. Both these changes come at the expense of more filter coefficients over all as well as more multiplications per input sample.

BW43p5 = 43.5e3;
SRC43p5kHz160dB = dsp.SampleRateConverter('Bandwidth',BW43p5,...
    'InputSampleRate',inFs,'OutputSampleRate',OutFs,...
    'StopbandAttenuation',160);

Analysis of the Filters Involved in the Conversion

The previous sample rate converter involved 8618 filter coefficients and a computational cost of 42.3 multiplications per input sample. By increasing the bandwidth and stopband attenuation, the cost increases substantially to 123896 filter coefficients and 440.34 multiplications per input sample. The frequency response reveals a much sharper filter transition as well as larger stopband attenuation. Moreover, the passband ripple is now in the micro-dB scale.

info(SRC43p5kHz160dB)
[H160dB,f] = freqz(SRC43p5kHz160dB,0:10:25e3);
plot(f,20*log10(abs(H160dB)/norm(H160dB,inf)));
xlabel('Frequency (Hz)');
ylabel('Magnitude (dB)');
axis([0 25e3 -250 5]);
sig = step(AFR96);
setup(SRC43p5kHz160dB,sig);
ans =

Overall Interpolation Factor    : 147
Overall Decimation Factor       : 320
Number of Filters               : 2
Multiplications per Input Sample: 440.340625
Number of Coefficients          : 123896
Filters:                         
   Filter 1:
   dsp.FIRDecimator     - Decimation Factor   : 2 
   Filter 2:
   dsp.FIRRateConverter - Interpolation Factor: 147
                        - Decimation Factor   : 160 


Create an Audio File Writer for the Converted Signal

The converted signal will be written to a different file.

% Here you create a System object to write the audio file
AFW44p1160 = dsp.AudioFileWriter('Swept_44p1kHz160dB.wav',...
    'FileFormat','WAV','DataType','int32',....
    'SampleRate',OutFs);

Main Processing Loop

The processing is repeated with the more precise sample rate converter. On most systems, the conversion is still done in less than 8 seconds despite the computational cost. This will be possible as long as the processing power can handle 43 million multiplications/additions per second.

tic
while ~isDone(AFR96)
    sig = step(AFR96);               % Read audio input
    sig = step(SRC43p5kHz160dB,sig); % Convert sample-rate
    step(AFW44p1160, sig);           % Write output audio
end
toc
release(AFR96);
release(AFW44p1160);
Elapsed time is 2.317134 seconds.
AFR44p1160 = dsp.AudioFileReader('Swept_44p1kHz160dB.wav', ...
    'SamplesPerFrame', frameSize, ...
    'OutputDataType', 'double');

Visualize Spectrum of 44.1 kHz File

Once again the spectrogram and power spectrum of the converted signal are plotted. Notice that the imaging/aliasing is attenuated enough that they are not visible in the spectrogram. The power spectrum shows spectral aliases attenuated by more than 160 dB.

while ~isDone(AFR44p1160)
    % Source
    sig = step(AFR44p1160);

    % Spectrogram
    step(Spectrogram44p1,sig);

    % Power spectrum with max hold
    step(PowerSpectrum44p1,sig);
end
release(AFR44p1160)
release(Spectrogram44p1)
release(PowerSpectrum44p1)

Farrow Sample Rate Converter

Sample rate converters based on Farrow filters using quadratic or cubic polynomial interpolation are appealing because of the low number of filter coefficients that need to be stored in memory as well as the low number of operations per input sample that need to be computed.

FARROWSRC = dsp.FarrowRateConverter('InputSampleRate',inFs,...
     'OutputSampleRate',OutFs);
cost(FARROWSRC)
sig = step(AFR96);
setup(FARROWSRC,sig);
ans = 

                  NumCoefficients: 16
                        NumStates: 3
    MultiplicationsPerInputSample: 5.5125
          AdditionsPerInputSample: 5.0531

Create an Audio File Writer for the Converted Signal

% Here you create a System object to write the audio file
AFW44p1Farrow = dsp.AudioFileWriter('Swept_44p1kHzFarrow.wav',...
    'FileFormat','WAV','DataType','int32',....
    'SampleRate',OutFs);

Main Processing Loop

Once again the a loop is used to convert a frame at a time.

tic
while ~isDone(AFR96)
    sig = step(AFR96);         % Read audio input
    sig = step(FARROWSRC,sig); % Convert sample-rate
    step(AFW44p1Farrow, sig);  % Write output audio
end
toc
release(AFR96);
release(AFW44p1Farrow);
Elapsed time is 8.273520 seconds.
AFR44p1Farrow = dsp.AudioFileReader('Swept_44p1kHzFarrow.wav', ...
    'SamplesPerFrame', frameSize, ...
    'OutputDataType', 'double');

Visualize Spectrum of 44.1 kHz File

As before, the spectrogram and power spectrum are plotted to analyze the result. Note there is a substantial amount of imaging/aliasing in the converted signal. This is the drawback of the Farrow filter, although the computational cost is low, the performance of the rate converter is also relatively poor.

while ~isDone(AFR44p1Farrow)
    % Source
    sig = step(AFR44p1Farrow);

    % Spectrogram
    step(Spectrogram44p1,sig);

    % Power spectrum with max hold
    step(PowerSpectrum44p1,sig);
end
release(AFR44p1Farrow)
release(Spectrogram44p1)
release(PowerSpectrum44p1)

Was this topic helpful?