Documentation |
On this page… |
---|
The goal of spectral estimation is to describe the distribution (over frequency) of the power contained in a signal, based on a finite set of data. Estimation of power spectra is useful in a variety of applications, including the detection of signals buried in wideband noise.
The power spectral density (PSD) of a stationary random process x(n) is mathematically related to the autocorrelation sequence by the discrete-time Fourier transform. In terms of normalized frequency, this is given by
$${P}_{xx}(\omega )=\frac{1}{2\pi}{\displaystyle \sum _{m=-\infty}^{\infty}{R}_{xx}(m){e}^{-j\omega m}}.$$
This can be written as a function of physical frequency f (e.g., in hertz) by using the relation ω = 2πf / f_{s}, where f_{s} is the sampling frequency:
$${P}_{xx}(f)=\frac{1}{{f}_{s}}{\displaystyle \sum _{m=-\infty}^{\infty}{R}_{xx}(m){e}^{-j2\pi mf/{f}_{s}}}.$$
The correlation sequence can be derived from the PSD by use of the inverse discrete-time Fourier transform:
$${R}_{xx}(m)={\displaystyle \underset{-\pi}{\overset{\pi}{\int}}{P}_{xx}(\omega ){e}^{j\omega m}d\omega}={\displaystyle \underset{-{f}_{s}/2}{\overset{{f}_{s}/2}{\int}}{P}_{xx}(f){e}^{j2\pi mf}{}^{/{f}_{s}}df}.$$
The average power of the sequence x(n) over the entire Nyquist interval is represented by
$${R}_{xx}(0)={\displaystyle \underset{-\pi}{\overset{\pi}{\int}}{P}_{xx}(\omega )d\omega ={\displaystyle \underset{-{f}_{s}/2}{\overset{{f}_{s}/2}{\int}}{P}_{xx}(f)df}}.$$
The average power of a signal over a particular frequency band [ω_{1}, ω_{2}], 0 ≤ ω_{1} ≤ ω_{2} ≤ π, can be found by integrating the PSD over that band:
$${\overline{P}}_{[{\omega}_{1},{\omega}_{2}]}={\displaystyle {\int}_{{\omega}_{1}}^{{\omega}_{2}}{P}_{xx}}(\omega )d\omega ={\displaystyle {\int}_{-{\omega}_{2}}^{-{\omega}_{1}}{P}_{xx}}(\omega )d\omega .$$
You can see from the above expression that P_{xx}(ω) represents the power content of a signal in an infinitesimal frequency band, which is why it is called the power spectral density.
The units of the PSD are power (e.g., watts) per unit of frequency. In the case of P_{xx}(ω), this is watts/radian/sample or simply watts/radian. In the case of P_{xx}(f), the units are watts/hertz. Integration of the PSD with respect to frequency yields units of watts, as expected for the average power .
For real–valued signals, the PSD is symmetric about DC, and thus P_{xx}(ω) for 0 ≤ ω ≤ π is sufficient to completely characterize the PSD. However, to obtain the average power over the entire Nyquist interval, it is necessary to introduce the concept of the one-sided PSD.
The one-sided PSD is given by
$${P}_{\text{onesided}}(\omega )=\{\begin{array}{ll}0,\hfill & -\pi \le \omega <0,\hfill \\ 2{P}_{xx}(\omega ),\hfill & 0\le \omega \le \pi .\hfill \end{array}$$
The average power of a signal over the frequency band, [ω_{1},ω_{2}] with 0 ≤ ω_{1} ≤ ω_{2} ≤ π, can be computed using the one-sided PSD as
$${\overline{P}}_{[{\omega}_{1},{\omega}_{2}]}={\displaystyle {\int}_{{\omega}_{1}}^{{\omega}_{2}}{P}_{\text{onesided}}}(\omega )d\omega .$$
The various methods of spectrum estimation available in the toolbox are categorized as follows:
Nonparametric methods are those in which the PSD is estimated directly from the signal itself. The simplest such method is the periodogram. Other nonparametric techniques such as Welch's method [8], the multitaper method (MTM) reduce the variance of the periodogram.
Parametric methods are those in which the PSD is estimated from a signal that is assumed to be output of a linear system driven by white noise. Examples are the Yule-Walker autoregressive (AR) method and the Burg method. These methods estimate the PSD by first estimating the parameters (coefficients) of the linear system that hypothetically generates the signal. They tend to produce better results than classical nonparametric methods when the data length of the available signal is relatively short. Parametric methods also produce smoother estimates of the PSD than nonparametric methods, but are subject to error from model misspecification.
Subspace methods, also known as high-resolution methods or super-resolution methods, generate frequency component estimates for a signal based on an eigenanalysis or eigendecomposition of the autocorrelation matrix. Examples are the multiple signal classification (MUSIC) method or the eigenvector (EV) method. These methods are best suited for line spectra — that is, spectra of sinusoidal signals — and are effective in the detection of sinusoids buried in noise, especially when the signal to noise ratios are low. The subspace methods do not yield true PSD estimates: they do not preserve process power between the time and frequency domains, and the autocorrelation sequence cannot be recovered by taking the inverse Fourier transform of the frequency estimate.
All three categories of methods are listed in the table below with the corresponding toolbox function names. See Parametric Modeling for details about lpc and other parametric estimation functions.
Spectral Estimation Methods/Functions
Method | Description | Functions |
---|---|---|
Periodogram | Power spectral density estimate | |
Welch | Averaged periodograms of overlapped, windowed signal sections | |
Multitaper | Spectral estimate from combination of multiple orthogonal windows (or "tapers") | |
Yule-Walker AR | Autoregressive (AR) spectral estimate of a time-series from its estimated autocorrelation function | |
Burg | Autoregressive (AR) spectral estimation of a time-series by minimization of linear prediction errors | |
Autoregressive (AR) spectral estimation of a time-series by minimization of the forward prediction errors | ||
Modified Covariance | Autoregressive (AR) spectral estimation of a time-series by minimization of the forward and backward prediction errors | |
MUSIC | Multiple signal classification | |
Eigenvector | Pseudospectrum estimate |
The following sections discuss the periodogram, modified periodogram, Welch, and multitaper methods of nonparametric estimation, along with the related CPSD function, transfer function estimate, and coherence function.
In general terms, one way of estimating the PSD of a process is to simply find the discrete-time Fourier transform of the samples of the process (usually done on a grid with an FFT) and appropriately scale the magnitude squared of the result. This estimate is called the periodogram.
The periodogram estimate of the PSD of a length-L signal x_{L}(n) is
$${P}_{xx}(f)=\frac{1}{L{F}_{s}}{\left|{\displaystyle \sum _{n=0}^{L-1}{x}_{L}}(n){e}^{-j2\pi fn/{F}_{s}}\right|}^{2},$$
where F_{s} is the sampling frequency.
In practice, the actual computation of P_{xx}(f) can be performed only at a finite number of frequency points, and usually employs an FFT. Most implementations of the periodogram method compute the N-point PSD estimate at the frequencies
$${f}_{k}=\frac{k{F}_{s}}{N},\text{\hspace{1em}}k=0,1,\dots ,N-1.$$
In some cases, the computation of the periodogram via an FFT algorithm is more efficient if the number of frequencies is a power of two. Therefore it is not uncommon to pad the input signal with zeros to extend its length to a power of two.
As an example of the periodogram, consider the following 1001-element signal xn, which consists of two sinusoids plus noise:
fs = 1000; % Sampling frequency t = (0:fs)/fs; % One second worth of samples A = [1 2]; % Sinusoid amplitudes (row vector) f = [150;140]; % Sinusoid frequencies (column vector) xn = A*sin(2*pi*f*t) + 0.1*randn(size(t));
Note The three last lines illustrate a convenient and general way to express the sum of sinusoids. Together they are equivalent to xn = sin(2*pi*150*t) + 2*sin(2*pi*140*t) + 0.1*randn(size(t)); |
The periodogram estimate of the PSD can be computed using periodogram. In this case, the data vector is multiplied by a Hamming window to produce a modified periodogram.
[Pxx,F] = periodogram(xn,hamming(length(xn)),length(xn),fs); plot(F,10*log10(Pxx)) xlabel('Hz') ylabel('dB') title('Modified Periodogram Power Spectral Density Estimate')
Algorithm. Periodogram computes and scales the output of the FFT to produce the power vs. frequency plot as follows. For a detailed example, see Power Spectral Density Estimates Using FFT.
If the input signal is real-valued, the magnitude of the resulting FFT is symmetric with respect to zero frequency (DC). For an even-length FFT, only the first (1 + nfft/2) points are unique. Determine the number of unique values and keep only those unique points.
Take the squared magnitudes of the unique FFT values. Scale the squared magnitudes (except for DC) by 2/(F_{s}N), where N is the length of signal prior to any zero padding. Scale the DC value by 1/(F_{s}N).
Create a frequency vector from the number of unique points, the nfft and the sampling frequency.
Plot the resulting magnitude squared FFT against the frequency.
The following sections discuss the performance of the periodogram with regard to the issues of leakage, resolution, bias, and variance.
Spectral Leakage. Consider the PSD of a finite-length (length L) signal x_{L}(n), as discussed in the Periodogram section. It is frequently useful to interpret x_{L}(n) as the result of multiplying an infinite signal, x(n), by a finite-length rectangular window, w_{R}(n):
$${x}_{L}[n]=x[n]{w}_{R}[n].$$
Because multiplication in the time domain corresponds to convolution in the frequency domain, the expected value of the periodogram in the frequency domain is
$$E\{{\widehat{P}}_{xx}(f)\}=\frac{1}{Fs}{\displaystyle {\int}_{-Fs/2}^{Fs/2}\frac{{\mathrm{sin}}^{2}(L\pi (f-{f}^{\prime})/Fs)}{L{\mathrm{sin}}^{2}(\pi (f-{f}^{\prime})/Fs)}}{P}_{xx}({f}^{\prime})d{f}^{\prime},$$
showing that the expected value of the periodogram is the convolution of the true PSD with the square of the Dirichlet kernel.
The effect of the convolution is best understood for sinusoidal data. Suppose that x(n) is composed of a sum of M complex sinusoids:
$$x(n)={\displaystyle \sum _{k=1}^{N}{A}_{k}}{e}^{j{\omega}_{k}n}.$$
Its spectrum is
$$X(\omega )={\displaystyle \sum _{k=1}^{N}{A}_{k}}\delta (\omega -{\omega}_{k}),$$
which for a finite-length sequence becomes
$$X(\omega )={\displaystyle {\int}_{-\pi}^{\pi}\left({\displaystyle \sum _{k=1}^{N}{A}_{k}\delta (\epsilon -{\omega}_{k})}\right){W}_{R}(\omega -\epsilon )\text{\hspace{0.17em}}d\epsilon}.$$
The preceding equation is equal to:
$$X(\omega )={\displaystyle \sum _{k=1}^{N}{A}_{k}}{W}_{R}(\omega -{\omega}_{k}).$$
So in the spectrum of the finite-length signal, the Dirac deltas have been replaced by terms of the form W_{R}(ω – ω_{k}), which corresponds to the frequency response of a rectangular window centered on the frequency ω_{k}.
The frequency response of a rectangular window has the shape of a sinc signal, as shown below.
The plot displays a main lobe and several side lobes, the largest of which is approximately 13.5 dB below the mainlobe peak. These lobes account for the effect known as spectral leakage. While the infinite-length signal has its power concentrated exactly at the discrete frequency points f_{k}, the windowed (or truncated) signal has a continuum of power "leaked" around the discrete frequency points f_{k}.
Because the frequency response of a short rectangular window is a much poorer approximation to the Dirac delta function than that of a longer window, spectral leakage is especially evident when data records are short. Consider the following sequence of 100 samples:
fs = 1000; % Sampling frequency t = (0:fs/10)/fs; % One-tenth second worth of samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 0.1*randn(size(t)); [Pxx,F] = periodogram(xn,rectwin(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
It is important to note that the effect of spectral leakage is contingent solely on the length of the data record. It is not a consequence of the fact that the periodogram is computed at a finite number of frequency samples.
Resolution. Resolution refers to the ability to discriminate spectral features, and is a key concept on the analysis of spectral estimator performance.
In order to resolve two sinusoids that are relatively close together in frequency, it is necessary for the difference between the two frequencies to be greater than the width of the mainlobe of the leaked spectra for either one of these sinusoids. The mainlobe width is defined to be the width of the mainlobe at the point where the power is half the peak mainlobe power (i.e., 3 dB width). This width is approximately equal to f_{s} / L.
In other words, for two sinusoids of frequencies f_{1} and f_{2}, the resolvability condition requires that
$${f}_{2}-{f}_{1}>\frac{Fs}{L}.$$
In the example above, where two sinusoids are separated by only 10 Hz, the data record must be greater than 100 samples to allow resolution of two distinct sinusoids by a periodogram.
Consider a case where this criterion is not met, as for the sequence of 67 samples below:
fs = 1000; % Sampling frequency t = (0:fs/15)/fs; % 67 samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 0.1*randn(size(t)); [Pxx,F] = periodogram(xn,rectwin(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
The above discussion about resolution did not consider the effects of noise since the signal-to-noise ratio (SNR) has been relatively high thus far. When the SNR is low, true spectral features are much harder to distinguish, and noise artifacts appear in spectral estimates based on the periodogram. The example below illustrates this:
fs = 1000; % Sampling frequency t = (0:fs/10)/fs; % One-tenth second worth of samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 2*randn(size(t)); [Pxx,F] = periodogram(xn,rectwin(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
Bias of the Periodogram. The periodogram is a biased estimator of the PSD. Its expected value was previously shown to be.
$$E\{{\widehat{P}}_{xx}(f)\}=\frac{1}{Fs}{\displaystyle {\int}_{-Fs/2}^{Fs/2}\frac{{\mathrm{sin}}^{2}(L\pi (f-{f}^{\prime})/Fs)}{L{\mathrm{sin}}^{2}(\pi (f-{f}^{\prime})/Fs)}}{P}_{xx}({f}^{\prime})d{f}^{\prime}.$$
The periodogram is asymptotically unbiased, which is evident from the earlier observation that as the data record length tends to infinity, the frequency response of the rectangular window more closely approximates the Dirac delta function. However, in some cases the periodogram is a poor estimator of the PSD even when the data record is long. This is due to the variance of the periodogram, as explained below.
Variance of the Periodogram. The variance of the periodogram can be shown to be
$$\text{Var}({\widehat{P}}_{xx}(f))=\{\begin{array}{cc}{P}_{xx}^{2}(f),& 0<f<Fs/2,\\ 2{P}_{xx}^{2}(f),& f=0\text{or}f=Fs/2,\end{array}$$
which indicates that the variance does not tend to zero as the data length L tends to infinity. In statistical terms, the periodogram is not a consistent estimator of the PSD. Nevertheless, the periodogram can be a useful tool for spectral estimation in situations where the SNR is high, and especially if the data record is long.
The modified periodogram windows the time-domain signal prior to computing the DFT in order to smooth the edges of the signal. This has the effect of reducing the height of the sidelobes or spectral leakage. This phenomenon gives rise to the interpretation of sidelobes as spurious frequencies introduced into the signal by the abrupt truncation that occurs when a rectangular window is used. For nonrectangular windows, the end points of the truncated signal are attenuated smoothly, and hence the spurious frequencies introduced are much less severe. On the other hand, nonrectangular windows also broaden the mainlobe, which results in a reduction of resolution.
periodogram allows you to compute a modified periodogram by specifying the window to be used on the data. For example, compare a default rectangular window and a Hamming window:
fs = 1000; % Sampling frequency t = (0:fs/10)/fs; % One-tenth second worth of samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 0.1*randn(size(t)); [Pxx,F] = periodogram(xn,rectwin(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
[Pxx,F] = periodogram(xn,hamming(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
You can verify that although the sidelobes are much less evident in the Hamming-windowed periodogram, the two main peaks are wider. In fact, the 3 dB width of the mainlobe corresponding to a Hamming window is approximately twice that of a rectangular window. Hence, for a fixed data length, the PSD resolution attainable with a Hamming window is approximately half that attainable with a rectangular window. The competing interests of mainlobe width and sidelobe height can be resolved to some extent by using variable windows such as the Kaiser window.
Nonrectangular windowing affects the average power of a signal because some of the time samples are attenuated when multiplied by the window. To compensate for this, periodogram and pwelch normalize the window to have an average power of unity. This ensures that the measured average power is generally independent of window choice. If the frequency components are not well resolved by the PSD estimators, the window choice does affect the average power.
The modified periodogram estimate of the PSD is
$${\widehat{P}}_{xx}(f)=\frac{|X(f){|}^{2}}{FsLU},$$
where U is the window normalization constant:
$$U=\frac{1}{L}{\displaystyle \sum _{n=0}^{N-1}|}w(n){|}^{2}.$$
For large values of L, U tends to become independent of window length. The addition of U as a normalization constant ensures that the modified periodogram is asymptotically unbiased.
An improved estimator of the PSD is the one proposed by Welch [8]. The method consists of dividing the time series data into (possibly overlapping) segments, computing a modified periodogram of each segment, and then averaging the PSD estimates. The result is Welch's PSD estimate. The toolbox function pwelch implements Welch's method.
The averaging of modified periodograms tends to decrease the variance of the estimate relative to a single periodogram estimate of the entire data record. Although overlap between segments introduces redundant information, this effect is diminished by the use of a nonrectangular window, which reduces the importance or weight given to the end samples of segments (the samples that overlap).
However, as mentioned above, the combined use of short data records and nonrectangular windows results in reduced resolution of the estimator. In summary, there is a trade-off between variance reduction and resolution. One can manipulate the parameters in Welch's method to obtain improved estimates relative to the periodogram, especially when the SNR is low. This is illustrated in the following example.
Consider an original signal consisting of 1000 samples:
fs = 1000; % Sampling frequency t = (0:1*fs)/fs; % 301 samples A = [2 8]; % Sinusoid amplitudes (row vector) f = [150;140]; % Sinusoid frequencies (column vector) xn = A*sin(2*pi*f*t) + 5*randn(size(t)); [Pxx,F] = periodogram(xn,rectwin(length(xn)),length(xn),fs); plot(F,10*log10(Pxx))
We can obtain Welch's spectral estimate for 3 segments with 50% overlap using a Hamming window.
[Pxx,F] = pwelch(xn,hamming(150),75,150,fs); plot(F,10*log10(Pxx)) xlabel('Hz') ylabel('dB') title('Welch''s Overlapped Segment Averaging PSD Estimate')
In the periodogram above, noise and the leakage make one of the sinusoids essentially indistinguishable from the artificial peaks. In contrast, although the PSD produced by Welch's method has wider peaks, you can still distinguish the two sinusoids, which stand out from the "noise floor."
However, if we try to reduce the variance further, the loss of resolution causes one of the sinusoids to be lost altogether:
[Pxx,F] = pwelch(xn,rectwin(100),75,512,Fs); plot(F,10*log10(Pxx))
For a more detailed discussion of Welch's method of PSD estimation, see Kay [2] and Welch [8].
Welch's method yields a biased estimator of the PSD. The expected value of the PSD estimate is:
$$E\{{P}_{\text{Welch}}(f)\}=\frac{1}{FsLU}{\displaystyle {\int}_{-Fs/2}^{Fs/2}|}W(f-{f}^{\prime}){|}^{2}{P}_{xx}({f}^{\prime})d{f}^{\prime},$$
where L is the length of the data segments, U is the same normalization constant present in the definition of the modified periodogram, and W(f) is the Fourier transform of the window function. As is the case for all periodograms, Welch's estimator is asymptotically unbiased. For a fixed length data record, the bias of Welch's estimate is larger than that of the periodogram because the length of the segments is less than the length of the entire data sample.
The variance of Welch's estimator is difficult to compute because it depends on both the window used and the amount of overlap between segments. Basically, the variance is inversely proportional to the number of segments whose modified periodograms are being averaged.
The periodogram can be interpreted as filtering a length L signal, x_{L}(n), through a filter bank (a set of filters in parallel) of L FIR bandpass filters. The 3 dB bandwidth of each of these bandpass filters can be shown to be approximately equal to f_{s} / L. The magnitude response of each one of these bandpass filters resembles that of the rectangular window discussed in Spectral Leakage. The periodogram can thus be viewed as a computation of the power of each filtered signal (i.e., the output of each bandpass filter) that uses just one sample of each filtered signal and assumes that the PSD of x_{L}(n) is constant over the bandwidth of each bandpass filter.
As the length of the signal increases, the bandwidth of each bandpass filter decreases, making it a more selective filter, and improving the approximation of constant PSD over the bandwidth of the filter. This provides another interpretation of why the PSD estimate of the periodogram improves as the length of the signal increases. However, there are two factors apparent from this standpoint that compromise the accuracy of the periodogram estimate. First, the rectangular window yields a poor bandpass filter. Second, the computation of the power at the output of each bandpass filter relies on a single sample of the output signal, producing a very crude approximation.
Welch's method can be given a similar interpretation in terms of a filter bank. In Welch's implementation, several samples are used to compute the output power, resulting in reduced variance of the estimate. On the other hand, the bandwidth of each bandpass filter is larger than that corresponding to the periodogram method, which results in a loss of resolution. The filter bank model thus provides a new interpretation of the compromise between variance and resolution.
Thompson's multitaper method (MTM) builds on these results to provide an improved PSD estimate. Instead of using bandpass filters that are essentially rectangular windows (as in the periodogram method), the MTM method uses a bank of optimal bandpass filters to compute the estimate. These optimal FIR filters are derived from a set of sequences known as discrete prolate spheroidal sequences (DPSSs, also known as Slepian sequences).
In addition, the MTM method provides a time-bandwidth parameter with which to balance the variance and resolution. This parameter is given by the time-bandwidth product, NWand it is directly related to the number of tapers used to compute the spectrum. There are always 2NW – 1 tapers used to form the estimate. This means that, as NW increases, there are more estimates of the power spectrum, and the variance of the estimate decreases. However, the bandwidth of each taper is also proportional to NW, so as NW increases, each estimate exhibits more spectral leakage (i.e., wider peaks) and the overall spectral estimate is more biased. For each data set, there is usually a value for NW that allows an optimal trade-off between bias and variance.
The Signal Processing Toolbox™ function that implements the MTM method is pmtm. Use pmtm to compute the PSD of xn from the previous examples:
fs = 1000; % Sampling frequency t = (0:fs)/fs; % One second worth of samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 0.1*randn(size(t)); [Pxx1,F1] = pmtm(xn,4,fs); plot(F1,10*log10(Pxx1))
By lowering the time-bandwidth product, you can increase the resolution at the expense of larger variance:
[Pxx2,F2] = pmtm(xn,1.5,fs); plot(F2,10*log10(Pxx2))
This method is more computationally expensive than Welch's method due to the cost of computing the discrete prolate spheroidal sequences. For long data series (10,000 points or more), it is useful to compute the DPSSs once and save them in a MAT-file. dpsssave, dpssload, dpssdir, and dpssclear are provided to keep a database of saved DPSSs in the MAT-file dpss.mat.
The PSD is a special case of the cross spectral density (CPSD) function, defined between two signals x(n) and y(n) as
$${P}_{xy}(\omega )=\frac{1}{2\pi}{\displaystyle \sum _{m=-\infty}^{\infty}{R}_{xy}(m){e}^{-j\omega m}}.$$
As is the case for the correlation and covariance sequences, the toolbox estimates the PSD and CPSD because signal lengths are finite.
To estimate the cross-spectral density of two equal length signals x and y using Welch's method, the cpsd function forms the periodogram as the product of the FFT of x and the conjugate of the FFT of y. Unlike the real-valued PSD, the CPSD is a complex function. cpsd handles the sectioning and windowing of x and y in the same way as the pwelch function:
Sxy = cpsd(x,y,nwin,noverlap,nfft,fs)
One application of Welch's method is nonparametric system identification. Assume that H is a linear, time invariant system, and x(n) and y(n) are the input to and output of H, respectively. Then the power spectrum of x(n) is related to the CPSD of x(n) and y(n) by
$${P}_{yx}(\omega )=H(\omega ){P}_{xx}(\omega ).$$
An estimate of the transfer function between x(n) and y(n) is
$$\stackrel{\u2322}{H}(\omega )=\frac{{\stackrel{\u2322}{P}}_{yx}(\omega )}{{\stackrel{\u2322}{P}}_{xx}(\omega )}.$$
This method estimates both magnitude and phase information. The tfestimate function uses Welch's method to compute the CPSD and power spectrum, and then forms their quotient for the transfer function estimate. Use tfestimate the same way that you use the cpsd function.
Filter the signal xn with an FIR filter, then plot the actual magnitude response and the estimated response:
h = ones(1,10)/10; % Moving-average filter yn = filter(h,1,xn); [HEST,f] = tfestimate(xn,yn,256,128,256,fs); H = freqz(h,1,f,fs); subplot(2,1,1) plot(f,abs(H)) title('Actual Transfer Function Magnitude') subplot(2,1,2) plot(f,abs(HEST)) title('Transfer Function Magnitude Estimate') xlabel('Frequency (Hz)')
The magnitude-squared coherence between two signals x(n) and y(n) is
$${C}_{xy}(\omega )=\frac{{\left|{P}_{xy}(\omega )\right|}^{2}}{{P}_{xx}(\omega ){P}_{yy}(\omega )}.$$
This quotient is a real number between 0 and 1 that measures the correlation between x(n) and y(n) at the frequency ω.
The mscohere function takes sequences x and y, computes their power spectra and CPSD, and returns the quotient of the magnitude squared of the CPSD and the product of the power spectra. Its options and operation are similar to the cpsd and tfestimate functions.
The coherence function of xn and the filter output yn versus frequency is
mscohere(xn,yn,256,128,256,fs)
If the input sequence length nfft, window length window, and the number of overlapping data points in a window, numoverlap, are such that mscohere operates on only a single record, the function returns all ones. This is because the coherence function for linearly dependent data is one.
Parametric methods can yield higher resolutions than nonparametric methods in cases when the signal length is short. These methods use a different approach to spectral estimation; instead of trying to estimate the PSD directly from the data, they model the data as the output of a linear system driven by white noise, and then attempt to estimate the parameters of that linear system.
The most commonly used linear system model is the all-pole model, a filter with all of its zeroes at the origin in the z-plane. The output of such a filter for white noise input is an autoregressive (AR) process. For this reason, these methods are sometimes referred to as AR methods of spectral estimation.
The AR methods tend to adequately describe spectra of data that is "peaky," that is, data whose PSD is large at certain frequencies. The data in many practical applications (such as speech) tends to have "peaky spectra" so that AR models are often useful. In addition, the AR models lead to a system of linear equations which is relatively simple to solve.
Signal Processing Toolbox AR methods for spectral estimation include:
All AR methods yield a PSD estimate given by
$$\widehat{P}(f)=\frac{1}{Fs}\frac{{\epsilon}_{p}}{{\left|1-{\displaystyle \sum _{k=1}^{p}{\widehat{a}}_{p}}(k){e}^{-j2\pi kf/Fs}\right|}^{2}}.$$
The different AR methods estimate the parameters slightly differently, yielding different PSD estimates. The following table provides a summary of the different AR methods.
AR Methods
Characteristics | Does not apply window to data | Does not apply window to data | Does not apply window to data | Applies window to data |
Minimizes the forward and backward prediction errors in the least squares sense, with the AR coefficients constrained to satisfy the L-D recursion | Minimizes the forward prediction error in the least squares sense | Minimizes the forward and backward prediction errors in the least squares sense | Minimizes the forward prediction error in the least squares sense (also called "Autocorrelation method") | |
Advantages | High resolution for short data records | Better resolution than Y-W for short data records (more accurate estimates) | High resolution for short data records | Performs as well as other methods for large data records |
Always produces a stable model | Able to extract frequencies from data consisting of p or more pure sinusoids | Able to extract frequencies from data consisting of p or more pure sinusoids | Always produces a stable model | |
Does not suffer spectral line-splitting | ||||
Disadvantages | Peak locations highly dependent on initial phase | May produce unstable models | May produce unstable models | Performs relatively poorly for short data records |
May suffer spectral line-splitting for sinusoids in noise, or when order is very large | Frequency bias for estimates of sinusoids in noise | Peak locations slightly dependent on initial phase | Frequency bias for estimates of sinusoids in noise | |
Frequency bias for estimates of sinusoids in noise | Minor frequency bias for estimates of sinusoids in noise | |||
Conditions for Nonsingularity | Order must be less than or equal to half the input frame size | Order must be less than or equal to 2/3 the input frame size | Because of the biased estimate, the autocorrelation matrix is guaranteed to positive-definite, hence nonsingular |
The Yule-Walker AR method of spectral estimation computes the AR parameters by solving the following linear system, which give the Yule-Walker equations in matrix form:
$$\left(\begin{array}{cccc}r(0)& {r}^{*}(1)& \dots & {r}^{*}(p-1)\\ r(1)& r(0)& \dots & {r}^{*}(p-2)\\ r(p-1)& r(p-2)& \dots & r(0)\end{array}\right)\left(\begin{array}{c}a(1)\\ a(2)\\ \dots \\ a(p)\end{array}\right)=\left(\begin{array}{c}r(1)\\ r(2)\\ \dots \\ r(p)\end{array}\right).$$
In practice, the biased estimate of the autocorrelation is used for the unknown true autocorrelation.The Yule-Walker AR method produces the same results as a maximum entropy estimator. For more information, see page 155 of item [2] in the Selected Bibliography.
The use of a biased estimate of the autocorrelation function ensures that the autocorrelation matrix above is positive definite. Hence, the matrix is invertible and a solution is guaranteed to exist. Moreover, the AR parameters thus computed always result in a stable all-pole model. The Yule-Walker equations can be solved efficiently via Levinson's algorithm, which takes advantage of the Hermitian Toeplitz structure of the autocorrelation matrix.
The toolbox function pyulear implements the Yule-Walker AR method. For example, compare the spectrum of a speech signal using Welch's method and the Yule-Walker AR method:
load mtlb
[Pxx,F] = pwelch(mtlb,hamming(256),128,1024,Fs);
plot(F,10*log10(Pxx))
order = 14; [Pxx,F] = pyulear(mtlb,order,1024,fs); plot(F,10*log10(Pxx))
The Yule-Walker AR spectrum is smoother than the periodogram because of the simple underlying all-pole model.
The Burg method for AR spectral estimation is based on minimizing the forward and backward prediction errors while satisfying the Levinson-Durbin recursion (see Marple [3], Chapter 7, and Proakis [6], Section 12.3.3). In contrast to other AR estimation methods, the Burg method avoids calculating the autocorrelation function, and instead estimates the reflection coefficients directly.
The primary advantages of the Burg method are resolving closely spaced sinusoids in signals with low noise levels, and estimating short data records, in which case the AR power spectral density estimates are very close to the true values. In addition, the Burg method ensures a stable AR model and is computationally efficient.
The accuracy of the Burg method is lower for high-order models, long data records, and high signal-to-noise ratios (which can cause line splitting, or the generation of extraneous peaks in the spectrum estimate). The spectral density estimate computed by the Burg method is also susceptible to frequency shifts (relative to the true frequency) resulting from the initial phase of noisy sinusoidal signals. This effect is magnified when analyzing short data sequences.
The toolbox function pburg implements the Burg method. Compare the spectrum of the speech signal generated by both the Burg method and the Yule-Walker AR method. They are very similar for large signal lengths:
load mtlb
order = 14;
[Pburg,F] = pburg(mtlb(1:512),order,1024,fs);
plot(F,10*log10(Pburg))
[Pxx,F] = pyulear(mtlb(1:512),ORDER,1024,fs); plot(F,10*log10(Pxx))
Compare the spectrum of a noisy signal computed using the Burg method and the Welch method:
fs = 1000; % Sampling frequency t = (0:fs)/fs; % One second worth of samples A = [1 2]; % Sinusoid amplitudes f = [150;140]; % Sinusoid frequencies xn = A*sin(2*pi*f*t) + 0.1*randn(size(t)); pwelch(xn,hamming(256),128,1024,fs)
pburg(xn,14,1024,fs)
Note that, as the model order for the Burg method is reduced, a frequency shift due to the initial phase of the sinusoids will become apparent.
The covariance method for AR spectral estimation is based on minimizing the forward prediction error. The modified covariance method is based on minimizing the forward and backward prediction errors. The toolbox functions pcov and pmcov implement the respective methods.
Compare the spectrum of the speech signal generated by both the covariance method and the modified covariance method. They are nearly identical, even for a short signal length:
load mtlb
pcov(mtlb(1:64),14,1024,fs)
pmcov(mtlb(1:64),14,1024,fs)
The pmusic function and peig functions provide two related spectral analysis methods:
pmusic provides the multiple signal classification (MUSIC) method developed by Schmidt
peig provides the eigenvector (EV) method developed by Johnson
Both of these methods are frequency estimator techniques based on eigenanalysis of the autocorrelation matrix. This type of spectral analysis categorizes the information in a correlation or data matrix, assigning information to either a signal subspace or a noise subspace.
Consider a number of complex sinusoids embedded in white noise. You can write the autocorrelation matrix R for this system as the sum of the signal autocorrelation matrix (S) and the noise autocorrelation matrix (W):
R = S + W. There is a close relationship between the eigenvectors of the signal autocorrelation matrix and the signal and noise subspaces. The eigenvectors v of S span the same signal subspace as the signal vectors. If the system contains M complex sinusoids and the order of the autocorrelation matrix is p, eigenvectors v_{M}_{+1} through v_{p}_{+1} span the noise subspace of the autocorrelation matrix.
Frequency Estimator Functions. To generate their frequency estimates, eigenanalysis methods calculate functions of the vectors in the signal and noise subspaces. Both the MUSIC and EV techniques choose a function that goes to infinity (denominator goes to zero) at one of the sinusoidal frequencies in the input signal. Using digital technology, the resulting estimate has sharp peaks at the frequencies of interest; this means that there might not be infinity values in the vectors.
The MUSIC estimate is given by the formula
$${\widehat{P}}_{\text{MUSIC}}(f)=\frac{1}{{\displaystyle \sum _{k=p+1}^{M}{\left|e{(f)}^{H}{v}_{k}\right|}^{2}}},$$
where the v_{k} are the eigenvectors of the noise subspace and e(f) is a vector of complex sinusoids:
$$e(f)={[\begin{array}{lllll}1\hfill & {e}^{j2\pi f}\hfill & {e}^{j4\pi f}\hfill & \dots \hfill & {e}^{j2(M-1)\pi}\hfill \end{array}]}^{T}.$$
Here v represents the eigenvectors of the input signal's correlation matrix; v_{k } is the kth eigenvector. H is the conjugate transpose operator. The eigenvectors used in the sum correspond to the smallest eigenvalues and span the noise subspace (p is the size of the signal subspace).
The expression e(f)^{H}v_{k} is equivalent to a Fourier transform (the vector e(f) consists of complex exponentials). This form is useful for numeric computation because the FFT can be computed for each v_{k} and then the squared magnitudes can be summed.
The EV method weights the summation by the eigenvalues of the correlation matrix:
$${\widehat{P}}_{\text{EV}}(f)=\frac{{\lambda}_{k}}{{\displaystyle \sum _{k=p+1}^{M}{\left|{e}^{H}(f){v}_{k}\right|}^{2}}}.$$
The pmusic and peig functions in this toolbox interpret their first input either as a signal matrix or as a correlation matrix (if the 'corr' input flag is set). In the former case, the singular value decomposition of the signal matrix is used to determine the signal and noise subspaces. In the latter case, the eigenvalue decomposition of the correlation matrix is used to determine the signal and noise subspaces.