Continuous and Discrete Wavelet Transforms

This topic describes the major differences between the continuous wavelet transform (CWT) and the discrete wavelet transform (DWT) – both decimated and nondecimated versions. cwt is a discretized version of the CWT so that it can be implemented in a computational environment. This discussion focuses on the 1-D case, but is applicable to higher dimensions.

The cwt wavelet transform compares a signal with shifted and scaled (stretched or shrunk) copies of a basic wavelet. If $ψ (t)$ is a wavelet centered at t=0 with time support on [-T/2, T/2], then $\frac{1}{s} ψ (\frac{t - u}{s})$ is centered at t = u with time support [-sT/2+u, sT/2+u]. The cwt function uses L1 normalization so that all frequency amplitudes are normalized to the same value. If 0<s<1, the wavelet is contracted (shrunk) and if s>1, the wavelet is stretched. The mathematical term for this is dilation. See Continuous Wavelet Transform and Scale-Based Analysis for examples of how this operation extracts features in the signal by matching it against dilated and translated wavelets.

The major difference between the CWT and discrete wavelet transforms, such as the dwt and modwt, is how the scale parameter is discretized. The CWT discretizes scale more finely than the discrete wavelet transform. In the CWT, you typically fix some base which is a fractional power of two, for example, $2^{1 / v}$ where v is an integer greater than 1. The v parameter is often referred to as the number of “voices per octave”. Different scales are obtained by raising this base scale to positive integer powers, for example $2^{j / v} j = 1, 2, 3, \dots$ . The translation parameter in the CWT is discretized to integer values, denoted here by m. The resulting discretized wavelets for the CWT are

$\frac{1}{2^{j / ν}} ψ (\frac{n - m}{2^{j / v}}) .$

The reason v is referred to as the number of voices per octave is because increasing the scale by an octave (a doubling) requires v intermediate scales. Take for example $2^{v / v} = 2$ and then increase the numerator in the exponent until you reach 4, the next octave. You move from $2^{v / v} = 2$ to $2^{2 v / v} = 4$ . There are v intermediate steps. Common values for v are 10,12,14,16, and 32. The larger the value of v, the finer the discretization of the scale parameter, s. However, this also increases the amount of computation required because the CWT must be computed for every scale. The difference between scales on a log₂ scale is 1/v. See CWT-Based Time-Frequency Analysis and Continuous Wavelet Analysis of Modulated Signals for examples of scale vectors with the CWT.

In the discrete wavelet transform, the scale parameter is always discretized to integer powers of 2, 2^j, j=1,2,3,..., so that the number of voices per octave is always 1. The difference between scales on a log₂ scale is always 1 for discrete wavelet transforms. Note that this is a much coarser sampling of the scale parameter, s, than is the case with the CWT. Further, in the decimated (downsampled) discrete wavelet transform (DWT), the translation parameter is always proportional to the scale. This means that at scale, 2^j, you always translate by 2^jm where m is a nonnegative integer. In nondecimated discrete wavelet transforms like modwt and swt, the scale parameter is restricted to powers of two, but the translation parameter is an integer like in the CWT. The discretized wavelet for the DWT takes the following form

$\frac{1}{\sqrt{2^{j}}} ψ (\frac{1}{2^{j}} (n - 2^{j} m)) .$

The discretized wavelet for the nondecimated discrete wavelet transform, such as the MODWT, is

$\frac{1}{\sqrt{2^{j}}} ψ (\frac{n - m}{2^{j}}) .$

To summarize:

The CWT and the discrete wavelet transforms differ in how they discretize the scale parameter. The CWT typically uses exponential scales with a base smaller than 2, for example 2^1/12 . The discrete wavelet transform always uses exponential scales with the base equal to 2. The scales in the discrete wavelet transform are powers of 2. Keep in mind that the physical interpretation of scales for both the CWT and discrete wavelet transforms requires the inclusion of the signal’s sampling interval if it is not equal to one. For example, assume you are using the CWT and you set your base to $s_{0} = 2^{1 / 12}$ . To attach physical significance to that scale, you must multiply by the sampling interval $Δ t$ , so a scale vector covering approximately four octaves with the sampling interval taken into account is $s_{0}^{j} Δ t j = 1, 2, \dots 48$ . Note that the sampling interval multiplies the scales, it is not in the exponent. For discrete wavelet transforms the base scale is always 2.
The decimated and nondecimated discrete wavelet transforms differ in how they discretize the translation parameter. The decimated discrete wavelet transform (DWT), always translates by an integer multiple of the scale, 2^jm . The nondecimated discrete wavelet transform translates by integer shifts.

These differences in how scale and translation are discretized result in advantages and disadvantages for the two classes of wavelet transforms. These differences also determine use cases where one wavelet transform is likely to provide superior results. Some important consequences of the discretization of the scale and translation parameter are:

The DWT provides a sparse representation for many natural signals. In other words, the important features of many natural signals are captured by a subset of DWT coefficients that is typically much smaller than the original signal. This “compresses” the signal. With the DWT, you always end up with the same number of coefficients as the original signal, but many of the coefficients may be close to zero in value. As a result, you can often throw away those coefficients and still maintain a high-quality signal approximation. With the CWT, you go from N samples for an N-length signal to a M-by-N matrix of coefficients with M equal to the number of scales. The CWT is a highly redundant transform. There is significant overlap between wavelets at each scale and between scales. The computational resources required to compute the CWT and store the coefficients is much larger than the DWT. The nondecimated discrete wavelet transform is also redundant but the redundancy factor is usually significantly less than the CWT, because the scale parameter is not discretized so finely. For the nondecimated discrete wavelet transform, you go from N samples to an L+1-by-N matrix of coefficients where L is the level of the transform.
The strict discretization of scale and translation in the DWT ensures that the DWT is an orthonormal transform (when using an orthogonal wavelet). There are many benefits of orthonormal transforms in signal analysis. Many signal models consist of some deterministic signal plus white Gaussian noise. An orthonormal transform takes this kind of signal and outputs the transform applied to the signal plus white noise. In other words, an orthonormal transform takes in white Gaussian noise and outputs white Gaussian noise. The noise is uncorrelated at the input and output. This is important in many statistical signal processing settings. In the case of the DWT, the signal of interest is typically captured by a few large-magnitude DWT coefficients, while the noise results in many small DWT coefficients that you can throw away. If you have studied linear algebra, you have no doubt learned many advantages to using orthonormal bases in the analysis and representation of vectors. The wavelets in the DWT are like orthonormal vectors. Neither the CWT nor the nondecimated discrete wavelet transform are orthonormal transforms. The wavelets in the CWT and nondecimated discrete wavelet transform are technically called frames, they are linearly-dependent sets.
The DWT is not shift-invariant. Because the DWT downsamples, a shift in the input signal does not manifest itself as a simple equivalent shift in the DWT coefficients at all levels. A simple shift in a signal can cause a significant realignment of signal energy in the DWT coefficients by scale. The CWT and nondecimated discrete wavelet transform are shift-invariant. There are some modifications of the DWT such as the dual-tree complex discrete wavelet transform that mitigate the lack of shift invariance in the DWT, see Critically Sampled and Oversampled Wavelet Filter Banks for some conceptual material on this topic and Dual-Tree Complex Wavelet Transforms for an example.
The discrete wavelet transforms are equivalent to discrete filter banks. Specifically, they are tree-structured discrete filter banks where the signal is first filtered by a lowpass and a highpass filter to yield lowpass and highpass subbands. Subsequently, the lowpass subband is iteratively filtered by the same scheme to yield narrower octave-band lowpass and highpass subbands. In the DWT, the filter outputs are downsampled at each successive stage. In the nondecimated discrete wavelet transform, the outputs are not downsampled. The filters that define the discrete wavelet transforms typically only have a small number of coefficients so the transform can be implemented very efficiently. For both the DWT and nondecimated discrete wavelet transform, you do not actually require an expression for the wavelet. The filters are sufficient. This is not the case with the CWT. The most common implementation of the CWT requires you have the wavelet explicitly defined. Even though the nondecimated discrete wavelet transform does not downsample the signal, the filter bank implementation still allows for good computational performance, but not as good as the DWT.
The discrete wavelet transforms provide perfect reconstruction of the signal upon inversion. This means that you can take the discrete wavelet transform of a signal and then use the coefficients to synthesize an exact reproduction of the signal to within numerical precision. You can implement an inverse CWT, but it is often the case that the reconstruction is not perfect. Reconstructing a signal from the CWT coefficients is a much less stable numerical operation.
The finer sampling of scales in the CWT typically results in a higher-fidelity signal analysis. You can localize transients in your signal, or characterize oscillatory behavior better with the CWT than with the discrete wavelet transforms.

For additional information on wavelet transforms and applications, see

Guidelines for Continuous Wavelet Transform vs. Discrete Wavelet Transform

Based on the previous section, here are some basic guidelines for deciding on whether to use a discrete or continuous wavelet transform.

If your application is to obtain the sparsest possible signal representation for compression, denoising, or signal transmission, use the DWT with wavedec.
If your application requires an orthonormal transform, use the DWT with one of the orthogonal wavelet filters. The orthogonal families in the Wavelet Toolbox™ are designated as type 1 wavelets in the wavelet manager, wavemngr. Valid built-in orthogonal wavelet families are: Best-localized Daubechies ("bl"), Beylkin ("beyl"), Coiflets ("coif"), Daubechies ("db"), Fejér-Korovkin ("fk"), Haar ("haar"), Han linear-phase moments ("han"), Morris minimum-bandwidth ("mb"), Symlets ("sym"), and Vaidyanathan ("vaid"). For a list of wavelets in each family, see wfilters. For additional information, see Choose a Wavelet and waveinfo.
If your application requires a shift-invariant transform but you still need perfect reconstruction and some measure of computational efficiency, try a nondecimated discrete wavelet transform like modwt or a dual-tree transform like dualtree.
If your primary goal is a detailed time-frequency (scale) analysis or precise localization of signal transients, use cwt. For an example of time-frequency analysis with the CWT, see CWT-Based Time-Frequency Analysis.
For denoising a signal by thresholding wavelet coefficients, use the wdenoise function or the Wavelet Signal Denoiser app. wdenoise and Wavelet Signal Denoiser provide default settings that can be applied to your data, as well as a simple interface to a variety of denoising methods. With the app, you can visualize and denoise signals, and compare results. For examples of denoising a signal, see Denoise A Signal Using Default Values and Denoise a Signal with the Wavelet Signal Denoiser. For denoising images, use wdenoise2. For an example, see Denoising Signals and Images.
If your application requires that you have a solid understanding of the statistical properties of the wavelet coefficients, use a discrete wavelet transform. There is active work in understanding the statistical properties of the CWT, but currently there are many more distributional results for the discrete wavelet transforms. The success of the DWT in denoising is largely due to our understanding of its statistical properties. For an example of estimation and hypothesis testing using a nondecimated discrete wavelet transform see Wavelet Analysis of Financial Data.

Continuous and Discrete Wavelet Transforms

Guidelines for Continuous Wavelet Transform vs. Discrete Wavelet Transform

Related Examples

More About