## Joint Time-Frequency Scattering

A joint time-frequency scattering (JTFS) network enables you to extract features from a signal that are not only invariant to shifts or deformations in time, but also to shifts or deformations in frequency. This time and frequency invariance makes JTFS features robust inputs in AI classification workflows. See, for example, Acoustic Scene Classification with Wavelet Scattering and Musical Instrument Classification with Joint Time-Frequency Scattering.

Anden, Lostanlen, and Mallat developed the JTFS transform as an extension of wavelet time
scattering [1]. As the name suggests,
the wavelet time scattering transform filters data along the *time*
dimension, and then applies pointwise modulus nonlinearities. The JTFS transform
additionally filters the data along the *frequency* dimension, followed
by pointwise modulus nonlinearities.

### Wavelet Time Scattering: A Refresher

A wavelet (time) scattering network enables you to derive low-variance features from time series to use in AI applications. The features are insensitive to translations in time on a scale that you can specify.

In a wavelet scattering network, the input data is first convolved with wavelet filters. Then, a nonlinearity, in this case, pointwise modulus, is applied to the filter bank outputs. The result is the scalogram of the input data. The scalogram is then smoothed (averaged) with a lowpass (scaling) filter. The process repeats with a second wavelet filter bank, and so on. For more information, see Wavelet Scattering. This is a diagram of a wavelet scattering network with two filter banks:

where:

*x*is the input time series.$${\ast}^{T}$$ denotes convolution in time.

$${\psi}_{\lambda}^{(1)}$$ and $${\psi}_{\mu}^{(2)}$$ are the (time) wavelets in the first- and second-order filter banks, respectively. The subscripts

*λ*and*μ*correspond to the wavelet center frequencies.$${\phi}_{T}$$ is the scaling function.

By default, `waveletScattering`

creates a scattering network with two filter
banks.

The first-order scalogram coefficients

$${U}_{1}x(t,\lambda )=\left|x{\ast}^{T}{\psi}_{\lambda}^{(1)}(t)\right|$$

are the absolute magnitude of the continuous wavelet transform (CWT) of the input data. The first-order scattering coefficients

$${S}_{1}x(t,\lambda )={U}_{1}x(t,\lambda ){\ast}^{T}{\phi}_{T}(t)$$

are the smoothed scalogram coefficients. The first-order scalogram
and scattering coefficients are a subset of the coefficients returned by the `waveletScattering`

object function `scatteringTransform`

.

### Inspired by Biology: Joint Time-Frequency Scattering

The JTFS transform is inspired by biology. Research into the primary auditory cortex
has demonstrated the existence of spectro-temporal receptive fields (STRFs) at the
cortical level. The STRF of a neuron is a function of time and frequency which describes
its post-stimulus time histogram in response to various stimuli. STRFs at the cortical
level exhibit ripple-like responses around a given point
(*t*,*λ*) in the time-frequency domain. This
ripple-like behavior can be described in terms of a temporal modulation rate,
*μ*, and a frequency modulation rate, *ℓ*. The
temporal modulation rate is in hertz and the frequency modulation rate is in cycles per
octave or *quefrency*. Accordingly, the full cortical model requires
four parameters:
(*t*,*λ*,*μ*,*ℓ*).

The wavelets $${\psi}_{\mu}(t)=\frac{1}{\mu}\psi \left(\frac{t}{\mu}\right)$$ appear in wavelet time scattering as those used to obtain the second-order time scattering coefficients:

$$\begin{array}{ccc}{S}_{2}x(t,\lambda ,\mu )& =& \left|\left|x{\ast}^{T}{\psi}_{\lambda}^{(1)}\right|{\ast}^{T}{\psi}_{\mu}^{(2)}\right|\ast {\phi}_{T}(t)\\ & =& {U}_{2}x(t,\lambda ,\mu ){\ast}^{T}{\phi}_{T}(t).\end{array}$$

These scattering coefficients are the smoothed second-order scalogram
coefficients. Both sets of coefficients are outputs of the
`waveletScattering`

object function `scatteringTransform`

.

Joint time-frequency scattering takes wavelet time scattering one step further and
essentially creates a feature extractor that mimics the full cortical model. The
second-order scattering coefficients depend on *μ*, the center
frequencies of the second-order time wavelets, $${\psi}_{\mu}^{(2)}(t).$$ To account for the time-frequency "ripples", JTFS uses a
two-dimensional separable wavelet:

$${\Psi}_{\mu ,\ell ,s}(t,\lambda )={\psi}_{\mu}^{(2)}(t){\psi}_{\ell ,s}(s\lambda ).$$

The $${\psi}_{\ell ,s}(s\lambda )$$ wavelets are *frequential wavelets*. Recall the
*λ* are the center frequencies of the first-order time wavelets and
the superscript (2) here denotes that this is the wavelet used in the second-order time
scattering. The variable *s* is the so-called *spin*
which takes the values ±1. This means that with respect to frequency modulations, the
separable wavelet permits both positive and negative frequencies whereas the time
wavelets are typically analytic.

Similar to the case with time scattering, in JTFS, you apply a joint time-frequency lowpass filter $${\phi}_{T,F}$$. With this, the JTFS coefficients are defined as:

$$S(t,\lambda ,\mu ,\ell ,s)=\left|\left|x{\ast}^{T}{\psi}_{\lambda}^{(1)}\right|{\ast}^{T,F}{\Psi}_{\mu ,\ell ,s}\right|{\ast}^{T,F}{\phi}_{T,F}.$$

### Visualize JTFS Separable 2-D Wavelet

Create a JTFS network.

jtfn = timeFrequencyScattering;

Use the `filterbank`

object function to obtain the second-order time wavelet filter bank and its metadata.

```
[~,psi2f,~,timemeta] = filterbank(jtfn);
centerFrequency = timemeta{2}.xi;
minCenterFrequency = min(centerFrequency(end));
whichCF = minCenterFrequency; %#ok<*NASGU>
```

Use the same function to obtain the spin-up and spin-down wavelets and their metadata. The output variable `frequencymeta`

contains the metadata for both the spin-up and spin-down wavelets.

[psifup,psifdown,~,frequencymeta] = filterbank(jtfn, ... FilterBank="frequency"); centerQuefrency = frequencymeta.xi; minCenterQuefrency = min(centerQuefrency); whichCQ = minCenterQuefrency;

A JTFS separable 2-D wavelet $${\Psi}_{\mu ,\ell ,s}(t,\lambda )$$ is defined as ${\Psi}_{\mu ,\ell ,s}(t,\lambda )={\psi}_{\mu}^{(2)}(t){\psi}_{\ell ,s}(s\lambda )$, where $${\psi}_{\mu}^{(2)}(t)$$ is a second-order time wavelet with center frequency (time modulation rate) $$\mu $$, and $${\psi}_{\ell ,s}(s\lambda )$$ is a frequential wavelet with center quefrency (frequency modulation rate) $$\ell $$ and spin $$s$$. The frequential wavelet $${\psi}_{\ell ,1}$$ is a spin-up wavelet, and $${\psi}_{\ell ,-1}$$ is a spin-down wavelet.

Select the center frequency of a time wavelet and the center quefrency of either a spin-up or spin-down wavelet. Use the helper function `helperPlotSeparableWavelet`

to plot the real part of the separable 2-D wavelet associated with the time and frequential wavelets. To choose a spin-down wavelet, select a negative `whichCQ`

value. You can use the same helper function to plot the imaginary part and magnitude of $${\Psi}_{\mu ,\ell ,s}$$. The source code for this helper function is in the same folder as this example file.

whichCF = centerFrequency(9); % time wavelet center frequency whichCQ = centerQuefrency(10); % frequential wavelet center quefrency plotType = "real"; helperPlotSeparableWavelet(psi2f,timemeta,psifup,psifdown,frequencymeta,whichCF,whichCQ,plotType)

### JTFS and Sensitivity to Time-Frequency Geometry

Load a quadratic chirp signal. Add an impulse and a sinusoid to the signal and plot the result.

load quadchirp len = numel(quadchirp); t = 2*(0:len-1)/len; sig0 = zeros(1,len); sig0(floor(len/2)-10:floor(len/2)+10) = 5; sig1 = 2*cos(50*2*pi*t); sig=quadchirp+sig0+sig1; plot(sig) title("Quadratic Chirp With Impulse and Sinusoid") axis tight

Use the `cwt`

function to plot the scalogram of the signal. The signal has a nontrivial time-frequency geometry.

cwt(sig)

One characteristic of JTFS coefficients is that different sets of coefficients are sensitive to different parts of the time-frequency geometry. To see this, use the `timeFrequencyScattering`

function to create a JTFS network appropriate for the signal. Use the `scatteringTransform`

function to obtain the JTFS of the signal. The five sets of JTFS coefficients are in the dictionary `outCFS`

, and the metadata describing each set is in the cell array `outMETA`

.

jtfn = timeFrequencyScattering(SignalLength=len, ... TimeInvarianceScale=32, ... TimeQualityFactors=[16 1], ... TimeMaxPaddingFactor=1, ... FrequencyInvarianceScale=1, ... NumFrequencyOctaves=2, ... FrequencyQualityFactor=1, ... FrequencyMaxPaddingFactor=2); [outCFS,outMETA] = scatteringTransform(jtfn,sig);

Use the `scattergram`

function to visualize the spin-up and spin-down coefficients. A JTFS separable 2-D wavelet is the product of a second-order time wavelet and a frequential wavelet of spin 1 or -1. Each row in the plot corresponds to a frequential wavelet of a given center quefrency. Spin-up and spin-down wavelets have positive and negative center quefrencies, respectively. Each column corresponds to a second-order time wavelet of a given center frequency.

The spin-up coefficients preferentially localize the up-chirp portion of the quadratic chirp, and the spin-down coefficients preferentially localize the down-chirp portion. Separable wavelets with lower temporal rates and higher modulation rates tend to localize the sinusoid, while the separable wavelets with higher temporal rates and lower modulation rates localize the impulse.

`jtfn.scattergram(outCFS,outMETA,PlotType="Spinned")`

You can visualize the separable 2-D wavelet associated with a set of spinned coefficients. First, use the `filterbank`

function to extract the second-order time wavelets and spinned wavelets from the JTFS network. Also obtain their metadata.

```
[~,psi2f,~,timemeta] = filterbank(jtfn);
[psifup,psifdown,~,frequencymeta] = filterbank(jtfn,FilterBank="frequency");
```

The variables `outMETA{3}`

and `outMETA{4}`

are tables describing the spin-up and spin-down coefficients, respectively. Each table row corresponds to a subplot in the scattergram, and each subplot corresponds to a specific separable 2-D wavelet. Inspect the spin-up metadata. The `path`

table variable indicates the coefficient path. The first column is the index of the frequential wavelet, and the second column is the index of the second-order time wavelet.

outMETA{3}

`ans=`*36×5 table*
type log2dsfactor path spin log2stride
________ ____________ ______ ____ __________
"SpinUp" 0 1 1 3 1 0 5
"SpinUp" 0 1 2 3 1 0 5
"SpinUp" 0 1 3 3 1 0 5
"SpinUp" 0 1 4 3 1 0 5
"SpinUp" 0 2 1 4 1 0 5
"SpinUp" 0 2 2 4 1 0 5
"SpinUp" 0 2 3 4 1 0 5
"SpinUp" 0 2 4 4 1 0 5
"SpinUp" 0 3 1 5 1 0 5
"SpinUp" 0 3 2 5 1 0 5
"SpinUp" 0 3 3 5 1 0 5
"SpinUp" 0 3 4 5 1 0 5
"SpinUp" 0 4 1 6 1 0 5
"SpinUp" 0 4 2 6 1 0 5
"SpinUp" 0 4 3 6 1 0 5
"SpinUp" 0 4 4 6 1 0 5
⋮

You can use the `path`

variable to access in `timemeta`

and `frequencymeta`

the metadata that describes the time and frequential wavelets. Choose a `path`

value from one of the rows in `outMETA{3}`

. Use that value to display the metadata describing the frequential and second-order time wavelets associated with the separable wavelet. The `xi`

variable contains the center quefrency and center frequency of the frequential and time wavelets, respectively. Metadata describing the second-order time filter bank is in `timemeta{2}`

.

pathValue = [4 9]; frequencymeta(pathValue(1),:)

`ans=`*1×7 table*
xi sigma isCQT log2dsfactor spin peakidx bwidx
____ ______ _____ ____________ ____ _______ _______
0.05 0.0325 0 1 1 8 2 19

timemeta{2}(pathValue(2),:)

`ans=`*1×6 table*
xi sigma isCQT log2dsfactor peakidx bwidx
_________ __________ _____ ____________ _______ _______
0.0015625 0.00062558 1 7 13 2 25

Use the helper function `helperPlotJTFSWaveletAndCFS`

to plot the separable wavelet and the coefficients associated with it. The source code for this helper function is in the same folder as this example file.

figure plotType = "real"; helperPlotJTFSWaveletAndCFS(pathValue,psifup,frequencymeta,psi2f,timemeta,outCFS{"SpinUp"},outMETA{3},plotType)

## References

[1] Andén, Joakim, Vincent
Lostanlen, and Stéphane Mallat. “Joint Time–Frequency Scattering.” *IEEE
Transactions on Signal Processing* 67, no. 14 (July 15, 2019): 3704–18.
https://doi.org/10.1109/TSP.2019.2918992

[2] Lostanlen, Vincent, Christian
El-Hajj, Mathias Rossignol, Grégoire Lafay, Joakim Andén, and Mathieu Lagrange.
“Time–Frequency Scattering Accurately Models Auditory Similarities between Instrumental
Playing Techniques.” *EURASIP Journal on Audio, Speech, and Music
Processing* 2021, no. 1 (December 2021): 3. https://doi.org/10.1186/s13636-020-00187-z

[3] Mallat, Stéphane. “Group
Invariant Scattering.” *Communications on Pure and Applied
Mathematics* 65, no. 10 (October 2012): 1331–98. https://doi.org/10.1002/cpa.21413

## See Also

### Objects

## Related Examples

- Acoustic Scene Classification with Wavelet Scattering
- Musical Instrument Classification with Joint Time-Frequency Scattering