# stftLayer

Short-time Fourier transform layer

## Description

An STFT layer computes the short-time Fourier transform of the input. Use of this layer requires Deep Learning Toolbox™.

## Creation

### Description

creates a Short-Time Fourier Transform (STFT) layer. The input to
`layer`

= stftLayer`stftLayer`

must be a `dlarray`

(Deep Learning Toolbox) object in
`"CBT"`

format with a size along the time dimension greater than the
length of `Window`

. `stftLayer`

formats the output as
`"SCBT"`

.
For more information, see Layer Output Format.

**Note**

The weights in `stftLayer`

are initialized internally to be the
modulated windows used as filters in the STFT. It is not recommended to initialize the
weights directly.

sets properties using one or more name-value arguments. You can specify the analysis
window and the number of overlapped samples, among others.`layer`

= stftLayer(`Name=Value`

)

## Properties

### STFT

`Window`

— Analysis window

`hann`

(128,'periodic')

(default) | vector

`hann`

(128,'periodic')This property is read-only.

Analysis window used to compute the STFT, specified as a vector with two or more elements.

**Example: **`(1-cos(2*pi*(0:127)'/127))/2`

and

both specify a Hann window of
length 128.`hann`

(128)

**Data Types: **`double`

| `single`

`OverlapLength`

— Number of overlapped samples

`96`

(default) | positive integer

This property is read-only.

Number of overlapped samples, specified as a positive integer strictly smaller
than the length of `Window`

.

The stride between consecutive windows is the difference between the window length and the number of overlapped samples.

**Data Types: **`double`

| `single`

`FFTLength`

— Number of DFT points

`128`

(default) | positive integer

This property is read-only.

Number of frequency points used to compute the discrete Fourier transform, specified as a positive integer greater than or equal to the window length. If not specified, this argument defaults to the length of the window.

If the length of the input data along the time dimension is less than the number
of DFT points, `stftLayer`

right-pads the data and the window with
zeros so they have a length equal to `FFTLength`

.

**Data Types: **`double`

| `single`

`TransformMode`

— Layer transform mode

`"mag"`

(default) | `"squaremag"`

| `"logmag"`

| `"logsquaremag"`

| `"realimag"`

Layer transform mode, specified as one of these:

`"mag"`

— STFT magnitude`"squaremag"`

— STFT squared magnitude`"logmag"`

— Natural logarithm of the STFT magnitude`"logsquaremag"`

— Natural logarithm of the STFT squared magnitude`"realimag"`

— Real and imaginary parts of the STFT, concatenated along the channel dimension

**Data Types: **`char`

| `string`

### Layer

`WeightLearnRateFactor`

— Multiplier for weight learning rate

`0`

(default) | nonnegative scalar

Multiplier for weight learning rate, specified as a nonnegative scalar. If not
specified, this property defaults to zero, resulting in weights that do not update
with training. You can also set this property using the `setLearnRateFactor`

(Deep Learning Toolbox) function.

**Data Types: **`double`

| `single`

`Name`

— Layer name

`''`

(default) | character vector | string scalar

Layer name, specified as a character vector or a string scalar.
For `Layer`

array input, the `trainNetwork`

(Deep Learning Toolbox), `assembleNetwork`

(Deep Learning Toolbox), `layerGraph`

(Deep Learning Toolbox), and
`dlnetwork`

(Deep Learning Toolbox) functions automatically assign
names to layers with the name `''`

.

**Data Types: **`char`

| `string`

`NumInputs`

— Number of inputs

`1`

(default)

This property is read-only.

Number of inputs of the layer. This layer accepts a single input only.

**Data Types: **`double`

`InputNames`

— Input names

`{'in'}`

(default)

This property is read-only.

Input names of the layer. This layer accepts a single input only.

**Data Types: **`cell`

`NumOutputs`

— Number of outputs

`1`

(default)

This property is read-only.

Number of outputs of the layer. This layer has a single output only.

**Data Types: **`double`

`OutputNames`

— Output names

`{'out'}`

(default)

This property is read-only.

Output names of the layer. This layer has a single output only.

**Data Types: **`cell`

## Examples

### Short-Time Fourier Transform of Chirp

Generate a signal sampled at 600 Hz for 2 seconds. The signal consists of a chirp with sinusoidally varying frequency content. Store the signal in a deep learning array with `"CTB"`

format.

```
fs = 6e2;
x = vco(sin(2*pi*(0:1/fs:2)),[0.1 0.4]*fs,fs);
dlx = dlarray(x,"CTB");
```

Create a short-time Fourier transform layer with default properties. Create a `dlnetwork`

object consisting of a sequence input layer and the short-time Fourier transform layer. Specify a minimum sequence length of 128 samples. Run the signal through the `predict`

method of the network.

ftl = stftLayer; dlnet = dlnetwork([sequenceInputLayer(1,MinLength=128) ftl]); netout = predict(dlnet,dlx);

Convert the network output to a numeric array. Use the `squeeze`

function to remove the length-1 channel and batch dimensions. Plot the magnitude of the STFT. The first dimension of the array corresponds to frequency and the second to time.

q = extractdata(netout); waterfall(squeeze(q)') set(gca,XDir="reverse",View=[30 45]) xlabel("Frequency") ylabel("Time")

### Short-Time Fourier Transform of Sinusoid

Generate a 3 × 160 (× 1) array containing one batch of a three-channel, 160-sample sinusoidal signal. The normalized sinusoid frequencies are *π*/4 rad/sample, *π*/2 rad/sample, and 3*π*/4 rad/sample. Save the signal as a `dlarray`

, specifying the dimensions in order. `dlarray`

permutes the array dimensions to the `"CBT"`

shape expected by a deep learning network.

```
nch = 3;
N = 160;
x = dlarray(cos(pi.*(1:nch)'/4*(0:N-1)),"CTB");
```

Create a short-time Fourier transform layer that can be used with the sinusoid. Specify a 64-sample rectangular window, 48 samples of overlap between adjoining windows, and 1024 DFT points. By default, the layer outputs the magnitude of the STFT.

stfl = stftLayer(Window=rectwin(64), ... OverlapLength=48, ... FFTLength=1024);

Create a two-layer `dlnetwork`

object containing a sequence input layer and the STFT layer you just created. Treat each channel of the sinusoid as a feature. Specify the signal length as the minimum sequence length for the input layer.

layers = [sequenceInputLayer(nch,MinLength=N) stfl]; dlnet = dlnetwork(layers);

Run the sinusoid through the `forward`

method of the network.

dataout = forward(dlnet,x);

Convert the network output to a numeric array. Use the `squeeze`

function to collapse the size-1 batch dimension. Permute the channel and time dimensions so that each array page contains a two-dimensional spectrogram. Plot the STFT magnitude separately for each channel in a waterfall plot.

q = squeeze(extractdata(dataout)); q = permute(q,[1 3 2]); for kj = 1:nch subplot(nch,1,kj) waterfall(q(:,:,kj)') view(30,45) zlabel("Ch. "+string(kj)) end

## More About

### Short-Time Fourier Transform

The short-time Fourier transform (STFT) is used to analyze how the frequency
content of a nonstationary signal changes over time. The magnitude squared of the STFT is
known as the *spectrogram* time-frequency representation of the signal.
For more information about the spectrogram and how to compute it using Signal Processing Toolbox™ functions, see Spectrogram Computation with Signal Processing Toolbox.

The STFT of a signal is computed by sliding an *analysis window*
*g*(*n*) of length *M* over the signal and calculating the
discrete Fourier transform (DFT) of each segment of windowed data. The window hops over the
original signal at intervals of *R* samples, equivalent to *L* = *M* –
*R* samples of overlap between adjoining segments. Most window functions taper
off at the edges to avoid spectral ringing. The DFT of each windowed segment is added to a
complex-valued matrix that contains the magnitude and phase for each point in time and
frequency. The STFT matrix has

$$k=\lfloor \frac{{N}_{x}-L}{M-L}\rfloor $$

columns, where *N _{x}* is the length
of the signal

*x*(

*n*) and the ⌊⌋ symbols denote the floor function. The number of rows in the matrix equals

*N*

_{DFT}, the number of DFT points, for centered and two-sided transforms and an odd number close to

*N*

_{DFT}/2 for one-sided transforms of real-valued signals.

The *m*th column of the STFT matrix $$X(f)=\left[\begin{array}{ccccc}{X}_{1}(f)& {X}_{2}(f)& {X}_{3}(f)& \cdots & {X}_{k}(f)\end{array}\right]$$ contains the DFT of the windowed data centered about time *mR*:

$${X}_{m}(f)={\displaystyle \sum _{n=-\infty}^{\infty}x(n)\text{\hspace{0.17em}}g(n-mR)\text{\hspace{0.17em}}{e}^{-j2\pi fn}}.$$

The short-time Fourier transform is invertible. The inversion process overlap-adds the windowed segments to compensate for the signal attenuation at the window edges. For more information, see Inverse Short-Time Fourier Transform.

The

`istft`

function inverts the STFT of a signal.Under a specific set of circumstances it is possible to achieve "perfect reconstruction" of a signal. For more information, see Perfect Reconstruction.

The

`stftmag2sig`

returns an estimate of a signal reconstructed from the magnitude of its STFT.

### Layer Output Format

`stftLayer`

formats the output as `"SCBT"`

, a sequence
of 1-D images where the image height corresponds to frequency, the second dimension
corresponds to channel, the third dimension corresponds to batch, and the fourth dimension
corresponds to time.

You can feed the output of

`stftLayer`

unchanged to a 1-D convolutional layer when you want to convolve along the frequency (`"S"`

) dimension. For more information, see`convolution1dLayer`

(Deep Learning Toolbox).To feed the output of

`stftLayer`

to a 1-D convolutional layer when you want to convolve along the time (`"T"`

) dimension, you must place a flatten layer after the`stftLayer`

. For more information, see`flattenLayer`

(Deep Learning Toolbox).You can feed the output of

`stftLayer`

unchanged to a 2-D convolutional layer when you want to convolve along the frequency (`"S"`

) and time (`"T"`

) dimensions. For more information, see`convolution2dLayer`

(Deep Learning Toolbox).To use

`stftLayer`

as part of a recurrent neural network, you must place a flatten layer after the`stftLayer`

. For more information, see`lstmLayer`

(Deep Learning Toolbox) and`gruLayer`

(Deep Learning Toolbox).To use the output of

`stftLayer`

with a fully connected layer as part of a classification workflow, you must reduce the time (`"T"`

) dimension of the output so that it has size 1. To reduce the time dimension of the output, place a global pooling layer before the fully connected layer. For more information, see`globalAveragePooling2dLayer`

(Deep Learning Toolbox) and`fullyConnectedLayer`

(Deep Learning Toolbox).

## Version History

**Introduced in R2021b**

### R2022b: `OutputMode`

property to be removed in a future release

The `OutputMode`

property of `stftLayer`

will be
removed in a future release. Update your code and networks to make them compatible with
`stftLayer`

output in `"SCBT"`

format. For more
information, see Layer Output Format.

## See Also

### Apps

- Deep Network Designer (Deep Learning Toolbox)

### Objects

### Functions

`dlstft`

|`stft`

|`istft`

|`stftmag2sig`

### Topics

- Learn Pre-Emphasis Filter Using Deep Learning
- List of Deep Learning Layers (Deep Learning Toolbox)

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)