Documentation

spectrogram

Spectrogram using short-time Fourier transform

Syntax

  • s = spectrogram(x)
  • s = spectrogram(x,window)
  • s = spectrogram(x,window,noverlap)
  • s = spectrogram(x,window,noverlap,nfft) example
  • [s,w,t] = spectrogram(___)
  • [s,f,t] = spectrogram(___,fs) example
  • [s,w,t] = spectrogram(x,window,noverlap,w)
  • [s,f,t] = spectrogram(x,window,noverlap,f,fs)
  • [___,pxx] = spectrogram(___) example
  • [___] = spectrogram(___,freqrange)
  • [___] = spectrogram(___,spectrumtype)

Description

s = spectrogram(x) returns the short-time Fourier transform of the input signal, x. Each column of s contains an estimate of the short-term, time-localized frequency content of x.

s = spectrogram(x,window) uses window to divide the signal into sections and perform windowing.

s = spectrogram(x,window,noverlap) uses noverlap samples of overlap between adjoining sections.

example

s = spectrogram(x,window,noverlap,nfft) uses nfft sampling points to calculate the discrete Fourier transform.

[s,w,t] = spectrogram(___) returns a vector of normalized frequencies, w, and a vector of time instants, t, at which the spectrogram is computed. This syntax can include any combination of input arguments from previous syntaxes.

example

[s,f,t] = spectrogram(___,fs) returns a vector of cyclical frequencies, f, expressed in terms of the sample rate, fs.

[s,w,t] = spectrogram(x,window,noverlap,w) returns the spectrogram at the normalized frequencies specified in w.

[s,f,t] = spectrogram(x,window,noverlap,f,fs) returns the spectrogram at the cyclical frequencies specified in f.

example

[___,pxx] = spectrogram(___) additionally returns a matrix, pxx, containing a power spectral density (PSD) estimate of each section.

[___] = spectrogram(___,freqrange) returns the PSD estimate over the frequency range specified by freqrange. Valid options for freqrange are 'onesided', 'twosided', and 'centered'.

[___] = spectrogram(___,spectrumtype) returns the PSD estimate if spectrumtype is specified as 'psd' and returns the power spectrum if spectrumtype is specified as 'power'.

example

spectrogram(___) with no output arguments plots the spectrogram in the current figure window.

example

spectrogram(___,freqloc) specifies the axis on which to plot the frequency.

Examples

collapse all

Frequency Along x-Axis

Generate a quadratic chirp, x, sampled at 1 kHz for 2 seconds. The frequency of the chirp is 100 Hz initially and crosses 200 Hz at t = 1 s.

t = 0:0.001:2;
x = chirp(t,100,1,200,'quadratic');

Compute and display the spectrogram of x. Divide the signal into sections of length 128, windowed with a Hamming window. Specify 120 samples of overlap between adjoining sections. Evaluate the spectrum at $\lfloor128/2+1\rfloor=65$ frequencies and $\lfloor({\tt length(x)}-120)/(128-120)\rfloor=235$ time bins.

spectrogram(x,128,120,128,1e3)

Replace the Hamming window with a Blackman window. Decrease the overlap to 60 samples. Plot the time axis so that its values increase from top to bottom.

spectrogram(x,blackman(128),60,128,1e3)
ax = gca;
ax.YDir = 'reverse';

Spectrogram and Instantaneous Frequency

Use the spectrogram to measure and track the instantaneous frequency of a signal.

Generate a quadratic chirp sampled at 1 kHz for two seconds. Specify the chirp so its frequency is initially 100 Hz and increases to 200 Hz after one second.

Fs = 1000;
t = 0:1/Fs:2-1/Fs;
y = chirp(t,100,1,200,'quadratic');

Estimate the spectrum of the chirp using the short-time Fourier transform implemented in the spectrogram function. Divide the signal into sections of length 100, windowed with a Hamming window. Specify 80 samples of overlap between adjoining sections and evaluate the spectrum at $\lfloor100/2+1\rfloor=51$ frequencies. Suppress the default color bar.

spectrogram(y,100,80,100,Fs,'yaxis')
view(-77,72)
shading interp
colorbar off

Track the chirp frequency by finding the maximum of the power spectral density at each of the $\lfloor(2000-80)/(100-80)\rfloor = 96$ time points. View the spectrogram as a two-dimensional graphic. Restore the color bar.

[s,f,t,p] = spectrogram(y,100,80,100,Fs);

[q,nd] = max(10*log10(p));

hold on
plot3(t,f(nd),q,'r','linewidth',4)
hold off
colorbar
view(2)

Power Spectral Densities of Chirps

Compute and display the PSD of each segment of a quadratic chirp that starts at 100 Hz and crosses 200 Hz at t = 1 s. Specify a sample rate of 1 kHz, a segment length of 128 samples, and an overlap of 120 samples. Use 128 DFT points and the default Hamming window.

t = 0:0.001:2;
x = chirp(t,100,1,200,'quadratic');

spectrogram(x,128,120,128,1e3,'yaxis')
title('Quadratic Chirp')

Compute and display the PSD of each segment of a linear chirp that starts at DC and crosses 150 Hz at t = 1 s. Specify a sample rate of 1 kHz, a segment length of 256 samples, and an overlap of 250 samples. Use the default Hamming window and 256 DFT points.

t = 0:0.001:2;
x = chirp(t,0,1,150);

spectrogram(x,256,250,256,1e3,'yaxis')
title('Linear Chirp')

Compute and display the PSD of each segment of a logarithmic chirp sampled at 1 kHz that starts at 20 Hz and crosses 60 Hz at t = 1 s. Specify a segment length of 256 samples and an overlap of 250 samples. Use the default Hamming window and 256 DFT points.

t = 0:0.001:2;
x = chirp(t,20,1,60,'logarithmic');

spectrogram(x,256,250,[],1e3,'yaxis')
title('Logarithmic Chirp')

Use a logarithmic scale for the frequency axis. The spectrogram becomes a line.

ax = gca;
ax.YScale = 'log';

Track Chirps in Audio Signal

Load an audio signal that contains two decreasing chirps and a wideband splatter sound. Compute the short-time Fourier transform. Divide the waveform into 400-sample segments with 300-sample overlap. Plot the spectrogram.

load splat

% To hear, type soundsc(y,Fs)

sg = 400;
ov = 300;

spectrogram(y,sg,ov,[],Fs,'yaxis')
colormap bone

Use the spectrogram function to output the power spectral density (PSD) of the signal.

[s,f,t,p] = spectrogram(y,sg,ov,[],Fs);

Track the two chirps using the medfreq function. To find the stronger, low-frequency chirp, restrict the search to frequencies above 100 Hz and to times before the start of the wideband sound.

f1 = f > 100;
t1 = t < 0.75;

m1 = medfreq(p(f1,t1),f(f1));

To find the faint high-frequency chirp, restrict the search to frequencies above 2500 Hz and to times between 0.3 seconds and 0.65 seconds.

f2 = f > 2500;
t2 = t > 0.3 & t < 0.65;

m2 = medfreq(p(f2,t2),f(f2));

Overlay the result on the spectrogram. Divide the frequency values by 1000 to express them in kHz.

hold on
plot(t(t1),m1/1000,'linewidth',4)
plot(t(t2),m2/1000,'linewidth',4)
hold off

3D Spectrogram Visualization

Generate two seconds of a signal sampled at 10 kHz. Specify the instantaneous frequency of the signal as a triangular function of time.

fs = 10e3;
t = 0:1/fs:2;
x1 = vco(sawtooth(2*pi*t,0.5),[0.1 0.4]*fs,fs);

Compute and plot the spectrogram of the signal. Use a Kaiser window of length 256 and shape parameter $\beta=5$. Specify 220 samples of section-to-section overlap and 512 DFT points. Plot the frequency on the y-axis. Use the default colormap and view.

spectrogram(x1,kaiser(256,5),220,512,fs,'yaxis')

Change the view to display the spectrogram as a waterfall plot. Set the colormap to bone.

colormap bone
view(-45,65)

Related Examples

Input Arguments

collapse all

x — Input signalvector

Input signal, specified as a row or column vector.

Example: cos(pi/4*(0:159))+randn(1,160) specifies a sinusoid embedded in white Gaussian noise.

Data Types: single | double
Complex Number Support: Yes

window — Windowinteger | vector | []

Window, specified as an integer or as a row or column vector. Use window to divide the signal into sections:

  • If window is an integer, then spectrogram divides x into sections of length window and windows each section with a Hamming window of that length.

  • If window is a vector, then spectrogram divides x into sections of the same length as the vector and windows each section using window.

If the length of x cannot be divided exactly into an integer number of sections with noverlap overlapping samples, then x is truncated accordingly.

If you specify window as empty, then spectrogram uses a Hamming window such that x is divided into eight sections with noverlap overlapping samples.

For a list of available windows, see Windows.

Example: hann(N+1) and (1-cos(2*pi*(0:N)'/N))/2 both specify a Hann window of length N + 1.

Data Types: single | double

noverlap — Number of overlapped samplespositive integer | []

Number of overlapped samples, specified as a positive integer.

  • If window is scalar, then noverlap must be smaller than window.

  • If window is a vector, then noverlap must be smaller than the length of window.

If you specify noverlap as empty, then spectrogram uses a number that produces 50% overlap between sections. If the section length is unspecified, the function sets noverlap to ⌊Nx/4.5⌋, where Nx is the length of the input signal.

Data Types: double | single

nfft — Number of DFT pointspositive integer scalar | []

Number of DFT points, specified as a positive integer scalar. If you specify nfft as empty, then spectrogram sets the parameter to max(256,2p), where p = ⌈log2 Nx⌉ for an input signal of length Nx.

Data Types: single | double

w — Normalized frequenciesvector

Normalized frequencies, specified as a vector. w must have at least two elements. Normalized frequencies are in rad/sample.

Example: pi./[2 4]

Data Types: double | single

f — Cyclical frequenciesvector

Cyclical frequencies, specified as a vector. f must have at least two elements. The units of f are specified by the sample rate, fs.

Data Types: double | single

fs — Sample rate1 Hz (default) | positive scalar

Sample rate, specified as a positive scalar. The sample rate is the number of samples per unit time. If the unit of time is seconds, the sampling frequency is in Hz.

Data Types: double | single

freqrange — Frequency range for PSD estimate'onesided' | 'twosided' | 'centered'

Frequency range for the PSD estimate, specified as 'onesided', 'twosided', or 'centered'. For real-valued signals, the default is 'onesided'. For complex-valued signals, the default is 'twosided'.

  • 'onesided' — returns the one-sided spectrogram of a real input signal. If nfft is even, then pxx has length nfft/2 + 1 and is computed over the interval [0, π] rad/sample. If nfft is odd, then pxx has length (nfft + 1)/2 and the interval is [0, π) rad/sample. If you specify fs, then the intervals are respectively [0, fs/2] cycles/unit time and [0, fs/2) cycles/unit time.

  • 'twosided' — returns the two-sided spectrogram of a real or complex signal. pxx has length nfft and is computed over the interval [0, 2π) rad/sample. If you specify fs, then the interval is [0, fs) cycles/unit time.

  • 'centered' — returns the centered two-sided spectrogram for a real or complex signal. pxx has length nfft. If nfft is even, then pxx is computed over the interval (–ππ] rad/sample. If nfft is odd, then pxx is computed over (–ππ) rad/sample. If you specify fs, then the intervals are respectively (–fs/2, fs/2] cycles/unit time and (–fs/2, fs/2) cycles/unit time.

Data Types: char

spectrumtype — Power spectrum scaling'psd' (default) | 'power'

Power spectrum scaling, specified as 'psd' or 'power'.

  • Omitting spectrumtype, or specifying 'psd', returns the power spectral density.

  • Specifying 'power' scales each estimate of the PSD by the equivalent noise bandwidth of the window. The result is an estimate of the power at each frequency.

Data Types: char

freqloc — Frequency display axis'xaxis' (default) | 'yaxis'

Frequency display axis, specified as 'xaxis' or 'yaxis'.

  • 'xaxis' — displays frequency on the x-axis and time on the y-axis.

  • 'yaxis' — displays frequency on the y-axis and time on the x-axis.

This argument is ignored if you call spectrogram with output arguments.

Data Types: char

Output Arguments

collapse all

s — Short-time Fourier transformmatrix

Short-time Fourier transform, returned as a matrix. Time increases across the columns of s and frequency increases down the rows, starting from zero.

  • If x is a signal of length Nx, then s has k columns, where

    • k = ⌊(Nx – noverlap)/(window – noverlap)⌋ if window is a scalar

    • k = ⌊(Nx – noverlap)/(length(window) – noverlap)⌋ if window is a vector.

  • If x is real and nfft is even, then s has (nfft/2 + 1) rows.

  • If x is real and nfft is odd, then s has (nfft + 1)/2 rows.

  • If x is complex, then s has nfft rows.

Data Types: double | single

w — Normalized frequenciesvector

Normalized frequencies, returned as a vector. w has a length equal to the number of rows of s.

Data Types: double | single

t — Time instantsvector

Time instants, returned as a vector. The time values in t correspond to the midpoint of each section.

Data Types: double | single

f — Cyclical frequenciesvector

Cyclical frequencies, returned as a vector. f has a length equal to the number of rows of s.

Data Types: double | single

pxx — Power spectral densitymatrix

Power spectral density (PSD), returned as a matrix.

  • If x is real, then pxx contains the one-sided modified periodogram estimate of the PSD of each section.

  • If x is complex, or if you specify a vector of frequencies, then pxx contains the two-sided modified periodogram estimate of the PSD of each section.

Data Types: double | single

References

[1] Oppenheim, Alan V., Ronald W. Schafer, and John R. Buck. Discrete-Time Signal Processing. 2nd Ed. Upper Saddle River, NJ: Prentice Hall, 1999.

[2] Rabiner, Lawrence R., and Ronald W. Schafer. Digital Processing of Speech Signals. Englewood Cliffs, NJ: Prentice-Hall, 1978.

Was this topic helpful?