Code covered by the BSD License  

Highlights from
A Numerical Tour of Signal Processing

from A Numerical Tour of Signal Processing by Gabriel Peyre
A set of Matlab experiments that illustrates advanced computational signal and image processing.

Sparse Spikes Deconvolution with Matching Pursuits

Sparse Spikes Deconvolution with Matching Pursuits

This numerical tour explores the use of Matching Pursuits to solve seismic sparse spikes deconvolution.

Contents

Sparse spikes deconvolution is one of the oldest inverse problems, that is a stilized version of recovery in seismic imaging. The ground is modeled as a 1D profile x, mostly zeros with a few spikes accounting for interfaces between layers in the ground. The observation is the convolution D*x of x against a wavelet filter, where D(:,1) is the basis wavelet function (D is the convolution operator.

The goal of sparse spike deconvolution is to recover an approximation of x given noisy measurement y=D*x+w. Since the convolution destroys many low and high frequencies, this requires some prior information to regularize the inverse problem. Since x is composed of a few spikes, the sparsity is a wonderful prior, that is exploited either by L1 minimization (basis pursuit) or by non-linear greedy processes (matching pursuits).

Installing toolboxes and setting up the path.

You need to download the general purpose toolbox and the signal toolbox.

You need to unzip these toolboxes in your working directory, so that you have toolbox_general/ and toolbox_signal/ in your directory.

For Scilab user: you must replace the Matlab comment '%' by its Scilab counterpart '//'.

Recommandation: You should create a text file named for instance numericaltour.sce (in Scilab) or numericaltour.m (in Matlab) to write all the Scilab/Matlab commands you want to execute. Then, simply run exec('numericaltour.sce'); (in Scilab) or numericaltour; (in Matlab) to run the commands.

Execute this line only if you are using Matlab.

getd = @(p)path(path,p); % scilab users must *not* execute this

Then you can add these toolboxes to the path.

% Add some directories to the path
getd('toolbox_signal/');
getd('toolbox_general/');

Seismic Wavelets and Seismic Imaging

We first see the forard measurement process in action.

set_rand_seeds(123456,21);

First we load a seismic filter, which is a second derivative of a Gaussian.

% dimension of the signal
n = 1024;
% width of the filter
s = 13;
% second derivative of Gaussian
t = -n/2:n/2-1;
h = (1-t.^2/s^2).*exp( -(t.^2)/(2*s^2) );
h = h-mean(h);
% normalize it
h = h/norm(h);
% recenter the filter for periodic boundary conditions
h1 = fftshift(h);

We display the filter in space (zoom).

k = 100;
sel = n/2-k+1:n/2+k;
clf;
plot(t(sel), h(sel));
set_graphic_sizes([], 20);
title('Filter h');
axis('tight');

We display its Fourier transform.

hf = fftshift(real(fft(h1)));
clf;
plot(t(sel), hf(sel));
set_graphic_sizes([], 20);
title('FFT(h)');
axis('tight');

The actual number of measurement of the seismic imaging is roughly the number of Fourier frequencies above the noise level sigma.

% noise level
sigma = .06*max(h);
% how much frequencies are removed
q = sum( abs(hf)>sigma );
disp(strcat(['Approximate number of measures = ' num2str(q) '.']));
Approximate number of measures = 106.

We compute the filtering matrix. To stabilize the recovery, we sub-sample by a factor of 2 the filtering.

% sub-sampling (distance between wavelets)
sub = 2;
% number of atoms in the dictionary
p = n/sub;
% the dictionary, with periodic boundary conditions
[Y,X] = meshgrid(1:sub:n,1:n);
D = reshape( h1(mod(X-Y,n)+1), [n p]);

Now we create the ideal signal x we would like to recover. We design it so that there is a deacreasing distance between the spikes. It makes the recovery harder and harder from left to right. The amplitude of the spikes and the signs are randoms

% spacing min and max between the spikes.
m = 5; M = 40;
k = floor( (p+M)*2/(M+m) )-1;
spc = linspace(M,m,k)';
% location of the spikes
sel = round( cumsum(spc) );
sel(sel>p) = [];
% randomization of the signs and values
x = zeros(p,1);
si = (-1).^(1:length(sel))'; si = si(randperm(length(si)));
% creating of the sparse spikes signal.
x(sel) = si;
x = x .* (1-rand(p,1)*.5);

Display the spikes

clf;
plot_sparse_diracs(x);
title('Signal x');

Sparsity of the signal.

disp(strcat(['Sparsity (L0 pseudo-norm) of x = ' num2str(sum(x~=0)) '.']));
Sparsity (L0 pseudo-norm) of x = 21.

Now compute the noisy obervations.

w = randn(n,1)*sigma;
y = D*x + w;
% display
clf;
plot(y);
set_graphic_sizes([], 20);
axis('tight');
title('Noisy measurements y=D*x+w');

Matching Pursuit

Matching pursuit is a greedy procedure that progressively identify the location of the spikes by looking at atoms that maximaly correlated with the current residual.

Initially, the residual is the whole obervations, and the detected signal is zero.

R = y;
xmp = zeros(p,1);

Compute and display the correlation.

C = D'*R;
clf;
plot(abs(C));
set_graphic_sizes([], 20);
axis('tight');
title('|Correlation|');

Extract the coefficient with maximal correlation

[tmp,I] = compute_max(abs(C));
% update the coefficients
xmp(I) = xmp(I) + C(I);
% update the residual
R = y-D*xmp;

Display the previous and the new residual.

clf;
subplot(2,1,1);
plot(y); axis('tight');
set_graphic_sizes([], 20);
title('Observation y');
subplot(2,1,2);
plot(R); axis('tight');
set_graphic_sizes([], 20);
title('Residual R');

Exercice 1: (the solution is exo1.m) Compute up to M steps of matching pursuit. What do you notice about the locations of the spikes that are well recovered by matching pursuit ?

exo1;

Although several locations of the spikes are well recovered, the values of the spikes are not well estimated. To better recover the values, one needs to find the best values that match the measurements. This requires the least norm solution of the over determined system D(:,sel)*x(sel) = y, where sel is the support recovered by MP.

sel = find(xmp~=0);
xproj = zeros(p,1);
xproj(sel) = D(:,sel) \ y;
% display
clf;
subplot(2,1,1);
plot_sparse_diracs(xmp);
title('Recovered by MP');
subplot(2,1,2);
plot_sparse_diracs(xproj);
title('Recovered by backprojected MP');

Exercice 2: (the solution is exo2.m) Perform 1.5*M steps of MP, and at each step compute the back-projection xproj of the MP solution xmp. Keep the solution xproj that minimize the error norm(xproj-x).

exo2;

Display of the solution.

err_mp = norm(x-xproj)/norm(x);
clf;
subplot(2,1,1);
plot_sparse_diracs(x);
title('Signal x');
subplot(2,1,2);
plot_sparse_diracs(xproj);
title(['Recovered by MP, err=' num2str(err_mp,3)]);

Orthogonal Matching Pursuit

Orthogonal matching pursuit improves over matching pursuit by back-projecting at each iteration the matching pursuit solution.

The initialization of OMP is the same as with MP.

R = y;
xomp = zeros(p,1);

The coeffcient selection is also done with maximum of correlation.

C = D'*R;
[tmp,I] = compute_max(abs(C));
% update the coefficients
xomp(I) = xomp(I) + C(I);
% perform a back projection of the coefficients
sel = find(xomp~=0);
xomp = zeros(p,1);
xomp(sel) = D(:,sel) \ y;
% update the residual
R = y-D*xomp;

Exercice 3: (the solution is exo3.m) Implement Orthogonal Matching Pursuit by modifying your implementation of Matching Pursuit. Keep the solution that gives the smallest error norm(x-xomp)

exo3;

Display of the solution.

err_omp = norm(x-xomp)/norm(x);
clf;
subplot(2,1,1);
plot_sparse_diracs(x);
title('Signal x');
subplot(2,1,2);
plot_sparse_diracs(xomp);
title(['Recovered by OMP, err=' num2str(err_omp,3)]);

Exact Recovery Condition

To study the theoritical properties of pursuits, Joel Tropp introduced a perfect recovery condition (ERC) that measure the conditioning ERC(S) of the support S=find(x~=0) of a coefficient. The condition ERC(S)<1 ensures that any xoefficients supported in S is recovered by OMP, up to the noise level (so that the procedure is stable).

First we create a signal with well separated Diracs. Such a signal is easy to recover by OMP, and we will check this with the ERC.

delta = 30;
x1 = zeros(p,1);
x1(1:delta:p+1-delta) = 1;

ERC(S) is the maximum L1 norm of the set of inner product between a dual atoms and the set of atoms outside S.

% compute the support and the complementary of the support
S  =find(x1~=0); % in
Sc =find(x1==0); % out
% compute pseudo inverse of atoms within the support
D1 = D(:,S);
D1 = (D1'*D1)^(-1) * D1';
% compute inner product between dual atoms inside the support
% and atoms outside.
C = D1 * D(:,Sc);
% compute the maximum L1 norm, which is the ERC
ERC = max( sum( abs(C), 1 ) );
% display
disp(strcat(['ERC(S)=' num2str(ERC,3)]));
ERC(S)=1.05

Exercice 4: (the solution is exo4.m) Compute ERC(i) = ERC(Si) for supports Si that are separated by an increasing value of delta. Check for the minimum delta that ensures that ERC(Si)<1.

exo4;

Contact us at files@mathworks.com