Autoencoders for Wireless Communications

This example shows how to model an end-to-end communications system with an autoencoder to reliably transmit information bits over a wireless channel.

Introduction

A traditional autoencoder is an unsupervised neural network that learns how to efficiently compress data, which is also called encoding. The autoencoder also learns how to reconstruct the data from the compressed representation such that the difference between the original data and the reconstructed data is minimal.

Traditional wireless communication systems are designed to provide reliable data transfer over a channel that impairs the transmitted signals. These systems have multiple components such as channel coding, modulation, equalization, synchronization, etc. Each component is optimized independently based on mathematical models that are simplified to arrive at closed form expressions. On the contrary, an autoencoder jointly optimizes the transmitter and the receiver as a whole. This joint optimization has the potential of providing a better performance than the traditional systems ,.

Traditional autoencoders are usually used to compress images, in other words remove redundancies in an image and reduce its dimension. A wireless communication system on the other hand uses channel coding and modulation techniques to add redundancy to the information bits. With this added redundancy, the system can recover the information bits that are impaired by the wireless channel. So, a wireless autoencoder actually adds redundancy and tries to minimize the number of errors in the received information for a given channel while learning to apply both channel coding and modulation in an unsupervised way.

Basic Autoencoder System

The following is the block diagram of a wireless auto encoder system. The encoder (transmitter) first maps $\mathit{k}$ information bits into a message s such that $s\in \left\{1,\dots ,M\right\}$, where $M={2}^{k}$. Then message s is mapped to n real number to create $\text{x}=f\left(s\right)\in {\mathbb{R}}^{n}$. The last layer of the encoder imposes constraints on $\text{x}$ to further restrict the encoded symbols. The following are possible such constraints and are implemented using the normalization layer:

• Energy constraint: $‖x{‖}_{2}^{2}\le n$

• Average power constraint: $\mathbb{E}\left[|{x}_{i}{|}^{2}\right]\le 1,\forall i$ Define the communication rate of this system as $R=k/n$ [bits/channel use], where (n,k) means that the system sends one of $M={2}^{k}$ messages using n channel uses. The channel impairs encoded (i.e. transmitted) symbols to generate $\text{y}\in {\mathbb{R}}^{n}$. The decoder (i.e. receiver) produces an estimate, $\underset{}{\overset{ˆ}{s}}$, of the transmitted message, $s$.

The input message is defined as a one-hot vector ${\text{1}}_{s}\in {\mathbb{R}}^{M}$, which is defined as a vector whose elements are all zeros except the ${\mathit{s}}^{\mathrm{th}}$ one. The channel is additive white Gaussian noise (AWGN) that adds noise to achieve a given energy per bit to noise power density ratio, ${E}_{b}/{N}_{o}$.

Define a (7,4) autoencoder network with energy normalization and a training ${E}_{b}/{N}_{o}$ of 3 dB. In , authors showed that two fully connected layers for both the encoder (transmitter) and the decoder (receiver) provides the best results with minimal complexity. Input layer (featureInputLayer (Deep Learning Toolbox)) accepts a one-hot vector of length M. The encoder has two fully connected layers (fullyConnectedLayer (Deep Learning Toolbox)). The first one has M inputs and M outputs and is followed by an ReLU layer (reluLayer (Deep Learning Toolbox)). The second fully connected layer has M inputs and n outputs and is followed by the normalization layer (helperAEWNormalizationLayer.m). The encoder layers are followed by the AWGN channel layer (helperAEWAWGNLayer.m). The output of the channel is passed to the decoder layers. The first decoder layer is a fully connected layer that has n inputs and M outputs and is followed by an ReLU layer. Second fully connected layer has M inputs and M outputs and is followed by a softmax layer (softmaxLayer (Deep Learning Toolbox)), which outputs the probability of each M symbols. The classification layer (classificationLayer (Deep Learning Toolbox)) outputs the most probable transmitted symbol from 0 to M-1.

k = 4;    % number of input bits
M = 2^k;  % number of possible input symbols
n = 7;    % number of channel uses
EbNo = 3; % Eb/No in dB

wirelessAutoencoder = [
featureInputLayer(M,"Name","One-hot input","Normalization","none")

fullyConnectedLayer(M,"Name","fc_1")
reluLayer("Name","relu_1")

fullyConnectedLayer(n,"Name","fc_2")

helperAEWNormalizationLayer("Method", "Energy", "Name", "wnorm")

helperAEWAWGNLayer("Name","channel",...
"NoiseMethod","EbNo",...
"EbNo",EbNo,...
"BitsPerSymbol",2,...
"SignalPower",1)

fullyConnectedLayer(M,"Name","fc_3")
reluLayer("Name","relu_2")

fullyConnectedLayer(M,"Name","fc_4")
softmaxLayer("Name","softmax")

classificationLayer("Name","classoutput")]
wirelessAutoencoder =
11×1 Layer array with layers:

1   'One-hot input'   Feature Input            16 features
2   'fc_1'            Fully Connected          16 fully connected layer
3   'relu_1'          ReLU                     ReLU
4   'fc_2'            Fully Connected          7 fully connected layer
5   'wnorm'           Wireless Normalization   Energy normalization layer
6   'channel'         AWGN Channel             AWGN channel with EbNo = 3
7   'fc_3'            Fully Connected          16 fully connected layer
8   'relu_2'          ReLU                     ReLU
9   'fc_4'            Fully Connected          16 fully connected layer
10   'softmax'         Softmax                  softmax
11   'classoutput'     Classification Output    crossentropyex

The helperAEWTrainWirelessAutoencoder.m function defines such a network based on the (n,k), normalization method and the ${E}_{b}/{N}_{o}$ values. The Wireless Autoencoder Training Function section shows the contents of the helperAEWTrainWirelessAutoencoder.m function.

Train Autoencoder

Run the helperAEWTrainWirelessAutoencoder.m function to train a (2,2) autoencoder with energy normalization. This function uses the trainingOptions (Deep Learning Toolbox) function to select

• Initial learning rate of 0.01,

• Maximum epochs of 15,

• Minibatch size of 20*M,

• Piecewise learning schedule with drop period of 10 and drop factor of 0.1.

Then, the helperAEWTrainWirelessAutoencoder.m function runs the trainNetwork (Deep Learning Toolbox) function to train the autoencoder network with the selected options. Finally, this function separates the network into encoder and decoder parts. Encoder starts with the input layer and ends after the normalization layer. Decoder starts after the channel layer and ends with the classification layer. A feature input layer is added at the beginning of the decoder.

Train the autoencoder with an ${E}_{b}/{N}_{o}$ value that is low enough to result in some errors but not too low such that the training algorithm cannot extract any useful information from the received symbols, y. Set ${E}_{b}/{N}_{o}$ to 3 dB.

Training an autoencoder may take several minutes. Set trainNow to false to use saved networks.

trainNow = false; %#ok<*NASGU>

n = 2;                      % number of channel uses
k = 2;                      % bits per data symbol
EbNo = 3;                   % dB
normalization = "Energy";   % Normalization "Energy" | "Average power"

if trainNow
[txNet22e,rxNet22e,info22e,wirelessAutoEncoder22e] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo); %#ok<*UNRCH>
else
load trainedNet_n2_k2_energy txNet rxNet info trainedNet
txNet22e = txNet;
rxNet22e = rxNet;
info22e = info;
wirelessAutoEncoder22e = trainedNet;
end

Plot the traning progress. The validation accuracy quickly reaches more than 90% while the validation loss keeps slowly decreasing. This behavior shows that the training ${E}_{b}/{N}_{o}$ value was low enough to cause some errors but not too low to avoid convergence. For definitions of validation accuracy and validation loss, see Monitor Deep Learning Training Progress (Deep Learning Toolbox) section.

figure
helperAEWPlotTrainingPerformance(info22e) Use the plot object function of the trained network objects to show the layer graphs of the full autoencoder, the encoder network, i.e. the transmitter, and the decoder network, i.e. the receiver.

figure
tiledlayout(2,2)
nexttile([2 1])
plot(wirelessAutoEncoder22e)
title('Autoencoder')
nexttile
plot(txNet22e)
title('Encoder/Tx')
nexttile
plot(rxNet22e)
title('Decoder/Rx') Plot the constellation learned by the autoencoder to send symbols through the AWGN channel together with the received constellation. For a (2,2) configuration, autoencoder learns a QPSK ($M={2}^{k}=4$) constellation with a phase rotation. The received constellation is basically the activation values at the output of the channel layer obtained using the activations (Deep Learning Toolbox) function and treated as interleaved complex numbers.

subplot(1,2,1)
helperAEWPlotConstellation(txNet22e)
title('Learned Constellation')
subplot(1,2,2) Simulate BLER Performance

Simulate the block error rate (BLER) performance of the (2,2) autoencoder. Setup simulation parameters.

simParams.EbNoVec = 0:0.5:8;
simParams.MinNumErrors = 10;
simParams.MaxNumFrames = 300;
simParams.NumSymbolsPerFrame = 10000;
simParams.SignalPower = 1;

Generate random integers in the [0 $M$-1] range that represents $k$ random information bits. Encode these information bits into complex symbols with helperAEWEncode.m function. The helperAEWEncode function runs the encoder part of the autoencoder then maps the real valued $\text{x}$ vector into a complex valued ${x}_{c}$ vector such that the odd and even elements are mapped into the in-phase and the quadrature component of a complex symbol, respectively, where ${\text{x}}_{c}=\text{x}\left(1:2:end\right)+j\text{x}\left(2:2:end\right)$. In other words, treat the $\text{x}$ array as an interleaved complex array.

Pass the complex symbols through an AWGN channel. Decode the channel impaired complex symbols with the helperAEWDecode.m function. The following code runs the simulation for each ${E}_{b}/{N}_{o}$ point for at least 10 block errors. To obtain more accurate results, increase minimum number of errors to at least 100. If Parallel Computing Toolbox™ is installed and a license is available, the simulation will run on a parallel pool. Compare the results with that of an uncoded QPSK system with block length 2.

EbNoVec = simParams.EbNoVec;
R = k/n;

M = 2^k;
BLER = zeros(size(EbNoVec));
parfor EbNoIdx = 1:length(EbNoVec)
EbNo = EbNoVec(EbNoIdx) + 10*log10(R);
chan = comm.AWGNChannel("BitsPerSymbol",2, ...
"EbNo", EbNo, "SamplesPerSymbol", 1, "SignalPower", 1);

numBlockErrors = 0;
frameCnt = 0;
while (numBlockErrors < simParams.MinNumErrors) ...
&& (frameCnt < simParams.MaxNumFrames) %#ok<PFBNS>

d = randi([0 M-1],simParams.NumSymbolsPerFrame,1);    % Random information bits
x = helperAEWEncode(d,txNet22e);                      % Encoder
y = chan(x);                                          % Channel
dHat = helperAEWDecode(y,rxNet22e);                   % Decoder

numBlockErrors = numBlockErrors + sum(d ~= dHat);
frameCnt = frameCnt + 1;
end
BLER(EbNoIdx) = numBlockErrors / (frameCnt*simParams.NumSymbolsPerFrame);
end
Starting parallel pool (parpool) using the 'local' profile ...
Connected to the parallel pool (number of workers: 6).
figure
semilogy(simParams.EbNoVec,BLER,'-')
hold on
qpsk22BLER = 1-(1-berawgn(simParams.EbNoVec,'psk',4,'nondiff')).^2;
semilogy(simParams.EbNoVec,qpsk22BLER,'--')
hold off
ylim([1e-4 1])
grid on
xlabel('E_b/N_o (dB)')
ylabel('BLER')
legend('AE (2,2)','QPSK (2,2)') The well formed constellation together with the BLER results show that training for 15 epochs is enough to get a satisfactory convergence.

Compare Constellation Diagrams

Compare learned constellations of several autoencoders normalized to unit energy and unit average power. Train (2,4) autoencoder normalized to unit energy.

n = 2;      % number of channel uses
k = 4;      % bits per data symbol
EbNo = 3;   % dB
normalization = "Energy";
if trainNow
[txNet24e,rxNet24e,info24e,wirelessAutoEncoder24e] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo);
else
load trainedNet_n2_k4_energy txNet rxNet info trainedNet
txNet24e = txNet;
rxNet24e = rxNet;
info24e = info;
wirelessAutoEncoder24e = trainedNet;
end

Train (2,4) autoencoder normalized to unit average power.

n = 2;      % number of channel uses
k = 4;      % bits per data symbol
EbNo = 3;   % dB
normalization = "Average power";
if trainNow
[txNet24p,rxNet24p,info24p,wirelessAutoEncoder24p] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo);
else
load trainedNet_n2_k4_power txNet rxNet info trainedNet
txNet24p = txNet;
rxNet24p = rxNet;
info24p = info;
wirelessAutoEncoder24p = trainedNet;
end

Train (7,4) autoencoder normalized to unit energy.

n = 7;      % number of channel uses
k = 4;      % bits per data symbol
EbNo = 3;   % dB
normalization = "Energy";
if trainNow
[txNet74e,rxNet74e,info74e,wirelessAutoEncoder74e] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo);
else
load trainedNet_n7_k4_energy txNet rxNet info trainedNet
txNet74e = txNet;
rxNet74e = rxNet;
info74e = info;
wirelessAutoEncoder74e = trainedNet;
end

Plot the constellation using the helperAEWPlotConstellation.m function. The trained (2,2) autoencoder converges on a QPSK constellation with a phase shift as the optimal constellation for the channel conditions experienced. The (2,4) autoencoder with energy normalization converges to a 16PSK constellation with a phase shift. Note that, energy normalization forces every symbol to have unit energy and places the symbols on the unit circle. Given this constraint, best constellation is a PSK constellation with equal angular distance between symbols. The (2,4) autoencoder with average power normalization converges to a three-tier constellation of 1-6-9 symbols. Average power normalization forces the symbols to have unity average power over time. This constraint results in an APSK constellation, which is different than the conventional QAM or APSK schemes. Note that, this network configuration may also converge to a two-tier constellation with 7-9 symbols based on the random initial condition used during training. The last plot shows the 2-D mapping of the 7-D constellation generated by the (7,4) autoencoder with energy constraint. 2-D mapping is obtained using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method (see tsne (Statistics and Machine Learning Toolbox) function).

figure
subplot(2,2,1)
helperAEWPlotConstellation(txNet22e)
title('(2,2) Energy')
subplot(2,2,2)
helperAEWPlotConstellation(txNet24e)
title('(2,4) Energy')
subplot(2,2,3)
helperAEWPlotConstellation(txNet24p)
title('(2,4) Average Power')
subplot(2,2,4)
helperAEWPlotConstellation(txNet74e,'t-sne')
title('(7,4) Energy') Compare BLER Performance of Autoencoders with Coded and Uncoded QPSK

Simulate the BLER performance of a (7,4) autoencoder with that of (7,4) Hamming code with QPSK modulation for both hard decision and maximum likelihood (ML) decoding. Use uncoded (4,4) QPSK as a baseline. (4,4) uncoded QPSK is basically a QPSK modulated system that sends blocks of 4 bits and measures BLER. The data for the following figures is obtained using helperAEWSimulateBLER.mlx and helperAEWPrepareAutoencoders.mlx files.

figure
qpsk44BLERTh = 1-(1-berawgn(simParams.EbNoVec,'psk',4,'nondiff')).^4;
semilogy(simParams.EbNoVec,qpsk44BLERTh,':*')
hold on
semilogy(simParams.EbNoVec,qpsk44BLER,':o')
semilogy(simParams.EbNoVec,hammingHard74BLER,'--s')
semilogy(simParams.EbNoVec,ae74eBLER,'-')
semilogy(simParams.EbNoVec,hammingML74BLER,'--d')
hold off
ylim([1e-5 1])
grid on
xlabel('E_b/N_o (dB)')
ylabel('BLER')
legend('Theoretical Uncoded QPSK (4,4)','Uncoded QPSK (4,4)','Hamming (7,4) Hard Decision',...
'Autoencoder (7,4)','Hamming (7,4) ML','Location','southwest')
title('BLER comparison of (7,4) Autoencoder') As expected, hard decision (7,4) Hamming code with QPSK modulation provides about 0.6 dB ${E}_{b}/{N}_{o}$ advantage over uncoded QPSK, while the ML decoding of (7,4) Hamming code with QPSK modulation provides another 1.5 dB advantage for a BLER of $1{0}^{-3}$. The (7,4) autoencoder BLER performance approaches the ML decoding of (7,4) Hamming code, when trained with 3 dB ${E}_{b}/{N}_{o}$. This BLER performance shows that the autoencoder is able to learn not only modulation but also channel coding to achieve a coding gain of about 2 dB for a coding rate of R=4/7.

Next, simulate the BLER performance of autoencoders with R=1 with that of uncoded QPSK systems. Use uncoded (2,2) and (8,8) QPSK as baselines. Compare BLER performance of these systems with that of (2,2), (4,4) and (8,8) autoencoders.

qpsk22BLERTh = 1-(1-berawgn(simParams.EbNoVec,'psk',4,'nondiff')).^2;
semilogy(simParams.EbNoVec,qpsk22BLERTh,':*')
hold on
semilogy(simParams.EbNoVec,qpsk88BLER,'--*')
qpsk88BLERTh = 1-(1-berawgn(simParams.EbNoVec,'psk',4,'nondiff')).^8;
semilogy(simParams.EbNoVec,qpsk88BLERTh,':o')
semilogy(simParams.EbNoVec,ae22eBLER,'-o')
semilogy(simParams.EbNoVec,ae44eBLER,'-d')
semilogy(simParams.EbNoVec,ae88eBLER,'-s')
hold off
ylim([1e-5 1])
grid on
xlabel('E_b/N_o (dB)')
ylabel('BLER')
legend('Uncoded QPSK (2,2)','Uncoded QPSK (8,8)','Theoretical Uncoded QPSK (8,8)',...
'Autoencoder (2,2)','Autoencoder (4,4)','Autoencoder (8,8)','Location','southwest')
title('BLER performance of R=1 Autoencoders') Bit error rate of QPSK is the same for both (8,8) and (2,2) cases. However, the BLER depends on the block length, $n$, and gets worse as $n$ increases as given by $BLER=1-\left(1-BER{\right)}^{n}$. As expected, BLER performance of (8,8) QPSK is worse than the (2,2) QPSK system. The BLER performance of (2,2) autoencoder matches the BLER performance of (2,2) QPSK. On the other hand, (4,4) and (8,8) autoencoders optimize the channel coder and the constellation jointly to obtain a coding gain with respect to the corresponding uncoded QPSK systems.

Effect of Training Eb/No on BLER Performance

Train the (7,4) autoencoder with energy normalization under different ${E}_{b}/{N}_{o}$ values and compare the BLER performance.

n = 7;
k = 4;
normalization = 'Energy';

EbNoVec = 1:3:10;
if trainNow
for EbNoIdx = 1:length(EbNoVec)
EbNo = EbNoVec(EbNoIdx);
[txNetVec{EbNoIdx},rxNetVec{EbNoIdx},infoVec{EbNoIdx},trainedNetVec{EbNoIdx}] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo);
BLERVec{EbNoIdx} = helperAEWAutoencoderBLER(txNetVec{EbNoIdx},rxNetVec{EbNoIdx},simParams);
end
else
load ae74TrainedEbNo1to10 BLERVec trainParams simParams txNetVec rxNetVec infoVec trainedNetVec EbNoVec
end

Plot the BLER performance together with theoretical upper bound for hard decision decoded Hamming (7,4) code and simulated BLER of maximum likelihood decoded (MLD) Hamming (7,4) code. The BLER performance of the (7,4) autoencoder gets closer to the Hamming (7,4) code with MLD as the training ${E}_{b}/{N}_{o}$ decreases from 10 dB to 1 dB, at which point it almost matches the MLD Hamming (7,4) code.

berHamming = bercoding(simParams.EbNoVec,'hamming','hard',7);
blerHamming = 1-(1-berHamming).^7;
figure
semilogy(simParams.EbNoVec,blerHamming,':k')
hold on
linespec = {'-*','-d','-o','-s',};
for EbNoIdx=length(EbNoVec):-1:1
semilogy(simParams.EbNoVec,BLERVec{EbNoIdx},linespec{EbNoIdx})
end
semilogy(simParams.EbNoVec,hammingML74BLER,'--vk')
hold off
ylim([1e-5 1])
grid on
xlabel('E_b/N_o (dB)')
ylabel('BLER')
legend('(7,4) Hamming HDD Upper','(7,4) AE - Eb/No=10','(7,4) AE - Eb/No=7',...
'(7,4) AE - Eb/No=4','(7,4) AE - Eb/No=1','Hamming (7,4) MLD','location','southwest') Conclusions and Further Exploration

The BLER results show that it is possible for autoencoders to learn joint coding and modulation schemes in an unsupervised way. It is even possible to train an autoencoder with R=1 to obtain a coding gain as compared to traditional methods. The example also shows the effect of hyperparameters such as ${E}_{b}/{N}_{o}$ on the BLER performance.

The results are obtained using the following default settings for training and BLER simulations:

trainParams.Plots = 'none';
trainParams.Verbose = false;
trainParams.MaxEpochs = 15;
trainParams.InitialLearnRate = 0.01;
trainParams.LearnRateSchedule = 'piecewise';
trainParams.LearnRateDropPeriod = 10;
trainParams.LearnRateDropFactor = 0.1;
trainParams.MiniBatchSize = 20*2^k;

simParams.EbNoVec = -2:0.5:8;
simParams.MinNumErrors = 100;
simParams.MaxNumFrames = 300;
simParams.NumSymbolsPerFrame = 10000;
simParams.SignalPower = 1;

Vary these parameters to train different autoencoders and test their BLER performance. Experiment with different n, k, normalization and ${E}_{b}/{N}_{o}$ values. See the help for helperAEWTrainWirelessAutoencoder.m, helperAEWPrepareAutoencoders.mlx and helperAEWAutoencoderBLER.m for more information.

Wireless Autoencoder Training Function

This section shows the content of the helperAEWTrainWirelessAutoencoder function. To open the runnable version of the function in the MATLAB editor, click helperAEWTrainWirelessAutoencoder.m.

type helperAEWTrainWirelessAutoencoder
function [txNet,rxNet,info,trainedNet] = ...
helperAEWTrainWirelessAutoencoder(n,k,normalization,EbNo,varargin)
%helperAEWTrainWirelessAutoencoder Train wireless autoencoder
%   [TX,RX,INFO,AE] = helperAEWTrainWirelessAutoencoder(N,K,NORM,EbNo)
%   trains an autoencoder, AE, with (N,K), where K is the number of input
%   bits and N is the number of channel uses. The autoencoder employs NORM
%   normalization. NORM must be one of 'Energy' and 'Average power'. The
%   channel is an AWGN channel with Eb/No set to EbNo. TX and Rx are the
%   encoder and decoder parts of the autoencoder that can be used in the
%   helperAEWEncoder and helperAEWDecoder functions, respectively. INFO is
%   the training information that can be used to check the convergence
%   behavior of the training process.
%
%   [TX,RX,INFO,AE] = helperAEWTrainWirelessAutoencoder(...,TP) provides
%   training parameters as follows:
%     TP.Plots: Plots to display during network training defined as one of
%               'none' (default) or 'training-progress'.
%     TP.Verbose: Indicator to display training progress information
%               defined as 1 (true) (default) or 0 (false).
%     TP.MaxEpochs: Maximum number of epochs defined as a positive integer.
%               The default is 15.
%     TP.InitialLearnRate: Initial learning rate as a floating point number
%               between 0 and 1. The default is 0.01;
%     TP.LearnRateSchedule: Learning rate schedule defined as one of
%               'piecewise' (default) or 'none'.
%     TP.LearnRateDropPeriod: Number of epochs for dropping the learning
%               rate as a positive integer. The default is 10.
%     TP.LearnRateDropFactor: Factor for dropping the learning rate,
%               defined as a scalar between 0 and 1. The default is 0.1.
%     TP.MiniBatchSize: Size of the mini-batch to use for each training
%               iteration, defined as a positive integer. The default is
%               20*M.
%
%   helperAEWDecode, helperAEWNormalizationLayer, helperAEWAWGNLayer.

%   Copyright 2020 The MathWorks, Inc.

% Derived parameters
M = 2^k;
R = k/n;

if nargin > 4
trainParams = varargin{1};
else
% Set default training options. Set maximum epochs to 15. SGD requires a
% representative mini-batch that has enough symbols to achieve
% convergence. Therefore, increase the mini-batch size with M. Set the
% initial learning rate to 0.01 and reduce the learning rate by a factor
% of 10 every 10 epochs. Do not plot or print training progress.
trainParams.MaxEpochs = 15;
trainParams.MiniBatchSize = 20*M;
trainParams.InitialLearnRate = 0.01;
trainParams.LearnRateSchedule = 'piecewise';
trainParams.LearnRateDropPeriod = 10;
trainParams.LearnRateDropFactor = 0.1;
trainParams.Plots = 'none';
trainParams.Verbose = false;
end

% Convert Eb/No to channel Eb/No values using the code rate
EbNoChannel = EbNo + 10*log10(R);

% As the number of possible input symbols increase, we need to increase the
% number of training symbols to give the network a chance to experience a
% large number of possible input combinations. The same is true for number
% of validation symbols.
numTrainSymbols = 2500 * M;
numValidationSymbols = 100 * M;

% Define autoencoder network. Input is a one-hot vector of length M. The
% encoder has two fully connected layers. The first one has M inputs and M
% outputs and is followed by an ReLU layer. The second fully connected
% layer has M inputs and n outputs and is followed by the normalization
% layer. Normalization layer imposes constraints on the encoder output and
% available methods are energy and average power normalization. The encoder
% layers are followed by the AWGN channel layer. Set BitsPerSymbol to 2
% since two output values are mapped onto a complex symbol. Set the signal
% power to 1 since the normalization layer outputs signals with unity
% power. The output of the channel is passed to the decoder layers. The
% first decoder layer is a fully connected layer that has n inputs and M
% outputs and is followed by an ReLU layer. Second fully connected layer
% has M inputs and M outputs and is followed by a softmax layer. The output
% of the decoder is chosen as the most probable transmitted symbol from 0
% to M-1.
wirelessAutoEncoder = [
featureInputLayer(M,"Name","One-hot input","Normalization","none")

fullyConnectedLayer(M,"Name","fc_1")
reluLayer("Name","relu_1")

fullyConnectedLayer(n,"Name","fc_2")

helperAEWNormalizationLayer("Method", normalization)

helperAEWAWGNLayer("NoiseMethod","EbNo",...
"EbNo",EbNoChannel,...
"BitsPerSymbol",2,...
"SignalPower",1)

fullyConnectedLayer(M,"Name","fc_3")
reluLayer("Name","relu_2")

fullyConnectedLayer(M,"Name","fc_4")
softmaxLayer("Name","softmax")

classificationLayer("Name","classoutput")];

% Generate random training data. Create one-hot input vectors and labels.
d = randi([0 M-1],numTrainSymbols,1);
trainSymbols = zeros(numTrainSymbols,M);
trainSymbols(sub2ind([numTrainSymbols, M],...
(1:numTrainSymbols)',d+1)) = 1;
trainLabels = categorical(d);

% Generate random validation data. Create one-hot input vectors and labels.
d = randi([0 M-1],numValidationSymbols,1);
validationSymbols = zeros(numValidationSymbols,M);
validationSymbols(sub2ind([numValidationSymbols, M],...
(1:numValidationSymbols)',d+1)) = 1;
validationLabels = categorical(d);

% Set training options
'InitialLearnRate',trainParams.InitialLearnRate, ...
'MaxEpochs',trainParams.MaxEpochs, ...
'MiniBatchSize',trainParams.MiniBatchSize, ...
'Shuffle','every-epoch', ...
'ValidationData',{validationSymbols,validationLabels}, ...
'LearnRateSchedule', trainParams.LearnRateSchedule, ...
'LearnRateDropPeriod', trainParams.LearnRateDropPeriod, ...
'LearnRateDropFactor', trainParams.LearnRateDropFactor, ...
'Plots', trainParams.Plots, ...
'Verbose', trainParams.Verbose);

% Train the autoencoder network
[trainedNet,info] = trainNetwork(trainSymbols,trainLabels,wirelessAutoEncoder,options);

% Separate the network into encoder and decoder parts. Encoder starts with
% the input layer and ends after the normalization layer.
for idxNorm = 1:length(trainedNet.Layers)
if isa(trainedNet.Layers(idxNorm), 'helperAEWNormalizationLayer')
break
end
end
regressionLayer('Name', 'txout'));
lgraph = connectLayers(lgraph,'wnorm','txout');
txNet = assembleNetwork(lgraph);

% Decoder starts after the channel layer and ends with the classification
% layer. Add a feature input layer at the beginning.
for idxChan = idxNorm:length(trainedNet.Layers)
if isa(trainedNet.Layers(idxChan), 'helperAEWAWGNLayer')
break
end
end
firstLayerName = trainedNet.Layers(idxChan+1).Name;
n = trainedNet.Layers(idxChan+1).InputSize;
trainedNet.Layers(idxChan+1:end));
lgraph = connectLayers(lgraph,'rxin',firstLayerName);
rxNet = assembleNetwork(lgraph);

References

 T. O’Shea and J. Hoydis, "An Introduction to Deep Learning for the Physical Layer," in IEEE Transactions on Cognitive Communications and Networking, vol. 3, no. 4, pp. 563-575, Dec. 2017, doi: 10.1109/TCCN.2017.2758370.

 S. Dörner, S. Cammerer, J. Hoydis and S. t. Brink, "Deep Learning Based Communication Over the Air," in IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 132-143, Feb. 2018, doi: 10.1109/JSTSP.2017.2784180.