# Documentation

# classificationLayer

Create a classification output layer

## Syntax

``coutputlayer = classificationLayer()``
``coutputlayer = classificationLayer('Name',Name)``

## Description

````coutputlayer = classificationLayer()` returns a classification output layer for a neural network. The classification output layer holds the name of the loss function that the software uses for training the network for multi-class classification, the size of the output, and the class labels.```

````coutputlayer = classificationLayer('Name',Name)` returns a classification layer with name specified by `name`.```

## Examples

Create a classification output layer with the name `'coutput'`.

```coutputlayer = classificationLayer('Name','coutput') ```
```coutputlayer = ClassificationOutputLayer with properties: Name: 'coutput' ClassNames: {1×0 cell} OutputSize: 'auto' Hyperparameters LossFunction: 'crossentropyex' ```

The default loss function for classification is cross entropy for mutually exclusive classes.

## Input Arguments

Name for the layer, specified as the comma-separated pair consisting of `Name` and a character vector.

Example: `'Name','coutput'`

Data Types: `char`

## Output Arguments

Classification output layer, returned as a `ClassificationOutputLayer` object.

For information on concatenating layers to construct convolutional neural network architecture, see `Layer`.

### Cross Entropy Function for k Mutually Exclusive Classes

For multi-class classification problems the software assigns each input to one of the k mutually exclusive classes. The loss (error) function for this case is the cross entropy function for a 1-of-k coding scheme [1]:

`$E\left(\theta \right)=-\sum _{i=1}^{n}\sum _{j=1}^{k}{t}_{ij}\mathrm{ln}{y}_{j}\left({x}_{i},\theta \right),$`

where $\theta$ is the parameter vector, ${t}_{ij}$ is the indicator that the ith sample belongs to the jth class, and ${y}_{j}\left({x}_{i},\theta \right)$ is the output for sample i. The output ${y}_{j}\left({x}_{i},\theta \right)$ can be interpreted as the probability that the network associates ith input with class j, i.e., $P\left({t}_{j}=1|{x}_{i}\right)$.

The output unit activation function is the softmax function:

`${y}_{r}\left(x\right)=\frac{\mathrm{exp}\left({a}_{r}\left(x\right)\right)}{\sum _{j=1}^{k}\mathrm{exp}\left({a}_{j}\left(x\right)\right)},$`

where $0\le {y}_{r}\le 1$ and $\sum _{j=1}^{k}{y}_{j}=1$.

## References

