From 907de31990a040c69093a0272a6443f90c1a9755 Mon Sep 17 00:00:00 2001
From: David Rotermund <54365609+davrot@users.noreply.github.com>
Date: Wed, 6 Dec 2023 00:00:10 +0100
Subject: [PATCH] Update README.md

Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com>
---
 pytorch/layers/README.md | 117 ++++++++++++++++++++++++++++++---------
 1 file changed, 90 insertions(+), 27 deletions(-)

diff --git a/pytorch/layers/README.md b/pytorch/layers/README.md
index 01787ae..8ec1d23 100644
--- a/pytorch/layers/README.md
+++ b/pytorch/layers/README.md
@@ -169,33 +169,96 @@ In the following I will mark the relevant layers.
 
 |||
 |---|---|
-|torch.nn.nn.ELU|Applies the Exponential Linear Unit (ELU) function, element-wise, as described in the paper: Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs).|
-|torch.nn.nn.Hardshrink|Applies the Hard Shrinkage (Hardshrink) function element-wise.|
-|torch.nn.nn.Hardsigmoid|Applies the Hardsigmoid function element-wise.|
-|torch.nn.nn.Hardtanh|Applies the HardTanh function element-wise.|
-|torch.nn.nn.Hardswish|Applies the Hardswish function, element-wise, as described in the paper: Searching for MobileNetV3.|
-|**[torch.nn.nn.LeakyReLU](https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html#torch.nn.LeakyReLU)**|Applies the element-wise function:|
-|torch.nn.nn.LogSigmoid|Applies the element-wise function:|
-|torch.nn.nn.MultiheadAttention |Allows the model to jointly attend to information from different representation subspaces as described in the paper: Attention Is All You Need.|
-|torch.nn.nn.PReLU|Applies the element-wise function:|
-|**[torch.nn.nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU)**|Applies the rectified linear unit function element-wise:|
-|torch.nn.nn.ReLU6|Applies the element-wise function:|
-|torch.nn.nn.RReLU|Applies the randomized leaky rectified liner unit function, element-wise, as described in the paper:|
-|torch.nn.nn.SELU|Applied element-wise...|
-|torch.nn.nn.CELU|Applies the element-wise function...|
-|torch.nn.nn.GELU|Applies the Gaussian Error Linear Units function:|
-|**[torch.nn.nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html#torch.nn.Sigmoid)**|Applies the element-wise function:...|
-|torch.nn.nn.SiLU|Applies the Sigmoid Linear Unit (SiLU) function, element-wise.|
-|torch.nn.nn.Mish|Applies the Mish function, element-wise.|
-|torch.nn.nn.Softplus|Applies the Softplus function |
-|torch.nn.nn.Softshrink|Applies the soft shrinkage function elementwise:|
-|torch.nn.nn.Softsign|Applies the element-wise function:...|
-|**[torch.nn.nn.Tanh](https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html#torch.nn.Tanh)**|Applies the Hyperbolic Tangent (Tanh) function element-wise.|
-|torch.nn.nn.Tanhshrink |Applies the element-wise function:|
-|torch.nn.nn.Threshold |Thresholds each element of the input Tensor.|
-|torch.nn.nn.GLU |Applies the gated linear unit function |
+|torch.nn.ELU|Applies the Exponential Linear Unit (ELU) function, element-wise, as described in the paper: Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs).|
+|torch.nn.Hardshrink|Applies the Hard Shrinkage (Hardshrink) function element-wise.|
+|torch.nn.Hardsigmoid|Applies the Hardsigmoid function element-wise.|
+|torch.nn.Hardtanh|Applies the HardTanh function element-wise.|
+|torch.nn.Hardswish|Applies the Hardswish function, element-wise, as described in the paper: Searching for MobileNetV3.|
+|**[torch.nn.LeakyReLU](https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html#torch.nn.LeakyReLU)**|Applies the element-wise function:|
+|torch.nn.LogSigmoid|Applies the element-wise function:|
+|torch.nn.MultiheadAttention |Allows the model to jointly attend to information from different representation subspaces as described in the paper: Attention Is All You Need.|
+|torch.nn.PReLU|Applies the element-wise function:|
+|**[torch.nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU)**|Applies the rectified linear unit function element-wise:|
+|torch.nn.ReLU6|Applies the element-wise function:|
+|torch.nn.RReLU|Applies the randomized leaky rectified liner unit function, element-wise, as described in the paper:|
+|torch.nn.SELU|Applied element-wise...|
+|torch.nn.CELU|Applies the element-wise function...|
+|torch.nn.GELU|Applies the Gaussian Error Linear Units function:|
+|**[torch.nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html#torch.nn.Sigmoid)**|Applies the element-wise function:...|
+|torch.nn.SiLU|Applies the Sigmoid Linear Unit (SiLU) function, element-wise.|
+|torch.nn.Mish|Applies the Mish function, element-wise.|
+|torch.nn.Softplus|Applies the Softplus function |
+|torch.nn.Softshrink|Applies the soft shrinkage function elementwise:|
+|torch.nn.Softsign|Applies the element-wise function:...|
+|**[torch.nn.Tanh](https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html#torch.nn.Tanh)**|Applies the Hyperbolic Tangent (Tanh) function element-wise.|
+|torch.nn.Tanhshrink |Applies the element-wise function:|
+|torch.nn.Threshold |Thresholds each element of the input Tensor.|
+|torch.nn.GLU |Applies the gated linear unit function |
+
+## [Non-linear Activations (other)](https://pytorch.org/docs/stable/nn.html#non-linear-activations-other)
+
+
+|||
+|---|---|
+|torch.nn.Softmin|Applies the Softmin function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0, 1] and sum to 1.|
+|**[torch.nn.Softmax](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax)**|Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1.|
+|torch.nn.Softmax2d|Applies SoftMax over features to each spatial location.|
+|torch.nn.LogSoftmax|Applies the LogSoftmax|
+|torch.nn.AdaptiveLogSoftmaxWithLoss|Efficient softmax approximation as described in Efficient softmax approximation for GPUs by Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, and Hervé Jégou.|
+
+
+## [Normalization Layers](https://pytorch.org/docs/stable/nn.html#non-linear-activations-other)
+
+|||
+|---|---|
+|**[torch.nn.BatchNorm1d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d)**|Applies Batch Normalization over a 2D or 3D input as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .|
+|**[torch.nn.BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html#torch.nn.BatchNorm2d)**|Applies Batch Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .|
+|**[torch.nn.BatchNorm3d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm3d.html#torch.nn.BatchNorm3d)**|Applies Batch Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .|
+|torch.nn.LazyBatchNorm1d|A torch.nn.BatchNorm1d module with lazy initialization of the num_features argument of the BatchNorm1d that is inferred from the input.size(1).|
+|torch.nn.LazyBatchNorm2d|A torch.nn.BatchNorm2d module with lazy initialization of the num_features argument of the BatchNorm2d that is inferred from the input.size(1).|
+|torch.nn.LazyBatchNorm3d|A torch.nn.BatchNorm3d module with lazy initialization of the num_features argument of the BatchNorm3d that is inferred from the input.size(1).|
+|torch.nn.GroupNorm|Applies Group Normalization over a mini-batch of inputs as described in the paper Group Normalization|
+|torch.nn.SyncBatchNorm|Applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .|
+|torch.nn.InstanceNorm1d|Applies Instance Normalization over a 2D (unbatched) or 3D (batched) input as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.|
+|torch.nn.InstanceNorm2d|Applies Instance Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.|
+|torch.nn.InstanceNorm3d|Applies Instance Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.|
+|torch.nn.LazyInstanceNorm1d|A torch.nn.InstanceNorm1d module with lazy initialization of the num_features argument of the InstanceNorm1d that is inferred from the input.size(1).|
+|torch.nn.LazyInstanceNorm2d|A torch.nn.InstanceNorm2d module with lazy initialization of the num_features argument of the InstanceNorm2d that is inferred from the input.size(1).|
+|torch.nn.LazyInstanceNorm3d|A torch.nn.InstanceNorm3d module with lazy initialization of the num_features argument of the InstanceNorm3d that is inferred from the input.size(1).|
+|torch.nn.LayerNorm|Applies Layer Normalization over a mini-batch of inputs as described in the paper Layer Normalization|
+|torch.nn.LocalResponseNorm|Applies local response normalization over an input signal composed of several input planes, where channels occupy the second dimension.|
+
+## [Recurrent Layers](https://pytorch.org/docs/stable/nn.html#recurrent-layers)
+
+|||
+|---|---|
+|torch.nn.RNNBase|Base class for RNN modules (RNN, LSTM, GRU).|
+|**[torch.nn.RNN](https://pytorch.org/docs/stable/generated/torch.nn.RNN.html#torch.nn.RNN)**|Applies a multi-layer Elman |
+|**[torch.nn.LSTM](https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#torch.nn.LSTM)**|Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.|
+|**[torch.nn.GRU](https://pytorch.org/docs/stable/generated/torch.nn.GRU.html#torch.nn.GRU)**|Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence.|
+|torch.nn.RNNCell|An Elman RNN cell with tanh or ReLU non-linearity.|
+|torch.nn.LSTMCell|A long short-term memory (LSTM) cell.|
+|torch.nn.GRUCell|A gated recurrent unit (GRU) cell|
+
+### [Transformer Layers](https://pytorch.org/docs/stable/nn.html#transformer-layers)
+
+|||
+|---|---|
+|torch.nn.Transformer|A transformer model.|
+|torch.nn.TransformerEncoder|TransformerEncoder is a stack of N encoder layers.|
+|torch.nn.TransformerDecoder|TransformerDecoder is a stack of N decoder layers|
+|torch.nn.TransformerEncoderLayer|TransformerEncoderLayer is made up of self-attn and feedforward network.|
+|torch.nn.TransformerDecoderLayer|TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.|
+
+
+## [Linear Layers]()
+
+|||
+|---|---|
+|**[torch.nn.Identity](https://pytorch.org/docs/stable/generated/torch.nn.Identity.html#torch.nn.Identity)**|A placeholder identity operator that is argument-insensitive|
+|**[torch.nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear)**|Applies a linear transformation to the incoming data|
+|**[torch.nn.Bilinear](https://pytorch.org/docs/stable/generated/torch.nn.Bilinear.html#torch.nn.Bilinear)**|Applies a bilinear transformation to the incoming data|
+|**[torch.nn.LazyLinear](https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html#torch.nn.LazyLinear)**|A torch.nn.Linear module where in_features is inferred.|
 
 
 
-
-### Transformer Layers