Create README.md

Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com>
2025-07-01 23:00:01 +02:00 · 2024-01-02 20:51:54 +01:00 · 2024-01-02 20:51:54 +01:00 · 4373fc8872
commit 4373fc8872
parent ae71a3d945
1 changed files with 129 additions and 0 deletions
--- a/pytorch/train/README.md
+++ b/pytorch/train/README.md
@ -0,0 +1,129 @@
+# Train the network 
+{:.no_toc}
+
+<nav markdown="1" class="toc-class">
+* TOC
+{:toc}
+</nav>
+
+## Top
+
+Training the weights and biases of the network.
+
+Questions to [David Rotermund](mailto:davrot@uni-bremen.de)
+
+## An example setup
+
+### Network
+
+```python
+import torch
+
+# Some parameters
+input_number_of_channel: int = 1
+input_dim_x: int = 24
+input_dim_y: int = 24
+
+number_of_output_channels_conv1: int = 32
+number_of_output_channels_conv2: int = 64
+number_of_output_channels_flatten1: int = 576
+number_of_output_channels_full1: int = 10
+
+kernel_size_conv1: tuple[int, int] = (5, 5)
+kernel_size_pool1: tuple[int, int] = (2, 2)
+kernel_size_conv2: tuple[int, int] = (5, 5)
+kernel_size_pool2: tuple[int, int] = (2, 2)
+
+stride_conv1: tuple[int, int] = (1, 1)
+stride_pool1: tuple[int, int] = (2, 2)
+stride_conv2: tuple[int, int] = (1, 1)
+stride_pool2: tuple[int, int] = (2, 2)
+
+padding_conv1: int = 0
+padding_pool1: int = 0
+padding_conv2: int = 0
+padding_pool2: int = 0
+
+network = torch.nn.Sequential(
+    torch.nn.Conv2d(
+        in_channels=input_number_of_channel,
+        out_channels=number_of_output_channels_conv1,
+        kernel_size=kernel_size_conv1,
+        stride=stride_conv1,
+        padding=padding_conv1,
+    ),
+    torch.nn.ReLU(),
+    torch.nn.MaxPool2d(
+        kernel_size=kernel_size_pool1, stride=stride_pool1, padding=padding_pool1
+    ),
+    torch.nn.Conv2d(
+        in_channels=number_of_output_channels_conv1,
+        out_channels=number_of_output_channels_conv2,
+        kernel_size=kernel_size_conv2,
+        stride=stride_conv2,
+        padding=padding_conv2,
+    ),
+    torch.nn.ReLU(),
+    torch.nn.MaxPool2d(
+        kernel_size=kernel_size_pool2, stride=stride_pool2, padding=padding_pool2
+    ),
+    torch.nn.Flatten(
+        start_dim=1,
+    ),
+    torch.nn.Linear(
+        in_features=number_of_output_channels_flatten1,
+        out_features=number_of_output_channels_full1,
+        bias=True,
+    ),
+    torch.nn.Softmax(dim=1),
+)
+```
+
+### Data augmentation
+
+```python
+import torchvision
+
+test_processing_chain = torchvision.transforms.Compose(
+    transforms=[torchvision.transforms.CenterCrop((24, 24))],
+)
+
+train_processing_chain = torchvision.transforms.Compose(
+    transforms=[torchvision.transforms.RandomCrop((24, 24))],
+)
+```
+
+## [What makes it learn?](https://pytorch.org/docs/stable/optim.html)
+
+### [Optimizer Algorithms](https://pytorch.org/docs/stable/optim.html#algorithms)
+
+This is just a small selection of optimizers (i.e. the algorithm that learns the weights based on a loss). Nevertheless, typically [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam) or [SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD) will be the first algorithm you will try.
+
+|||
+|---|---|
+|[Adagrad](https://pytorch.org/docs/stable/generated/torch.optim.Adagrad.html#torch.optim.Adagrad)|	Implements Adagrad algorithm.|
+|[Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam)|	Implements Adam algorithm.|
+|[ASGD](https://pytorch.org/docs/stable/generated/torch.optim.ASGD.html#torch.optim.ASGD)|	Implements Averaged Stochastic Gradient Descent.|
+|[RMSprop](https://pytorch.org/docs/stable/generated/torch.optim.RMSprop.html#torch.optim.RMSprop)|	Implements RMSprop algorithm.|
+|[Rprop](https://pytorch.org/docs/stable/generated/torch.optim.Rprop.html#torch.optim.Rprop)|	Implements the resilient backpropagation algorithm.|
+|[SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD)https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD|	Implements stochastic gradient descent (optionally with momentum).|
+
+
+### [Learning rate scheduler](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate)
+
+"torch.optim.lr_scheduler provides several methods to adjust the learning rate based on the number of epochs."
+
+Why do you want to reduce the learning rate: Well, typically you want to start with a large learning rate for jumping over local minima but later you want to anneal the learning rate because otherwise the optimizer will jump over / oscillate around the minima.  
+
+A non-representative selection is
+
+|||
+|---|---|
+|[lr_scheduler.StepLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.StepLR.html#torch.optim.lr_scheduler.StepLR)|	Decays the learning rate of each parameter group by gamma every step_size epochs.|
+|[lr_scheduler.MultiStepLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.MultiStepLR.html#torch.optim.lr_scheduler.MultiStepLR)|	Decays the learning rate of each parameter group by gamma once the number of epoch reaches one of the milestones.|
+|[lr_scheduler.ConstantLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ConstantLR.html#torch.optim.lr_scheduler.ConstantLR)|	Decays the learning rate of each parameter group by a small constant factor until the number of epoch reaches a pre-defined milestone: total_iters.|
+|[lr_scheduler.LinearLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.LinearLR.html#torch.optim.lr_scheduler.LinearLR)|	Decays the learning rate of each parameter group by linearly changing small multiplicative factor until the number of epoch reaches a pre-defined milestone: total_iters.|
+|[lr_scheduler.ExponentialLR](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ExponentialLR.html#torch.optim.lr_scheduler.ExponentialLR)	|Decays the learning rate of each parameter group by gamma every epoch.|
+|[lr_scheduler.ReduceLROnPlateau](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html#torch.optim.lr_scheduler.ReduceLROnPlateau)|	Reduce learning rate when a metric has stopped improving.|
+
+However, typically I only use [lr_scheduler.ReduceLROnPlateau](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html#torch.optim.lr_scheduler.ReduceLROnPlateau).