Update README.md

Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com>
This commit is contained in:
David Rotermund 2023-12-20 17:56:36 +01:00 committed by GitHub
parent 0b078059ab
commit 27e1d3d962
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -36,35 +36,39 @@ gzip -d *.gz
### Label file structure
> # [offset] [type] [value] [description]
> # 0000 32 bit integer 0x00000801(2049) magic number (MSB first)
> # 0004 32 bit integer 60000 number of items
> # 0008 unsigned byte ?? label
> # 0009 unsigned byte ?? label
> # ........
> # xxxx unsigned byte ?? label
> # The labels values are 0 to 9.
> [offset] [type] [value] [description]
> 0000 32 bit integer 0x00000801(2049) magic number (MSB first)
> 0004 32 bit integer 60000 number of items
> 0008 unsigned byte ?? label
> 0009 unsigned byte ?? label
> ........
> xxxx unsigned byte ?? label
The labels values are 0 to 9.
### Pattern file structure
> # [offset] [type] [value] [description]
> # 0000 32 bit integer 0x00000803(2051) magic number
> # 0004 32 bit integer 60000 number of images
> # 0008 32 bit integer 28 number of rows
> # 0012 32 bit integer 28 number of columns
> # 0016 unsigned byte ?? pixel
> # 0017 unsigned byte ?? pixel
> # ........
> # xxxx unsigned byte ?? pixel
> # Pixels are organized row-wise.
> # Pixel values are 0 to 255. 0 means background (white),
> # 255 means foreground (black).
> [offset] [type] [value] [description]
> 0000 32 bit integer 0x00000803(2051) magic number
> 0004 32 bit integer 60000 number of images
> 0008 32 bit integer 28 number of rows
> 0012 32 bit integer 28 number of columns
> 0016 unsigned byte ?? pixel
> 0017 unsigned byte ?? pixel
> ........
> xxxx unsigned byte ?? pixel
##
Pixels are organized row-wise.
Pixel values are 0 to 255. 0 means background (white),
255 means foreground (black).
## Converting the dataset to numpy
My source code for that task: convert.py
# %%
```python
import numpy as np
# [offset] [type] [value] [description]
@ -199,11 +203,11 @@ def proprocess_dataset(testdata_mode: bool) -> None:
proprocess_dataset(testdata_mode=True)
proprocess_dataset(testdata_mode=False)
```
Now we have the files:
test_label_storage.npy
test_pattern_storage.npy
train_label_storage.npy
train_pattern_storage.npy
* test_label_storage.npy
* test_pattern_storage.npy
* train_label_storage.npy
* train_pattern_storage.npy