pytutorial/python_basics/python_typing/README.md

433 lines
12 KiB
Markdown
Raw Permalink Normal View History

# Type annotations and static type checking for Python
{:.no_toc}
<nav markdown="1" class="toc-class">
* TOC
{:toc}
</nav>
## Goal
We want to use static type checking and type annotations in our Python code for detecting errors we made. We will use the mypy extension in VS code for that.
```python
a: int = 0
b: float = 0.0
a = b Incompatible types in assignment (expression has type "float", variable has type "int")
```
Questions to [David Rotermund](mailto:davrot@uni-bremen.de)
## Why [Type hints](https://peps.python.org/pep-0484/)?
Why we got type hints according PEP 484 -- Type Hints (29-Sep-2014):
> [Rationale and Goals](https://www.python.org/dev/peps/pep-0484/#id11)
>
> This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information.
>
> Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.
>
> [...]
>
> It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.
I would redefine this list a bit:
* It is a part of your automatic documentation (like with meaningful variable names). If another person gets your source code they understand it easier.
* You editor might thank you. Do to some new features in Python 3.10, the modern editors that do syntax highlighting and error checking have a harder time to infer what you mean. The more it need to think about what you mean, the slower your editor might get or even fail to show you syntax highlighting.
* Static code analysis is really helpful. It showed me any problems ahead that I would have figured out the hard way otherwise.
* Packages like the just-in-time compiler numba can produce better results if you can tell it what the variables are.
## How do we do it?
Variables are assigned to a type the ***first*** time when used or can be defined even before use:
```python
a: int
b: int = 0
```
You are allowed to connect a variable ***once and only once*** to a type. If you assign a type a second time to a variable then you will get an error and have to remove the second assignment.
For functions it looks a bit different because we have to handle the type of the return value with the **->** construct:
```python
def this_is_a_function() -> None:
pass
def this_is_a_function() -> int:
return 5
def this_is_a_function(a: int) -> int:
return a
def this_is_a_function(a: int, b: int = 8) -> int:
return a + b
def this_is_a_function(a: int, b: int = 8) -> tuple[int, int]:
return a, b
```
Please note, that there is a difference how type annotations worked for older version. I will cover only Python 3.10 and newer. The official documentation can be found [here](https://docs.python.org/3/library/typing.html).
## Ignore a line
You can force MyPy to ignore a line if you add the comment # type: ignore to that line.
Please check [Common issues and solutions](https://mypy.readthedocs.io/en/stable/common_issues.html) if you run in strange problems.
## MyPy (command line)
```shell
pip install mypy
```
If you don't vs code then you can use the mypy command line tool:
```shell
mypy test2.py
```
```shell
test2.py:3: error: Incompatible types in assignment (expression has type "float", variable has type "int")
Found 1 error in 1 file (checked 1 source file)
```
## MyPy under VS Code
Go under Extensions and install the **Mypy Type Checker** from **Microsoft**.
![figure 1](image0.png)
## [Built-in types](https://mypy.readthedocs.io/en/latest/builtin_types.html)
* If the type starts with an upper letter then you might import it from the typing module like
```python
from typing import Any
```
* If you have no clue what type something has, well use **type()**:
```python
import numpy as np
import torch
def func() -> None:
return
a = 0
b = np.zeros((10,))
c = torch.zeros((10, 1))
d = func
print(type(a))
print(type(b))
print(type(c))
print(type(d))
```
Output:
```python
<class 'int'>
<class 'numpy.ndarray'>
<class 'torch.Tensor'>
<class 'function'>
```
The correct typing would have been:
```python
import numpy as np
import torch
from typing import Callable
def func() -> None:
return
a: int = 0
b: np.ndarray = np.zeros((10,))
c: torch.Tensor = torch.zeros((10, 1))
d: Callable = func
```
As you can see, we had to change **b** a bit because we didn't use **import numpy** but used **import numpy as np**. Thus we had to use **np.ndarray** instead of **numpy.ndarray**.
Concerning **<class 'function'>**, this is a specical case. And requires an import from the typing module via **from typing import Callable**. More about that later.
### [Simple types](https://mypy.readthedocs.io/en/latest/builtin_types.html#simple-types)
> Here are examples of some common built-in types:
|Type|Description|
|---|---|
|int|integer|
|float|floating point number|
|bool|boolean value (subclass of int)|
|str|text, sequence of unicode codepoints|
|bytes|8-bit string, sequence of byte values|
|object|an arbitrary object (object is the common base class)|
```python
a: int = 0
b: float = 0.0
c: bool = True
d: str = "LaLa"
```
### [Any type](https://mypy.readthedocs.io/en/latest/builtin_types.html#any-type)
Special type indicating an unconstrained type.
```python
from typing import Any
a: Any = 0
b: float = 0.0
a = b
```
### [Generic types](https://mypy.readthedocs.io/en/latest/builtin_types.html#generic-types)
|Type|Description|
|---|---|
|list[str]|list of str objects|
|tuple[int, int]|tuple of two int objects (tuple[()] is the empty tuple)|
|tuple[int, ...]|tuple of an arbitrary number of int objects|
|dict[str, int]|dictionary from str keys to int values|
|Iterable[int]|iterable object containing ints|
|Sequence[bool]|sequence of booleans (read-only)|
|Mapping[str, int]|mapping from str keys to int values (read-only)|
|type[C]|type object of C (C is a class/type variable/union of types)|
Examples:
```python
la: list = ["a", 1, 3.3]
ta: tuple = ("a", 1, 3.3)
tb: tuple[str, int, float] = ("a", 1, 3.3)
```
Wrong:
```python
la: list[str, int, float] = ["a", 1, 3.3]
```
Correct:
```python
la: list[str | int | float] = ["a", 1, 3.3]
```
## [\|](https://docs.python.org/3/library/typing.html#typing.Union)
In the case you expect a variable that can have differnt types over it's lifetime. Let us say you initialize it with None and later want to store integer in it:
```python
a: None | int = None
```
An other example is this:
```python
import torch
import numpy as np
a: np.ndarray | torch.Tensor = torch.zeros((100,))
```
This is called a [Union](https://docs.python.org/3/library/typing.html#typing.Union). The Union with **None** is called [Optional](https://docs.python.org/3/library/typing.html#typing.Optional). But nowadays you just need to remember **\|**.
In the real world I encountered this problem:
```python
a: int | None = 1
b: int
b = a # Incompatible types in assignment (expression has type "int | None", variable has type "int")
```
The solution is to use [assert](https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement):
```python
a: int | None = 1
b: int
assert a is not None
b = a
```
## [TypeAlias](https://docs.python.org/3/library/typing.html#typing.TypeAlias)
You can create an alias for more complicated types
```python
from typing import TypeAlias
Numbis: TypeAlias = int | float
a: Numbis
a = 1
a = 1.1
a = "Hello" # Incompatible types in assignment (expression has type "str", variable has type "int | float")
```
## [Tuple](https://docs.python.org/3/library/typing.html#typing.Tuple)
```python
a: tuple[int, str, int] = (5, "Hello", 6)
a = (
"Hello",
4,
4,
) # Incompatible types in assignment (expression has type "Tuple[str, int, int]", variable has type "Tuple[int, str, int]")
```
Or if you don't care about what is in the tuple
```python
a: tuple = (5, "Hello", 6)
a = ("Hello", 4, 4)
```
## [List](https://docs.python.org/3/library/typing.html#typing.List)
A generic list:
```python
mylist: list = []
mylist.append(1)
mylist.append(2)
mylist.append("Hello")
```
Or defining more details about the list:
```python
mylist: list[int] = []
mylist.append(1)
mylist.append(2)
mylist.append(
"Hello"
) # Argument 1 to "append" of "list" has incompatible type "str"; expected "int"
print(mylist) # -> [1, 2, 'Hello']
```
## [Dict](https://docs.python.org/3/library/typing.html#typing.Dict)
Generic dictionary:
```python
mydict: dict = {"A": 1, "B": 3.14, "C": "Hello"}
```
We can give it more information. However, we have to be careful to include the types correctly.
This is wrong:
```python
mydict_a: dict[str, int] = {"A": 1, "B": 3.14, "C": "Hello", 1: 1}
```
We get these errors:
```python
Dict entry 1 has incompatible type "str": "float"; expected "str": "int"
Dict entry 2 has incompatible type "str": "str"; expected "str": "int"
Dict entry 3 has incompatible type "int": "int"; expected "str": "int"
```
These are correct ways to handle it:
```python
mydict_a: dict[str | int, str | int | float] = {"A": 1, "B": 3.14, "C": "Hello", 1: 1}
mydict_b: dict[str, str | int | float] = {"A": 1, "B": 3.14, "C": "Hello", "1": 1}
```
## [Numpy](https://numpy.org/devdocs/reference/typing.html)
Generic:
```python
import numpy as np
a: np.ndarray = np.zeros((10, 1))
```
Protecting against a wrong dtype:
```python
import numpy as np
from typing import Any
a: np.ndarray[Any, np.dtype[np.uint64]]
a = np.zeros((10, 1), dtype=np.uint64)
a = np.zeros((10, 1)) # -> Incompatible types in assignment (expression has type "ndarray[Any, dtype[floating[_64Bit]]]", variable has type "ndarray[Any, dtype[unsignedinteger[_64Bit]]]")
```
## PyTorch
Please note the big T!
```python
import torch
a: torch.Tensor = torch.zeros((10, 1))
```
## [Callable](https://docs.python.org/3/library/typing.html#annotating-callable-objects)
Callable means "function". As you know, you can shove functions objects around and also use then as function arguments of other function. It is helpful to make sure that the function you get has the properties you expect.
```python
Callable[[Arg1Type, Arg2Type], ReturnType]
```
```python
from typing import Callable
def function_a(x: int) -> int:
return x + 1
def function_a_bad(x: int, y: int) -> int:
return x + y
def function_b(x, other_function: Callable[[int], int]) -> int:
return other_function(x) ** 2
print(function_b(1, function_a)) # -> 4
print(function_b(1, function_b)) # -> Argument 2 to "function_b" has incompatible type "Callable[[Any, Callable[[int], int]], int]"; expected "Callable[[int], int]"
```
## [NewType](https://docs.python.org/3/library/typing.html#newtype), [Generics](https://docs.python.org/3/library/typing.html#generics), [User-defined generic types](https://docs.python.org/3/library/typing.html#user-defined-generic-types)
{: .topic-optional}
This is an optional topic!
Well... this exists... never used it.
|||
|---|---|
|NewType | Use the NewType helper to create distinct types|
|Generics| Since type information about objects kept in containers cannot be statically inferred in a generic way, many container classes in the standard library support subscription to denote the expected types of container elements. |
|User-defined generic types|A user-defined class can be defined as a generic class.|
## References
* [Type hints cheat sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html)
* [typing — Support for type hints](https://docs.python.org/3/library/typing.html)