9af7ad5d6d
Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com> |
||
---|---|---|
.. | ||
README.md |
Python -- Type annotations and static type checking for Python
Goal
We want to use static type checking and type annotations in our Python code for detecting errors we made. We will use the tool mypy for that.
a: int = 0
b: float = 0.0
a = b Incompatible types in assignment (expression has type "float", variable has type "int")
Questions to David Rotermund
Why Type hints?
Why we got type hints according PEP 484 -- Type Hints (29-Sep-2014):
This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and (perhaps, in some contexts) code generation utilizing type information.
Of these goals, static analysis is the most important. This includes support for off-line type checkers such as mypy, as well as providing a standard notation that can be used by IDEs for code completion and refactoring.
[...]
It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.
I would redefine this list a bit:
- It is a part of your automatic documentation (like with meaningful variable names). If another person gets your source code they understand it easier.
- You editor might thank you. Do to some new features in Python 3.10, the modern editors that do syntax highlighting and error checking have a harder time to infer what you mean. The more it need to think about what you mean, the slower your editor might get or even fail to show you syntax highlighting.
- Static code analysis is really helpful. It showed me any problems ahead that I would have figured out the hard way otherwise.
- Packages like the just-in-time compiler numba can produce better results if you can tell it what the variables are.
How do we do it?
Variables are assigned to a type the first time when used or can be defined even before use:
a: int
b: int = 0
You are allowed to connect a variable once and only once to a type. If you assign a type a second time to a variable then you will get an error and have to remove the second assignment.
For functions it looks a bit different because we have to handle the type of the return value with the -> construct:
def this_is_a_function() -> None:
pass
def this_is_a_function() -> int:
return 5
def this_is_a_function(a: int) -> int:
return a
def this_is_a_function(a: int, b: int = 8) -> int:
return a + b
def this_is_a_function(a: int, b: int = 8) -> tuple[int, int]:
return a, b
Please note, that there is a difference how type annotations worked for older version. I will cover only Python 3.10 and newer. The official documentation can be found here.
MyPy under VS Code
(also the header packages)
Built-in types
- If the type starts with an upper letter then you might import it from the typing module like
from typing import Any
- If you have no clue what type something has, well use type():
import numpy as np
import torch
def func() -> None:
return
a = 0
b = np.zeros((10,))
c = torch.zeros((10, 1))
d = func
print(type(a))
print(type(b))
print(type(c))
print(type(d))
Output:
<class 'int'>
<class 'numpy.ndarray'>
<class 'torch.Tensor'>
<class 'function'>
The correct typing would have been:
import numpy as np
import torch
from typing import Callable
def func() -> None:
return
a: int = 0
b: np.ndarray = np.zeros((10,))
c: torch.Tensor = torch.zeros((10, 1))
d: Callable = func
As you can see, we had to change b a bit because we didn't use import numpy but used import numpy as np. Thus we had to use np.ndarray instead of numpy.ndarray.
Concerning <class 'function'>, this is a specical case. And requires an import from the typing module via from typing import Callable. More about that later.
Simple types
Here are examples of some common built-in types:
Type | Description |
---|---|
int | integer |
float | floating point number |
bool | boolean value (subclass of int) |
str | text, sequence of unicode codepoints |
bytes | 8-bit string, sequence of byte values |
object | an arbitrary object (object is the common base class) |
a: int = 0
b: float = 0.0
c: bool = True
d: str = "LaLa"
Any type
Special type indicating an unconstrained type.
from typing import Any
a: Any = 0
b: float = 0.0
a = b
Generic types
Type | Description |
---|---|
list[str] | list of str objects |
tuple[int, int] | tuple of two int objects (tuple[()] is the empty tuple) |
tuple[int, ...] | tuple of an arbitrary number of int objects |
dict[str, int] | dictionary from str keys to int values |
Iterable[int] | iterable object containing ints |
Sequence[bool] | sequence of booleans (read-only) |
Mapping[str, int] | mapping from str keys to int values (read-only) |
type[C] | type object of C (C is a class/type variable/union of types) |
Examples:
la: list = ["a", 1, 3.3]
ta: tuple = ("a", 1, 3.3)
tb: tuple[str, int, float] = ("a", 1, 3.3)
Wrong:
la: list[str, int, float] = ["a", 1, 3.3]
Correct:
la: list[str | int | float] = ["a", 1, 3.3]
|
In the case you expect a variable that can have differnt types over it's lifetime. Let us say you initialize it with None and later want to store integer in it:
a: None | int = None
An other example is this:
import torch
import numpy as np
a: np.ndarray | torch.Tensor = torch.zeros((100,))
This is called a Union. The Union with None is called Optional. But nowadays you just need to remember |.
Tuple
a: tuple[int, str, int] = (5, "Hello", 6)
a = (
"Hello",
4,
4,
) # Incompatible types in assignment (expression has type "Tuple[str, int, int]", variable has type "Tuple[int, str, int]")
Or if you don't care about what is in the tuple
a: tuple = (5, "Hello", 6)
a = ("Hello", 4, 4)
List
A generic list:
mylist: list = []
mylist.append(1)
mylist.append(2)
mylist.append("Hello")
Or defining more details about the list:
mylist: list[int] = []
mylist.append(1)
mylist.append(2)
mylist.append(
"Hello"
) # Argument 1 to "append" of "list" has incompatible type "str"; expected "int"
print(mylist) # -> [1, 2, 'Hello']