diff --git a/PyBind11/direct/README.md b/PyBind11/direct/README.md new file mode 100644 index 0000000..e43ac39 --- /dev/null +++ b/PyBind11/direct/README.md @@ -0,0 +1,81 @@ +# The fast and furious way +{:.no_toc} + + + +## Top + +Let us assume that you know what you are doing. And also let us assume that you notices how extremely slow the "correct" way of communication between Python and C++ is. Well the following section is for you... + +Questions to [David Rotermund](mailto:davrot@uni-bremen.de) + + + +## On the Python side + +```python +# If it is a torch tensor then make a "view" to its numpy core +np_input: np.ndarray = input.contiguous().detach().numpy() + +# We need to make sure that the numpy ndarray is C_CONTIGUOUS. +# If not then use numpy.ascontiguousarray() to make it so +assert np_input.flags["C_CONTIGUOUS"] is True + +# Input is a 4d ndarray. And I will make sure that this is really the case +assert np_input.ndim == 4 + +# Now I extract the pointer to the data memory of the ndarray +np_input_pointer, _ = np_input.__array_interface__["data"] + +# Also I need the shape information for the C++ program. +np_input_dim_0: int = np_input.shape[0] +np_input_dim_1: int = np_input.shape[1] +np_input_dim_2: int = np_input.shape[2] +np_input_dim_3: int = np_input.shape[3] +``` + +## On the C++ side + +Your C++ method needs to accept these arguments + +```cpp +int64_t np_input_pointer_addr, +int64_t np_input_dim_0, +int64_t np_input_dim_1, +int64_t np_input_dim_2, +int64_t np_input_dim_3, +``` + +Inside your C++ method you convert the address into a pointer. **BE WARNED:** Make absolutely sure that the dtype of the np.ndarray is correctly reflected in the pointer type + +dtype=np.float32 --> float +dtype=np.float64 --> double +dtype=np.uint64 --> uint64 + +If you fuck this up then this will end in tears! + +```cpp +float *np_input_pointer = (float *)np_input_pointer_addr; + +// Input +assert((np_input_pointer != nullptr)); +assert((np_input_dim_0 > 0)); +assert((np_input_dim_1 > 0)); +assert((np_input_dim_2 > 0)); +assert((np_input_dim_3 > 0)); +``` + +Don't forget that C Contiguous is just a complicated way of saying Row-major order memory layout [Row- and column-major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order). + +$$M[a,b,c,d] = M[\eta_a \cdot a + \eta_b \cdot b + \eta_c \cdot c + d]$$ + +with + +$$\eta_c = n_d$$ + +$$\eta_b = \eta_c \cdot n_c$$ + +$$\eta_a = \eta_b \cdot n_b$$