diff --git a/numpy/JSON/README.md b/numpy/JSON/README.md new file mode 100644 index 0000000..0429776 --- /dev/null +++ b/numpy/JSON/README.md @@ -0,0 +1,141 @@ +# Numpy <-> JSON over Pandas +{:.no_toc} + + + +## The goal + +Normally we can not use JSON with Numpy. However, if we use [Pandas](https://pandas.pydata.org/) as an intermediary then we can do it. + +Questions to [David Rotermund](mailto:davrot@uni-bremen.de) + +**Note: Pandas can also be used for many other formats beside JSON.** + +## Writing JSON [pandas.DataFrame.to_json](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html#pandas-dataframe-to-json) + +```python +DataFrame.to_json(path_or_buf=None, orient=None, date_format=None, double_precision=10, force_ascii=True, date_unit='ms', default_handler=None, lines=False, compression='infer', index=None, indent=None, storage_options=None, mode='w')[source] +``` + +> Convert the object to a JSON string. +> +> Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps. + +```python +import numpy as np +import pandas as pd + +rng = np.random.default_rng() + +some_data = rng.random((11, 3)) + +df = pd.DataFrame(some_data) +# As file +filename = "mynumpydata.json" +df.to_json(filename, orient="index") + +# As string +output = df.to_json(orient="index") +print(output) +``` + +Output (reformated): + +```python +{ + "0": { + "0": 0.3145859169, + "1": 0.2517001569, + "2": 0.6685086575 + }, + "1": { + "0": 0.7324177066, + "1": 0.6750562092, + "2": 0.0086333192 + }, + "2": { + "0": 0.7529914827, + "1": 0.3597052352, + "2": 0.2780062722 + }, + "3": { + "0": 0.2847410336, + "1": 0.5572451873, + "2": 0.5591149362 + }, + "4": { + "0": 0.4507115703, + "1": 0.9623511422, + "2": 0.7180796014 + }, + "5": { + "0": 0.5406601852, + "1": 0.9315847158, + "2": 0.2456480951 + }, + "6": { + "0": 0.3441382077, + "1": 0.4714817658, + "2": 0.1777388975 + }, + "7": { + "0": 0.6994839505, + "1": 0.6520935819, + "2": 0.9870686976 + }, + "8": { + "0": 0.187576403, + "1": 0.7466669157, + "2": 0.2952841542 + }, + "9": { + "0": 0.9140410582, + "1": 0.6828387334, + "2": 0.165762789 + }, + "10": { + "0": 0.644055269, + "1": 0.6122094952, + "2": 0.9695111468 + } +} +``` + +## Read JSON [pandas.read_json](https://pandas.pydata.org/docs/reference/api/pandas.read_json.html#pandas-read-json) + +```python +pandas.read_json(path_or_buf, *, orient=None, typ='frame', dtype=None, convert_axes=None, convert_dates=True, keep_default_dates=True, precise_float=False, date_unit=None, encoding=None, encoding_errors='strict', lines=False, chunksize=None, compression='infer', nrows=None, storage_options=None, dtype_backend=_NoDefault.no_default, engine='ujson')[source] +``` + +> Convert a JSON string to pandas object. + +```python +import numpy as np +import pandas as pd + +filename = "mynumpydata.json" +df = pd.read_json(filename, orient="index") +output_np = df.to_numpy() +print(type(output_np)) # -> +print(output_np) +``` + +Output: + +```python +[[0.31458592 0.25170016 0.66850866] + [0.73241771 0.67505621 0.00863332] + [0.75299148 0.35970524 0.27800627] + [0.28474103 0.55724519 0.55911494] + [0.45071157 0.96235114 0.7180796 ] + [0.54066019 0.93158472 0.2456481 ] + [0.34413821 0.47148177 0.1777389 ] + [0.69948395 0.65209358 0.9870687 ] + [0.1875764 0.74666692 0.29528415] + [0.91404106 0.68283873 0.16576279] + [0.64405527 0.6122095 0.96951115]] +``` +