jangada.serialization.Serializable.register_dataset_type#

classmethod Serializable.register_dataset_type(dataset_type: type, disassemble: Callable[[Any], tuple[TypeAliasForwardRef('NDArray'), dict]], assemble: Callable[[TypeAliasForwardRef('NDArray'), dict], Any]) None#

Register a type that requires special handling for serialization.

Dataset types are types that need to be converted to/from numpy arrays for storage (e.g., in HDF5 datasets). The disassemble function converts the object to an array and metadata dict, while assemble reconstructs the object.

Parameters:
dataset_typetype

The type to register.

disassembleCallable[[Any], tuple[NDArray, dict]]

Function that converts an object to (array, attributes_dict). The array will be stored as an HDF5 dataset, and attributes as HDF5 attributes or group metadata.

assembleCallable[[NDArray, dict], Any]

Function that reconstructs the object from array and attributes.

See also

remove_dataset_type

Remove a dataset type registration

is_dataset_type

Check if a type is registered as a dataset type

Notes

Registering a dataset type also automatically registers it as a primitive type. The type is registered both by type object and by qualified name string for deserialization.

Built-in registered dataset types: - numpy.ndarray - pandas.Timestamp - pandas.DatetimeIndex

Examples

Register a custom array-like type:

>>> class CustomArray:
...     def __init__(self, data):
...         self.data = data
...
>>> def disassemble(obj):
...     return np.array(obj.data), {'shape': len(obj.data)}
...
>>> def assemble(arr, attrs):
...     return CustomArray(arr.tolist())
...
>>> Serializable.register_dataset_type(CustomArray, disassemble, assemble)