Serialization (jangada.serialization)#

A small, explicit serialization and persistence framework.

This module provides a descriptor-driven schema (SerializableProperty), a registry-backed in-memory serialization protocol (Serializable), and an HDF5 persistence layer (Persistable).

Design goals#

  • Explicit schemas: Classes declare which attributes are serialized via SerializableProperty descriptors (including inheritance across the MRO).

  • Portable serialized form: The in-memory serialized representation uses only Python-native containers (dict/list) plus registered primitive values, and a "__class__" envelope for reconstructing Serializable objects.

  • Extensibility: New primitives and dataset-backed types can be registered globally through the metaclass API.

  • Numerical efficiency: Large arrays and similar objects can be stored as true HDF5 datasets (not blobs), optionally accessed lazily through proxies.

Serialized representation#

The serialized form returned by Serializable.serialize is a tree composed of: - None - lists (always serialized as list) - dictionaries - registered primitive values - Serializable instances serialized as dictionaries that include a

"__class__" key containing a fully qualified name.

The general object envelope is:

{

“__class__”: “module.QualifiedName”, “<property_name_1>”: <serialized_value>, …

}

Primitive types#

A “primitive type” is any type that is allowed to pass through serialization unchanged (stored as-is in the in-memory representation). Examples in this module include strings, numbers, and pathlib.Path (with persistence support provided by Persistable).

Dataset-backed types#

A “dataset type” is a type that: - is treated as a primitive during in-memory serialization, but - is persisted as an HDF5 dataset when using Persistable.

Dataset types are registered with a pair of functions:

  • disassemble(obj) -> (ndarray, attrs_dict)

  • assemble(ndarray, attrs_dict) -> obj

During persistence, the ndarray becomes a dataset and attrs become dataset attributes.

Forward-compatibility behavior#

If Serializable.deserialize encounters a "__class__" name that is not registered, it creates a synthetic class named _Generic<ClassName> that accepts the serialized keys as properties. This allows loading data produced by newer code with older code at the cost of type-specific behavior.

Notes#

This module is intentionally conservative: it does not attempt to serialize arbitrary Python objects (unlike pickle). Anything not explicitly supported must be registered as a primitive or dataset type, or be a Serializable.

Entities#

SerializableProperty([postinitializer, ...])

A descriptor for properties that support defaults, parsing, observation, and post-initialization hooks.

Serializable(*args, **kwargs)

Base class for objects that can be serialized to/from dictionaries.

Persistable(*args, **kwargs)

Base class for objects that can be persisted to HDF5 files.

SerializableMetatype(name, bases, namespace, ...)

Metaclass for automatic registration and introspection of Serializable classes.