Serialization (jangada.serialization)#
A small, explicit serialization and persistence framework.
This module provides a descriptor-driven schema (SerializableProperty),
a registry-backed in-memory serialization protocol (Serializable), and an
HDF5 persistence layer (Persistable).
Design goals#
Explicit schemas: Classes declare which attributes are serialized via
SerializablePropertydescriptors (including inheritance across the MRO).Portable serialized form: The in-memory serialized representation uses only Python-native containers (dict/list) plus registered primitive values, and a
"__class__"envelope for reconstructingSerializableobjects.Extensibility: New primitives and dataset-backed types can be registered globally through the metaclass API.
Numerical efficiency: Large arrays and similar objects can be stored as true HDF5 datasets (not blobs), optionally accessed lazily through proxies.
Serialized representation#
The serialized form returned by Serializable.serialize is a tree composed of:
- None
- lists (always serialized as list)
- dictionaries
- registered primitive values
- Serializable instances serialized as dictionaries that include a
"__class__"key containing a fully qualified name.
The general object envelope is:
- {
“__class__”: “module.QualifiedName”, “<property_name_1>”: <serialized_value>, …
}
Primitive types#
A “primitive type” is any type that is allowed to pass through serialization
unchanged (stored as-is in the in-memory representation). Examples in this
module include strings, numbers, and pathlib.Path (with persistence support
provided by Persistable).
Dataset-backed types#
A “dataset type” is a type that:
- is treated as a primitive during in-memory serialization, but
- is persisted as an HDF5 dataset when using Persistable.
Dataset types are registered with a pair of functions:
disassemble(obj) -> (ndarray, attrs_dict)assemble(ndarray, attrs_dict) -> obj
During persistence, the ndarray becomes a dataset and attrs become dataset attributes.
Forward-compatibility behavior#
If Serializable.deserialize encounters a "__class__" name that is not
registered, it creates a synthetic class named _Generic<ClassName> that
accepts the serialized keys as properties. This allows loading data produced by
newer code with older code at the cost of type-specific behavior.
Notes#
This module is intentionally conservative: it does not attempt to serialize
arbitrary Python objects (unlike pickle). Anything not explicitly supported
must be registered as a primitive or dataset type, or be a Serializable.
Entities#
|
A descriptor for properties that support defaults, parsing, observation, and post-initialization hooks. |
|
Base class for objects that can be serialized to/from dictionaries. |
|
Base class for objects that can be persisted to HDF5 files. |
|
Metaclass for automatic registration and introspection of Serializable classes. |