structured-classes

Name	structured-classes JSON
Version	3.0.0 JSON
	download
home_page
Summary	Annotated classes that pack and unpack C structures.
upload_time	2023-01-14 21:43:17
maintainer
docs_url	None
author	lojack5
requires_python	>=3.9
license	BSD 3-Clause
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            ![tests](https://github.com/lojack5/structured/actions/workflows/tests.yml/badge.svg)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


# structured - creating classes which pack and unpack with Python's `struct` module.
This is a small little library to let you leverage type hints to define classes which can also be packed and unpacked using Python's `struct` module.  The basic usage is almost like a dataclass:

```python
class MyClass(Structured):
  file_magic: char[4]
  version: uint8

a = MyClass(b'', 0)

with open('some_file.dat', 'rb') as ins:
  a.unpack_read(ins)
```

# Contents

1. [Hint Types](#hint-types): For all the types you can use as type-hints.
    - [Basic Types](#basic-types)
    - [Complex Types](#complex-types)
    - [Custom Types](#custom-types)
2. [The `Structured` class](#the-structured-class)
3. [Generics](#generics)
4. [Serializers](#serializers)


# Hint Types
If you just want to use the library, these are the types you use to hint your instance variables to
make them detected as serialized by the packing/unpacking logic. I'll use the term **serializable**
to mean a hinted type that results in the variable being detected by the `Structured` class as being
handled for packing and unpacking. They're broken up into two basic
catergories:
- Basic Types: Those with direct correlation to the `struct` format specifiers, needing no extra
  logic.
- Complex Types: Those still using `struct` for packing and unpacking, but requiring extra logic
  so they do not always have the same format specifier.
- Custom Types: You can use your own custom classes and specify how they should be packed and
  unpacked.


Almost all types use `typing.Annotated` under the hood to just add extra serialization information
to the type they represent. For example `bool8` is defined as
`Annotated[bool8, StructSerializer('?')]`, so type-checkers will properly see it as a `bool`.

There are four exceptions to this.  For these types, almost everything should pass inspection by a
type-checker, except for assignment.  These are:
- `char`: subclassed from `bytes`.
- `pascal`: subclassed from `bytes`.
- `unicode`: subclassed from `str`.
- `array`: subclassed from `list`.

If you want to work around this, you can use `typing.Annotated` yourself to appease the
type-checker:

```python
class MyStruct1(Structured):
  name: unicode[100]

item = MyStruct('Jane Doe')
item.name = 'Jessica'   # Type-checker complains about "str incompatible with unicode".

class MyStruct2(Structured):
  name: Annotated[str, unicode[100]]

item = MyStruct('Jane Doe')
item.name = 'Jessica'   # No complaint from the type-checker.
```


## Basic Types
Almost every format specifier in `struct` is supported as a type:

| `struct` format | structured type | Python type | Notes |
|:---------------:|:---------------:|-------------|:-----:|
| `x`             | `pad`           |             |(1)(3) |
| `c`             | equivalent to `char[1]` | `bytes` with length 1 | |
| `?`             | `bool8`         | `int`       |       |
| `b`             | `int8`          | `int`       |       |
| `B`             | `uint8`         | `int`       |       |
| `h`             | `int16`         | `int`       |       |
| `H`             | `uint16`        | `int`       |       |
| `i`             | `int32`         | `int`       |       |
| `I`             | `uint32`        | `int`       |       |
| `q`             | `int64`         | `int`       |       |
| `Q`             | `uint64`        | `int`       |       |
| `n`             | not supported   |             |       |
| `N`             | not supported   |             |       |
| `e`             | `float16`       | `float`     |  (2)  |
| `f`             | `float32`       | `float`     |       |
| `d`             | `float64`       | `float`     |       |
| `s`             | `char`          | `bytes`     |  (1)  |
| `p`             | `pascal`        | `bytes`     |  (1)  |
| `P`             | not supported   |             |       |

Notes:
 1. These type must be indexed to specify their length.  For a single byte `char` for example
    (`'s'`), use `char[1]`.
 2. The 16-bit float type is not supported on all platforms.
 3. Pad variables are skipped and not actually assigned when unpacking, nor used when packing. There
    is a special metaclass hook to allow you to name all of your pad variables `_`, and they still
    **all** count towards the final format specifier.  If you want to be able to override their
    type-hint in subclasses, choose a name other than `_`.

Consecutive variables with any of these type-hints will be combined into a single `struct` format
specifier.  Keep in mind that Python's `struct` module may insert extra padding bytes *between*
(but never before or after) format specifiers, depending on the Byte Order specification used.

Example:

```python
class MyStruct(Structured):
  a: int8
  b: int8
  c: uint32
  _: pad[4]
  d: char[10]
  _: pad[2]
  e: uint32
```
In this example, all of the instance variables are of the "basic" type, so the final result will be
as if packing or unpacking with `struct` using a format of `2bI4x10s2xI`.  Note we took advantage of
the `Structured` metaclass to specify the padding using the same name `_`.


### Byte Order
All of the specifiers are supported, the default it to use no specifier:
| `struct` specifier | `ByteOrder` |
|:------------------:|:-----------:|
| `<`                | `LITTLE_ENDIAN` (or `LE`) |
| `>`                | `BIG_ENDIAN` (or `BE`) |
| `=`                | `NATIVE_STANDARD` |
| `@`                | `NATIVE_NATIVE` |
| `!`                | `NETWORK`   |

To specify a byte order, pass `byte_order=ByteOrder.<option>` to the `Structured` sub-classing
machinery, like so:

```python
class MyStruct(Structured, byte_order=ByteOrder.NETWORK):
  a: int8
  b: uint32
```
In this example, the `NETWORK` (`!`) specifier was used, so `struct` will not insert any padding
bytes between variables `a` and `b`, and multi-byte values will be unpacked as Big Endian numbers.


## Complex Types
All other types fall into the "complex" category.  They currently consist of:
- `tuple`: Fixed length tuples of serializable objects.
- `array`: Lists of a single type of serializable object.
- `char`: Although `char[3]` (or any other integer) is considered a basic type, `char` also supports
  variable length strings.
- `unicode`: A wrapper around `char` to add automatic encoding on pack and decoding on unpack.
- `unions`: Unions of serializable types are supported as well.
- `Structured`-derived types: You can use any of your `Structured`-derived classes as a type-hint,
  and the variable will be serialized as well.


### Tuples
Both the `tuple` and `Tuple` type-hints are supported, including `TypeVar`s (see: `Generics`). To be
detected as serializable, the `tuple` type-hint must be for a fixed sized `tuple` (so no elipses
`...`), and each type-hint in the `tuple` must be a serializable type.

Example:
```python
class MyStruct(Structured):
  position: tuple[int8, int8]
  size: tuple[int8, int8]
```

### Arrays
Arrays are `list`s of one kind of serializable type. You do need to specify how `Structured` will
determine the *length* of the list when unpacking, and how to write it when packing. To do this,
you chose a `Header` type. The final type-hint for your list then becomes
`array[<header_type>, <item_type>]`. Arrays also support `TypeVar`s.

Here are the header types:
- `Header[3]` (or any other positive integer): A fixed length array. No length byte is packed or
  unpacked, just the fixed number of items. When packing, if the list doesn't contain the fixed
  number of elements specified, a `ValueError` is raised.
- `Header[uint32]` (or any other `uint*`-type): An array with the length stored as a `uint32` (or
  other `uint*`-type) just before the items.
- `Header[uint32, uint16]`: An array with two values stored just prior to the items. The first
  value (in this case a `uint32`) is the length of the array. The second value (in this case a
  `uint16`) denotes how many bytes of data the array items takes up. When unpacking, this size is
  checked against how many bytes were actually required to unpack that many items. In the case of a
  mismatch, a `ValueError` will be raises.

Example:
```python
class MyItem(Structured):
  name: unicode[100]
  age: uint8

class MyStruct(Structured):
  students: array[Header[uint32], MyItem]
```


### `char`
For unpacking bytes other than with a fixed length, you have a few more options with `char`:
- `char[uint8]` (or any other `uint*` type): This indicates that a value (a `uint8` in this case)
  will be packed/unpack just prior to the `bytes`.  The value holds the number of `bytes` to pack or
  unpack.
- `char[b'\0']` (or any other single bytes): This indicates a terminated byte-string. For
  unpacking, bytes will be read until the terminator is encountered (the terminator will be
  discarded). For packing, the `bytes` will be written, and a terminator will be written at the end
  if not already written.  The usual case for this is NULL-terminated byte-strings, so a quick alias
  for that is provided: `null_char`.


### `unicode`
For cases where you want to read a byte-string and treat it as text, `unicode` will automatically
encode/decode it for you.  The options are the same as for `char`, but with an optional second
argument to specify how to encode/decode it.  The second option can either be a string indicating
the encoding to use (defaults to `utf8`), or for more complex solutions you may provide an
`EncoderDecoder` class.  Similar to `char`, we provide `null_unicode` as an alias for
`unicode[b'\0', 'utf8']`.

```python
class MyStruct(Structured):
  name: null_unicode
  description: unicode[255, 'ascii']
  bio: unicode[uint32, MyEncoderDecoder]
```

To write a custom encoder-decoder, you must subclass from `EncoderDecoder` and implement the two
class methods `encode` and `decode`:

```python
class MyEncoderDecoder(EncoderDecoder):
  @classmethod
  def encode(text: str) -> bytes: ...

  @classmethod
  def decode(bytestring: bytes) -> str: ...
```


### Unions
Sometimes, the data structure you're packing/unpacking depends on certain conditions.  Maybe a
`uint8` is used to indicate what follows next.  In cases like this, `Structured` supports unions in
its typehints.  To hint for this, you need to do three things:
1. Every type in your union must be a serializable type.
2. You need create a *decider* which will perform the logic on deciding how to unpack the data.
3. Use `typing.Annotated` to indicate the decider to use for packing/unpacking.

#### Deciders
All deciders provide some method to take in information and produce a value to be used to make a
decision. The decision is made with a "decision map", which is a mapping of value to serializable
types. You can also provide a default serializable type, or `None` if you want an error to be raised
if your decision method doesn't produce a value in the decision map.

There are currently two deciders.  In addition to the decision map and default, you will need to
provide a few more things for each:
- `LookbackDecider`: You provide a method that accepts the object to be packed/unpacked and produces
  a decision value.  Commonly, `operator.attrgetter` is used here.  A minor detail: for unpacking
  operations, the object sent to your method will not be the actual unpacked object, merely a proxy
  with the values unpacked so far set on it.
- `LookaheadDecider`: For packing, this behaves just like `LookbackDecider`.  For unpacking, you
  need to specify a serializable type which is unpacked first and used as the the value to look up
  in the decision map.  After this first value is unpacked, the data-stream is rewound back for
  unpacking the object.

Here are a few examples:
```python
class MyStruct(Structured):
  a_type: uint8
  a: Annotated[uint32 | float32 | char[4], LookbackDecider(attrgetter('a_type'), {0: uint32, 1: float32}, char[4])]
```
This example first unpacks a `uint8` and stores it in `a_type`. The union `a` polls that value with
`attrgetter`, if the value is 0 it unpacks a `uint32`, if it is 1 it unpacks a `float32`, and if it
is anything else it unpacks just 4 bytes (raw data), storing whatever was unpacked in `a`.

```python
class IntRecord(Structured):
  sig: char[4]
  value: int32

class FloatRecord(Structured):
  sig: char[4]
  value: float32

class MyStruct(Structured):
  record: Annotated[IntRecord | FloatRecord, LookaheadDecider(char[4], attrgetter('record.sig'), {b'IIII': IntRecord, 'FFFF': FloatRecord}, None)]
```
For unpacking, this example first reads in 4 bytes (`char[4]`), then looks up that value in the
dictionary. If it was `b'IIII'` then it rewinds and unpacks an `IntRecord` (note: `IntRecord`'s
`sig` attribute will be set to `char[4]`). If it was `b'FFFF'` it rewinds and unpacks a
`FloatRecord`, and if was neither it raises an exception.

For packing, this example uses `attrgetter('record.sig')` on the object to decide how to pack it.


### Structured
You can also type-hint with one of your `Structured` derived classes, and the value will be unpacked
and packed just as expected.  `Structured` doesn't *fully* support `Generic`s, so make sure to read
the section on that to see how to hint properly with a `Generic` `Structured` class.

Example:
```python
class MyStruct(Structured):
  a: int8
  b: char[100]

class MyStruct2(Structured):
  magic: char[4]
  item: MyStruct
```


## Custom Types
When the above are not enough, and your problem is fairly simple, you can use `SerializeAs` to tell
the `Structured` class how to pack and unpack your custom type. To do so, you choose one of the
above "basic" types to use as its serialization method, then type-hint with `typing.Annotated` to
provide that information via a `SerializeAs` object.

For example, say you have a class that encapsulates an integer, providing some custom functionality.
You can tell your `Structured` class how to pack and unpack it. Say the value will be stored as a
4-byte unsigned integer:

```python
class MyInt:
  _wrapped: int

  def __init__(self, value: int) -> None:
    self._wrapped = value

  def __index__(self) -> int:
    return self._wrapped

class MyStruct(Structured):
  version: Annotated[MyInt, SerializeAs(uint32)]
```

If you use your type a lot, you can use a `TypeAlias` to make things easier:

```python
MyInt32: TypeAlias = Annotated[MyInt, SerializeAs(int32)]

class MyStruct(Structured):
  version: MyInt32
```

Note a few things required for this to work as expected:
- Your class needs to accept a single value as its initializer, which is the value unpacked by the
  serializer you specified in `SerializeAs`.
- Your class must be compatible with your chosen type for packing as well.  This means:
  - for integer-like types, it must have an `__index__` method.
  - for float-like types, it must have a `__float__` method.

Finally, if the `__init__` requirement is too constraining, you can supply a factory method for
creating your objects from the single unpacked value, and use `SerializeAs.with_factory` instead.
The factory method must accept the single unpacked value, and return an instance of your type.


## The `Structured` class
The above examples should give you the basics of defining your own `Structured`-derived class, but
there are a few details and you probably want to know, and *how* to use it to pack and unpack your
data.


### dunders
- `__init__`: By default, `Structured` generates an `__init__` for your class which requires an
  initializer for each of the serializable types in your definition. You can block this generated
  `__init__` by passing `init=False` to the subclassing machinery. Keep in mind, whatever you
  decide the final class's `__init__` must be compatible with being initialized in the original way
  (one value provided for each serializable member). Otherwise your class cannot be used as a
  type-hint or as the item type for `array`.
- `__eq__`: `Structured` instance can be compared for equality / inequality.  Comparison is done by
  comparing each of the instance variables that are serialized.  You can of course override this
  in your subclass to add more checks, and allow `super().__eq__` to handle the serializable types.
- `__str__`: `Structured` provides a nice string representation with the values of all its
  serializable types.
- `__repr__`: The repr is almost identical to `__str__`, just with angled brackets (`<>`).

### Class variables
There are three public class variables associated with your class:
- `.serializer`: This is the **serializer** (see: Serializers) used for packing and unpacking the
  instance variables of your class.
- `.byte_order`: This is a `ByteOrder` enum value showing the byte order and alignment option used
  for your class.
- `.attrs`: This is a tuple containing the names of the attributes which are serialized for you, in
  the order they were detected as serializable.  This can be helpful when troubleshooting why your
  class isn't working the way you intended.

### Packing methods
There are three ways you might pack the data contained in your class, two should be familiar from
Python's `struct` library:
- `pack() -> bytes`: This just packs your data into a bytestring and returns it.
- `pack_into(buffer, offset = 0) -> None`: This packs your data into an object supporting the
  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), starting at the given offset.
- `pack_write(writable) -> None`: This packs your data, writing to the file-like object `writable`
  (which should be open in binary mode).


### Unpacking methods
Similar to packing, there are three methods for unpacking data into an already existing instance of
your class. There are also three similar class methods for creating a new object from freshly
unpacked data:
- `unpack(buffer) -> None`: Unpacks data from a bytes-like buffer, assigning values to the instance.
- `unpack_from(buffer, offset=0) -> None`: Like `unpack`, but works with an object supporting the
  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html).
- `unpack_read(readable)`: Reads data from a file-like object (which should be open in binary mode),
  unpacking until all instance variables are unpacked.
- `create_unpack(buffer) -> instance`: Class method that unpacks from a bytes-like buffer to create
  a new instance of your class.
- `create_unpack_from(buffer, offset=0) -> instance`: Class method that unpacks from a buffer
  supporting the [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html) to create a new
  instance of your class.
- `create_unpack_read(readable) -> instance`: Class method that reads data from a file-like object
  until enough data has been processed to create a new instance of your class.


### Subclassing
Subclassing from your `Structured`-derived class is very straight-forward. New members are inserted
after previous one in the serialization order. You can redefine the type of a super-class's member
and it will not change the order. For example, you could remove a super-class's serializable member
entirely from serialization, by redefining its type-hint with `None`.

Multiple inheritance from `Structured` classes is not supported (so no diamonds). By default, your
sub-class must also use the same `ByteOrder` option as its super-class. This is to prevent
unintended serialization errors, so if you really want to change the `ByteOrder`, you can pass
`byte_order_mode=ByteOrderMode.OVERRIDE` to the sub-classing machinery.


An example of using a different byte order than the super-class:
```python
class MyStructLE(Structured, byte_order=ByteOrder.LE):
  a: int8
  b: int32

class MyStructBE(MyStructLE, byte_order=ByteOrder.BE, byte_order_mode=ByteOrderMode.OVERRIDE):
  pass
```

A simple example of extending:
```python
class MyStructV1(Structured):
  size: uint32
  a: int8
  b: char[100]

class MyStructV2(MyStructV2):
  c: float32
```
Here, the sub-class will pack and unpack equivalent to the `struct` format `'Ib100sf'`.

A an example of removing a member from serialization:
```python
class MyStruct(Structured):
  a: int8
  b: uint32
  c: float32

class DerivedStruct(MyStruct):
  b: None
```
Here, the sub-class will pack and unpack equivalent to the `struct` format `'bf'`.


### Generics
`Structured` classes can be used with `typing.Generic`, and most things will work the way you want
with an extra step. In order for your specializations to detect the specialized `TypeVar`s, you must
subclass the specialization. After doing so, you have a concrete class which should serialize as you
expect.

Here's an example:
```python
class MyGeneric(Generic[T], Structured):
  a: int32
  b: T
```
This generic class is equivalent to the `struct` format `i`, since it hasn't been specialized yet.
To make a concrete version, subclass:

```python
class MyGenericUint32(MyGeneric[uint32]):
  pass
```
This subclass now is equivalent to the `struct` format `iI`.

You can also use `TypeVar`s in `tuple`s, `array`s, `char`s, and `unicode`s, but  similarly you will
have to sub-class in order to get the concrete implementation of your class.

NOTE: This means using your generic `Structured` class as the element type of `array` or `tuple`
won't work as expected unless you first sub-class to make the concrete version of it.


## Serializers
For those more interested in what goes on under the hood, or need more access to implement
serialization of a custom type, read on to learn about what **serializers** are and how they work!

Serializers are use `typing.Generic` and `typing.TypeVarTuple` in their class heirarchy, so if you
want to include the types the serializer unpacks this *could* help find errors.  For example:

```python
class MySerializer(Serializer[int, int, float]):
  ...
```
would indicate that this serializer packs and unpacks three items, an `(int, int float)`.

### The API
The `Serializer` class exposes a public API very similar to that of `struct.Struct`. All of these
methods must be implemented (unless noted otherwise) in order to work fully.

#### Attributes
- `.num_values: int`: In most cases this can just be a class variable, this represents the number of
  items unpacked or packed by the serializer.  For example, a `StructSerializer('2I')` has
  `num_values == 2`.  Note that `array` has `num_values == 1`, since it unpacks a *single* list.
- `.size`: This is similar to `struct.Struct.size`.  It holds the number of bytes required for a
  pack or unpack operation. However unlike `struct.Struct`, the serializer may not know this size
  until the item(s) have been fully packed or unpacked. For this reason, the `.size` attribute is
  only required to be up to date with the most recently completed pack or unpack call.

#### Packing methods
- `.prepack(self, partial_object) -> Serializer` (**not required**): This will be called just prior
  to any of the pack methods of the `Serializer`, with a (maybe proxy of) the `Structured` object to
  be packed. This is to allow union serializers (for example) to make decisions based on the state
  of the object to be packed.  This method should return an appropriate serializer to be used for
  packing, based on the information contained in `partial_object`.  In most cases, the default
  implementation will do just fine, which just returns itself unchanged.
- `.pack(self, *values) -> bytes`: Pack the values according to this serializer's logic. The number
  of items in `values` must be `.num_values`.  Return the values in packed `bytes` form.
- `.pack_into(self, buffer, offset, *values) -> None`: Pack the values into an object supporting the
  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), at the given offset.
- `.pack_write(self, writable, *values) -> None`: Pack the values and write them to the file-like
  object `writable`.

#### Unpacking methods
- `.preunpack(self, partial_object) -> Serializer` (**not required**): This will be called just
  prior to any of the unpack methods of the `Serializer`, with a (maybe proxy of) the `Structured`
  object to be unpacked. This means the only attributes guaranteed to exist on the object are
  those that were serialized *before* those handled by this serializer. Again, in most cases the
  default implementation should work fine, which just returns itself unchanged.
- `.unpack(self, byteslike) -> Iterable`: Unpack from the bytes-like object, returning the values in
  an iterable. In most cases, just returning the values in a tuple should be fine. Iterables are
  supported so that the partial-proxy objects can have their attributes set more easily during
  unpacking.  Note: the number of values in the iterable must be `.num_values`. NOTE: unlike
  `struct.unpack`, the byteslike object is not required to be the *exact* length needed for
  unpacking, only *at least* as long as required.
- `.unpack_from(self, buffer, offset=0) -> Iterable`: Like `.unpack`, but from an object supporting
  the [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), at the given offset.
- `.unpack_read(self, readable) -> Iterable`: Like `.unpack`, but reading the data from the
  file-like object `readable`.

#### Other
- `.with_byte_order(self, byte_order: ByteOrder) -> Serializer)`: Return a (possibly new) serializer
  configured to use the `ByteOrder` specified.  The default implementation returns itself unchanged,
  but in most cases this should be overridden with a correct implementation.
- `.__add__(self, other) -> Serailzer` (**not required**): The final serializer used for a
 `Structured` class is determined by "adding" all of the individual serializers for each attribute
 together.  In most cases the default implementation will suffice.  You can provide your own
 implementation if optimizations can be made (for example, see `StructSerializer`'s implementation).


### The "building" Serializers
There are a few basic serializers used for building others:
- `NullSerializer`: This is a serializer that packs and unpacks nothing. This will be the serializer
  used by a `Structured` class if *no* serializable instance variables are detected. It is also used
  as the starting value to `sum(...)` when generating the final serializer for a `Structured` class.
- `CompoundSerializer`: This is a generic "chaining" serializer. Most serializers don't have an
  easy way to combine their logic, so `CompoundSerializer` handles the logic of calling the packing
  and unpacking methods one after another. This is a common serializer to see as the final
  serializer for a `Structured` class. This is also an interesting example to see how to handle
  variable `.size`, and handling `.preunpack` and `.prepack`.


### Specific Serializers
The rest of the Serializer classes are for handling specific serialization types.  They range from
very simple, to quite complex.

- `StructSerializer`: For packing/unpacking types which can be directly related to a constant
  `struct` format string.  For example, `uint32` is implemented as
  `Annotated[int, StructSerializer('I')]`.
- `StructActionSerializer`: This is the class used for `StructSerializer`-able custom types, but
  need to perform a custom callable on the result(s) to convert them to their final type.  It is
  almost identical to `StructSerializer`, but calls an `action` on each value unpacked.
- `TupleSerializer`: A fairly simple serializer that handles the `tuple` type-hints.
- `AUnion`: The base for both union serializers.
- `LookbackDecider`: The union serializer which allows for reading attributes already unpacked on
  the object to make a decision.
- `LookaheadDecider`: The union serializer which unpacks a little data then rewinds, using the
  unpacked value to make a decision.
- `StructuredSerializer`: A fairly simple serializer to handle translating the `Structured` class
  methods into the `Serializer` API.
- `DynamicCharSerializer`: The serializer used to handle `char[uint*]` type-hints.
- `TerminatedCharSerializer`: The serializer used to handle `char[b'\x00']` type-hints.
- `UnicodeSerializer`: A wrapper around one of the `char[]` serializers to handle encoding on
  packing and decoding on unpacking.


### Type detection
This is a very internal-level detail, but may be required if you write your own `Serializer` class.

Almost all of the typehints use `typing.Annotated` to specify the `Serializer` instance to use for
a hint. In most cases, it's as simple as creating your serializer, then defining a type using this.
See all of the "basic" types for example.  In some more complicated examples, which are configured
via the `__class_getitem__` method, these return `Annotated` objects with the correct serializer.

In any case, the `Structured` class detects the serializers by inspecting the `Annotated` objects
for serializers.  To support things like `a: Annotated[int, int8]`, it even recursively looks inside
nested `Annotated` objects. For most of this work, `structured` internally uses a singleton object
`structured.type_checking.annotated` to help extract this information. There is a step to perform
extra transformations on these `Annotated` extras, that a new `Serializer` you implement might need
to work.  Check out for example, `TupleSerializer` and `StructuredSerializer` on where that might
be necessary.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "structured-classes",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "",
    "author": "lojack5",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/32/98/968b53b5a668b76f1d0e1f43ea0bf682ab83aca64002c40e823581f2c1df/structured_classes-3.0.0.tar.gz",
    "platform": null,
    "description": "![tests](https://github.com/lojack5/structured/actions/workflows/tests.yml/badge.svg)\r\n[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\r\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\r\n\r\n\r\n# structured - creating classes which pack and unpack with Python's `struct` module.\r\nThis is a small little library to let you leverage type hints to define classes which can also be packed and unpacked using Python's `struct` module.  The basic usage is almost like a dataclass:\r\n\r\n```python\r\nclass MyClass(Structured):\r\n  file_magic: char[4]\r\n  version: uint8\r\n\r\na = MyClass(b'', 0)\r\n\r\nwith open('some_file.dat', 'rb') as ins:\r\n  a.unpack_read(ins)\r\n```\r\n\r\n# Contents\r\n\r\n1. [Hint Types](#hint-types): For all the types you can use as type-hints.\r\n    - [Basic Types](#basic-types)\r\n    - [Complex Types](#complex-types)\r\n    - [Custom Types](#custom-types)\r\n2. [The `Structured` class](#the-structured-class)\r\n3. [Generics](#generics)\r\n4. [Serializers](#serializers)\r\n\r\n\r\n# Hint Types\r\nIf you just want to use the library, these are the types you use to hint your instance variables to\r\nmake them detected as serialized by the packing/unpacking logic. I'll use the term **serializable**\r\nto mean a hinted type that results in the variable being detected by the `Structured` class as being\r\nhandled for packing and unpacking. They're broken up into two basic\r\ncatergories:\r\n- Basic Types: Those with direct correlation to the `struct` format specifiers, needing no extra\r\n  logic.\r\n- Complex Types: Those still using `struct` for packing and unpacking, but requiring extra logic\r\n  so they do not always have the same format specifier.\r\n- Custom Types: You can use your own custom classes and specify how they should be packed and\r\n  unpacked.\r\n\r\n\r\nAlmost all types use `typing.Annotated` under the hood to just add extra serialization information\r\nto the type they represent. For example `bool8` is defined as\r\n`Annotated[bool8, StructSerializer('?')]`, so type-checkers will properly see it as a `bool`.\r\n\r\nThere are four exceptions to this.  For these types, almost everything should pass inspection by a\r\ntype-checker, except for assignment.  These are:\r\n- `char`: subclassed from `bytes`.\r\n- `pascal`: subclassed from `bytes`.\r\n- `unicode`: subclassed from `str`.\r\n- `array`: subclassed from `list`.\r\n\r\nIf you want to work around this, you can use `typing.Annotated` yourself to appease the\r\ntype-checker:\r\n\r\n```python\r\nclass MyStruct1(Structured):\r\n  name: unicode[100]\r\n\r\nitem = MyStruct('Jane Doe')\r\nitem.name = 'Jessica'   # Type-checker complains about \"str incompatible with unicode\".\r\n\r\nclass MyStruct2(Structured):\r\n  name: Annotated[str, unicode[100]]\r\n\r\nitem = MyStruct('Jane Doe')\r\nitem.name = 'Jessica'   # No complaint from the type-checker.\r\n```\r\n\r\n\r\n## Basic Types\r\nAlmost every format specifier in `struct` is supported as a type:\r\n\r\n| `struct` format | structured type | Python type | Notes |\r\n|:---------------:|:---------------:|-------------|:-----:|\r\n| `x`             | `pad`           |             |(1)(3) |\r\n| `c`             | equivalent to `char[1]` | `bytes` with length 1 | |\r\n| `?`             | `bool8`         | `int`       |       |\r\n| `b`             | `int8`          | `int`       |       |\r\n| `B`             | `uint8`         | `int`       |       |\r\n| `h`             | `int16`         | `int`       |       |\r\n| `H`             | `uint16`        | `int`       |       |\r\n| `i`             | `int32`         | `int`       |       |\r\n| `I`             | `uint32`        | `int`       |       |\r\n| `q`             | `int64`         | `int`       |       |\r\n| `Q`             | `uint64`        | `int`       |       |\r\n| `n`             | not supported   |             |       |\r\n| `N`             | not supported   |             |       |\r\n| `e`             | `float16`       | `float`     |  (2)  |\r\n| `f`             | `float32`       | `float`     |       |\r\n| `d`             | `float64`       | `float`     |       |\r\n| `s`             | `char`          | `bytes`     |  (1)  |\r\n| `p`             | `pascal`        | `bytes`     |  (1)  |\r\n| `P`             | not supported   |             |       |\r\n\r\nNotes:\r\n 1. These type must be indexed to specify their length.  For a single byte `char` for example\r\n    (`'s'`), use `char[1]`.\r\n 2. The 16-bit float type is not supported on all platforms.\r\n 3. Pad variables are skipped and not actually assigned when unpacking, nor used when packing. There\r\n    is a special metaclass hook to allow you to name all of your pad variables `_`, and they still\r\n    **all** count towards the final format specifier.  If you want to be able to override their\r\n    type-hint in subclasses, choose a name other than `_`.\r\n\r\nConsecutive variables with any of these type-hints will be combined into a single `struct` format\r\nspecifier.  Keep in mind that Python's `struct` module may insert extra padding bytes *between*\r\n(but never before or after) format specifiers, depending on the Byte Order specification used.\r\n\r\nExample:\r\n\r\n```python\r\nclass MyStruct(Structured):\r\n  a: int8\r\n  b: int8\r\n  c: uint32\r\n  _: pad[4]\r\n  d: char[10]\r\n  _: pad[2]\r\n  e: uint32\r\n```\r\nIn this example, all of the instance variables are of the \"basic\" type, so the final result will be\r\nas if packing or unpacking with `struct` using a format of `2bI4x10s2xI`.  Note we took advantage of\r\nthe `Structured` metaclass to specify the padding using the same name `_`.\r\n\r\n\r\n### Byte Order\r\nAll of the specifiers are supported, the default it to use no specifier:\r\n| `struct` specifier | `ByteOrder` |\r\n|:------------------:|:-----------:|\r\n| `<`                | `LITTLE_ENDIAN` (or `LE`) |\r\n| `>`                | `BIG_ENDIAN` (or `BE`) |\r\n| `=`                | `NATIVE_STANDARD` |\r\n| `@`                | `NATIVE_NATIVE` |\r\n| `!`                | `NETWORK`   |\r\n\r\nTo specify a byte order, pass `byte_order=ByteOrder.<option>` to the `Structured` sub-classing\r\nmachinery, like so:\r\n\r\n```python\r\nclass MyStruct(Structured, byte_order=ByteOrder.NETWORK):\r\n  a: int8\r\n  b: uint32\r\n```\r\nIn this example, the `NETWORK` (`!`) specifier was used, so `struct` will not insert any padding\r\nbytes between variables `a` and `b`, and multi-byte values will be unpacked as Big Endian numbers.\r\n\r\n\r\n## Complex Types\r\nAll other types fall into the \"complex\" category.  They currently consist of:\r\n- `tuple`: Fixed length tuples of serializable objects.\r\n- `array`: Lists of a single type of serializable object.\r\n- `char`: Although `char[3]` (or any other integer) is considered a basic type, `char` also supports\r\n  variable length strings.\r\n- `unicode`: A wrapper around `char` to add automatic encoding on pack and decoding on unpack.\r\n- `unions`: Unions of serializable types are supported as well.\r\n- `Structured`-derived types: You can use any of your `Structured`-derived classes as a type-hint,\r\n  and the variable will be serialized as well.\r\n\r\n\r\n### Tuples\r\nBoth the `tuple` and `Tuple` type-hints are supported, including `TypeVar`s (see: `Generics`). To be\r\ndetected as serializable, the `tuple` type-hint must be for a fixed sized `tuple` (so no elipses\r\n`...`), and each type-hint in the `tuple` must be a serializable type.\r\n\r\nExample:\r\n```python\r\nclass MyStruct(Structured):\r\n  position: tuple[int8, int8]\r\n  size: tuple[int8, int8]\r\n```\r\n\r\n### Arrays\r\nArrays are `list`s of one kind of serializable type. You do need to specify how `Structured` will\r\ndetermine the *length* of the list when unpacking, and how to write it when packing. To do this,\r\nyou chose a `Header` type. The final type-hint for your list then becomes\r\n`array[<header_type>, <item_type>]`. Arrays also support `TypeVar`s.\r\n\r\nHere are the header types:\r\n- `Header[3]` (or any other positive integer): A fixed length array. No length byte is packed or\r\n  unpacked, just the fixed number of items. When packing, if the list doesn't contain the fixed\r\n  number of elements specified, a `ValueError` is raised.\r\n- `Header[uint32]` (or any other `uint*`-type): An array with the length stored as a `uint32` (or\r\n  other `uint*`-type) just before the items.\r\n- `Header[uint32, uint16]`: An array with two values stored just prior to the items. The first\r\n  value (in this case a `uint32`) is the length of the array. The second value (in this case a\r\n  `uint16`) denotes how many bytes of data the array items takes up. When unpacking, this size is\r\n  checked against how many bytes were actually required to unpack that many items. In the case of a\r\n  mismatch, a `ValueError` will be raises.\r\n\r\nExample:\r\n```python\r\nclass MyItem(Structured):\r\n  name: unicode[100]\r\n  age: uint8\r\n\r\nclass MyStruct(Structured):\r\n  students: array[Header[uint32], MyItem]\r\n```\r\n\r\n\r\n### `char`\r\nFor unpacking bytes other than with a fixed length, you have a few more options with `char`:\r\n- `char[uint8]` (or any other `uint*` type): This indicates that a value (a `uint8` in this case)\r\n  will be packed/unpack just prior to the `bytes`.  The value holds the number of `bytes` to pack or\r\n  unpack.\r\n- `char[b'\\0']` (or any other single bytes): This indicates a terminated byte-string. For\r\n  unpacking, bytes will be read until the terminator is encountered (the terminator will be\r\n  discarded). For packing, the `bytes` will be written, and a terminator will be written at the end\r\n  if not already written.  The usual case for this is NULL-terminated byte-strings, so a quick alias\r\n  for that is provided: `null_char`.\r\n\r\n\r\n### `unicode`\r\nFor cases where you want to read a byte-string and treat it as text, `unicode` will automatically\r\nencode/decode it for you.  The options are the same as for `char`, but with an optional second\r\nargument to specify how to encode/decode it.  The second option can either be a string indicating\r\nthe encoding to use (defaults to `utf8`), or for more complex solutions you may provide an\r\n`EncoderDecoder` class.  Similar to `char`, we provide `null_unicode` as an alias for\r\n`unicode[b'\\0', 'utf8']`.\r\n\r\n```python\r\nclass MyStruct(Structured):\r\n  name: null_unicode\r\n  description: unicode[255, 'ascii']\r\n  bio: unicode[uint32, MyEncoderDecoder]\r\n```\r\n\r\nTo write a custom encoder-decoder, you must subclass from `EncoderDecoder` and implement the two\r\nclass methods `encode` and `decode`:\r\n\r\n```python\r\nclass MyEncoderDecoder(EncoderDecoder):\r\n  @classmethod\r\n  def encode(text: str) -> bytes: ...\r\n\r\n  @classmethod\r\n  def decode(bytestring: bytes) -> str: ...\r\n```\r\n\r\n\r\n### Unions\r\nSometimes, the data structure you're packing/unpacking depends on certain conditions.  Maybe a\r\n`uint8` is used to indicate what follows next.  In cases like this, `Structured` supports unions in\r\nits typehints.  To hint for this, you need to do three things:\r\n1. Every type in your union must be a serializable type.\r\n2. You need create a *decider* which will perform the logic on deciding how to unpack the data.\r\n3. Use `typing.Annotated` to indicate the decider to use for packing/unpacking.\r\n\r\n#### Deciders\r\nAll deciders provide some method to take in information and produce a value to be used to make a\r\ndecision. The decision is made with a \"decision map\", which is a mapping of value to serializable\r\ntypes. You can also provide a default serializable type, or `None` if you want an error to be raised\r\nif your decision method doesn't produce a value in the decision map.\r\n\r\nThere are currently two deciders.  In addition to the decision map and default, you will need to\r\nprovide a few more things for each:\r\n- `LookbackDecider`: You provide a method that accepts the object to be packed/unpacked and produces\r\n  a decision value.  Commonly, `operator.attrgetter` is used here.  A minor detail: for unpacking\r\n  operations, the object sent to your method will not be the actual unpacked object, merely a proxy\r\n  with the values unpacked so far set on it.\r\n- `LookaheadDecider`: For packing, this behaves just like `LookbackDecider`.  For unpacking, you\r\n  need to specify a serializable type which is unpacked first and used as the the value to look up\r\n  in the decision map.  After this first value is unpacked, the data-stream is rewound back for\r\n  unpacking the object.\r\n\r\nHere are a few examples:\r\n```python\r\nclass MyStruct(Structured):\r\n  a_type: uint8\r\n  a: Annotated[uint32 | float32 | char[4], LookbackDecider(attrgetter('a_type'), {0: uint32, 1: float32}, char[4])]\r\n```\r\nThis example first unpacks a `uint8` and stores it in `a_type`. The union `a` polls that value with\r\n`attrgetter`, if the value is 0 it unpacks a `uint32`, if it is 1 it unpacks a `float32`, and if it\r\nis anything else it unpacks just 4 bytes (raw data), storing whatever was unpacked in `a`.\r\n\r\n```python\r\nclass IntRecord(Structured):\r\n  sig: char[4]\r\n  value: int32\r\n\r\nclass FloatRecord(Structured):\r\n  sig: char[4]\r\n  value: float32\r\n\r\nclass MyStruct(Structured):\r\n  record: Annotated[IntRecord | FloatRecord, LookaheadDecider(char[4], attrgetter('record.sig'), {b'IIII': IntRecord, 'FFFF': FloatRecord}, None)]\r\n```\r\nFor unpacking, this example first reads in 4 bytes (`char[4]`), then looks up that value in the\r\ndictionary. If it was `b'IIII'` then it rewinds and unpacks an `IntRecord` (note: `IntRecord`'s\r\n`sig` attribute will be set to `char[4]`). If it was `b'FFFF'` it rewinds and unpacks a\r\n`FloatRecord`, and if was neither it raises an exception.\r\n\r\nFor packing, this example uses `attrgetter('record.sig')` on the object to decide how to pack it.\r\n\r\n\r\n### Structured\r\nYou can also type-hint with one of your `Structured` derived classes, and the value will be unpacked\r\nand packed just as expected.  `Structured` doesn't *fully* support `Generic`s, so make sure to read\r\nthe section on that to see how to hint properly with a `Generic` `Structured` class.\r\n\r\nExample:\r\n```python\r\nclass MyStruct(Structured):\r\n  a: int8\r\n  b: char[100]\r\n\r\nclass MyStruct2(Structured):\r\n  magic: char[4]\r\n  item: MyStruct\r\n```\r\n\r\n\r\n## Custom Types\r\nWhen the above are not enough, and your problem is fairly simple, you can use `SerializeAs` to tell\r\nthe `Structured` class how to pack and unpack your custom type. To do so, you choose one of the\r\nabove \"basic\" types to use as its serialization method, then type-hint with `typing.Annotated` to\r\nprovide that information via a `SerializeAs` object.\r\n\r\nFor example, say you have a class that encapsulates an integer, providing some custom functionality.\r\nYou can tell your `Structured` class how to pack and unpack it. Say the value will be stored as a\r\n4-byte unsigned integer:\r\n\r\n```python\r\nclass MyInt:\r\n  _wrapped: int\r\n\r\n  def __init__(self, value: int) -> None:\r\n    self._wrapped = value\r\n\r\n  def __index__(self) -> int:\r\n    return self._wrapped\r\n\r\nclass MyStruct(Structured):\r\n  version: Annotated[MyInt, SerializeAs(uint32)]\r\n```\r\n\r\nIf you use your type a lot, you can use a `TypeAlias` to make things easier:\r\n\r\n```python\r\nMyInt32: TypeAlias = Annotated[MyInt, SerializeAs(int32)]\r\n\r\nclass MyStruct(Structured):\r\n  version: MyInt32\r\n```\r\n\r\nNote a few things required for this to work as expected:\r\n- Your class needs to accept a single value as its initializer, which is the value unpacked by the\r\n  serializer you specified in `SerializeAs`.\r\n- Your class must be compatible with your chosen type for packing as well.  This means:\r\n  - for integer-like types, it must have an `__index__` method.\r\n  - for float-like types, it must have a `__float__` method.\r\n\r\nFinally, if the `__init__` requirement is too constraining, you can supply a factory method for\r\ncreating your objects from the single unpacked value, and use `SerializeAs.with_factory` instead.\r\nThe factory method must accept the single unpacked value, and return an instance of your type.\r\n\r\n\r\n## The `Structured` class\r\nThe above examples should give you the basics of defining your own `Structured`-derived class, but\r\nthere are a few details and you probably want to know, and *how* to use it to pack and unpack your\r\ndata.\r\n\r\n\r\n### dunders\r\n- `__init__`: By default, `Structured` generates an `__init__` for your class which requires an\r\n  initializer for each of the serializable types in your definition. You can block this generated\r\n  `__init__` by passing `init=False` to the subclassing machinery. Keep in mind, whatever you\r\n  decide the final class's `__init__` must be compatible with being initialized in the original way\r\n  (one value provided for each serializable member). Otherwise your class cannot be used as a\r\n  type-hint or as the item type for `array`.\r\n- `__eq__`: `Structured` instance can be compared for equality / inequality.  Comparison is done by\r\n  comparing each of the instance variables that are serialized.  You can of course override this\r\n  in your subclass to add more checks, and allow `super().__eq__` to handle the serializable types.\r\n- `__str__`: `Structured` provides a nice string representation with the values of all its\r\n  serializable types.\r\n- `__repr__`: The repr is almost identical to `__str__`, just with angled brackets (`<>`).\r\n\r\n### Class variables\r\nThere are three public class variables associated with your class:\r\n- `.serializer`: This is the **serializer** (see: Serializers) used for packing and unpacking the\r\n  instance variables of your class.\r\n- `.byte_order`: This is a `ByteOrder` enum value showing the byte order and alignment option used\r\n  for your class.\r\n- `.attrs`: This is a tuple containing the names of the attributes which are serialized for you, in\r\n  the order they were detected as serializable.  This can be helpful when troubleshooting why your\r\n  class isn't working the way you intended.\r\n\r\n### Packing methods\r\nThere are three ways you might pack the data contained in your class, two should be familiar from\r\nPython's `struct` library:\r\n- `pack() -> bytes`: This just packs your data into a bytestring and returns it.\r\n- `pack_into(buffer, offset = 0) -> None`: This packs your data into an object supporting the\r\n  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), starting at the given offset.\r\n- `pack_write(writable) -> None`: This packs your data, writing to the file-like object `writable`\r\n  (which should be open in binary mode).\r\n\r\n\r\n### Unpacking methods\r\nSimilar to packing, there are three methods for unpacking data into an already existing instance of\r\nyour class. There are also three similar class methods for creating a new object from freshly\r\nunpacked data:\r\n- `unpack(buffer) -> None`: Unpacks data from a bytes-like buffer, assigning values to the instance.\r\n- `unpack_from(buffer, offset=0) -> None`: Like `unpack`, but works with an object supporting the\r\n  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html).\r\n- `unpack_read(readable)`: Reads data from a file-like object (which should be open in binary mode),\r\n  unpacking until all instance variables are unpacked.\r\n- `create_unpack(buffer) -> instance`: Class method that unpacks from a bytes-like buffer to create\r\n  a new instance of your class.\r\n- `create_unpack_from(buffer, offset=0) -> instance`: Class method that unpacks from a buffer\r\n  supporting the [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html) to create a new\r\n  instance of your class.\r\n- `create_unpack_read(readable) -> instance`: Class method that reads data from a file-like object\r\n  until enough data has been processed to create a new instance of your class.\r\n\r\n\r\n### Subclassing\r\nSubclassing from your `Structured`-derived class is very straight-forward. New members are inserted\r\nafter previous one in the serialization order. You can redefine the type of a super-class's member\r\nand it will not change the order. For example, you could remove a super-class's serializable member\r\nentirely from serialization, by redefining its type-hint with `None`.\r\n\r\nMultiple inheritance from `Structured` classes is not supported (so no diamonds). By default, your\r\nsub-class must also use the same `ByteOrder` option as its super-class. This is to prevent\r\nunintended serialization errors, so if you really want to change the `ByteOrder`, you can pass\r\n`byte_order_mode=ByteOrderMode.OVERRIDE` to the sub-classing machinery.\r\n\r\n\r\nAn example of using a different byte order than the super-class:\r\n```python\r\nclass MyStructLE(Structured, byte_order=ByteOrder.LE):\r\n  a: int8\r\n  b: int32\r\n\r\nclass MyStructBE(MyStructLE, byte_order=ByteOrder.BE, byte_order_mode=ByteOrderMode.OVERRIDE):\r\n  pass\r\n```\r\n\r\nA simple example of extending:\r\n```python\r\nclass MyStructV1(Structured):\r\n  size: uint32\r\n  a: int8\r\n  b: char[100]\r\n\r\nclass MyStructV2(MyStructV2):\r\n  c: float32\r\n```\r\nHere, the sub-class will pack and unpack equivalent to the `struct` format `'Ib100sf'`.\r\n\r\nA an example of removing a member from serialization:\r\n```python\r\nclass MyStruct(Structured):\r\n  a: int8\r\n  b: uint32\r\n  c: float32\r\n\r\nclass DerivedStruct(MyStruct):\r\n  b: None\r\n```\r\nHere, the sub-class will pack and unpack equivalent to the `struct` format `'bf'`.\r\n\r\n\r\n### Generics\r\n`Structured` classes can be used with `typing.Generic`, and most things will work the way you want\r\nwith an extra step. In order for your specializations to detect the specialized `TypeVar`s, you must\r\nsubclass the specialization. After doing so, you have a concrete class which should serialize as you\r\nexpect.\r\n\r\nHere's an example:\r\n```python\r\nclass MyGeneric(Generic[T], Structured):\r\n  a: int32\r\n  b: T\r\n```\r\nThis generic class is equivalent to the `struct` format `i`, since it hasn't been specialized yet.\r\nTo make a concrete version, subclass:\r\n\r\n```python\r\nclass MyGenericUint32(MyGeneric[uint32]):\r\n  pass\r\n```\r\nThis subclass now is equivalent to the `struct` format `iI`.\r\n\r\nYou can also use `TypeVar`s in `tuple`s, `array`s, `char`s, and `unicode`s, but  similarly you will\r\nhave to sub-class in order to get the concrete implementation of your class.\r\n\r\nNOTE: This means using your generic `Structured` class as the element type of `array` or `tuple`\r\nwon't work as expected unless you first sub-class to make the concrete version of it.\r\n\r\n\r\n## Serializers\r\nFor those more interested in what goes on under the hood, or need more access to implement\r\nserialization of a custom type, read on to learn about what **serializers** are and how they work!\r\n\r\nSerializers are use `typing.Generic` and `typing.TypeVarTuple` in their class heirarchy, so if you\r\nwant to include the types the serializer unpacks this *could* help find errors.  For example:\r\n\r\n```python\r\nclass MySerializer(Serializer[int, int, float]):\r\n  ...\r\n```\r\nwould indicate that this serializer packs and unpacks three items, an `(int, int float)`.\r\n\r\n### The API\r\nThe `Serializer` class exposes a public API very similar to that of `struct.Struct`. All of these\r\nmethods must be implemented (unless noted otherwise) in order to work fully.\r\n\r\n#### Attributes\r\n- `.num_values: int`: In most cases this can just be a class variable, this represents the number of\r\n  items unpacked or packed by the serializer.  For example, a `StructSerializer('2I')` has\r\n  `num_values == 2`.  Note that `array` has `num_values == 1`, since it unpacks a *single* list.\r\n- `.size`: This is similar to `struct.Struct.size`.  It holds the number of bytes required for a\r\n  pack or unpack operation. However unlike `struct.Struct`, the serializer may not know this size\r\n  until the item(s) have been fully packed or unpacked. For this reason, the `.size` attribute is\r\n  only required to be up to date with the most recently completed pack or unpack call.\r\n\r\n#### Packing methods\r\n- `.prepack(self, partial_object) -> Serializer` (**not required**): This will be called just prior\r\n  to any of the pack methods of the `Serializer`, with a (maybe proxy of) the `Structured` object to\r\n  be packed. This is to allow union serializers (for example) to make decisions based on the state\r\n  of the object to be packed.  This method should return an appropriate serializer to be used for\r\n  packing, based on the information contained in `partial_object`.  In most cases, the default\r\n  implementation will do just fine, which just returns itself unchanged.\r\n- `.pack(self, *values) -> bytes`: Pack the values according to this serializer's logic. The number\r\n  of items in `values` must be `.num_values`.  Return the values in packed `bytes` form.\r\n- `.pack_into(self, buffer, offset, *values) -> None`: Pack the values into an object supporting the\r\n  [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), at the given offset.\r\n- `.pack_write(self, writable, *values) -> None`: Pack the values and write them to the file-like\r\n  object `writable`.\r\n\r\n#### Unpacking methods\r\n- `.preunpack(self, partial_object) -> Serializer` (**not required**): This will be called just\r\n  prior to any of the unpack methods of the `Serializer`, with a (maybe proxy of) the `Structured`\r\n  object to be unpacked. This means the only attributes guaranteed to exist on the object are\r\n  those that were serialized *before* those handled by this serializer. Again, in most cases the\r\n  default implementation should work fine, which just returns itself unchanged.\r\n- `.unpack(self, byteslike) -> Iterable`: Unpack from the bytes-like object, returning the values in\r\n  an iterable. In most cases, just returning the values in a tuple should be fine. Iterables are\r\n  supported so that the partial-proxy objects can have their attributes set more easily during\r\n  unpacking.  Note: the number of values in the iterable must be `.num_values`. NOTE: unlike\r\n  `struct.unpack`, the byteslike object is not required to be the *exact* length needed for\r\n  unpacking, only *at least* as long as required.\r\n- `.unpack_from(self, buffer, offset=0) -> Iterable`: Like `.unpack`, but from an object supporting\r\n  the [Buffer Protocol](https://docs.python.org/3/c-api/buffer.html), at the given offset.\r\n- `.unpack_read(self, readable) -> Iterable`: Like `.unpack`, but reading the data from the\r\n  file-like object `readable`.\r\n\r\n#### Other\r\n- `.with_byte_order(self, byte_order: ByteOrder) -> Serializer)`: Return a (possibly new) serializer\r\n  configured to use the `ByteOrder` specified.  The default implementation returns itself unchanged,\r\n  but in most cases this should be overridden with a correct implementation.\r\n- `.__add__(self, other) -> Serailzer` (**not required**): The final serializer used for a\r\n `Structured` class is determined by \"adding\" all of the individual serializers for each attribute\r\n together.  In most cases the default implementation will suffice.  You can provide your own\r\n implementation if optimizations can be made (for example, see `StructSerializer`'s implementation).\r\n\r\n\r\n### The \"building\" Serializers\r\nThere are a few basic serializers used for building others:\r\n- `NullSerializer`: This is a serializer that packs and unpacks nothing. This will be the serializer\r\n  used by a `Structured` class if *no* serializable instance variables are detected. It is also used\r\n  as the starting value to `sum(...)` when generating the final serializer for a `Structured` class.\r\n- `CompoundSerializer`: This is a generic \"chaining\" serializer. Most serializers don't have an\r\n  easy way to combine their logic, so `CompoundSerializer` handles the logic of calling the packing\r\n  and unpacking methods one after another. This is a common serializer to see as the final\r\n  serializer for a `Structured` class. This is also an interesting example to see how to handle\r\n  variable `.size`, and handling `.preunpack` and `.prepack`.\r\n\r\n\r\n### Specific Serializers\r\nThe rest of the Serializer classes are for handling specific serialization types.  They range from\r\nvery simple, to quite complex.\r\n\r\n- `StructSerializer`: For packing/unpacking types which can be directly related to a constant\r\n  `struct` format string.  For example, `uint32` is implemented as\r\n  `Annotated[int, StructSerializer('I')]`.\r\n- `StructActionSerializer`: This is the class used for `StructSerializer`-able custom types, but\r\n  need to perform a custom callable on the result(s) to convert them to their final type.  It is\r\n  almost identical to `StructSerializer`, but calls an `action` on each value unpacked.\r\n- `TupleSerializer`: A fairly simple serializer that handles the `tuple` type-hints.\r\n- `AUnion`: The base for both union serializers.\r\n- `LookbackDecider`: The union serializer which allows for reading attributes already unpacked on\r\n  the object to make a decision.\r\n- `LookaheadDecider`: The union serializer which unpacks a little data then rewinds, using the\r\n  unpacked value to make a decision.\r\n- `StructuredSerializer`: A fairly simple serializer to handle translating the `Structured` class\r\n  methods into the `Serializer` API.\r\n- `DynamicCharSerializer`: The serializer used to handle `char[uint*]` type-hints.\r\n- `TerminatedCharSerializer`: The serializer used to handle `char[b'\\x00']` type-hints.\r\n- `UnicodeSerializer`: A wrapper around one of the `char[]` serializers to handle encoding on\r\n  packing and decoding on unpacking.\r\n\r\n\r\n### Type detection\r\nThis is a very internal-level detail, but may be required if you write your own `Serializer` class.\r\n\r\nAlmost all of the typehints use `typing.Annotated` to specify the `Serializer` instance to use for\r\na hint. In most cases, it's as simple as creating your serializer, then defining a type using this.\r\nSee all of the \"basic\" types for example.  In some more complicated examples, which are configured\r\nvia the `__class_getitem__` method, these return `Annotated` objects with the correct serializer.\r\n\r\nIn any case, the `Structured` class detects the serializers by inspecting the `Annotated` objects\r\nfor serializers.  To support things like `a: Annotated[int, int8]`, it even recursively looks inside\r\nnested `Annotated` objects. For most of this work, `structured` internally uses a singleton object\r\n`structured.type_checking.annotated` to help extract this information. There is a step to perform\r\nextra transformations on these `Annotated` extras, that a new `Serializer` you implement might need\r\nto work.  Check out for example, `TupleSerializer` and `StructuredSerializer` on where that might\r\nbe necessary.\r\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause",
    "summary": "Annotated classes that pack and unpack C structures.",
    "version": "3.0.0",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c416d2aecb3d9e1b251ed789c2bf559d5bfceed881e808d6e2a4d61794d31298",
                "md5": "768dd19eaf0ab9b1df19ed1c2564d290",
                "sha256": "ae9adc15eb348be16fc7ae6cccccbbd5de67c1d07d43d72b48eef0b957c3f044"
            },
            "downloads": -1,
            "filename": "structured_classes-3.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "768dd19eaf0ab9b1df19ed1c2564d290",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 44216,
            "upload_time": "2023-01-14T21:43:15",
            "upload_time_iso_8601": "2023-01-14T21:43:15.089749Z",
            "url": "https://files.pythonhosted.org/packages/c4/16/d2aecb3d9e1b251ed789c2bf559d5bfceed881e808d6e2a4d61794d31298/structured_classes-3.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3298968b53b5a668b76f1d0e1f43ea0bf682ab83aca64002c40e823581f2c1df",
                "md5": "afb192edde6994bd94951ecec82bdd22",
                "sha256": "49aa8d4051d683b8e947d41917ede0ec226926712f481fabcd2343f4d16b080f"
            },
            "downloads": -1,
            "filename": "structured_classes-3.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "afb192edde6994bd94951ecec82bdd22",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 48104,
            "upload_time": "2023-01-14T21:43:17",
            "upload_time_iso_8601": "2023-01-14T21:43:17.390272Z",
            "url": "https://files.pythonhosted.org/packages/32/98/968b53b5a668b76f1d0e1f43ea0bf682ab83aca64002c40e823581f2c1df/structured_classes-3.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-14 21:43:17",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "structured-classes"
}

lojack5