> :warning: **This project is pre-alpha and there are no guarantees of API stability. The documentation is sometimes more aspirational than accurate.**
[![bitformat](https://raw.githubusercontent.com/scott-griffiths/bitformat/main/doc/bitformat_logo_small.png)](https://github.com/scott-griffiths/bitformat)
[![CI badge](https://github.com/scott-griffiths/bitformat/actions/workflows/.github/workflows/ci.yml/badge.svg)](https://github.com/scott-griffiths/bitformat/actions/workflows/ci.yml)
[![Docs](https://img.shields.io/readthedocs/bitformat?logo=readthedocs&logoColor=white)](https://bitformat.readthedocs.io/en/latest/)
<!--
[![Dependents (via libraries.io)](https://img.shields.io/librariesio/dependents/pypi/bitformat?logo=libraries.io&logoColor=white)](https://libraries.io/pypi/bitformat)
[![Codacy Badge](https://img.shields.io/codacy/grade/b61ae16cc6404d0da5dbcc21ee19ddda?logo=codacy)](https://app.codacy.com/gh/scott-griffiths/bitformat/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
[![Pepy Total Downlods](https://img.shields.io/pepy/dt/bitformat?logo=python&logoColor=white&labelColor=blue&color=blue)](https://www.pepy.tech/projects/bitformat)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/bitformat?label=%40&labelColor=blue&color=blue)](https://pypistats.org/packages/bitformat)
-->
---------
**bitformat** is a Python module for creating, manipulating and interpreting binary data.
It also supports parsing and creating more complex binary formats.
It is from the author of the widely used [**bitstring**](https://github.com/scott-griffiths/bitstring) module.
----
## Features :hammer_and_wrench:
* The `Bits` class represents a sequence of binary data of arbitrary length. It provides methods for creating, modifying and interpreting the data.
* The `Format` class provides a way to define a binary format using a simple and flexible syntax.
* A wide array of data types is supported with no arbitrary restrictions on length.
* Data is always stored efficiently as a contiguous array of bits.
> [!NOTE]
> To see what been added, improved or fixed, and also to see what's coming in the next version, see the [release notes](https://github.com/scott-griffiths/bitformat/blob/main/release_notes.md).
## Documentation :book:
* [The bitformat documentation](https://bitformat.readthedocs.io/en/latest/) includes a full reference for the library.
* [A Tour of bitformat](https://nbviewer.org/github/scott-griffiths/bitformat/blob/main/doc/bitformat_tour.ipynb) is a notebook
which gives a quick introduction to the library and some worked examples.
## Some Examples :bulb:
### Creating some Bits
A variety of constructor methods are available to create `Bits`, including from binary, hexadecimal or octal strings, formatted strings, byte literals and iterables.
```pycon
>>> from bitformat import *
>>> a = Bits('0b1010') # Create from a binary string
>>> b = Bits('u12 = 54') # Create from a formatted string.
>>> c = Bits.from_bytes(b'\x01\x02\x03') # Create from a bytes or bytearray object.
>>> d = Bits.pack('f16', -0.75) # Pack a value into a data type.
>>> e = Bits.join([a, b, c, d]) # The best way to join lots of bits together.
```
### Interpreting those Bits
Although the examples above were created from a variety of data types, the `Bits` instance doesn't retain any knowledge of how it was created - it's just a sequence of bits.
You can therefore interpret them however you'd like:
```pycon
>>> a.i
-6
>>> b.hex
'036'
>>> c.unpack(['u4', 'f16', 'u4'])
[0, 0.0005035400390625, 3]
>>> d.bytes
b'\xba\x00'
```
The `unpack` method is available as a general-case way to unpack the bits into a single or multiple data types.
If you only want to unpack to a single data type you can use properties of the `Bits` as a short-cut.
### Data types
A wide range of data types are supported. These are essentially descriptions on how binary data can be converted to a useful value. The `Dtype` class is used to define these, but usually just the string representation can be used.
Some example data type strings are:
* `'u3'` - a 3 bit unsigned integer.
* `'i_le32'` - a 32 bit little-endian signed integer.
* `'f64'` - a 64 bit IEEE float. Lengths of 16, 32 and 64 are supported.
* `'bool'` - a single bit boolean value.
* `'bytes10'` - a 10 byte sequence.
* `'hex'` - a hexadecimal string.
* `'bin'` - a binary string.
* `'[u8; 40]'` - an array of 40 unsigned 8 bit integers.
Byte endianness for floating point and integer data types is specified with `_le`, `_be` and `_ne` suffixes to the base type.
### Bit operations
An extensive set of operations are available to query `Bits` or to create new ones. For example:
```pycon
>>> a + b # Concatenation
Bits('0xa036')
>>> c.find('0b11') # Returns found bit position
22
>>> b.replace('0b1', '0xfe')
Bits('0x03fbf9fdfc')
>>> b[0:10] | d[2:12] # Slicing and logical operators
Bits('0b1110101101')
```
### Arrays
An `Array` class is provided which stores a contiguous sequence of `Bits` of the same data type.
This is similar to the `array` type in the standard module of the same name, but it's not restricted to just a dozen or so types.
```pycon
>>> r = Array('i5', [4, -3, 0, 1, -5, 15]) # An array of 5 bit signed ints
>>> r -= 2 # Operates on each element
>>> r.unpack()
[2, -5, -2, -1, -7, 13]
>>> r.dtype = 'u6' # You can freely change the data type
>>> r
Array('u6', [5, 47, 55, 60, 45])
>>> r.to_bits()
Bits('0b000101101111110111111100101101')
```
### A `Format` example
The `Format` class can be used to give structure to bits, as well as storing the data in a human-readable form.
```pycon
>>> f = Format('[width: u12, height: u12, flags: [bool; 4]]')
>>> f.pack([320, 240, [True, False, True, False]])
Bits('0x1400f0a')
>>> print(f)
[
width: u12 = 320,
height: u12 = 240,
flags: [bool; 4] = (True, False, True, False)
]
>>> f['height'].value /= 2
>>> f.to_bits()
Bits('0x140078a')
>>> f.to_bits() == 'u12=320, u12=120, 0b1010'
True
```
The `Format` and its fields can optionally have names (the `Format` above is unnamed, but its fields are named).
In this example the `pack` method was used with appropriate values, which then returned a `Bits` object.
The `Format` now contains all the interpreted values, which can be easily accessed and modified.
The final line in the example above demonstrates how new `Bits` objects can be created when needed by promoting other types, in this case the formatted string is promoted to a `Bits` object before the comparison is made.
The `Format` can be used symmetrically to both create and parse binary data:
```pycon
>>> f.parse(b'x\x048\x10')
28
>>> f
Format([
'width: u12 = 1920',
'height: u12 = 1080',
'flags: [bool; 4] = (False, False, False, True)'
])
```
The `parse` method is able to lazily parse the input bytes, and simply returns the number of bits that were consumed. The actual values of the individual fields aren't calculated until they are needed, which allows large and complex file formats to be efficiently dealt with.
## More to come :construction:
The `bitformat` library is still pre-alpha and is being actively developed.
I'm hoping to make an alpha release or two in late 2024, with more features added in 2025.
There are a number of important features planned, some of which are from the `bitstring` library on which much of the core is based, and others are needed for a full binary format experience.
The (unordered) :todo: list includes:
* **Streaming methods.** There is no concept of a bit position, or of reading through a `Bits`. This is available in `bitstring`, but I want to find a better way of doing it before adding it to `bitformat`.
* **Field expressions.** Rather than hard coding everything in a field, some parts will be calculated during the parsing process. For example in the format `'[w: u16, h: u16, [u8; {w * h}]]'` the size of the `'u8'` array would depend on the values parsed just before it.
* **New field types.** Fields like `Repeat`, `Find` and `If` are planned which will allow more flexible formats to be written.
* **Exotic floating point types.** In `bitstring` there are a number of extra floating point types such as `bfloat` and the MXFP 8, 6 and 4-bit variants. These will be ported over to `bitformat`.
* **Performance improvements.** A primary focus on the design of `bitformat` is that it should be fast. Early versions won't be well optimized, but tests so far are quite promising, and the design philosophy should mean that it can be made even more performant later.
* **LSB0.** Currenlty all bit positions are done with the most significant bit being bit zero (MSB0). I plan to add support for least significant bit zero (LSB0) bit numbering as well.
<sub>Copyright (c) 2024 Scott Griffiths</sub>
Raw data
{
"_id": null,
"home_page": null,
"name": "bitformat",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "binary, bitarray, bitvector, bitfield, bitstring",
"author": null,
"author_email": "Scott Griffiths <dr.scottgriffiths@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/23/ee/8bc31eca4838b47603d8739661d1ee38912549a01804a9212d36f78fd554/bitformat-0.1.0.tar.gz",
"platform": null,
"description": "> :warning: **This project is pre-alpha and there are no guarantees of API stability. The documentation is sometimes more aspirational than accurate.**\n\n[![bitformat](https://raw.githubusercontent.com/scott-griffiths/bitformat/main/doc/bitformat_logo_small.png)](https://github.com/scott-griffiths/bitformat)\n\n[![CI badge](https://github.com/scott-griffiths/bitformat/actions/workflows/.github/workflows/ci.yml/badge.svg)](https://github.com/scott-griffiths/bitformat/actions/workflows/ci.yml)\n[![Docs](https://img.shields.io/readthedocs/bitformat?logo=readthedocs&logoColor=white)](https://bitformat.readthedocs.io/en/latest/)\n<!--\n[![Dependents (via libraries.io)](https://img.shields.io/librariesio/dependents/pypi/bitformat?logo=libraries.io&logoColor=white)](https://libraries.io/pypi/bitformat)\n[![Codacy Badge](https://img.shields.io/codacy/grade/b61ae16cc6404d0da5dbcc21ee19ddda?logo=codacy)](https://app.codacy.com/gh/scott-griffiths/bitformat/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)\n \n[![Pepy Total Downlods](https://img.shields.io/pepy/dt/bitformat?logo=python&logoColor=white&labelColor=blue&color=blue)](https://www.pepy.tech/projects/bitformat)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/bitformat?label=%40&labelColor=blue&color=blue)](https://pypistats.org/packages/bitformat)\n-->\n---------\n\n**bitformat** is a Python module for creating, manipulating and interpreting binary data.\nIt also supports parsing and creating more complex binary formats.\n\nIt is from the author of the widely used [**bitstring**](https://github.com/scott-griffiths/bitstring) module.\n\n\n----\n\n## Features :hammer_and_wrench:\t\n* The `Bits` class represents a sequence of binary data of arbitrary length. It provides methods for creating, modifying and interpreting the data.\n* The `Format` class provides a way to define a binary format using a simple and flexible syntax.\n* A wide array of data types is supported with no arbitrary restrictions on length.\n* Data is always stored efficiently as a contiguous array of bits.\n\n> [!NOTE]\n> To see what been added, improved or fixed, and also to see what's coming in the next version, see the [release notes](https://github.com/scott-griffiths/bitformat/blob/main/release_notes.md).\n\n\n## Documentation :book:\n\n* [The bitformat documentation](https://bitformat.readthedocs.io/en/latest/) includes a full reference for the library.\n* [A Tour of bitformat](https://nbviewer.org/github/scott-griffiths/bitformat/blob/main/doc/bitformat_tour.ipynb) is a notebook\nwhich gives a quick introduction to the library and some worked examples.\n\n## Some Examples :bulb: \n\n### Creating some Bits\n\nA variety of constructor methods are available to create `Bits`, including from binary, hexadecimal or octal strings, formatted strings, byte literals and iterables.\n\n```pycon\n>>> from bitformat import *\n\n>>> a = Bits('0b1010') # Create from a binary string\n>>> b = Bits('u12 = 54') # Create from a formatted string.\n>>> c = Bits.from_bytes(b'\\x01\\x02\\x03') # Create from a bytes or bytearray object.\n>>> d = Bits.pack('f16', -0.75) # Pack a value into a data type.\n>>> e = Bits.join([a, b, c, d]) # The best way to join lots of bits together.\n```\n\n### Interpreting those Bits\n\nAlthough the examples above were created from a variety of data types, the `Bits` instance doesn't retain any knowledge of how it was created - it's just a sequence of bits.\nYou can therefore interpret them however you'd like:\n\n```pycon\n>>> a.i\n-6\n>>> b.hex\n'036'\n>>> c.unpack(['u4', 'f16', 'u4'])\n[0, 0.0005035400390625, 3]\n>>> d.bytes\nb'\\xba\\x00'\n```\n\nThe `unpack` method is available as a general-case way to unpack the bits into a single or multiple data types.\nIf you only want to unpack to a single data type you can use properties of the `Bits` as a short-cut.\n\n### Data types\n\nA wide range of data types are supported. These are essentially descriptions on how binary data can be converted to a useful value. The `Dtype` class is used to define these, but usually just the string representation can be used.\n\nSome example data type strings are:\n\n* `'u3'` - a 3 bit unsigned integer.\n* `'i_le32'` - a 32 bit little-endian signed integer.\n* `'f64'` - a 64 bit IEEE float. Lengths of 16, 32 and 64 are supported.\n* `'bool'` - a single bit boolean value.\n* `'bytes10'` - a 10 byte sequence.\n* `'hex'` - a hexadecimal string.\n* `'bin'` - a binary string.\n* `'[u8; 40]'` - an array of 40 unsigned 8 bit integers.\n\nByte endianness for floating point and integer data types is specified with `_le`, `_be` and `_ne` suffixes to the base type. \n\n### Bit operations\n\nAn extensive set of operations are available to query `Bits` or to create new ones. For example:\n\n```pycon\n>>> a + b # Concatenation\nBits('0xa036')\n>>> c.find('0b11') # Returns found bit position\n22\n>>> b.replace('0b1', '0xfe')\nBits('0x03fbf9fdfc')\n>>> b[0:10] | d[2:12] # Slicing and logical operators\nBits('0b1110101101')\n```\n\n### Arrays\n\nAn `Array` class is provided which stores a contiguous sequence of `Bits` of the same data type.\nThis is similar to the `array` type in the standard module of the same name, but it's not restricted to just a dozen or so types.\n\n```pycon\n>>> r = Array('i5', [4, -3, 0, 1, -5, 15]) # An array of 5 bit signed ints\n>>> r -= 2 # Operates on each element\n>>> r.unpack()\n[2, -5, -2, -1, -7, 13]\n>>> r.dtype = 'u6' # You can freely change the data type\n>>> r\nArray('u6', [5, 47, 55, 60, 45])\n>>> r.to_bits()\nBits('0b000101101111110111111100101101')\n```\n\n### A `Format` example\n\nThe `Format` class can be used to give structure to bits, as well as storing the data in a human-readable form.\n\n```pycon\n>>> f = Format('[width: u12, height: u12, flags: [bool; 4]]')\n>>> f.pack([320, 240, [True, False, True, False]])\nBits('0x1400f0a')\n>>> print(f)\n[\n width: u12 = 320,\n height: u12 = 240,\n flags: [bool; 4] = (True, False, True, False)\n]\n>>> f['height'].value /= 2\n>>> f.to_bits()\nBits('0x140078a')\n>>> f.to_bits() == 'u12=320, u12=120, 0b1010'\nTrue\n```\n\nThe `Format` and its fields can optionally have names (the `Format` above is unnamed, but its fields are named).\nIn this example the `pack` method was used with appropriate values, which then returned a `Bits` object.\nThe `Format` now contains all the interpreted values, which can be easily accessed and modified.\n\nThe final line in the example above demonstrates how new `Bits` objects can be created when needed by promoting other types, in this case the formatted string is promoted to a `Bits` object before the comparison is made.\n\nThe `Format` can be used symmetrically to both create and parse binary data:\n\n```pycon\n>>> f.parse(b'x\\x048\\x10')\n28\n>>> f\nFormat([\n 'width: u12 = 1920',\n 'height: u12 = 1080',\n 'flags: [bool; 4] = (False, False, False, True)'\n])\n```\n\nThe `parse` method is able to lazily parse the input bytes, and simply returns the number of bits that were consumed. The actual values of the individual fields aren't calculated until they are needed, which allows large and complex file formats to be efficiently dealt with.\n\n## More to come :construction:\n\nThe `bitformat` library is still pre-alpha and is being actively developed.\nI'm hoping to make an alpha release or two in late 2024, with more features added in 2025.\n\nThere are a number of important features planned, some of which are from the `bitstring` library on which much of the core is based, and others are needed for a full binary format experience.\n\nThe (unordered) :todo: list includes:\n\n* **Streaming methods.** There is no concept of a bit position, or of reading through a `Bits`. This is available in `bitstring`, but I want to find a better way of doing it before adding it to `bitformat`.\n* **Field expressions.** Rather than hard coding everything in a field, some parts will be calculated during the parsing process. For example in the format `'[w: u16, h: u16, [u8; {w * h}]]'` the size of the `'u8'` array would depend on the values parsed just before it.\n* **New field types.** Fields like `Repeat`, `Find` and `If` are planned which will allow more flexible formats to be written.\n* **Exotic floating point types.** In `bitstring` there are a number of extra floating point types such as `bfloat` and the MXFP 8, 6 and 4-bit variants. These will be ported over to `bitformat`.\n* **Performance improvements.** A primary focus on the design of `bitformat` is that it should be fast. Early versions won't be well optimized, but tests so far are quite promising, and the design philosophy should mean that it can be made even more performant later.\n* **LSB0.** Currenlty all bit positions are done with the most significant bit being bit zero (MSB0). I plan to add support for least significant bit zero (LSB0) bit numbering as well.\n\n<sub>Copyright (c) 2024 Scott Griffiths</sub>\n",
"bugtrack_url": null,
"license": null,
"summary": "A library for creating and interpreting binary formats.",
"version": "0.1.0",
"project_urls": {
"documentation": "https://bitformat.readthedocs.io/",
"homepage": "https://github.com/scott-griffiths/bitformat"
},
"split_keywords": [
"binary",
" bitarray",
" bitvector",
" bitfield",
" bitstring"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "74f1533b71d7c6c303c973ec02dd2bfee5119ab83fced5367d2038fe67e7a3e2",
"md5": "bd0b4c57ee2009f0d39724cfaeb7dc88",
"sha256": "f01ce3d994e15db96833beb1352815cd70bebd28f5a8ab6f17778ce3052c4626"
},
"downloads": -1,
"filename": "bitformat-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bd0b4c57ee2009f0d39724cfaeb7dc88",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 48892,
"upload_time": "2024-09-14T16:55:25",
"upload_time_iso_8601": "2024-09-14T16:55:25.166142Z",
"url": "https://files.pythonhosted.org/packages/74/f1/533b71d7c6c303c973ec02dd2bfee5119ab83fced5367d2038fe67e7a3e2/bitformat-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "23ee8bc31eca4838b47603d8739661d1ee38912549a01804a9212d36f78fd554",
"md5": "e27d96f5b4ec02dc589ebb986ece54f1",
"sha256": "c74a9244e40ebb03b01a22fcd97dc211f4c67babe37149b85eeaeece16d8af40"
},
"downloads": -1,
"filename": "bitformat-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "e27d96f5b4ec02dc589ebb986ece54f1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 73972,
"upload_time": "2024-09-14T16:55:26",
"upload_time_iso_8601": "2024-09-14T16:55:26.840467Z",
"url": "https://files.pythonhosted.org/packages/23/ee/8bc31eca4838b47603d8739661d1ee38912549a01804a9212d36f78fd554/bitformat-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-14 16:55:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scott-griffiths",
"github_project": "bitformat",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "bitformat"
}