# binarysearchfile
Binary search sorted binary file for fast random access
## Usage
Define and use your own binary search file:
```py
from binarysearchfile import BinarySearchFile
class MyBinarySearchFile(BinarySearchFile):
magic = b'\xfe\xff\x01\x01' # magic string, you can change 2nd and 4th byte
headerstart = b'MyBinarySearchFile' # name of the file format
record = (50, 50) # record structure, here two ints, first field can be searched binarily
bsf = MyBinarySearchFile('mybinarysearchfile')
data = [(10, 42), (4, 10), (5, 5)]
bsf.write(data) # write sorted data
print(len(bsf)) # number of records
print(bsf.search(10)) # get index
print(bsf.get(10)) # get record
print(bsf) # print file information
#Output:
#3
#2
#(10, 42)
#MyBinarySearchFile
# fname: mybinarysearchfile
# records: 3
# size: 40.00 Byte
# recsize: 2 Byte (1, 1)
```
The example above defines records consisting of two integers.
The first element ("key") in the record can be searched binarily.
Currently, the following types can be used out of the box:
```
0: binary adding whitepace
10: ascii adding whitespace
20: utf-8 adding whitespace
50: int
51: signedint
```
The file can be read by the original class:
```py
bsf = BinarySearchFile('mybinarysearchfile')
print(bsf.get(10))
#Output:
#(10, 42)
```
The file format is specified in the module's docstring.
### Defining your own data types
Use the following approach to define additional custom types with the DTypeDef class.
Its init method takes arguments `name`, `len`, `encode` and `decode`.
`len`, the byte length of an object, is usually a function of the object, but can be an integer for a fixed length.
Register custom types only with keys greater than 99.
```py
from binarysearchfile import BinarySearchFile, DTypeDef
class MyBinarySearchFile(BinarySearchFile):
DTYPE = BinarySearchFile.DTYPE.copy()
DTYPE[100] = DTypeDef(
'fixedlenint', 5,
encode=lambda v, s: v.to_bytes(s),
decode=lambda v: int.from_bytes(v)
)
# definitions of other class properties follow
```
### Use binary sequential file
We provide a `BinarySequentialFile` class that uses the same file layout and can be used for sequential reading and writing.
```py
from binarysearchfile import BinarySequentialFile
with BinarySequentialFile('mybinarysearchfile') as bseqf:
print(bseqf[2])
#Output:
#(10, 42)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "binarysearchfile",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "binary search, random access, direct access",
"author": "Tom Eulenfeld",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/62/ea/9e8b2b6fb83daaad42c9ff9a26aa18b973bbd8806886b4cf08cc704a8f54/binarysearchfile-0.2.0.tar.gz",
"platform": null,
"description": "# binarysearchfile\n\nBinary search sorted binary file for fast random access\n\n## Usage\n\nDefine and use your own binary search file:\n\n```py\nfrom binarysearchfile import BinarySearchFile\n\nclass MyBinarySearchFile(BinarySearchFile):\n magic = b'\\xfe\\xff\\x01\\x01' # magic string, you can change 2nd and 4th byte\n headerstart = b'MyBinarySearchFile' # name of the file format\n record = (50, 50) # record structure, here two ints, first field can be searched binarily\n\nbsf = MyBinarySearchFile('mybinarysearchfile')\ndata = [(10, 42), (4, 10), (5, 5)]\nbsf.write(data) # write sorted data\nprint(len(bsf)) # number of records\nprint(bsf.search(10)) # get index\nprint(bsf.get(10)) # get record\nprint(bsf) # print file information\n\n#Output:\n#3\n#2\n#(10, 42)\n#MyBinarySearchFile\n# fname: mybinarysearchfile\n# records: 3\n# size: 40.00 Byte\n# recsize: 2 Byte (1, 1)\n```\n\nThe example above defines records consisting of two integers.\nThe first element (\"key\") in the record can be searched binarily.\nCurrently, the following types can be used out of the box:\n\n```\n0: binary adding whitepace\n10: ascii adding whitespace\n20: utf-8 adding whitespace\n50: int\n51: signedint\n```\n\nThe file can be read by the original class:\n\n```py\nbsf = BinarySearchFile('mybinarysearchfile')\nprint(bsf.get(10))\n\n#Output:\n#(10, 42)\n\n```\n\nThe file format is specified in the module's docstring.\n\n### Defining your own data types\n\nUse the following approach to define additional custom types with the DTypeDef class.\nIts init method takes arguments `name`, `len`, `encode` and `decode`.\n`len`, the byte length of an object, is usually a function of the object, but can be an integer for a fixed length.\nRegister custom types only with keys greater than 99.\n\n```py\nfrom binarysearchfile import BinarySearchFile, DTypeDef\n\nclass MyBinarySearchFile(BinarySearchFile):\n DTYPE = BinarySearchFile.DTYPE.copy()\n DTYPE[100] = DTypeDef(\n 'fixedlenint', 5,\n encode=lambda v, s: v.to_bytes(s),\n decode=lambda v: int.from_bytes(v)\n )\n # definitions of other class properties follow\n```\n\n### Use binary sequential file\n\nWe provide a `BinarySequentialFile` class that uses the same file layout and can be used for sequential reading and writing.\n\n```py\nfrom binarysearchfile import BinarySequentialFile\nwith BinarySequentialFile('mybinarysearchfile') as bseqf:\n print(bseqf[2])\n\n#Output:\n#(10, 42)\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Binary search binary file for fast random access",
"version": "0.2.0",
"project_urls": {
"Bug Tracker": "https://github.com/trichter/binarysearchfile/issues",
"Homepage": "https://github.com/trichter/binarysearchfile"
},
"split_keywords": [
"binary search",
" random access",
" direct access"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "28dd72f96fb8e6a8f0a23c15f3873e943397701a1ebd8950e1d5cd34e596e8d5",
"md5": "af03a2640f478df5eb967c03ef48b89e",
"sha256": "08280c7a7cf7a0a32de25511451f6db9d1a171de08d5bbdbc2e2000283276b5b"
},
"downloads": -1,
"filename": "binarysearchfile-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "af03a2640f478df5eb967c03ef48b89e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 7216,
"upload_time": "2024-05-27T13:58:55",
"upload_time_iso_8601": "2024-05-27T13:58:55.350291Z",
"url": "https://files.pythonhosted.org/packages/28/dd/72f96fb8e6a8f0a23c15f3873e943397701a1ebd8950e1d5cd34e596e8d5/binarysearchfile-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "62ea9e8b2b6fb83daaad42c9ff9a26aa18b973bbd8806886b4cf08cc704a8f54",
"md5": "22c875547abd8bbdfecb66868493a9db",
"sha256": "04cbaefc12a04482b5fde3000a9975f37df00a8ceda42b64a13a518e91a5147b"
},
"downloads": -1,
"filename": "binarysearchfile-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "22c875547abd8bbdfecb66868493a9db",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 7069,
"upload_time": "2024-05-27T13:58:56",
"upload_time_iso_8601": "2024-05-27T13:58:56.728651Z",
"url": "https://files.pythonhosted.org/packages/62/ea/9e8b2b6fb83daaad42c9ff9a26aa18b973bbd8806886b4cf08cc704a8f54/binarysearchfile-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-27 13:58:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "trichter",
"github_project": "binarysearchfile",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "binarysearchfile"
}