Name | read-protobuf JSON |
Version |
0.2.0
JSON |
| download |
home_page | |
Summary | Small library to read serialized protobuf(s) directly into Pandas DataFrame |
upload_time | 2023-05-23 11:27:58 |
maintainer | |
docs_url | None |
author | Marc Shapiro |
requires_python | >=3.7 |
license | MIT |
keywords |
pandas
protobuf
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# read-protobuf
Small library to read serialized protobuf(s) directly into Pandas DataFrame.
This is intended to be a simple shortcut for translating serialized
protobuf bytes / files directly to a dataframe.
## Install
Available via pip:
```bash
$ pip install read-protobuf
```
## Usage
Run the [demo-notebook](tests/demo.ipynb) for an interactive demo.
```python
import demo_pb2 # compiled protobuf message module
from read_protobuf import read_protobuf
MessageType = demo_pb2.MessageType() # instantiate a new message type
df = read_protobuf(b'\x00\x00', MessageType) # create a dataframe from serialized protobuf bytes
df = read_protobuf([b'\x00\x00', b'x00\x00'] MessageType) # read multiple protobuf bytes
df = read_protobuf('demo.pb', MessageType) # use file instead of bytes
df = read_protobuf(['demo.pb', 'demo2.pb'], MessageType) # read multiple files
# options
df = read_protobuf('demo.pb', MessageType, flatten=False) # don't flatten pb messages
df = read_protobuf('demo.pb', MessageType, prefix_nested=True) # prefix nested messages with parent keys (like pandas.io.json.json_normalize)
```
To compile a protobuf Message class from python, use:
```bash
$ protoc --python_out="." demo.proto
```
## Alternatives
#### protobuf-to-dict
https://github.com/benhodgson/protobuf-to-dict
This library was developed earlier to convert protobufs to JSON via a dict.
#### MessageToDict, MessageToJson
The google protobuf library comes with utilities to convert messages to a `dict` or JSON,
then loaded by Pandas.
```python
from google.protobuf.json_format import MessageToJson
from google.protobuf.json_format import MessageToDict
```
In brief tests, the `read_protobuf` package is about 2x as fast
as using `MessageToDict` and 3x as fast as `MessageToJson`.
## Develop
To install a development version of the package, run from the root directory:
```bash
$ pip install -e .
```
- To install development dependencies, use the optional `[dev]`dependencies:
```bash
$ pip install -e ".[dev]"
```
## Format
Uses `black` and `isort` to format files.
```bash
$ make black
$ make isort
```
## Lint
Uses `ruff` to lint application.
```bash
$ make ruff
```
## Test
Uses `pytest` to run unit tests. From the root of the repository, run:
```bash
$ make pytest
# specify test
$ pytest -k "TestRead::test_read_bytes"
```
## Code Coverage
Use `coverage` to monitor code coverage during tests.
To record coverage while running tests, run:
```bash
$ make pytest-cov
```
## License
[MIT License](LICENSE)
Raw data
{
"_id": null,
"home_page": "",
"name": "read-protobuf",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "pandas,protobuf",
"author": "Marc Shapiro",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/6d/02/15b98644924d4d2d7b637ba6b6d8ca033d465415f66e00e157033caf4b4a/read-protobuf-0.2.0.tar.gz",
"platform": null,
"description": "# read-protobuf\n\nSmall library to read serialized protobuf(s) directly into Pandas DataFrame.\n\nThis is intended to be a simple shortcut for translating serialized\nprotobuf bytes / files directly to a dataframe.\n\n## Install\n\nAvailable via pip:\n\n```bash\n$ pip install read-protobuf\n```\n\n## Usage\n\nRun the [demo-notebook](tests/demo.ipynb) for an interactive demo.\n\n```python\nimport demo_pb2 # compiled protobuf message module \nfrom read_protobuf import read_protobuf\n\nMessageType = demo_pb2.MessageType() # instantiate a new message type\ndf = read_protobuf(b'\\x00\\x00', MessageType) # create a dataframe from serialized protobuf bytes\ndf = read_protobuf([b'\\x00\\x00', b'x00\\x00'] MessageType) # read multiple protobuf bytes\n\ndf = read_protobuf('demo.pb', MessageType) # use file instead of bytes\ndf = read_protobuf(['demo.pb', 'demo2.pb'], MessageType) # read multiple files\n\n# options\ndf = read_protobuf('demo.pb', MessageType, flatten=False) # don't flatten pb messages\ndf = read_protobuf('demo.pb', MessageType, prefix_nested=True) # prefix nested messages with parent keys (like pandas.io.json.json_normalize)\n```\n\nTo compile a protobuf Message class from python, use:\n\n```bash\n$ protoc --python_out=\".\" demo.proto\n```\n\n## Alternatives\n\n#### protobuf-to-dict\n\nhttps://github.com/benhodgson/protobuf-to-dict\n\nThis library was developed earlier to convert protobufs to JSON via a dict.\n\n#### MessageToDict, MessageToJson\n\nThe google protobuf library comes with utilities to convert messages to a `dict` or JSON,\nthen loaded by Pandas.\n\n```python\nfrom google.protobuf.json_format import MessageToJson\nfrom google.protobuf.json_format import MessageToDict\n```\n\nIn brief tests, the `read_protobuf` package is about 2x as fast\nas using `MessageToDict` and 3x as fast as `MessageToJson`.\n\n## Develop\n\nTo install a development version of the package, run from the root directory:\n\n```bash\n$ pip install -e .\n```\n\n- To install development dependencies, use the optional `[dev]`dependencies:\n\n```bash\n$ pip install -e \".[dev]\"\n```\n\n## Format\n\nUses `black` and `isort` to format files.\n\n```bash\n$ make black\n$ make isort\n```\n\n## Lint\n\nUses `ruff` to lint application.\n\n```bash\n$ make ruff\n```\n\n## Test\n\nUses `pytest` to run unit tests. From the root of the repository, run:\n\n```bash\n$ make pytest\n\n# specify test\n$ pytest -k \"TestRead::test_read_bytes\"\n```\n\n## Code Coverage\n\nUse `coverage` to monitor code coverage during tests.\nTo record coverage while running tests, run:\n\n```bash\n$ make pytest-cov\n```\n\n## License\n\n[MIT License](LICENSE)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Small library to read serialized protobuf(s) directly into Pandas DataFrame",
"version": "0.2.0",
"project_urls": {
"Issues": "https://github.com/mlshapiro/read-protobuf/issues",
"Repository": "https://github.com/mlshapiro/read-protobuf.git"
},
"split_keywords": [
"pandas",
"protobuf"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9d1b46c10517f5fe91298be9f4bc2f185a87deb691b7f3b10401c55aa85772c5",
"md5": "980be6975bcdd9e0a31d42b48c767fb8",
"sha256": "643ff9dfc4185f7e5f89d4447c6e452e5d86cf95051ab3f95cd5c39aec7a3d79"
},
"downloads": -1,
"filename": "read_protobuf-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "980be6975bcdd9e0a31d42b48c767fb8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 4728,
"upload_time": "2023-05-23T11:27:55",
"upload_time_iso_8601": "2023-05-23T11:27:55.537221Z",
"url": "https://files.pythonhosted.org/packages/9d/1b/46c10517f5fe91298be9f4bc2f185a87deb691b7f3b10401c55aa85772c5/read_protobuf-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6d0215b98644924d4d2d7b637ba6b6d8ca033d465415f66e00e157033caf4b4a",
"md5": "cc41c5ad56efc964ae421384a16f31d6",
"sha256": "42755793bc107317bca4400851056f7e1c73376a8e90d748c2ed8bbc9c6372c2"
},
"downloads": -1,
"filename": "read-protobuf-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "cc41c5ad56efc964ae421384a16f31d6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 5605,
"upload_time": "2023-05-23T11:27:58",
"upload_time_iso_8601": "2023-05-23T11:27:58.044845Z",
"url": "https://files.pythonhosted.org/packages/6d/02/15b98644924d4d2d7b637ba6b6d8ca033d465415f66e00e157033caf4b4a/read-protobuf-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-05-23 11:27:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mlshapiro",
"github_project": "read-protobuf",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "read-protobuf"
}