cprotobuf


Namecprotobuf JSON
Version 0.1.11 PyPI version JSON
download
home_pagehttps://github.com/yihuang/cprotobuf
Summarypythonic and high performance protocol buffer implementation.
upload_time2022-10-17 01:51:10
maintainer
docs_urlNone
authorhuangyi
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            A minimal fast protobuf implementation with cython.
Benchmark shows that it's much faster than google official expremental cpp-python implementation.

I've been using it in production since 2013, only tested with python2.7, feedback on other python release is welcome.

Benchmark
=========

.. code-block:: bash

  $ ./setup.py build_ext --inplace
  $ cd benchmark
  $ ./bench.sh
  encode[google official pure python]:
  10 loops, best of 3: 68.8 msec per loop
  encode[google official cpp python]:
  100 loops, best of 3: 19.4 msec per loop
  encode[py-protobuf][cprotobuf]:
  100 loops, best of 3: 3.58 msec per loop
  decode[google official pure python]:
  10 loops, best of 3: 47.5 msec per loop
  decode[google official cpp python]:
  100 loops, best of 3: 4.55 msec per loop
  decode[py-protobuf][cprotobuf]:
  100 loops, best of 3: 3.98 msec per loop

Tutorial
========

Use plugin
----------

You write a ``person.proto`` file like this:

.. code-block:: protobuf

    package foo;

    message Person {
      required int32 id = 1;
      required string name = 2;
      optional string email = 3;
    }

And a ``people.proto`` file like this:

.. code-block:: protobuf

    package foo;
    import "person.proto";

    message People {
      repeated Person people = 1;
    }

Then you compile it with provided plugin:

.. code-block:: bash

    $ protoc --cprotobuf_out=. person.proto people.proto

If you have trouble to run a protobuf plugin like on windows, you can directly run ``protoc-gen-cprotobuf`` like this:

.. code-block:: bash

    $ protoc -ofoo.pb person.proto people.proto
    $ protoc-gen-cprotobuf foo.pb -d .

Then you get a python module ``foo_pb.py`` , cprotobuf generate a python module for each package rather than each protocol file.

The generated code is quite readable:

.. code-block:: python

    # coding: utf-8
    from cprotobuf import ProtoEntity, Field
    # file: person.proto
    class Person(ProtoEntity):
        id              = Field('int32',	1)
        name            = Field('string',	2)
        email           = Field('string',	3, required=False)

    # file: people.proto
    class People(ProtoEntity):
        people          = Field(Person,	1, repeated=True)

Actually, if you only use python, you can write this python module, avoid code generation.

The API
-------

Now, you have this lovely python module, how to parse and serialize messages?

When design this package, We try to minimise the effort of migration, so we keep the names of api akin to protocol buffer's.

.. note::
    
    Since this is no need to reuse a message instance and call ``Clear`` on it in python, It don't provide ``Clear`` api,
    so ``ParseFromString`` is more like ``MergeFromString`` in official implementation, because it don't call ``Clear`` at first.

encode/decode
~~~~~~~~~~~~~

.. code-block:: python

    >>> from foo_pb import Person, People
    >>> msg = People()
    >>> msg.people.add(
    ...    id = 1,
    ...    name = 'jim',
    ...    email = 'jim@gmail.com',
    ... )
    >>> s = msg.SerializeToString()
    >>> msg2 = People()
    >>> msg2.ParseFromString(s)
    >>> len(msg2)
    1
    >>> msg2.people[0].name
    'jim'

reflection
~~~~~~~~~~

.. code-block:: python

    >>> from foo_pb import Person, People
    >>> dir(Person._fields[0])
    ['__class__', '__delattr__', '__doc__', '__format__', '__get__', '__getattribute__', '__hash__', '__init__', '__new__', '__pyx_vtable__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'index', 'name', 'packed', 'repeated', 'required', 'wire_type']
    >>> Person._fields[0].name
    'email'
    >>> Person._fieldsmap
    {1: <cprotobuf.Field object at 0xb74a538c>, 2: <cprotobuf.Field object at 0xb74a541c>, 3: <cprotobuf.Field object at 0xb74a5c8c>}
    >>> Person._fieldsmap_by_name
    {'email': <cprotobuf.Field object at 0xb74a5c8c>, 'name': <cprotobuf.Field object at 0xb74a541c>, 'id': <cprotobuf.Field object at 0xb74a538c>}

repeated container
~~~~~~~~~~~~~~~~~~

We use ``RepeatedContainer`` to represent repeated field, ``RepeatedContainer`` is inherited from ``list``, so you can manipulate it like a ``list``, or with apis like google's implementation.

.. code-block:: python

    >>> from foo_pb import Person, People
    >>> msg = People()
    >>> msg.people.add(
    ...    id = 1,
    ...    name = 'jim',
    ...    email = 'jim@gmail.com',
    ... )
    >>> p = msg.people.add()
    >>> p.id = 2
    >>> p.name = 'jake'
    >>> p.email = 'jake@gmail.com'
    >>> p2 = Person(id=3, name='lucy', email='lucy@gmail.com')
    >>> msg.people.append(p2)
    >>> msg.people.append({
    ...     'id' : 4,
    ...     'name' : 'lily',
    ...     'email' : 'lily@gmail.com',
    ... })

encode raw data fast
~~~~~~~~~~~~~~~~~~~~

If you already have your messages represented as ``list`` and ``dict``, you can encode it without constructing intermidiate objects, getting ride of a lot of overhead:

.. code-block:: python

    >>> from cprotobuf import encode_data
    >>> from foo_pb import Person, People
    >>> s = encode_data(People, [
    ...     { 'id': 1, 'name': 'tom', 'email': 'tom@gmail.com' }
    ... ])
    >>> msg = People()
    >>> msg.ParseFromString(s)
    >>> msg.people[0].name
    'tom'

Utility APIs
------------

.. code-block:: python

    >>> from cprotobuf import encode_primitive, decode_primitive
    >>> encode_primitive('uint64', 10)
    bytearray(b'\x01')
    >>> decode_primitive(b'\n', 'uint64')
    (10, 1)

Run Tests
=========

.. code-block::

    $ nosetests
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yihuang/cprotobuf",
    "name": "cprotobuf",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "huangyi",
    "author_email": "yi.codeplayer@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/32/ae/4f99bc0f98b8e98e1bde78bbfad061fdc4da57875e7efc1137c9f83bb745/cprotobuf-0.1.11.tar.gz",
    "platform": null,
    "description": "A minimal fast protobuf implementation with cython.\nBenchmark shows that it's much faster than google official expremental cpp-python implementation.\n\nI've been using it in production since 2013, only tested with python2.7, feedback on other python release is welcome.\n\nBenchmark\n=========\n\n.. code-block:: bash\n\n  $ ./setup.py build_ext --inplace\n  $ cd benchmark\n  $ ./bench.sh\n  encode[google official pure python]:\n  10 loops, best of 3: 68.8 msec per loop\n  encode[google official cpp python]:\n  100 loops, best of 3: 19.4 msec per loop\n  encode[py-protobuf][cprotobuf]:\n  100 loops, best of 3: 3.58 msec per loop\n  decode[google official pure python]:\n  10 loops, best of 3: 47.5 msec per loop\n  decode[google official cpp python]:\n  100 loops, best of 3: 4.55 msec per loop\n  decode[py-protobuf][cprotobuf]:\n  100 loops, best of 3: 3.98 msec per loop\n\nTutorial\n========\n\nUse plugin\n----------\n\nYou write a ``person.proto`` file like this:\n\n.. code-block:: protobuf\n\n    package foo;\n\n    message Person {\n      required int32 id = 1;\n      required string name = 2;\n      optional string email = 3;\n    }\n\nAnd a ``people.proto`` file like this:\n\n.. code-block:: protobuf\n\n    package foo;\n    import \"person.proto\";\n\n    message People {\n      repeated Person people = 1;\n    }\n\nThen you compile it with provided plugin:\n\n.. code-block:: bash\n\n    $ protoc --cprotobuf_out=. person.proto people.proto\n\nIf you have trouble to run a protobuf plugin like on windows, you can directly run ``protoc-gen-cprotobuf`` like this:\n\n.. code-block:: bash\n\n    $ protoc -ofoo.pb person.proto people.proto\n    $ protoc-gen-cprotobuf foo.pb -d .\n\nThen you get a python module ``foo_pb.py`` , cprotobuf generate a python module for each package rather than each protocol file.\n\nThe generated code is quite readable:\n\n.. code-block:: python\n\n    # coding: utf-8\n    from cprotobuf import ProtoEntity, Field\n    # file: person.proto\n    class Person(ProtoEntity):\n        id              = Field('int32',\t1)\n        name            = Field('string',\t2)\n        email           = Field('string',\t3, required=False)\n\n    # file: people.proto\n    class People(ProtoEntity):\n        people          = Field(Person,\t1, repeated=True)\n\nActually, if you only use python, you can write this python module, avoid code generation.\n\nThe API\n-------\n\nNow, you have this lovely python module, how to parse and serialize messages?\n\nWhen design this package, We try to minimise the effort of migration, so we keep the names of api akin to protocol buffer's.\n\n.. note::\n    \n    Since this is no need to reuse a message instance and call ``Clear`` on it in python, It don't provide ``Clear`` api,\n    so ``ParseFromString`` is more like ``MergeFromString`` in official implementation, because it don't call ``Clear`` at first.\n\nencode/decode\n~~~~~~~~~~~~~\n\n.. code-block:: python\n\n    >>> from foo_pb import Person, People\n    >>> msg = People()\n    >>> msg.people.add(\n    ...    id = 1,\n    ...    name = 'jim',\n    ...    email = 'jim@gmail.com',\n    ... )\n    >>> s = msg.SerializeToString()\n    >>> msg2 = People()\n    >>> msg2.ParseFromString(s)\n    >>> len(msg2)\n    1\n    >>> msg2.people[0].name\n    'jim'\n\nreflection\n~~~~~~~~~~\n\n.. code-block:: python\n\n    >>> from foo_pb import Person, People\n    >>> dir(Person._fields[0])\n    ['__class__', '__delattr__', '__doc__', '__format__', '__get__', '__getattribute__', '__hash__', '__init__', '__new__', '__pyx_vtable__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'index', 'name', 'packed', 'repeated', 'required', 'wire_type']\n    >>> Person._fields[0].name\n    'email'\n    >>> Person._fieldsmap\n    {1: <cprotobuf.Field object at 0xb74a538c>, 2: <cprotobuf.Field object at 0xb74a541c>, 3: <cprotobuf.Field object at 0xb74a5c8c>}\n    >>> Person._fieldsmap_by_name\n    {'email': <cprotobuf.Field object at 0xb74a5c8c>, 'name': <cprotobuf.Field object at 0xb74a541c>, 'id': <cprotobuf.Field object at 0xb74a538c>}\n\nrepeated container\n~~~~~~~~~~~~~~~~~~\n\nWe use ``RepeatedContainer`` to represent repeated field, ``RepeatedContainer`` is inherited from ``list``, so you can manipulate it like a ``list``, or with apis like google's implementation.\n\n.. code-block:: python\n\n    >>> from foo_pb import Person, People\n    >>> msg = People()\n    >>> msg.people.add(\n    ...    id = 1,\n    ...    name = 'jim',\n    ...    email = 'jim@gmail.com',\n    ... )\n    >>> p = msg.people.add()\n    >>> p.id = 2\n    >>> p.name = 'jake'\n    >>> p.email = 'jake@gmail.com'\n    >>> p2 = Person(id=3, name='lucy', email='lucy@gmail.com')\n    >>> msg.people.append(p2)\n    >>> msg.people.append({\n    ...     'id' : 4,\n    ...     'name' : 'lily',\n    ...     'email' : 'lily@gmail.com',\n    ... })\n\nencode raw data fast\n~~~~~~~~~~~~~~~~~~~~\n\nIf you already have your messages represented as ``list`` and ``dict``, you can encode it without constructing intermidiate objects, getting ride of a lot of overhead:\n\n.. code-block:: python\n\n    >>> from cprotobuf import encode_data\n    >>> from foo_pb import Person, People\n    >>> s = encode_data(People, [\n    ...     { 'id': 1, 'name': 'tom', 'email': 'tom@gmail.com' }\n    ... ])\n    >>> msg = People()\n    >>> msg.ParseFromString(s)\n    >>> msg.people[0].name\n    'tom'\n\nUtility APIs\n------------\n\n.. code-block:: python\n\n    >>> from cprotobuf import encode_primitive, decode_primitive\n    >>> encode_primitive('uint64', 10)\n    bytearray(b'\\x01')\n    >>> decode_primitive(b'\\n', 'uint64')\n    (10, 1)\n\nRun Tests\n=========\n\n.. code-block::\n\n    $ nosetests",
    "bugtrack_url": null,
    "license": "",
    "summary": "pythonic and high performance protocol buffer implementation.",
    "version": "0.1.11",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "16c33132f54145c2e05451e5a1ab0ea8",
                "sha256": "d2d88c8de840275205e64e530052c653dd25a0fb9e5cd9f7e39ce8f762d7c0a4"
            },
            "downloads": -1,
            "filename": "cprotobuf-0.1.11.tar.gz",
            "has_sig": false,
            "md5_digest": "16c33132f54145c2e05451e5a1ab0ea8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 105687,
            "upload_time": "2022-10-17T01:51:10",
            "upload_time_iso_8601": "2022-10-17T01:51:10.191138Z",
            "url": "https://files.pythonhosted.org/packages/32/ae/4f99bc0f98b8e98e1bde78bbfad061fdc4da57875e7efc1137c9f83bb745/cprotobuf-0.1.11.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-10-17 01:51:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "yihuang",
    "github_project": "cprotobuf",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "cprotobuf"
}
        
Elapsed time: 0.02405s