beavers


Namebeavers JSON
Version 0.12.0 PyPI version JSON
download
home_pagehttps://github.com/tradewelltech/beavers
SummaryPython stream processing
upload_time2024-11-25 12:11:10
maintainer0x26res
docs_urlNone
authorTradewell Tech
requires_python<4,>=3.10
licenseApache-2.0
keywords apache-arrow streaming data
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
[![PyPI Version][pypi-image]][pypi-url]
[![Python Version][versions-image]][versions-url]
[![Github Stars][stars-image]][stars-url]
[![codecov][codecov-image]][codecov-url]
[![Build Status][build-image]][build-url]
[![Documentation][doc-image]][doc-url]
[![License][license-image]][license-url]
[![Downloads][downloads-image]][downloads-url]
[![Downloads][downloads-month-image]][downloads-month-url]
[![Code style: black][codestyle-image]][codestyle-url]

![Beavers Logo][5]

# Beavers

[Documentation][6] / [Installation][7] / [Repository][1] / [PyPI][8]

[Beavers][1] is a python library for stream processing, optimized for analytics.

It is used at [Tradewell Technologies][2],
to calculate analytics and serve model predictions,
for both realtime and batch jobs.

## Key Features

- Works in **real time** (eg: reading from Kafka) and **replay mode** (eg: reading from Parquet files).
- Optimized for analytics, using micro-batches (instead of processing records one by one).
- Similar to [incremental][3], it updates nodes in a dag incrementally.
- Taking inspiration from [kafka streams][4], there are two types of nodes in the dag:
    - **Stream**: ephemeral micro-batches of events (cleared after every cycle).
    - **State**: durable state derived from streams.
- Clear separation between the business logic and the IO.
  So the same dag can be used in real time mode, replay mode or can be easily tested.
- Functional interface: no inheritance or decorator required.
- Support for complicated joins, not just "linear" data flow.

## Limitations

- No concurrency support.
  To speed up calculation use libraries like pandas, pyarrow or polars.
- No async code.
  To speed up IO use kafka driver native thread or parquet IO thread pool.
- No support for persistent state.
  Instead of saving state, replay historic data from kafka to prime stateful nodes.

## Talks

- [Unified batch and stream processing in python | PyData Global 2023][9]

[1]: https://github.com/tradewelltech/beavers
[2]: https://www.tradewelltech.co/
[3]: https://github.com/janestreet/incremental
[4]: https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/
[5]: https://raw.githubusercontent.com/tradewelltech/beavers/master/docs/static/icons/beavers/logo.svg
[6]: https://beavers.readthedocs.io/en/latest/
[7]: https://beavers.readthedocs.io/en/latest/install/
[8]: https://pypi.org/project/beavers/
[9]: https://www.youtube.com/watch?v=8pUwsGA8SQM

[pypi-image]: https://img.shields.io/pypi/v/beavers
[pypi-url]: https://pypi.org/project/beavers/
[build-image]: https://github.com/tradewelltech/beavers/actions/workflows/ci.yaml/badge.svg
[build-url]: https://github.com/tradewelltech/beavers/actions/workflows/ci.yaml
[stars-image]: https://img.shields.io/github/stars/tradewelltech/beavers
[stars-url]: https://github.com/tradewelltech/beavers
[versions-image]: https://img.shields.io/pypi/pyversions/beavers
[versions-url]: https://pypi.org/project/beavers/
[doc-image]: https://readthedocs.org/projects/beavers/badge/?version=latest
[doc-url]: https://beavers.readthedocs.io/en/latest/?badge=latest
[license-image]: http://img.shields.io/:license-Apache%202-blue.svg
[license-url]: https://github.com/tradewelltech/beavers/blob/main/LICENSE
[codecov-image]: https://codecov.io/gh/tradewelltech/beavers/branch/main/graph/badge.svg?token=GY6KL7NT1Q
[codecov-url]: https://codecov.io/gh/tradewelltech/beavers
[downloads-image]: https://pepy.tech/badge/beavers
[downloads-url]: https://static.pepy.tech/badge/beavers
[downloads-month-image]: https://pepy.tech/badge/beavers/month
[downloads-month-url]: https://static.pepy.tech/badge/beavers/month
[codestyle-image]: https://img.shields.io/badge/code%20style-black-000000.svg
[codestyle-url]: https://github.com/ambv/black
[snyk-image]: https://snyk.io/advisor/python/beavers/badge.svg
[snyk-url]: https://snyk.io/advisor/python/beavers

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tradewelltech/beavers",
    "name": "beavers",
    "maintainer": "0x26res",
    "docs_url": null,
    "requires_python": "<4,>=3.10",
    "maintainer_email": "0x26res@gmail.com",
    "keywords": "apache-arrow, streaming, data",
    "author": "Tradewell Tech",
    "author_email": "engineering@tradewelltech.co",
    "download_url": "https://files.pythonhosted.org/packages/44/bc/2274205c7c3bede375e2164b9e8defbfe73179f4f69216302c7fcbf63b05/beavers-0.12.0.tar.gz",
    "platform": null,
    "description": "\n[![PyPI Version][pypi-image]][pypi-url]\n[![Python Version][versions-image]][versions-url]\n[![Github Stars][stars-image]][stars-url]\n[![codecov][codecov-image]][codecov-url]\n[![Build Status][build-image]][build-url]\n[![Documentation][doc-image]][doc-url]\n[![License][license-image]][license-url]\n[![Downloads][downloads-image]][downloads-url]\n[![Downloads][downloads-month-image]][downloads-month-url]\n[![Code style: black][codestyle-image]][codestyle-url]\n\n![Beavers Logo][5]\n\n# Beavers\n\n[Documentation][6] / [Installation][7] / [Repository][1] / [PyPI][8]\n\n[Beavers][1] is a python library for stream processing, optimized for analytics.\n\nIt is used at [Tradewell Technologies][2],\nto calculate analytics and serve model predictions,\nfor both realtime and batch jobs.\n\n## Key Features\n\n- Works in **real time** (eg: reading from Kafka) and **replay mode** (eg: reading from Parquet files).\n- Optimized for analytics, using micro-batches (instead of processing records one by one).\n- Similar to [incremental][3], it updates nodes in a dag incrementally.\n- Taking inspiration from [kafka streams][4], there are two types of nodes in the dag:\n    - **Stream**: ephemeral micro-batches of events (cleared after every cycle).\n    - **State**: durable state derived from streams.\n- Clear separation between the business logic and the IO.\n  So the same dag can be used in real time mode, replay mode or can be easily tested.\n- Functional interface: no inheritance or decorator required.\n- Support for complicated joins, not just \"linear\" data flow.\n\n## Limitations\n\n- No concurrency support.\n  To speed up calculation use libraries like pandas, pyarrow or polars.\n- No async code.\n  To speed up IO use kafka driver native thread or parquet IO thread pool.\n- No support for persistent state.\n  Instead of saving state, replay historic data from kafka to prime stateful nodes.\n\n## Talks\n\n- [Unified batch and stream processing in python | PyData Global 2023][9]\n\n[1]: https://github.com/tradewelltech/beavers\n[2]: https://www.tradewelltech.co/\n[3]: https://github.com/janestreet/incremental\n[4]: https://www.confluent.io/blog/kafka-streams-tables-part-1-event-streaming/\n[5]: https://raw.githubusercontent.com/tradewelltech/beavers/master/docs/static/icons/beavers/logo.svg\n[6]: https://beavers.readthedocs.io/en/latest/\n[7]: https://beavers.readthedocs.io/en/latest/install/\n[8]: https://pypi.org/project/beavers/\n[9]: https://www.youtube.com/watch?v=8pUwsGA8SQM\n\n[pypi-image]: https://img.shields.io/pypi/v/beavers\n[pypi-url]: https://pypi.org/project/beavers/\n[build-image]: https://github.com/tradewelltech/beavers/actions/workflows/ci.yaml/badge.svg\n[build-url]: https://github.com/tradewelltech/beavers/actions/workflows/ci.yaml\n[stars-image]: https://img.shields.io/github/stars/tradewelltech/beavers\n[stars-url]: https://github.com/tradewelltech/beavers\n[versions-image]: https://img.shields.io/pypi/pyversions/beavers\n[versions-url]: https://pypi.org/project/beavers/\n[doc-image]: https://readthedocs.org/projects/beavers/badge/?version=latest\n[doc-url]: https://beavers.readthedocs.io/en/latest/?badge=latest\n[license-image]: http://img.shields.io/:license-Apache%202-blue.svg\n[license-url]: https://github.com/tradewelltech/beavers/blob/main/LICENSE\n[codecov-image]: https://codecov.io/gh/tradewelltech/beavers/branch/main/graph/badge.svg?token=GY6KL7NT1Q\n[codecov-url]: https://codecov.io/gh/tradewelltech/beavers\n[downloads-image]: https://pepy.tech/badge/beavers\n[downloads-url]: https://static.pepy.tech/badge/beavers\n[downloads-month-image]: https://pepy.tech/badge/beavers/month\n[downloads-month-url]: https://static.pepy.tech/badge/beavers/month\n[codestyle-image]: https://img.shields.io/badge/code%20style-black-000000.svg\n[codestyle-url]: https://github.com/ambv/black\n[snyk-image]: https://snyk.io/advisor/python/beavers/badge.svg\n[snyk-url]: https://snyk.io/advisor/python/beavers\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Python stream processing",
    "version": "0.12.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/tradewelltech/beavers/issues",
        "Changelog": "https://github.com/tradewelltech/beavers/blob/main/CHANGELOG.md",
        "Documentation": "https://beavers.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/tradewelltech/beavers",
        "Repository": "https://github.com/tradewelltech/beavers"
    },
    "split_keywords": [
        "apache-arrow",
        " streaming",
        " data"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f6800595d03cb2dc49c6cb76e60bfa6242559de5cce07cee43e26b4ff1773023",
                "md5": "97a0432f6cf0d5f38e89958c1a1b8a96",
                "sha256": "69bde622542d34dbb50cc010856846ae6f85ca0ebbbb3b4e9fabd718399f3f54"
            },
            "downloads": -1,
            "filename": "beavers-0.12.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "97a0432f6cf0d5f38e89958c1a1b8a96",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.10",
            "size": 31366,
            "upload_time": "2024-11-25T12:11:08",
            "upload_time_iso_8601": "2024-11-25T12:11:08.837485Z",
            "url": "https://files.pythonhosted.org/packages/f6/80/0595d03cb2dc49c6cb76e60bfa6242559de5cce07cee43e26b4ff1773023/beavers-0.12.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "44bc2274205c7c3bede375e2164b9e8defbfe73179f4f69216302c7fcbf63b05",
                "md5": "4953f48f7974dbfaa0e36bf5bb8221b5",
                "sha256": "3883a38efa2bc8dc8994c4ee05a90b2fface258f120b738bf03909536f47be25"
            },
            "downloads": -1,
            "filename": "beavers-0.12.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4953f48f7974dbfaa0e36bf5bb8221b5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.10",
            "size": 29627,
            "upload_time": "2024-11-25T12:11:10",
            "upload_time_iso_8601": "2024-11-25T12:11:10.403072Z",
            "url": "https://files.pythonhosted.org/packages/44/bc/2274205c7c3bede375e2164b9e8defbfe73179f4f69216302c7fcbf63b05/beavers-0.12.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-25 12:11:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tradewelltech",
    "github_project": "beavers",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "beavers"
}
        
Elapsed time: 0.59042s