pandas-streaming


Namepandas-streaming JSON
Version 0.5.0 PyPI version JSON
download
home_pagehttps://github.com/sdpython/pandas-streaming
SummaryArray (and numpy) API for ONNX
upload_time2024-01-13 12:27:56
maintainer
docs_urlNone
authorXavier Dupré
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            pandas-streaming: streaming API over pandas
===========================================

.. image:: https://ci.appveyor.com/api/projects/status/4te066r8ne1ymmhy?svg=true
    :target: https://ci.appveyor.com/project/sdpython/pandas-streaming
    :alt: Build Status Windows

.. image:: https://dl.circleci.com/status-badge/img/gh/sdpython/pandas-streaming/tree/main.svg?style=svg
    :target: https://dl.circleci.com/status-badge/redirect/gh/sdpython/pandas-streaming/tree/main

.. image:: https://dev.azure.com/xavierdupre3/pandas_streaming/_apis/build/status/sdpython.pandas_streaming
    :target: https://dev.azure.com/xavierdupre3/pandas_streaming/

.. image:: https://badge.fury.io/py/pandas_streaming.svg
    :target: http://badge.fury.io/py/pandas_streaming

.. image:: https://img.shields.io/badge/license-MIT-blue.svg
    :alt: MIT License
    :target: https://opensource.org/license/MIT/

.. image:: https://codecov.io/gh/sdpython/pandas-streaming/branch/main/graph/badge.svg?token=0caHX1rhr8 
    :target: https://codecov.io/gh/sdpython/pandas-streaming

.. image:: http://img.shields.io/github/issues/sdpython/pandas_streaming.png
    :alt: GitHub Issues
    :target: https://github.com/sdpython/pandas_streaming/issues

.. image:: https://pepy.tech/badge/pandas_streaming/month
    :target: https://pepy.tech/project/pandas_streaming/month
    :alt: Downloads

.. image:: https://img.shields.io/github/forks/sdpython/pandas_streaming.svg
    :target: https://github.com/sdpython/pandas_streaming/
    :alt: Forks

.. image:: https://img.shields.io/github/stars/sdpython/pandas_streaming.svg
    :target: https://github.com/sdpython/pandas_streaming/
    :alt: Stars

.. image:: https://img.shields.io/github/repo-size/sdpython/pandas_streaming
    :target: https://github.com/sdpython/pandas_streaming/
    :alt: size

`pandas-streaming <https://sdpython.github.io/doc/pandas-streaming/dev/>`_
aims at processing big files with `pandas <https://pandas.pydata.org/>`_,
too big to hold in memory, too small to be parallelized with a significant gain.
The module replicates a subset of *pandas* API
and implements other functionalities for machine learning.

.. code-block:: python

    from pandas_streaming.df import StreamingDataFrame
    sdf = StreamingDataFrame.read_csv("filename", sep="\t", encoding="utf-8")

    for df in sdf:
        # process this chunk of data
        # df is a dataframe
        print(df)

The module can also stream an existing dataframe.

.. code-block:: python

    import pandas
    df = pandas.DataFrame([dict(cf=0, cint=0, cstr="0"),
                           dict(cf=1, cint=1, cstr="1"),
                           dict(cf=3, cint=3, cstr="3")])

    from pandas_streaming.df import StreamingDataFrame
    sdf = StreamingDataFrame.read_df(df)

    for df in sdf:
        # process this chunk of data
        # df is a dataframe
        print(df)

It contains other helpers to split datasets into
train and test with some weird constraints.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sdpython/pandas-streaming",
    "name": "pandas-streaming",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Xavier Dupr\u00e9",
    "author_email": "xavier.dupre@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/21/f3/28a70d24df490849b5c4c93deacb3fb6674e928834a63f86edb05e071e5b/pandas-streaming-0.5.0.tar.gz",
    "platform": null,
    "description": "pandas-streaming: streaming API over pandas\n===========================================\n\n.. image:: https://ci.appveyor.com/api/projects/status/4te066r8ne1ymmhy?svg=true\n    :target: https://ci.appveyor.com/project/sdpython/pandas-streaming\n    :alt: Build Status Windows\n\n.. image:: https://dl.circleci.com/status-badge/img/gh/sdpython/pandas-streaming/tree/main.svg?style=svg\n    :target: https://dl.circleci.com/status-badge/redirect/gh/sdpython/pandas-streaming/tree/main\n\n.. image:: https://dev.azure.com/xavierdupre3/pandas_streaming/_apis/build/status/sdpython.pandas_streaming\n    :target: https://dev.azure.com/xavierdupre3/pandas_streaming/\n\n.. image:: https://badge.fury.io/py/pandas_streaming.svg\n    :target: http://badge.fury.io/py/pandas_streaming\n\n.. image:: https://img.shields.io/badge/license-MIT-blue.svg\n    :alt: MIT License\n    :target: https://opensource.org/license/MIT/\n\n.. image:: https://codecov.io/gh/sdpython/pandas-streaming/branch/main/graph/badge.svg?token=0caHX1rhr8 \n    :target: https://codecov.io/gh/sdpython/pandas-streaming\n\n.. image:: http://img.shields.io/github/issues/sdpython/pandas_streaming.png\n    :alt: GitHub Issues\n    :target: https://github.com/sdpython/pandas_streaming/issues\n\n.. image:: https://pepy.tech/badge/pandas_streaming/month\n    :target: https://pepy.tech/project/pandas_streaming/month\n    :alt: Downloads\n\n.. image:: https://img.shields.io/github/forks/sdpython/pandas_streaming.svg\n    :target: https://github.com/sdpython/pandas_streaming/\n    :alt: Forks\n\n.. image:: https://img.shields.io/github/stars/sdpython/pandas_streaming.svg\n    :target: https://github.com/sdpython/pandas_streaming/\n    :alt: Stars\n\n.. image:: https://img.shields.io/github/repo-size/sdpython/pandas_streaming\n    :target: https://github.com/sdpython/pandas_streaming/\n    :alt: size\n\n`pandas-streaming <https://sdpython.github.io/doc/pandas-streaming/dev/>`_\naims at processing big files with `pandas <https://pandas.pydata.org/>`_,\ntoo big to hold in memory, too small to be parallelized with a significant gain.\nThe module replicates a subset of *pandas* API\nand implements other functionalities for machine learning.\n\n.. code-block:: python\n\n    from pandas_streaming.df import StreamingDataFrame\n    sdf = StreamingDataFrame.read_csv(\"filename\", sep=\"\\t\", encoding=\"utf-8\")\n\n    for df in sdf:\n        # process this chunk of data\n        # df is a dataframe\n        print(df)\n\nThe module can also stream an existing dataframe.\n\n.. code-block:: python\n\n    import pandas\n    df = pandas.DataFrame([dict(cf=0, cint=0, cstr=\"0\"),\n                           dict(cf=1, cint=1, cstr=\"1\"),\n                           dict(cf=3, cint=3, cstr=\"3\")])\n\n    from pandas_streaming.df import StreamingDataFrame\n    sdf = StreamingDataFrame.read_df(df)\n\n    for df in sdf:\n        # process this chunk of data\n        # df is a dataframe\n        print(df)\n\nIt contains other helpers to split datasets into\ntrain and test with some weird constraints.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Array (and numpy) API for ONNX",
    "version": "0.5.0",
    "project_urls": {
        "Homepage": "https://github.com/sdpython/pandas-streaming"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0ae2fd3184612f13a4acbc1daf661a544118806a1b640b7561ba18a7928f243c",
                "md5": "ea4f7fb97a23cfd455bfe3a8e0703a0a",
                "sha256": "a6ded7b7cc8f87a45e63c581bdc796fd37981182dbf3229b74e80b20385c5ba6"
            },
            "downloads": -1,
            "filename": "pandas_streaming-0.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ea4f7fb97a23cfd455bfe3a8e0703a0a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 36643,
            "upload_time": "2024-01-13T12:27:54",
            "upload_time_iso_8601": "2024-01-13T12:27:54.213481Z",
            "url": "https://files.pythonhosted.org/packages/0a/e2/fd3184612f13a4acbc1daf661a544118806a1b640b7561ba18a7928f243c/pandas_streaming-0.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "21f328a70d24df490849b5c4c93deacb3fb6674e928834a63f86edb05e071e5b",
                "md5": "b0428843b387193bd50e7b5f40eacfbe",
                "sha256": "5693cd930d0b833aef5d2aa7873528a8fbe60b2f4575fe65499a2a05fc57381f"
            },
            "downloads": -1,
            "filename": "pandas-streaming-0.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "b0428843b387193bd50e7b5f40eacfbe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 34064,
            "upload_time": "2024-01-13T12:27:56",
            "upload_time_iso_8601": "2024-01-13T12:27:56.094260Z",
            "url": "https://files.pythonhosted.org/packages/21/f3/28a70d24df490849b5c4c93deacb3fb6674e928834a63f86edb05e071e5b/pandas-streaming-0.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-13 12:27:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sdpython",
    "github_project": "pandas-streaming",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "circle": true,
    "appveyor": true,
    "requirements": [],
    "lcname": "pandas-streaming"
}
        
Elapsed time: 0.22577s