tsum


Nametsum JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummarySummarize data in Dask DataFrames.
upload_time2024-04-07 06:09:21
maintainerNone
docs_urlNone
authorFasih Khatib
requires_python<4.0,>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## TSum - Table Summarization

> Given a table where rows correspond to records and columns correspond to attributes, we want to find a small number of patterns that succinctly summarize the dataset. 

TSum is a [table summarization algorithm published by Google Research.](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41683.pdf) This is a Python implementation of the algorithm using Dask Dataframes for scale.  

### Usage

```python
import dask.dataframe as dd
from tsum import summarize, Pattern
from dask.distributed import LocalCluster

cluster = LocalCluster(n_workers=1, nthreads=8, diagnostics_port=8787)
client = cluster.get_client()
ddf: dd.DataFrame = ...
patterns: list[Pattern] = summarize(ddf=ddf)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "tsum",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Fasih Khatib",
    "author_email": "hellofasih.confound928@passinbox.com",
    "download_url": "https://files.pythonhosted.org/packages/04/9e/51a253c6bb4690411190d5acaa78d324ec11fd62e25cb0ef96877483b432/tsum-0.1.0.tar.gz",
    "platform": null,
    "description": "## TSum - Table Summarization\n\n> Given a table where rows correspond to records and columns correspond to attributes, we want to find a small number of patterns that succinctly summarize the dataset. \n\nTSum is a [table summarization algorithm published by Google Research.](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41683.pdf) This is a Python implementation of the algorithm using Dask Dataframes for scale.  \n\n### Usage\n\n```python\nimport dask.dataframe as dd\nfrom tsum import summarize, Pattern\nfrom dask.distributed import LocalCluster\n\ncluster = LocalCluster(n_workers=1, nthreads=8, diagnostics_port=8787)\nclient = cluster.get_client()\nddf: dd.DataFrame = ...\npatterns: list[Pattern] = summarize(ddf=ddf)\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Summarize data in Dask DataFrames.",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f128c22302bfd944aa9f0b4616b4a457a51de7f89de104118b552338472d8c3b",
                "md5": "884cea66f693887829e87e5fa2101e0b",
                "sha256": "8714a2e3e34d229bf584ddec635b3259ef1f3b9c58a53b11842edebc38f27bee"
            },
            "downloads": -1,
            "filename": "tsum-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "884cea66f693887829e87e5fa2101e0b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 13993,
            "upload_time": "2024-04-07T06:09:17",
            "upload_time_iso_8601": "2024-04-07T06:09:17.893007Z",
            "url": "https://files.pythonhosted.org/packages/f1/28/c22302bfd944aa9f0b4616b4a457a51de7f89de104118b552338472d8c3b/tsum-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "049e51a253c6bb4690411190d5acaa78d324ec11fd62e25cb0ef96877483b432",
                "md5": "9f33d1eb019852e10e6d727b3a49320f",
                "sha256": "4828d78e827290848a19c028dce1056abca56a75a4c62fea3b0d50bac2f4eb67"
            },
            "downloads": -1,
            "filename": "tsum-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9f33d1eb019852e10e6d727b3a49320f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 7665,
            "upload_time": "2024-04-07T06:09:21",
            "upload_time_iso_8601": "2024-04-07T06:09:21.033451Z",
            "url": "https://files.pythonhosted.org/packages/04/9e/51a253c6bb4690411190d5acaa78d324ec11fd62e25cb0ef96877483b432/tsum-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-07 06:09:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "tsum"
}
        
Elapsed time: 0.21213s