# Tabeline
Tabeline is a data frame and data grammar library. You write the expressions in strings and supply them to methods on the `DataFrame` class. The strings are parsed by Parsita and converted into Polars for execution.
Tabeline draws inspiration from dplyr, the data grammar of R's tidyverse, especially for its methods names. The `filter`, `mutate`, `group_by`, and `summarize` methods should all feel familiar. But Tabeline is as proper a Python library as can be, using methods instead of pipes, like is standard in R.
Tabeline uses Polars under the hood, but adds a lot of handling of edge cases from Polars, which otherwise result in crashes or behavior that is not type stable.
See the [Documentation](https://tabeline.drhagen.com) for the full user guide.
## Installation
It is recommended to install Tabeline from PyPI using `pip`.
```shell
pip install tabeline
```
## Motivating example
```python
from tabeline import DataFrame
# Construct a data frame using clean syntax
# from_csv, from_pandas, and from_polars are also available
df = DataFrame(
id=[0, 0, 0, 0, 1, 1, 1, 1, 1],
t=[0, 6, 12, 24, 0, 6, 12, 24, 48],
y=[0, 2, 3, 1, 0, 4, 3, 2, 1],
)
# Use data grammar methods and string expressions to define
# transformed data frames
analysis = (
df
.filter("t <= 24")
.group_by("id")
.summarize(auc="trapz(t, y)")
)
print(analysis)
# shape: (2, 2)
# ┌─────┬──────┐
# │ id ┆ auc │
# │ --- ┆ --- │
# │ i64 ┆ f64 │
# ╞═════╪══════╡
# │ 0 ┆ 45.0 │
# ├╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ 1 ┆ 63.0 │
# └─────┴──────┘
```
Raw data
{
"_id": null,
"home_page": "https://github.com/drhagen/tabeline",
"name": "tabeline",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "dataframe, datatable, datagrammar, dplyr",
"author": "David Hagen",
"author_email": "david@drhagen.com",
"download_url": "https://files.pythonhosted.org/packages/af/ab/5ddbdcd1dd516526fd56850d064cc31001cf24738a9b779840e619dca4e0/tabeline-0.5.0.tar.gz",
"platform": null,
"description": "# Tabeline\n\nTabeline is a data frame and data grammar library. You write the expressions in strings and supply them to methods on the `DataFrame` class. The strings are parsed by Parsita and converted into Polars for execution.\n\nTabeline draws inspiration from dplyr, the data grammar of R's tidyverse, especially for its methods names. The `filter`, `mutate`, `group_by`, and `summarize` methods should all feel familiar. But Tabeline is as proper a Python library as can be, using methods instead of pipes, like is standard in R. \n\nTabeline uses Polars under the hood, but adds a lot of handling of edge cases from Polars, which otherwise result in crashes or behavior that is not type stable.\n\nSee the [Documentation](https://tabeline.drhagen.com) for the full user guide.\n\n## Installation\n\nIt is recommended to install Tabeline from PyPI using `pip`.\n\n```shell\npip install tabeline\n```\n\n## Motivating example\n\n```python\nfrom tabeline import DataFrame\n\n# Construct a data frame using clean syntax\n# from_csv, from_pandas, and from_polars are also available \ndf = DataFrame(\n id=[0, 0, 0, 0, 1, 1, 1, 1, 1],\n t=[0, 6, 12, 24, 0, 6, 12, 24, 48],\n y=[0, 2, 3, 1, 0, 4, 3, 2, 1],\n)\n\n# Use data grammar methods and string expressions to define\n# transformed data frames\nanalysis = (\n df\n .filter(\"t <= 24\")\n .group_by(\"id\")\n .summarize(auc=\"trapz(t, y)\")\n)\n\nprint(analysis)\n# shape: (2, 2)\n# \u250c\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n# \u2502 id \u2506 auc \u2502\n# \u2502 --- \u2506 --- \u2502\n# \u2502 i64 \u2506 f64 \u2502\n# \u255e\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n# \u2502 0 \u2506 45.0 \u2502\n# \u251c\u254c\u254c\u254c\u254c\u254c\u253c\u254c\u254c\u254c\u254c\u254c\u254c\u2524\n# \u2502 1 \u2506 63.0 \u2502\n# \u2514\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A data frame and data grammar library",
"version": "0.5.0",
"project_urls": {
"Documentation": "https://tabeline.drhagen.com",
"Homepage": "https://github.com/drhagen/tabeline",
"Repository": "https://github.com/drhagen/tabeline"
},
"split_keywords": [
"dataframe",
" datatable",
" datagrammar",
" dplyr"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6c59a75d675636933182984e99119814653b9cb0a459e266fee9d681734d54f5",
"md5": "76fb574b153d08b5511a95dc3a173bb9",
"sha256": "9d9c3740482d1d3fe5b2b7810097f3be0c05ec13f9b6ebea5cb4992feb3960a3"
},
"downloads": -1,
"filename": "tabeline-0.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "76fb574b153d08b5511a95dc3a173bb9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 20465,
"upload_time": "2024-11-25T13:42:10",
"upload_time_iso_8601": "2024-11-25T13:42:10.736816Z",
"url": "https://files.pythonhosted.org/packages/6c/59/a75d675636933182984e99119814653b9cb0a459e266fee9d681734d54f5/tabeline-0.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "afab5ddbdcd1dd516526fd56850d064cc31001cf24738a9b779840e619dca4e0",
"md5": "47b7561d5a58e332be594cbfe4d0e949",
"sha256": "26e9595328ce7f39923a6cf4c10832da3c29a474b53b6ef574e9c9fac0eb01c0"
},
"downloads": -1,
"filename": "tabeline-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "47b7561d5a58e332be594cbfe4d0e949",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 17576,
"upload_time": "2024-11-25T13:42:12",
"upload_time_iso_8601": "2024-11-25T13:42:12.616003Z",
"url": "https://files.pythonhosted.org/packages/af/ab/5ddbdcd1dd516526fd56850d064cc31001cf24738a9b779840e619dca4e0/tabeline-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-25 13:42:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "drhagen",
"github_project": "tabeline",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "tabeline"
}