Name | dask-expr JSON |
Version |
1.1.0
JSON |
| download |
home_page | None |
Summary | High Level Expressions for Dask |
upload_time | 2024-05-03 21:20:28 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | BSD |
keywords |
dask
pandas
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
Dask Expressions
================
Dask DataFrames with query optimization.
This is a rewrite of Dask DataFrame that includes query
optimization and generally improved organization.
More in our blog posts:
- [Dask Expressions overview](https://blog.dask.org/2023/08/25/dask-expr-introduction)
- [TPC-H benchmark results vs. Dask DataFrame](https://blog.coiled.io/blog/dask-expr-tpch-dask.html)
Example
-------
```python
import dask_expr as dx
df = dx.datasets.timeseries()
df.head()
df.groupby("name").x.mean().compute()
```
Query Representation
--------------------
Dask-expr encodes user code in an expression tree:
```python
>>> df.x.mean().pprint()
Mean:
Projection: columns='x'
Timeseries: seed=1896674884
```
This expression tree will be optimized and modified before execution:
```python
>>> df.x.mean().optimize().pprint()
Div:
Sum:
Fused(375f9):
| Projection: columns='x'
| Timeseries: dtypes={'x': <class 'float'>} seed=1896674884
Count:
Fused(375f9):
| Projection: columns='x'
| Timeseries: dtypes={'x': <class 'float'>} seed=1896674884
```
Stability
---------
This is the default backend for dask.DataFrame since version 2024.3.0.
API Coverage
------------
Dask-Expr covers almost everything of the Dask DataFrame API. The only missing features are:
- named GroupBy Aggregations
Raw data
{
"_id": null,
"home_page": null,
"name": "dask-expr",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "Matthew Rocklin <mrocklin@gmail.com>",
"keywords": "dask pandas",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/93/6e/201bfc4010d6b9eba3f510aabec223d6d0738032a010f0604e0dba4131c7/dask_expr-1.1.0.tar.gz",
"platform": null,
"description": "Dask Expressions\n================\n\nDask DataFrames with query optimization.\n\nThis is a rewrite of Dask DataFrame that includes query\noptimization and generally improved organization.\n\nMore in our blog posts:\n- [Dask Expressions overview](https://blog.dask.org/2023/08/25/dask-expr-introduction)\n- [TPC-H benchmark results vs. Dask DataFrame](https://blog.coiled.io/blog/dask-expr-tpch-dask.html)\n\nExample\n-------\n\n```python\nimport dask_expr as dx\n\ndf = dx.datasets.timeseries()\ndf.head()\n\ndf.groupby(\"name\").x.mean().compute()\n```\n\nQuery Representation\n--------------------\n\nDask-expr encodes user code in an expression tree:\n\n```python\n>>> df.x.mean().pprint()\n\nMean:\n Projection: columns='x'\n Timeseries: seed=1896674884\n```\n\nThis expression tree will be optimized and modified before execution:\n\n```python\n>>> df.x.mean().optimize().pprint()\n\nDiv:\n Sum:\n Fused(375f9):\n | Projection: columns='x'\n | Timeseries: dtypes={'x': <class 'float'>} seed=1896674884\n Count:\n Fused(375f9):\n | Projection: columns='x'\n | Timeseries: dtypes={'x': <class 'float'>} seed=1896674884\n```\n\nStability\n---------\n\nThis is the default backend for dask.DataFrame since version 2024.3.0.\n\nAPI Coverage\n------------\n\nDask-Expr covers almost everything of the Dask DataFrame API. The only missing features are:\n\n- named GroupBy Aggregations\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "High Level Expressions for Dask",
"version": "1.1.0",
"project_urls": {
"Source code": "https://github.com/dask-contrib/dask-expr/"
},
"split_keywords": [
"dask",
"pandas"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f6b2ae3bccc04cab1a2fd54a11ba3db6ba0290197f9a8620b1d832e768002e3a",
"md5": "d15f7829b31e3730f16f519d8f29f263",
"sha256": "15c30460fd8493417f00b9c00e070d7037d20848b51d69b3347d0ebc6859e81f"
},
"downloads": -1,
"filename": "dask_expr-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d15f7829b31e3730f16f519d8f29f263",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 205125,
"upload_time": "2024-05-03T21:20:26",
"upload_time_iso_8601": "2024-05-03T21:20:26.112692Z",
"url": "https://files.pythonhosted.org/packages/f6/b2/ae3bccc04cab1a2fd54a11ba3db6ba0290197f9a8620b1d832e768002e3a/dask_expr-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "936e201bfc4010d6b9eba3f510aabec223d6d0738032a010f0604e0dba4131c7",
"md5": "30a96b0a413f14893e52f272cc2c22cb",
"sha256": "cd0e0ad4d65229a4fd22a231fa9c7e632cef09c5b6ab7259bdaaaadd20179cdb"
},
"downloads": -1,
"filename": "dask_expr-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "30a96b0a413f14893e52f272cc2c22cb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 188224,
"upload_time": "2024-05-03T21:20:28",
"upload_time_iso_8601": "2024-05-03T21:20:28.698587Z",
"url": "https://files.pythonhosted.org/packages/93/6e/201bfc4010d6b9eba3f510aabec223d6d0738032a010f0604e0dba4131c7/dask_expr-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-03 21:20:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dask-contrib",
"github_project": "dask-expr",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dask-expr"
}