Name | polars-expr-hopper JSON |
Version |
0.6.5
JSON |
| download |
home_page | None |
Summary | A Polars plugin providing a 'hopper' of expressions for automatic, schema-aware application. |
upload_time | 2025-02-13 09:53:06 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | MIT |
keywords |
polars
plugin
filter
expr
metadata
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# polars-expr-hopper
<!-- [](https://pepy.tech/project/polars-expr-hopper) -->
[](https://github.com/astral-sh/uv)
[](https://pdm.fming.dev)
[](https://pypi.org/project/polars-expr-hopper)
[](https://pypi.org/project/polars-expr-hopper)
[](https://pypi.org/project/polars-expr-hopper)
[](https://results.pre-commit.ci/latest/github/lmmx/polars-expr-hopper/master)
**Polars plugin providing an “expression hopper”**—a flexible, DataFrame-level container of **Polars expressions** (`pl.Expr`) that apply themselves **as soon as** the relevant columns are available.
Powered by [polars-config-meta](https://pypi.org/project/polars-config-meta/) for persistent DataFrame-level metadata.
Simplify data pipelines by storing your expressions in a single location and letting them apply **as soon as** the corresponding columns exist in the DataFrame schema.
## Installation
```bash
pip install polars-expr-hopper
```
> The `polars` dependency is required but not included in the package by default.
> It is shipped as an optional extra which can be activated by passing it in square brackets:
> ```bash
> pip install polars-expr-hopper[polars] # for standard Polars
> pip install polars-expr-hopper[polars-lts-cpu] # for older CPUs
> ```
### Requirements
- Python 3.9+
- Polars (any recent version, installed via `[polars]` or `[polars-lts-cpu]` extras)
- _(Optional)_ [pyarrow](https://pypi.org/project/pyarrow) if you want Parquet I/O features that preserve metadata in the hopper
## Features
- **DataFrame-Level Expression Management**: Store multiple Polars **expressions** on a DataFrame via the `.hopper` namespace.
- **Apply When Ready**: Each expression is automatically applied once the DataFrame has all columns required by that expression.
- **Namespace Plugin**: Access everything through `df.hopper.*(...)`—no subclassing or monkey-patching.
- **Metadata Preservation**: Transformations called through `df.hopper.<method>()` keep the same expression hopper on the new DataFrame.
- **No Central Orchestration**: Avoid fiddly pipeline step names or schemas—just attach your expressions once, and they get applied in the right order automatically.
- **Optional Serialisation**: If you want to store or share expressions across runs (e.g., Parquet round-trip), you can serialise them to JSON or binary and restore them later—without forcing overhead in normal usage.
## Usage
### Basic Usage Example
```python
import polars as pl
import polars_hopper # This registers the .hopper plugin under pl.DataFrame
# Create an initial DataFrame
df = pl.DataFrame({
"user_id": [1, 2, 3, 0],
"name": ["Alice", "Bob", "Charlie", "NullUser"]
})
# Add expressions to the hopper:
# - This one is valid right away: pl.col("user_id") != 0
# - Another needs a future 'age' column
df.hopper.add_filters(pl.col("user_id") != 0)
df.hopper.add_filters(pl.col("age") > 18) # 'age' doesn't exist yet
# Apply what we can; the first expression is immediately valid:
df = df.hopper.apply_ready_filters()
print(df)
# Rows with user_id=0 are dropped.
# Now let's do a transformation that adds an 'age' column.
# By calling df.hopper.with_columns(...), the plugin
# automatically copies the hopper metadata to the new DataFrame.
df2 = df.hopper.with_columns(
pl.Series("age", [25, 15, 30]) # new column
)
# Now the second expression can be applied:
df2 = df2.hopper.apply_ready_filters()
print(df2)
# Only rows with age > 18 remain. That expression is then removed from the hopper.
```
### How It Works
Internally, **polars-expr-hopper** attaches a small “manager” object (a plugin namespace) to each `DataFrame`. This manager leverages [polars-config-meta](https://pypi.org/project/polars-config-meta/) to store data in `df.config_meta.get_metadata()`, keyed by the `id(df)`.
1. **List of In-Memory Expressions**:
- Maintains a `hopper_filters` list of Polars expressions (`pl.Expr`) in the DataFrame’s metadata.
- Avoids Python callables or lambdas so that **.meta.root_names()** can be used for schema checks and optional serialisation is possible.
2. **Automatic Column Check** (`apply_ready_filters()`)
- On `apply_ready_filters()`, each expression’s required columns (via `.meta.root_names()`) are compared to the current DataFrame schema.
- Expressions referencing missing columns remain pending.
- Expressions referencing all present columns are applied via `df.filter(expr)`.
- Successfully applied expressions are removed from the hopper.
3. **Metadata Preservation**
- Because we rely on **polars-config-meta**, transformations called through `df.hopper.select(...)`, `df.hopper.with_columns(...)`, etc. automatically copy the same `hopper_filters` list to the new DataFrame.
- This ensures **pending** expressions remain valid throughout your pipeline until their columns finally appear.
4. **No Monkey-Patching**
- Polars’ plugin system is used, so there is no monkey-patching of core Polars classes.
- The plugin registers a `.hopper` namespace—just like `df.config_meta`, but specialised for expression management.
Together, these features allow you to:
- store a **set** of Polars expressions in one place
- apply them **as soon as** their required columns exist
- easily carry them forward through the pipeline
All without global orchestration or repeated expression checks.
This was motivated by wanting a way to make a flexible CLI tool and express filters for the results
at different steps, without a proliferation of CLI flags. From there, the idea of a 'queue' which
was pulled from on demand, in FIFO order but on the condition that the schema must be amenable was born.
This idea **could be extended to `select` statements**, but initially filtering was the primary deliverable.
### API Methods
- `add_filters(*exprs: tuple[pl.Expr, ...])`
Add a new predicate (lambda, function, Polars expression, etc.) to the hopper.
- `apply_ready_filters() -> pl.DataFrame`
Check each stored expression’s root names. If the columns exist, `df.filter(expr)` is applied. Successfully applied expressions are removed.
- `list_filters() -> List[pl.Expr]`
Inspect the still-pending expressions in the hopper.
- `serialise_filters(format="binary"|"json") -> List[str|bytes]`
Convert expressions to JSON strings or binary bytes.
- `deserialise_filters(serialised_list, format="binary"|"json")`
Re-create in-memory `pl.Expr` objects from the serialised data, overwriting any existing expressions.
## Contributing
Maintained by [Louis Maddox](https://github.com/lmmx/polars-expr-hopper). Contributions welcome!
1. **Issues & Discussions**: Please open a GitHub issue or discussion for bugs, feature requests, or questions.
2. **Pull Requests**: PRs are welcome!
- Install the dev extra (e.g. with [uv](https://docs.astral.sh/uv/)):
`uv pip install -e .[dev]`
- Run tests (when available) and include updates to docs or examples if relevant.
- If reporting a bug, please include the version and any error messages/tracebacks.
## License
This project is licensed under the [MIT License](https://opensource.org/licenses/MIT).
Raw data
{
"_id": null,
"home_page": null,
"name": "polars-expr-hopper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "polars, plugin, filter, expr, metadata",
"author": null,
"author_email": "Louis Maddox <louismmx@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/53/27/f479f2b6a79c0ea0fbc82a3faed2f83634d2e2b141edc7a04d77a0dfe5d1/polars_expr_hopper-0.6.5.tar.gz",
"platform": null,
"description": "# polars-expr-hopper\n\n<!-- [](https://pepy.tech/project/polars-expr-hopper) -->\n[](https://github.com/astral-sh/uv)\n[](https://pdm.fming.dev)\n[](https://pypi.org/project/polars-expr-hopper)\n[](https://pypi.org/project/polars-expr-hopper)\n[](https://pypi.org/project/polars-expr-hopper)\n[](https://results.pre-commit.ci/latest/github/lmmx/polars-expr-hopper/master)\n\n**Polars plugin providing an \u201cexpression hopper\u201d**\u2014a flexible, DataFrame-level container of **Polars expressions** (`pl.Expr`) that apply themselves **as soon as** the relevant columns are available.\n\nPowered by [polars-config-meta](https://pypi.org/project/polars-config-meta/) for persistent DataFrame-level metadata.\n\nSimplify data pipelines by storing your expressions in a single location and letting them apply **as soon as** the corresponding columns exist in the DataFrame schema.\n\n## Installation\n\n```bash\npip install polars-expr-hopper\n```\n\n> The `polars` dependency is required but not included in the package by default.\n> It is shipped as an optional extra which can be activated by passing it in square brackets:\n> ```bash\n> pip install polars-expr-hopper[polars] # for standard Polars\n> pip install polars-expr-hopper[polars-lts-cpu] # for older CPUs\n> ```\n\n### Requirements\n\n- Python 3.9+\n- Polars (any recent version, installed via `[polars]` or `[polars-lts-cpu]` extras)\n- _(Optional)_ [pyarrow](https://pypi.org/project/pyarrow) if you want Parquet I/O features that preserve metadata in the hopper\n\n## Features\n\n- **DataFrame-Level Expression Management**: Store multiple Polars **expressions** on a DataFrame via the `.hopper` namespace.\n- **Apply When Ready**: Each expression is automatically applied once the DataFrame has all columns required by that expression.\n- **Namespace Plugin**: Access everything through `df.hopper.*(...)`\u2014no subclassing or monkey-patching.\n- **Metadata Preservation**: Transformations called through `df.hopper.<method>()` keep the same expression hopper on the new DataFrame.\n- **No Central Orchestration**: Avoid fiddly pipeline step names or schemas\u2014just attach your expressions once, and they get applied in the right order automatically.\n- **Optional Serialisation**: If you want to store or share expressions across runs (e.g., Parquet round-trip), you can serialise them to JSON or binary and restore them later\u2014without forcing overhead in normal usage.\n\n## Usage\n\n### Basic Usage Example\n\n```python\nimport polars as pl\nimport polars_hopper # This registers the .hopper plugin under pl.DataFrame\n\n# Create an initial DataFrame\ndf = pl.DataFrame({\n \"user_id\": [1, 2, 3, 0],\n \"name\": [\"Alice\", \"Bob\", \"Charlie\", \"NullUser\"]\n})\n\n# Add expressions to the hopper:\n# - This one is valid right away: pl.col(\"user_id\") != 0\n# - Another needs a future 'age' column\ndf.hopper.add_filters(pl.col(\"user_id\") != 0)\ndf.hopper.add_filters(pl.col(\"age\") > 18) # 'age' doesn't exist yet\n\n# Apply what we can; the first expression is immediately valid:\ndf = df.hopper.apply_ready_filters()\nprint(df)\n# Rows with user_id=0 are dropped.\n\n# Now let's do a transformation that adds an 'age' column.\n# By calling df.hopper.with_columns(...), the plugin\n# automatically copies the hopper metadata to the new DataFrame.\ndf2 = df.hopper.with_columns(\n pl.Series(\"age\", [25, 15, 30]) # new column\n)\n\n# Now the second expression can be applied:\ndf2 = df2.hopper.apply_ready_filters()\nprint(df2)\n# Only rows with age > 18 remain. That expression is then removed from the hopper.\n```\n\n### How It Works\n\nInternally, **polars-expr-hopper** attaches a small \u201cmanager\u201d object (a plugin namespace) to each `DataFrame`. This manager leverages [polars-config-meta](https://pypi.org/project/polars-config-meta/) to store data in `df.config_meta.get_metadata()`, keyed by the `id(df)`.\n\n1. **List of In-Memory Expressions**:\n - Maintains a `hopper_filters` list of Polars expressions (`pl.Expr`) in the DataFrame\u2019s metadata.\n - Avoids Python callables or lambdas so that **.meta.root_names()** can be used for schema checks and optional serialisation is possible.\n\n2. **Automatic Column Check** (`apply_ready_filters()`)\n - On `apply_ready_filters()`, each expression\u2019s required columns (via `.meta.root_names()`) are compared to the current DataFrame schema.\n - Expressions referencing missing columns remain pending.\n - Expressions referencing all present columns are applied via `df.filter(expr)`.\n - Successfully applied expressions are removed from the hopper.\n\n3. **Metadata Preservation**\n - Because we rely on **polars-config-meta**, transformations called through `df.hopper.select(...)`, `df.hopper.with_columns(...)`, etc. automatically copy the same `hopper_filters` list to the new DataFrame.\n - This ensures **pending** expressions remain valid throughout your pipeline until their columns finally appear.\n\n4. **No Monkey-Patching**\n - Polars\u2019 plugin system is used, so there is no monkey-patching of core Polars classes.\n - The plugin registers a `.hopper` namespace\u2014just like `df.config_meta`, but specialised for expression management.\n\nTogether, these features allow you to:\n\n- store a **set** of Polars expressions in one place\n- apply them **as soon as** their required columns exist\n- easily carry them forward through the pipeline\n\nAll without global orchestration or repeated expression checks.\n\nThis was motivated by wanting a way to make a flexible CLI tool and express filters for the results\nat different steps, without a proliferation of CLI flags. From there, the idea of a 'queue' which\nwas pulled from on demand, in FIFO order but on the condition that the schema must be amenable was born.\n\nThis idea **could be extended to `select` statements**, but initially filtering was the primary deliverable.\n\n### API Methods\n\n- `add_filters(*exprs: tuple[pl.Expr, ...])`\n Add a new predicate (lambda, function, Polars expression, etc.) to the hopper.\n\n- `apply_ready_filters() -> pl.DataFrame`\n Check each stored expression\u2019s root names. If the columns exist, `df.filter(expr)` is applied. Successfully applied expressions are removed.\n- `list_filters() -> List[pl.Expr]`\n Inspect the still-pending expressions in the hopper.\n- `serialise_filters(format=\"binary\"|\"json\") -> List[str|bytes]`\n Convert expressions to JSON strings or binary bytes.\n- `deserialise_filters(serialised_list, format=\"binary\"|\"json\")`\n Re-create in-memory `pl.Expr` objects from the serialised data, overwriting any existing expressions.\n\n## Contributing\n\nMaintained by [Louis Maddox](https://github.com/lmmx/polars-expr-hopper). Contributions welcome!\n\n1. **Issues & Discussions**: Please open a GitHub issue or discussion for bugs, feature requests, or questions.\n2. **Pull Requests**: PRs are welcome!\n - Install the dev extra (e.g. with [uv](https://docs.astral.sh/uv/)):\n `uv pip install -e .[dev]`\n - Run tests (when available) and include updates to docs or examples if relevant.\n - If reporting a bug, please include the version and any error messages/tracebacks.\n\n## License\n\nThis project is licensed under the [MIT License](https://opensource.org/licenses/MIT).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Polars plugin providing a 'hopper' of expressions for automatic, schema-aware application.",
"version": "0.6.5",
"project_urls": {
"Documentation": "https://polars-expr-hopper.vercel.app/",
"Homepage": "https://github.com/lmmx/polars-expr-hopper",
"Repository": "https://github.com/lmmx/polars-expr-hopper.git"
},
"split_keywords": [
"polars",
" plugin",
" filter",
" expr",
" metadata"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ded35ffab6436cf1bf940ccf378d4610224563c1a35b412c7570e689f424b270",
"md5": "92a592dc82d52a15f16fea183fe87586",
"sha256": "57d71e189fa4f177a14d1a3d0fe685a5cefe545f1beed440a795fed6293b0af6"
},
"downloads": -1,
"filename": "polars_expr_hopper-0.6.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "92a592dc82d52a15f16fea183fe87586",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 10478,
"upload_time": "2025-02-13T09:53:04",
"upload_time_iso_8601": "2025-02-13T09:53:04.726699Z",
"url": "https://files.pythonhosted.org/packages/de/d3/5ffab6436cf1bf940ccf378d4610224563c1a35b412c7570e689f424b270/polars_expr_hopper-0.6.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5327f479f2b6a79c0ea0fbc82a3faed2f83634d2e2b141edc7a04d77a0dfe5d1",
"md5": "bdb185e7a6021b54bb9023dabf92454e",
"sha256": "dca7335750901b85209145ab870371143e356261105392ec1a3db54e18216134"
},
"downloads": -1,
"filename": "polars_expr_hopper-0.6.5.tar.gz",
"has_sig": false,
"md5_digest": "bdb185e7a6021b54bb9023dabf92454e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 21306,
"upload_time": "2025-02-13T09:53:06",
"upload_time_iso_8601": "2025-02-13T09:53:06.662917Z",
"url": "https://files.pythonhosted.org/packages/53/27/f479f2b6a79c0ea0fbc82a3faed2f83634d2e2b141edc7a04d77a0dfe5d1/polars_expr_hopper-0.6.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-13 09:53:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lmmx",
"github_project": "polars-expr-hopper",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "polars-expr-hopper"
}