# Welcome to oups!
## What is oups?
*oups* stands for Ordered Unified Processing Stack — out-of-core processing for ordered data (batch + live).
*oups* is a Python toolkit for building end-to-end pipelines over ordered data with the same code in offline training and live streaming/batch contexts.
It centers on ``StatefulLoop`` (``loop.bind_function_state``, ``loop.iterate``, ``loop.buffer``), which binds and persists function/object state, orchestrates chunked iteration, and buffers DataFrames under a memory cap with flush-on-limit or last-iteration semantics.
Complementing the loop, ``stateful_ops`` provides vectorized, chunk-friendly primitives like ``AsofMerger`` for multi-DataFrame as-of joins (with optional windows of previous values) and ``SegmentedAggregator`` (planned) for streamed segmentation and aggregation.
The ``store`` package manages ordered Parquet datasets via schema-driven keys (``@toplevel``), supports incremental updates (``store[key].write(...)``) and duplicate handling, and offers synchronized iteration across datasets via ``store.iter_intersections(...)`` with optional warm-up (``n_prev``).
Together these pieces enable out-of-core processing with resumability, and deterministic buffering. The design favors explicit, minimal APIs and reproducible results, aligning offline feature generation with online serving.
## Links
- 📖 **[Documentation](https://pierrot.codeberg.page/oups/)** - Guides and API reference
- 📋 **[Changelog](CHANGELOG.md)** - Release notes and version history
Raw data
{
"_id": null,
"home_page": null,
"name": "oups",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.13",
"maintainer_email": null,
"keywords": "out-of-core, streaming, stateful, time-series, pandas, parquet, data-engineering",
"author": "pierrot",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/fe/01/657d8e31d71c9009cb4ecb050dbe1419c373abfa13d68e12789976d67d12/oups-2025.9.5.tar.gz",
"platform": null,
"description": "# Welcome to oups!\n\n## What is oups?\n*oups* stands for Ordered Unified Processing Stack \u2014 out-of-core processing for ordered data (batch + live).\n\n*oups* is a Python toolkit for building end-to-end pipelines over ordered data with the same code in offline training and live streaming/batch contexts.\n\nIt centers on ``StatefulLoop`` (``loop.bind_function_state``, ``loop.iterate``, ``loop.buffer``), which binds and persists function/object state, orchestrates chunked iteration, and buffers DataFrames under a memory cap with flush-on-limit or last-iteration semantics.\nComplementing the loop, ``stateful_ops`` provides vectorized, chunk-friendly primitives like ``AsofMerger`` for multi-DataFrame as-of joins (with optional windows of previous values) and ``SegmentedAggregator`` (planned) for streamed segmentation and aggregation.\nThe ``store`` package manages ordered Parquet datasets via schema-driven keys (``@toplevel``), supports incremental updates (``store[key].write(...)``) and duplicate handling, and offers synchronized iteration across datasets via ``store.iter_intersections(...)`` with optional warm-up (``n_prev``).\n\nTogether these pieces enable out-of-core processing with resumability, and deterministic buffering. The design favors explicit, minimal APIs and reproducible results, aligning offline feature generation with online serving.\n\n## Links\n\n- \ud83d\udcd6 **[Documentation](https://pierrot.codeberg.page/oups/)** - Guides and API reference\n- \ud83d\udccb **[Changelog](CHANGELOG.md)** - Release notes and version history\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Out-of-core pipelines over ordered data: StatefulLoop, stateful ops, and ordered Parquet Store.",
"version": "2025.9.5",
"project_urls": {
"Changelog": "https://codeberg.org/pierrot/oups/src/branch/main/CHANGELOG.md",
"Documentation": "https://codeberg.org/pierrot/oups/src/branch/main/docs",
"Homepage": "https://codeberg.org/pierrot/oups",
"Issues": "https://codeberg.org/pierrot/oups/issues",
"Source": "https://codeberg.org/pierrot/oups"
},
"split_keywords": [
"out-of-core",
" streaming",
" stateful",
" time-series",
" pandas",
" parquet",
" data-engineering"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d0341b6357342f8349e87afd2be8c13e28ff9f0d44935e9a28fe741ffb7af8cd",
"md5": "21dd7d5bfcf847f00f5b5a91f93f5443",
"sha256": "f94565c88e0773f3c7b02925e165991e1b1c9479287d268eee8fe6759c8f086e"
},
"downloads": -1,
"filename": "oups-2025.9.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "21dd7d5bfcf847f00f5b5a91f93f5443",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.13",
"size": 132304,
"upload_time": "2025-09-10T07:25:46",
"upload_time_iso_8601": "2025-09-10T07:25:46.098955Z",
"url": "https://files.pythonhosted.org/packages/d0/34/1b6357342f8349e87afd2be8c13e28ff9f0d44935e9a28fe741ffb7af8cd/oups-2025.9.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "fe01657d8e31d71c9009cb4ecb050dbe1419c373abfa13d68e12789976d67d12",
"md5": "eae15a5051d5782e7774218152166869",
"sha256": "1490e0604f5b98f2e367524ff1a084bf23f704bd966fa0032a6cf9afd1a9310a"
},
"downloads": -1,
"filename": "oups-2025.9.5.tar.gz",
"has_sig": false,
"md5_digest": "eae15a5051d5782e7774218152166869",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.13",
"size": 109501,
"upload_time": "2025-09-10T07:25:47",
"upload_time_iso_8601": "2025-09-10T07:25:47.531127Z",
"url": "https://files.pythonhosted.org/packages/fe/01/657d8e31d71c9009cb4ecb050dbe1419c373abfa13d68e12789976d67d12/oups-2025.9.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-10 07:25:47",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": true,
"codeberg_user": "pierrot",
"codeberg_project": "oups",
"lcname": "oups"
}