<picture align="center">
<img alt="Digout logo" src="https://gitlab.cern.ch/particlepredatorinvasion/digout/raw/master/docs/source/_static/digout.svg">
</picture>
<p align="center">
<a href="https://gitlab.cern.ch/particlepredatorinvasion/digout/-/pipelines/">
<img alt="Pipeline Status" src="https://gitlab.cern.ch/particlepredatorinvasion/digout/badges/master/pipeline.svg" />
</a>
<a href="https://gitlab.cern.ch/particlepredatorinvasion/digout/-/blob/master/LICENSE">
<img alt="License" src="https://img.shields.io/pypi/l/digout" />
</a>
<a href="https://gitlab.cern.ch/particlepredatorinvasion/digout/-/releases">
<img alt="Latest Release" src="https://gitlab.cern.ch/particlepredatorinvasion/digout/-/badges/release.svg" />
</a>
<a href="https://pypi.org/project/digout/">
<img alt="PyPI - Version" src="https://img.shields.io/pypi/v/digout" />
</a>
<a href="https://pypi.org/project/digout/">
<img alt="Python Version" src="https://img.shields.io/pypi/pyversions/digout" />
</a>
<a href="https://digout.docs.cern.ch">
<img alt="Documentation Status" src="https://img.shields.io/badge/documentation-view-blue.svg" />
</a>
<a href="https://digout.docs.cern.ch/master/development/contributing.html">
<img alt="Contributing Guide" src="https://img.shields.io/badge/contributing-guide-blue.svg" />
</a>
</p>
`digout` is a Python library purpose-built to execute the multi-stage workflow
of converting raw LHCb `DIGI` files into analysis-ready `parquet` dataframes
of particles and hits.
To manage this process in a scalable and reproducible manner,
it implements a workflow framework organized around configurable **steps**
(e.g., `digi2root`, `root2df`).
The framework operates on a two-phase execution model:
a **stream phase** runs once to prepare the dataset from a bookkeeping path,
and a **chunk phase** processes each input file in parallel.
This parallel execution is managed by swappable **schedulers**
(such as `local` for local processing or `htcondor` for cluster submission),
with the entire workflow being defined through YAML configuration files
to ensure complete reproducibility.
## Resources
| Link | Description |
|:----------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|
| 📖 **[Full Documentation](https://digout.docs.cern.ch)** | The complete guide to installation, configuration, and concepts. |
| 🚀 **[Quickstart Guide](https://digout.docs.cern.ch/master/getstarted/quickstart.html)** | The fastest way to get a working example running. |
| 💡 **[Contributing Guide](https://digout.docs.cern.ch/master/development/contributing.html)** | Learn how to set up a development environment and contribute to the project. |
| 🐛 **[Report a Bug](https://gitlab.cern.ch/particlepredatorinvasion/digout/-/issues)** | Found an issue? Let us know by creating a bug report. |
| 📜 **[Changelog](https://gitlab.cern.ch/particlepredatorinvasion/digout/-/releases)** | See the latest changes from the release page |
## Core Features
- **Automated Metadata Discovery**:
Automatically queries the LHCb bookkeeping system to retrieve necessary
metadata (`dddb_tag`, `conddb_tag`, etc.), eliminating manual lookup.
- **Scalable Parallel Processing**:
Built-in support for processing large datasets in parallel on a local machine
or on a distributed cluster like HTCondor.
- **Configuration-Driven and Reproducible**:
Define your entire workflow in YAML files.
`digout` saves the final, resolved configuration for every run,
ensuring any result can be reproduced.
- **Idempotent Execution**:
Automatically detects and skips steps that have already been completed.
- **Extensible Architecture**: Easily define new steps or schedulers.
## Main Workflows
- **DIGI to DataFrame Conversion**:
Produce analysis-ready `parquet` dataframes from LHCb `DIGI` files.
The available output dataframes are detailed
on the [DataFrames Page](https://digout.docs.cern.ch/master/concepts/dataframes.html).
- **DIGI to MDF Conversion**:
Convert LHCb `DIGI` files into the `.mdf` format required as input
for the [Allen framework](https://gitlab.cern.ch/lhcb/Allen).
Raw data
{
"_id": null,
"home_page": null,
"name": "digout",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "DAG, cern, computation, graph, grid, lhcb, pipeline, workflow",
"author": null,
"author_email": "anthonyc <anthony.correia@cern.ch>",
"download_url": null,
"platform": null,
"description": "\n<picture align=\"center\">\n <img alt=\"Digout logo\" src=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/raw/master/docs/source/_static/digout.svg\">\n</picture>\n\n<p align=\"center\">\n <a href=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/-/pipelines/\">\n <img alt=\"Pipeline Status\" src=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/badges/master/pipeline.svg\" />\n </a>\n <a href=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/-/blob/master/LICENSE\">\n <img alt=\"License\" src=\"https://img.shields.io/pypi/l/digout\" />\n </a>\n <a href=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/-/releases\">\n <img alt=\"Latest Release\" src=\"https://gitlab.cern.ch/particlepredatorinvasion/digout/-/badges/release.svg\" />\n </a>\n <a href=\"https://pypi.org/project/digout/\">\n <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/digout\" />\n </a>\n <a href=\"https://pypi.org/project/digout/\">\n <img alt=\"Python Version\" src=\"https://img.shields.io/pypi/pyversions/digout\" />\n </a>\n <a href=\"https://digout.docs.cern.ch\">\n <img alt=\"Documentation Status\" src=\"https://img.shields.io/badge/documentation-view-blue.svg\" />\n </a>\n <a href=\"https://digout.docs.cern.ch/master/development/contributing.html\">\n <img alt=\"Contributing Guide\" src=\"https://img.shields.io/badge/contributing-guide-blue.svg\" />\n </a>\n</p>\n\n`digout` is a Python library purpose-built to execute the multi-stage workflow\nof converting raw LHCb `DIGI` files into analysis-ready `parquet` dataframes\nof particles and hits.\n\nTo manage this process in a scalable and reproducible manner,\nit implements a workflow framework organized around configurable **steps**\n(e.g., `digi2root`, `root2df`).\nThe framework operates on a two-phase execution model:\na **stream phase** runs once to prepare the dataset from a bookkeeping path,\nand a **chunk phase** processes each input file in parallel.\nThis parallel execution is managed by swappable **schedulers**\n(such as `local` for local processing or `htcondor` for cluster submission),\nwith the entire workflow being defined through YAML configuration files\nto ensure complete reproducibility.\n\n## Resources\n\n| Link | Description |\n|:----------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|\n| \ud83d\udcd6 **[Full Documentation](https://digout.docs.cern.ch)** | The complete guide to installation, configuration, and concepts. |\n| \ud83d\ude80 **[Quickstart Guide](https://digout.docs.cern.ch/master/getstarted/quickstart.html)** | The fastest way to get a working example running. |\n| \ud83d\udca1 **[Contributing Guide](https://digout.docs.cern.ch/master/development/contributing.html)** | Learn how to set up a development environment and contribute to the project. |\n| \ud83d\udc1b **[Report a Bug](https://gitlab.cern.ch/particlepredatorinvasion/digout/-/issues)** | Found an issue? Let us know by creating a bug report. |\n| \ud83d\udcdc **[Changelog](https://gitlab.cern.ch/particlepredatorinvasion/digout/-/releases)** | See the latest changes from the release page |\n\n## Core Features\n\n- **Automated Metadata Discovery**:\n Automatically queries the LHCb bookkeeping system to retrieve necessary\n metadata (`dddb_tag`, `conddb_tag`, etc.), eliminating manual lookup.\n- **Scalable Parallel Processing**:\n Built-in support for processing large datasets in parallel on a local machine\n or on a distributed cluster like HTCondor.\n- **Configuration-Driven and Reproducible**:\n Define your entire workflow in YAML files.\n `digout` saves the final, resolved configuration for every run,\n ensuring any result can be reproduced.\n- **Idempotent Execution**:\n Automatically detects and skips steps that have already been completed.\n- **Extensible Architecture**: Easily define new steps or schedulers.\n\n## Main Workflows\n\n- **DIGI to DataFrame Conversion**:\n Produce analysis-ready `parquet` dataframes from LHCb `DIGI` files.\n The available output dataframes are detailed\n on the [DataFrames Page](https://digout.docs.cern.ch/master/concepts/dataframes.html).\n\n- **DIGI to MDF Conversion**:\n Convert LHCb `DIGI` files into the `.mdf` format required as input\n for the [Allen framework](https://gitlab.cern.ch/lhcb/Allen).\n",
"bugtrack_url": null,
"license": "Apache License (2.0)",
"summary": "Pipeline framework to dump analysis-ready data from LHCb grid-based files",
"version": "0.1.1",
"project_urls": {
"Changelog": "https://gitlab.cern.ch/particlepredatorinvasion/digout/-/blob/master/CHANGELOG.md",
"Documentation": "https://digout.docs.cern.ch",
"Homepage": "https://gitlab.cern.ch/particlepredatorinvasion/digout",
"Issues": "https://gitlab.cern.ch/particlepredatorinvasion/digout/issues",
"Repository": "https://gitlab.cern.ch/particlepredatorinvasion/digout"
},
"split_keywords": [
"dag",
" cern",
" computation",
" graph",
" grid",
" lhcb",
" pipeline",
" workflow"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "07b3dce00e8313c6436f0c25dde30f6528137616432b2aa3a918e5f723defabf",
"md5": "6b745b677def77917f8095e32685e2c1",
"sha256": "375c0854f571e513ad3b1dcd1a1d0667619e5dcc328eb808622515744659d5b2"
},
"downloads": -1,
"filename": "digout-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6b745b677def77917f8095e32685e2c1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 152930,
"upload_time": "2025-08-04T01:33:35",
"upload_time_iso_8601": "2025-08-04T01:33:35.855861Z",
"url": "https://files.pythonhosted.org/packages/07/b3/dce00e8313c6436f0c25dde30f6528137616432b2aa3a918e5f723defabf/digout-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-04 01:33:35",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "digout"
}