# Bamboo: A high-level HEP analysis library for ROOT::RDataFrame
[![Documentation Status](https://readthedocs.org/projects/bamboo-hep/badge/?version=latest)](https://bamboo-hep.readthedocs.io/en/latest/?badge=latest)
The [`ROOT::RDataFrame`](https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html)
class provides an efficient and flexible way to process per-event information
(stored in a `TTree`) and e.g. aggregate it into histograms.
With the typical pattern of storing object arrays as a structure of arrays
(variable-sized branches with a common prefix in the names and length),
the expressions that are typically needed for a complete analysis quickly become
cumbersome to write (with indices to match, repeated sub-expressions etc.).
As an example, imagine the expression needed to calculate the invariant mass
of the two leading muons from a CMS NanoAOD file (which stores 4-momenta with
pt, eta and phi branches): one way is to construct LorentzVector objects,
sum and evaluate the invariant mass.
Next imagine doing the same thing with the two highest-pt jets that have a b-tag
and are not within some cone of the two leptons you already selected in another
way (while keeping the code maintainable enough to allow for passing jet momenta
with a systematic variation applied).
Bamboo attempts to solve this problem by automatically constructing
lightweight python wrappers based on the structure of the `TTree`,
which allow to construct such expression with high-level code, similar to the
language that is commonly used to discuss and describe them. By constructing
an object representation of the expression, a few powerful operations can be
used to compose complex expressions.
This also allows to automate the construction of derived expressions, e.g. for
shape systematic variation histograms.
Building selections, plots etc. with such expressions is analysis-specific, but
the mechanics of loading data samples, processing them locally or on a batch
system, combining the outputs for different samples in an overview etc.
is very similar over a broad range of use cases.
Therefore a common implementation of these is provided, such that the analyst
only needs to provide a subclass with their selection and plot definitions,
and a configuration file with a list of samples, and instructions how to
display them.
## Documentation
The HTML documentation (with a longer introduction, installation instructions,
recipes for common tasks and an API reference of the classes and methods) is
available [here](https://bamboo-hep.readthedocs.io/).
## Development
Bamboo has been in development since early 2019, and is actively used by
several analyses.
The experience from daily use, and the addition of new features in the
underlying ROOT::RDataFrame package, ideas for improvements and further
development continue to pop up.
Please have a look at the
[guidelines](https://gitlab.cern.ch/cp3-cms/bamboo/-/blob/master/CONTRIBUTING.md)
to also start contributing.
Raw data
{
"_id": null,
"home_page": "https://gitlab.cern.ch/cp3-cms/bamboo",
"name": "bamboo-hep",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "ROOT,RDataFrame",
"author": "Pieter David",
"author_email": "pieter.david@cern.ch",
"download_url": "https://files.pythonhosted.org/packages/e8/2a/a253630e20f79de6f54f1ea5846c50452ec3da9320d5935d9b5e66c0dfac/bamboo-hep-1.1.0.tar.gz",
"platform": null,
"description": "# Bamboo: A high-level HEP analysis library for ROOT::RDataFrame\n\n[![Documentation Status](https://readthedocs.org/projects/bamboo-hep/badge/?version=latest)](https://bamboo-hep.readthedocs.io/en/latest/?badge=latest)\n\nThe [`ROOT::RDataFrame`](https://root.cern.ch/doc/master/classROOT_1_1RDataFrame.html)\nclass provides an efficient and flexible way to process per-event information\n(stored in a `TTree`) and e.g. aggregate it into histograms.\n\nWith the typical pattern of storing object arrays as a structure of arrays\n(variable-sized branches with a common prefix in the names and length),\nthe expressions that are typically needed for a complete analysis quickly become\ncumbersome to write (with indices to match, repeated sub-expressions etc.).\n\nAs an example, imagine the expression needed to calculate the invariant mass\nof the two leading muons from a CMS NanoAOD file (which stores 4-momenta with\npt, eta and phi branches): one way is to construct LorentzVector objects,\nsum and evaluate the invariant mass.\nNext imagine doing the same thing with the two highest-pt jets that have a b-tag\nand are not within some cone of the two leptons you already selected in another\nway (while keeping the code maintainable enough to allow for passing jet momenta\nwith a systematic variation applied).\n\nBamboo attempts to solve this problem by automatically constructing\nlightweight python wrappers based on the structure of the `TTree`,\nwhich allow to construct such expression with high-level code, similar to the\nlanguage that is commonly used to discuss and describe them. By constructing\nan object representation of the expression, a few powerful operations can be\nused to compose complex expressions.\nThis also allows to automate the construction of derived expressions, e.g. for\nshape systematic variation histograms.\n\nBuilding selections, plots etc. with such expressions is analysis-specific, but\nthe mechanics of loading data samples, processing them locally or on a batch\nsystem, combining the outputs for different samples in an overview etc.\nis very similar over a broad range of use cases.\nTherefore a common implementation of these is provided, such that the analyst\nonly needs to provide a subclass with their selection and plot definitions,\nand a configuration file with a list of samples, and instructions how to\ndisplay them.\n\n## Documentation\n\nThe HTML documentation (with a longer introduction, installation instructions,\nrecipes for common tasks and an API reference of the classes and methods) is\navailable [here](https://bamboo-hep.readthedocs.io/).\n\n## Development\n\nBamboo has been in development since early 2019, and is actively used by\nseveral analyses.\nThe experience from daily use, and the addition of new features in the\nunderlying ROOT::RDataFrame package, ideas for improvements and further\ndevelopment continue to pop up.\nPlease have a look at the\n[guidelines](https://gitlab.cern.ch/cp3-cms/bamboo/-/blob/master/CONTRIBUTING.md)\nto also start contributing.\n\n\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "A high-level HEP analysis library for ROOT::RDataFrame",
"version": "1.1.0",
"split_keywords": [
"root",
"rdataframe"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e82aa253630e20f79de6f54f1ea5846c50452ec3da9320d5935d9b5e66c0dfac",
"md5": "c562061153eb34925ad582c1efadb4fb",
"sha256": "11f79c7f6fbbda9774c2a8212d028cb84fd8cd7ea644bf0c2c44efebf08c3531"
},
"downloads": -1,
"filename": "bamboo-hep-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "c562061153eb34925ad582c1efadb4fb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 1803566,
"upload_time": "2023-04-21T08:49:27",
"upload_time_iso_8601": "2023-04-21T08:49:27.218305Z",
"url": "https://files.pythonhosted.org/packages/e8/2a/a253630e20f79de6f54f1ea5846c50452ec3da9320d5935d9b5e66c0dfac/bamboo-hep-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-21 08:49:27",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "bamboo-hep"
}