SQLamarr


NameSQLamarr JSON
Version 0.0rc6 PyPI version JSON
download
home_page
SummaryThe stand-alone ultra-fast simulation option for LHCb
upload_time2024-02-23 17:34:30
maintainer
docs_urlNone
author
requires_python
licenseGPL-3
keywords lhcb fast-simulation simulation hep physics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![GitHub address](https://badgen.net/badge/icon/repository?icon=github&label)](https://github.com/lamarrsim/SQLamarr)
[![Doxygen on Pages](https://github.com/LamarrSim/SQLamarr/actions/workflows/main.yml/badge.svg)](https://lamarrsim.github.io/SQLamarr)

![Lamarr logo](https://avatars.githubusercontent.com/u/125392434?s=200&v=4)

# SQLamarr
*The stand-alone ultra-fast simulation option for the LHCb experiment.*

The detailed simulation of the hadron collisions at the LHC, and of the 
interaction of the generated particles with the detector material
dominates the cost for the computing infrastructure pledged to the 
LHCb Collaboration.

Among the various options explored towards a faster simulation, 
there is Lamarr, a framework defining a pipeline of parametrizations
transforming generator-level quantities to reconstructed, analysis-level 
features. 
Most of the parametrizations are defined using *machine-learning*, and 
in particular Deep Neural Networks and Gradient Boosted Decision Trees,
with a training procedure defined in independent packages (*e.g.* 
[landerlini/lb-trksim-train](https://github.com/landerlini/lb-trksim-train)
and [mbarbetti/lb-pidsim-train](https://github.com/mbarbetti/lb-pidsim-train)).

To be integrated in the LHCb software stack, models must be queried 
from a C++ application, running in the Gaudi framework, which includes a 
dedicated multithreading scheduler which was found to conflict with 
the schedulers of TensorFlow and ONNX runtimes.
In addition, since the models are relatively simple and fast to 
evaluate, the overhead of context switching from Gaudi to a dedicated 
runtime was observed to be unaccptably large.
Hence, models are converted into compatible C code using the 
[landerlini/scikinC](https://github.com/landerlini/scikinC)
package and distributed through the CernVM FileSystem releasing 
the [LamarrData package](https://gitlab.cern.ch/lhcb-datapkg/LamarrData).

While crucial to the applications within LHCb, the integration with
Gaudi and [Gauss](https://gitlab.cern.ch/lhcb/Gauss) makes the adoption 
of Lamarr unappealing for researchers outside of the LHCb community 
approaching the LHCb simulation to evaluate 
the experiment sensitivity to new physics phenomena or studying the 
recently-released LHCb Open Data.
The [landerlini/SQLamarr](https://github.com/landerlini/SQLamarr)
package aims at decoupling Lamarr from Gaudi providing a stand-alone 
application with minimal dependencies that can be easily set up and 
run in any Linux machine.
The parametrizations are shared between the Gauss-embedded implementation
[LbLamarr](https://gitlab.cern.ch/lhcb/Gauss/-/tree/master/Sim/LbLamarr) 
and `SQLamarr`.
In the future, the exact same package might be integrated within Gaudi 
to reduce the maintainance effort.

To replace the ROOT-based TransientEventStore concept defind in Gaudi,
`SQLamarr` adopts the SQLite3 package, enabling vectorized processing 
of batches of events, for a better performance.

To avoid dependencies on ROOT, also the persistency is handled using 
SQLite3, writing the reconstructed (or intermediate) quantities in the 
form of SQLite3 databases. 
Note that converting an SQLite3 table to a ROOT nTuple requires no more 
than 3 lines of Python:

```python
import sqlite3, uproot, pandas
with sqlite3.connect("SomeInput.db") as conn:
  uproot.open("SomeFile.root", "RECREATE")["myTree"] = pandas.read_sql_table("myTable", conn)
```

## Dependencies
 * [SQLite3](https://www.sqlite.org/index.html) with C/C++ headers
 * [HepMC3](http://hepmc.web.cern.ch/hepmc/) as a standard interface
  to event generators.

## Build from source
Make sure you have conda (or similar) installed, if not 
get [miniconda3](https://docs.conda.io/en/latest/miniconda.html).
Create and activate a dedicated conda environment, say `sqlamarr`:
```bash
conda create -y -n sqlamarr -c conda-forge python=3.10 gxx gxx_linux-64 hepmc3 doxygen
conda activate sqlamarr
```

Create a out-of-source directory:
```bash
mkdir build
cd build
```

Configure and build
```bash
cmake .. 
cmake --build .
```


## How to use SQLamarr
The project is not mature enough to provide a good user experience.
For the time being, clone the repository and compile the package with CMake, 
then edit the file `src/main.cpp` to
define the desired pipeline, by using the building blocks provided by
the package.

## BlockLib
The namespace `SQLamarr::BlockLib` groups functions defining specialized blocks
making assumptions on the workflow in which these blocks will be deployed.
While useful for testing and for organizing the code defining pipelines, 
it is not supposed to be stable (as it is being modified while the workflow
under test is modified) and should not be used as part of other packages. 

Other packages, however, may take inspiration from the structure of `SQLamarr::BlockLib`
to design specialized blocks, resident in their codebase, in a more 
organized way than having everything pipelined in a single file.

To test the completeness of the feature set in the main part of the library,
`SQLamarr::BlockLib` is designed to only include functions, accessing public methods 
of the objects defined in the main part of the library.


## Copyright and Licence
(c) Copyright 2022 CERN for the benefit of the LHCb Collaboration. 
                                                                            
This software is distributed under the terms of the GNU General Public Licence version 3 (GPL Version 3), copied verbatim in the file "LICENCE".
                                                                            
In applying this licence, CERN does not waive the privileges and immunities granted to it by virtue of its status as an Intergovernmental Organization or submit itself to any jurisdiction.

We acknowledge the support of the ICSC Foundation to the development of SQLamarr.

![image](https://user-images.githubusercontent.com/44908794/227858127-47d2b66f-4f1b-4f34-b505-814748957123.png)


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "SQLamarr",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "lhcb,fast-simulation,simulation,hep,physics",
    "author": "",
    "author_email": "Lucio Anderlini <Lucio.Anderlini@fi.infn.it>, Matteo Barbetti <Matteo.Barbetti@fi.infn.it>",
    "download_url": "",
    "platform": null,
    "description": "[![GitHub address](https://badgen.net/badge/icon/repository?icon=github&label)](https://github.com/lamarrsim/SQLamarr)\n[![Doxygen on Pages](https://github.com/LamarrSim/SQLamarr/actions/workflows/main.yml/badge.svg)](https://lamarrsim.github.io/SQLamarr)\n\n![Lamarr logo](https://avatars.githubusercontent.com/u/125392434?s=200&v=4)\n\n# SQLamarr\n*The stand-alone ultra-fast simulation option for the LHCb experiment.*\n\nThe detailed simulation of the hadron collisions at the LHC, and of the \ninteraction of the generated particles with the detector material\ndominates the cost for the computing infrastructure pledged to the \nLHCb Collaboration.\n\nAmong the various options explored towards a faster simulation, \nthere is Lamarr, a framework defining a pipeline of parametrizations\ntransforming generator-level quantities to reconstructed, analysis-level \nfeatures. \nMost of the parametrizations are defined using *machine-learning*, and \nin particular Deep Neural Networks and Gradient Boosted Decision Trees,\nwith a training procedure defined in independent packages (*e.g.* \n[landerlini/lb-trksim-train](https://github.com/landerlini/lb-trksim-train)\nand [mbarbetti/lb-pidsim-train](https://github.com/mbarbetti/lb-pidsim-train)).\n\nTo be integrated in the LHCb software stack, models must be queried \nfrom a C++ application, running in the Gaudi framework, which includes a \ndedicated multithreading scheduler which was found to conflict with \nthe schedulers of TensorFlow and ONNX runtimes.\nIn addition, since the models are relatively simple and fast to \nevaluate, the overhead of context switching from Gaudi to a dedicated \nruntime was observed to be unaccptably large.\nHence, models are converted into compatible C code using the \n[landerlini/scikinC](https://github.com/landerlini/scikinC)\npackage and distributed through the CernVM FileSystem releasing \nthe [LamarrData package](https://gitlab.cern.ch/lhcb-datapkg/LamarrData).\n\nWhile crucial to the applications within LHCb, the integration with\nGaudi and [Gauss](https://gitlab.cern.ch/lhcb/Gauss) makes the adoption \nof Lamarr unappealing for researchers outside of the LHCb community \napproaching the LHCb simulation to evaluate \nthe experiment sensitivity to new physics phenomena or studying the \nrecently-released LHCb Open Data.\nThe [landerlini/SQLamarr](https://github.com/landerlini/SQLamarr)\npackage aims at decoupling Lamarr from Gaudi providing a stand-alone \napplication with minimal dependencies that can be easily set up and \nrun in any Linux machine.\nThe parametrizations are shared between the Gauss-embedded implementation\n[LbLamarr](https://gitlab.cern.ch/lhcb/Gauss/-/tree/master/Sim/LbLamarr) \nand `SQLamarr`.\nIn the future, the exact same package might be integrated within Gaudi \nto reduce the maintainance effort.\n\nTo replace the ROOT-based TransientEventStore concept defind in Gaudi,\n`SQLamarr` adopts the SQLite3 package, enabling vectorized processing \nof batches of events, for a better performance.\n\nTo avoid dependencies on ROOT, also the persistency is handled using \nSQLite3, writing the reconstructed (or intermediate) quantities in the \nform of SQLite3 databases. \nNote that converting an SQLite3 table to a ROOT nTuple requires no more \nthan 3 lines of Python:\n\n```python\nimport sqlite3, uproot, pandas\nwith sqlite3.connect(\"SomeInput.db\") as conn:\n  uproot.open(\"SomeFile.root\", \"RECREATE\")[\"myTree\"] = pandas.read_sql_table(\"myTable\", conn)\n```\n\n## Dependencies\n * [SQLite3](https://www.sqlite.org/index.html) with C/C++ headers\n * [HepMC3](http://hepmc.web.cern.ch/hepmc/) as a standard interface\n  to event generators.\n\n## Build from source\nMake sure you have conda (or similar) installed, if not \nget [miniconda3](https://docs.conda.io/en/latest/miniconda.html).\nCreate and activate a dedicated conda environment, say `sqlamarr`:\n```bash\nconda create -y -n sqlamarr -c conda-forge python=3.10 gxx gxx_linux-64 hepmc3 doxygen\nconda activate sqlamarr\n```\n\nCreate a out-of-source directory:\n```bash\nmkdir build\ncd build\n```\n\nConfigure and build\n```bash\ncmake .. \ncmake --build .\n```\n\n\n## How to use SQLamarr\nThe project is not mature enough to provide a good user experience.\nFor the time being, clone the repository and compile the package with CMake, \nthen edit the file `src/main.cpp` to\ndefine the desired pipeline, by using the building blocks provided by\nthe package.\n\n## BlockLib\nThe namespace `SQLamarr::BlockLib` groups functions defining specialized blocks\nmaking assumptions on the workflow in which these blocks will be deployed.\nWhile useful for testing and for organizing the code defining pipelines, \nit is not supposed to be stable (as it is being modified while the workflow\nunder test is modified) and should not be used as part of other packages. \n\nOther packages, however, may take inspiration from the structure of `SQLamarr::BlockLib`\nto design specialized blocks, resident in their codebase, in a more \norganized way than having everything pipelined in a single file.\n\nTo test the completeness of the feature set in the main part of the library,\n`SQLamarr::BlockLib` is designed to only include functions, accessing public methods \nof the objects defined in the main part of the library.\n\n\n## Copyright and Licence\n(c) Copyright 2022 CERN for the benefit of the LHCb Collaboration. \n                                                                            \nThis software is distributed under the terms of the GNU General Public Licence version 3 (GPL Version 3), copied verbatim in the file \"LICENCE\".\n                                                                            \nIn applying this licence, CERN does not waive the privileges and immunities granted to it by virtue of its status as an Intergovernmental Organization or submit itself to any jurisdiction.\n\nWe acknowledge the support of the ICSC Foundation to the development of SQLamarr.\n\n![image](https://user-images.githubusercontent.com/44908794/227858127-47d2b66f-4f1b-4f34-b505-814748957123.png)\n\n",
    "bugtrack_url": null,
    "license": "GPL-3",
    "summary": "The stand-alone ultra-fast simulation option for LHCb",
    "version": "0.0rc6",
    "project_urls": {
        "Homepage": "https://lamarrsim.github.io/SQLamarr/",
        "Source": "https://github.com/LamarrSim/SQLamarr"
    },
    "split_keywords": [
        "lhcb",
        "fast-simulation",
        "simulation",
        "hep",
        "physics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "35c1a5ce463345aac2ddc8052c41b9b11c9430a0855bac8b08cbb08ff0143470",
                "md5": "a86072e3444089984ccf405f11d0a69d",
                "sha256": "8d2c435695448ec093098b4a78d782af512b35b58fc3a54551bcfddff5b01220"
            },
            "downloads": -1,
            "filename": "SQLamarr-0.0rc6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "a86072e3444089984ccf405f11d0a69d",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 2131556,
            "upload_time": "2024-02-23T17:34:30",
            "upload_time_iso_8601": "2024-02-23T17:34:30.518818Z",
            "url": "https://files.pythonhosted.org/packages/35/c1/a5ce463345aac2ddc8052c41b9b11c9430a0855bac8b08cbb08ff0143470/SQLamarr-0.0rc6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1db36f93d83df9c5e1e7d6fe19ba961d8e363924ec12e735aa5b358383f5a073",
                "md5": "fcd9d80be6d48766a6e362b9986215a2",
                "sha256": "6ae6e02eaca4f2ecee67d2a517acb31651f8253749b0b6c15718bf4c561e6ce5"
            },
            "downloads": -1,
            "filename": "SQLamarr-0.0rc6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "fcd9d80be6d48766a6e362b9986215a2",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 2131555,
            "upload_time": "2024-02-23T17:34:32",
            "upload_time_iso_8601": "2024-02-23T17:34:32.703719Z",
            "url": "https://files.pythonhosted.org/packages/1d/b3/6f93d83df9c5e1e7d6fe19ba961d8e363924ec12e735aa5b358383f5a073/SQLamarr-0.0rc6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7c15fb58d4fb8437dafbc29f6a5ef855c46a4935cbed2b11f7519821c551037a",
                "md5": "07226881c6570b293d19a066f1afc494",
                "sha256": "2dbcb4a0399b1ed590ba1e74fbe7e8cf75a6edf5612a8f5bb471af53be00102d"
            },
            "downloads": -1,
            "filename": "SQLamarr-0.0rc6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "07226881c6570b293d19a066f1afc494",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 2131557,
            "upload_time": "2024-02-23T17:34:34",
            "upload_time_iso_8601": "2024-02-23T17:34:34.379372Z",
            "url": "https://files.pythonhosted.org/packages/7c/15/fb58d4fb8437dafbc29f6a5ef855c46a4935cbed2b11f7519821c551037a/SQLamarr-0.0rc6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b805497950ee6d79bbb6ae27398c2bf2a5df0aa9bb5fa8d79b7b7251403aae90",
                "md5": "32173c499f7d1fcde8c762b8adcc00bd",
                "sha256": "269b4a6341fad0f094be9c204205f5dfc71284fa0c3ba987fc159dc19e34387a"
            },
            "downloads": -1,
            "filename": "SQLamarr-0.0rc6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "32173c499f7d1fcde8c762b8adcc00bd",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 2131553,
            "upload_time": "2024-02-23T17:34:37",
            "upload_time_iso_8601": "2024-02-23T17:34:37.039881Z",
            "url": "https://files.pythonhosted.org/packages/b8/05/497950ee6d79bbb6ae27398c2bf2a5df0aa9bb5fa8d79b7b7251403aae90/SQLamarr-0.0rc6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d24f829e98c7c46329f59bcbc7845442e964588f513dff6ac8929697f39b32b4",
                "md5": "e49eb27872c208bee8b8d5e052a42613",
                "sha256": "73bf08a6f8be9bc0a0eb8f33d2a4b40428eed04aa9687db8ca9a07183b5b1393"
            },
            "downloads": -1,
            "filename": "SQLamarr-0.0rc6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "e49eb27872c208bee8b8d5e052a42613",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 2131553,
            "upload_time": "2024-02-23T17:34:38",
            "upload_time_iso_8601": "2024-02-23T17:34:38.754293Z",
            "url": "https://files.pythonhosted.org/packages/d2/4f/829e98c7c46329f59bcbc7845442e964588f513dff6ac8929697f39b32b4/SQLamarr-0.0rc6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-23 17:34:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LamarrSim",
    "github_project": "SQLamarr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sqlamarr"
}
        
Elapsed time: 0.19405s