Name | steamroller JSON |
Version |
3.0.3
JSON |
| download |
home_page | None |
Summary | Framework for flexible, reproducible, and scalable empirical research. |
upload_time | 2024-11-28 00:35:29 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
machine learning
parallel computing
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
-------------------
SteamRoller |Logo|
-------------------
------------------------------------
Getting started with an example
------------------------------------
First and foremost, steamroller builds upon the [SCons](https://scons.org/), which has a [very in-depth MAN page](https://scons.org/doc/production/HTML/scons-man.html) and [API](https://scons.org/doc/latest/HTML/scons-api/index.html), but a probably-more-useful [user guide](https://scons.org/doc/production/HTML/scons-user/index.html).
Assuming you're starting in an empty directory named after the experiment (e.g. `~/my_experiment`) and have a recent version of Python 3 on your path::
$ python -m venv local
$ source local/bin/activate
(my_experiment) $ pip install pip -U
(my_experiment) $ pip install steamroller
Steamroller can create a simple dummy project to help get started::
(my_experiment) $ steamroller --new_project
As it runs, this will print out information about what it's creating, and why. The only difference from a basic SCons project, as can be seen in the example SConstruct file, is that `Environment` is imported from steamroller rather than scons. Other than the "--new project" option, the `steamroller` command behaves like the normal `scons` command, and this will run the experiment one step at a time::
(my_experiment) $ steamroller -Q
However, steamroller makes it easy to instead run on a grid with maximal parallelism, simply by setting a few variables (these will depend on the details of your particular grid, and typically you would want to put these variables in "custom.py" so that you can use the simple `steamroller -Q` command)::
(my_experiment) $ steamroller -Q STEAMROLLER_ENGINE=slurm CPU_QUEUE=parallel CPU_ACCOUNT=tlippin1 GPU_QUEUE=parallel GPU_ACCOUNT=tlippin1 GPU_COUNT=0
This will submit the experiment to the grid and immediately return: you can check the status of the experiment by running `sacct -s R,PD,F` (work is underway to make monitoring a first-order aspect of steamroller, but for the moment you want to be careful not to resubmit while jobs from a previous invocation are still pending). See below for the broader set of steamroller variables that can be overridden.
------------
Scaling up
------------
As described above, steamroller is indistinguishable from the [SCons](https://scons.org/) system itself, which is very well-documented: so, one could simply use it in this fashion, running experiments in serial, locally (e.g. on a laptop). Steamroller's power comes from the ability to easily flip a switch to take advantage of the massively parallel architecture of a high-performance compute cluster, without changing the underlying code.
When following the pattern described above, this is accomplished by creating a "custom.py" file in the experiment directory and setting a few variables in it. The most important is `STEAMROLLER_ENGINE`, which defaults to "local", but can alternatively be set to "slurm", "univa", or "sge" (depending on what grid system is used). There are a few other special variables, all starting with "STEAMROLLER", though bear in mind these can also be set when a particular build rule is called (see, above, how the "PARAM" and "SPLIT" variables are being set)::
STEAMROLLER_ACCOUNT = "my_account"
STEAMROLLER_QUEUE = "some_queue"
STEAMROLLER_TIME = "06:00:00"
STEAMROLLER_MEMORY = "64G"
STEAMROLLER_GPU_COUNT = 1
There are a few variables that steamroller uses internally and that you should only set if you *really* know what you're doing::
STEAMROLLER_SUBMIT_STRING
STEAMROLLER_NAME_PREFIX
STEAMROLLER_NAME
STEAMROLLER_LOG
STEAMROLLER_WORKING_DIRECTORY
STEAMROLLER_DEPENDENCIES
STEAMROLLER_SHELL
Basically, if there's no grid (e.g. you're running on a laptop), steamroller should just behave like SCons and run each needed task as a simple command-line invocation, in linear order that respects the dependencies. If a grid is specified (e.g. `STEAMROLLER_ENGINE = "slurm"`), and is in fact available, steamroller will instead *submit* each task in the appropriate order, propagating the *task IDs* as appropriate so that the grid jobs respect the dependency structure.
----
FAQ
----
.. |Logo| image:: logo.png
Raw data
{
"_id": null,
"home_page": null,
"name": "steamroller",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "Tom Lippincott <tom@cs.jhu.edu>",
"keywords": "machine learning, parallel computing, ",
"author": null,
"author_email": "Tom Lippincott <tom@cs.jhu.edu>",
"download_url": "https://files.pythonhosted.org/packages/68/c6/22971b35d39afb2da6618c0cff918075d84aa6a51b406ef1c804e035a116/steamroller-3.0.3.tar.gz",
"platform": null,
"description": "-------------------\nSteamRoller |Logo|\n-------------------\n\n------------------------------------\nGetting started with an example\n------------------------------------\n\nFirst and foremost, steamroller builds upon the [SCons](https://scons.org/), which has a [very in-depth MAN page](https://scons.org/doc/production/HTML/scons-man.html) and [API](https://scons.org/doc/latest/HTML/scons-api/index.html), but a probably-more-useful [user guide](https://scons.org/doc/production/HTML/scons-user/index.html).\n\nAssuming you're starting in an empty directory named after the experiment (e.g. `~/my_experiment`) and have a recent version of Python 3 on your path::\n\n $ python -m venv local\n $ source local/bin/activate\n (my_experiment) $ pip install pip -U\n (my_experiment) $ pip install steamroller\n\nSteamroller can create a simple dummy project to help get started::\n\n (my_experiment) $ steamroller --new_project\n\nAs it runs, this will print out information about what it's creating, and why. The only difference from a basic SCons project, as can be seen in the example SConstruct file, is that `Environment` is imported from steamroller rather than scons. Other than the \"--new project\" option, the `steamroller` command behaves like the normal `scons` command, and this will run the experiment one step at a time::\n\n (my_experiment) $ steamroller -Q\n \nHowever, steamroller makes it easy to instead run on a grid with maximal parallelism, simply by setting a few variables (these will depend on the details of your particular grid, and typically you would want to put these variables in \"custom.py\" so that you can use the simple `steamroller -Q` command)::\n\n (my_experiment) $ steamroller -Q STEAMROLLER_ENGINE=slurm CPU_QUEUE=parallel CPU_ACCOUNT=tlippin1 GPU_QUEUE=parallel GPU_ACCOUNT=tlippin1 GPU_COUNT=0\n\nThis will submit the experiment to the grid and immediately return: you can check the status of the experiment by running `sacct -s R,PD,F` (work is underway to make monitoring a first-order aspect of steamroller, but for the moment you want to be careful not to resubmit while jobs from a previous invocation are still pending). See below for the broader set of steamroller variables that can be overridden.\n\n------------\nScaling up\n------------\n\nAs described above, steamroller is indistinguishable from the [SCons](https://scons.org/) system itself, which is very well-documented: so, one could simply use it in this fashion, running experiments in serial, locally (e.g. on a laptop). Steamroller's power comes from the ability to easily flip a switch to take advantage of the massively parallel architecture of a high-performance compute cluster, without changing the underlying code.\n\nWhen following the pattern described above, this is accomplished by creating a \"custom.py\" file in the experiment directory and setting a few variables in it. The most important is `STEAMROLLER_ENGINE`, which defaults to \"local\", but can alternatively be set to \"slurm\", \"univa\", or \"sge\" (depending on what grid system is used). There are a few other special variables, all starting with \"STEAMROLLER\", though bear in mind these can also be set when a particular build rule is called (see, above, how the \"PARAM\" and \"SPLIT\" variables are being set)::\n\n\n STEAMROLLER_ACCOUNT = \"my_account\"\n STEAMROLLER_QUEUE = \"some_queue\"\n STEAMROLLER_TIME = \"06:00:00\"\n STEAMROLLER_MEMORY = \"64G\"\n STEAMROLLER_GPU_COUNT = 1\n\nThere are a few variables that steamroller uses internally and that you should only set if you *really* know what you're doing::\n\n STEAMROLLER_SUBMIT_STRING\n STEAMROLLER_NAME_PREFIX\n STEAMROLLER_NAME\n STEAMROLLER_LOG\n STEAMROLLER_WORKING_DIRECTORY\n STEAMROLLER_DEPENDENCIES\n STEAMROLLER_SHELL\n\nBasically, if there's no grid (e.g. you're running on a laptop), steamroller should just behave like SCons and run each needed task as a simple command-line invocation, in linear order that respects the dependencies. If a grid is specified (e.g. `STEAMROLLER_ENGINE = \"slurm\"`), and is in fact available, steamroller will instead *submit* each task in the appropriate order, propagating the *task IDs* as appropriate so that the grid jobs respect the dependency structure.\n\n----\nFAQ\n----\n\n.. |Logo| image:: logo.png\n",
"bugtrack_url": null,
"license": null,
"summary": "Framework for flexible, reproducible, and scalable empirical research.",
"version": "3.0.3",
"project_urls": {
"issues": "https://github.com/comp-int-hum/steamroller/issues",
"repository": "https://github.com/comp-int-hum/steamroller"
},
"split_keywords": [
"machine learning",
" parallel computing",
" "
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "68c622971b35d39afb2da6618c0cff918075d84aa6a51b406ef1c804e035a116",
"md5": "b847050aa0e857d92fea72dee9841726",
"sha256": "9b6c1d484db2bdf7ced2a8b72711ca01c8da2e0a816808861bf5f2e9b4b15c2f"
},
"downloads": -1,
"filename": "steamroller-3.0.3.tar.gz",
"has_sig": false,
"md5_digest": "b847050aa0e857d92fea72dee9841726",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 28982,
"upload_time": "2024-11-28T00:35:29",
"upload_time_iso_8601": "2024-11-28T00:35:29.678564Z",
"url": "https://files.pythonhosted.org/packages/68/c6/22971b35d39afb2da6618c0cff918075d84aa6a51b406ef1c804e035a116/steamroller-3.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-28 00:35:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "comp-int-hum",
"github_project": "steamroller",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "steamroller"
}