plantit-cli


Nameplantit-cli JSON
Version 0.2.94 PyPI version JSON
download
home_pagehttps://github.com/Computational-Plant-Science/plantit-cli
SummaryDeploy PlantIT workflows on laptops, servers, or clusters.
upload_time2021-08-14 12:41:32
maintainer
docs_urlNone
authorComputational Plant Science Lab
requires_python>=3.6.8
licenseBSD-3-Clause
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
<img src="https://github.com/Computational-Plant-Science/plantit/blob/master/plantit/front_end/src/assets/logo.png?raw=true" />
</p>

# PlantIT CLI

Container orchestration for reproducible phenotyping on laptops, servers, & HPC/HTC clusters

- Easy transfers to/from the CyVerse Data Store via the Terrain API
- Deploy Docker images as Singularity containers to clusters/servers
- Compatible with any substrate/scheduler supported by [Dask-Jobqueue](https://jobqueue.dask.org/en/latest/index.html)

This package must be installed and available in the $PATH on agents bound to PlantIT

![CI](https://github.com/Computational-Plant-Science/plantit-cli/workflows/CI/badge.svg)
[![PyPI version](https://badge.fury.io/py/plantit-cli.svg)](https://badge.fury.io/py/plantit-cli)
[![Coverage Status](https://coveralls.io/repos/github/Computational-Plant-Science/plantit-cli/badge.svg?branch=master)](https://coveralls.io/github/Computational-Plant-Science/plantit-cli) 

**This project is in open beta and is not yet stable.**

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Contents**

- [Requirements](#requirements)
- [Installation](#installation)
- [Usage](#usage)
  - [Commands](#commands)
    - [Pull](#pull)
    - [Run](#run)
      - [Compound commands](#compound-commands)
      - [Inputs](#inputs)
        - [File](#file)
        - [Files](#files)
        - [Directory](#directory)
      - [Bind mounts](#bind-mounts)
      - [GPU mode](#gpu-mode)
      - [HPC/HTC](#hpchtc)
        - [Virtual memory](#virtual-memory)
        - [Other resource requests](#other-resource-requests)
    - [Zip](#zip)
    - [Push](#push)
    - [Ping](#ping)
  - [Authenticating with Docker](#authenticating-with-docker)
  - [Authenticating with Terrain](#authenticating-with-terrain)
  - [Authenticating with PlantIT](#authenticating-with-plantit)
  - [Logging](#logging)
- [Development](#development)
  - [Tests](#tests)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

## Requirements


- Python 3.6.9+
- [Singularity](https://sylabs.io/docs/)

## Installation

To install the PlantIT CLI, use pip:

```
pip3 install plantit-cli
```

## Usage

Once the CLI is installed it can be invoked with `plantit <command>`.

### Commands

The CLI supports the following commands:

- `ping`: Print `pong`.
- `pull`: Download files from the CyVerse Data Store.
- `run`: Run a workflow.
- `zip`: Zip files produced by a workflow.
- `push`: Upload files to the CyVerse Data Store.

#### Pull

To pull files from the `/iplant/home/shared/iplantcollaborative/testing_tools/cowsay/` directory in the CyVerse Data Store to the current working directory, use:

```shell
plantit terrain pull /iplant/home/shared/iplantcollaborative/testing_tools/cowsay/ --terrain_token <token>
```

Optional arguments are:

- `--local_path (-p)`: Local path to download files to.
- `--pattern`: File patterns to include (one or more).
- `--overwrite`: Whether to overwrite already-existing files.

#### Run

PlantIT workflows are defined in YAML files. To run a workflow defined in `hello_world.yaml`, use `plantit run hello_world.yaml`. At minimum, the schema should include the following attributes:

```yaml
image: docker://alpine              # Docker image
workdir: /your/working/directory    # working directory
command: echo "Hello, world!"       # entrypoint
```

##### Compound commands

Note that your `command` may fail on some images if it contains `&&`. If you must run multiple consecutive commands, it's probably best to package them into a script.

##### Inputs

Runs involving inputs fall into 3 categories:

- spawn a single container to process a single file
- spawn a single container to process a single directory
- spawn a container per file to process files in a directory

To pull a file or directory, add an `input` section.

###### File

To pull a file from the Data Store and spawn a single container to process it, use `kind: file` and `from: <file path>`:

```yaml
input:
  kind: file
  path: /iplant/home/username/directory/file
```

###### Files

To pull a directory from the Data Store and spawn a container for each file, use `kind: files` and `from: <directory path>`:

```yaml
input:
  kind: files
  path: /iplant/home/username/directory
  patterns:  # optional
  - jpg
  - png
```

###### Directory

To pull the contents of a directory from the Data Store and spawn a single container to process it, use `kind: directory` and `from: <directory path>`:

```yaml
input:
  kind: directory
  path: /iplant/home/username/directory
```

##### Bind mounts

If your code needs to write temporary files somewhere other than the (automatically mounted) host working directory, use the `bind_mounts` attribute:

```yaml
bind_mounts:
  - /path/in/your/container # defaults to the host working directory
  - path/relative/to/host/working/directory:/another/path/in/your/container
```

##### CUDA GPU mode

To instruct Singularity to bind to NVIDIA GPU drivers on the host, add a `gpu: True` attribute to your configuration.

##### HPC/HTC

On high-performance or high-throughput computing systems with a scheduler like Torque or SLURM, you can parallelize multi-file runs by adding a `jobqueue` section like the following:

```yaml
...
jobqueue:
  slurm:
    cores: 1
    processes: 10,
    project: '<your allocation>'
    walltime: '01:00:00'
    queue: '<your queue>'
```

Substitute `pbs`, `moab`, `slurm`, or any other [Dask Jobqueue](https://jobqueue.dask.org/) cluster configuration section (the CLI uses Dask internally and passes your configuration directly through).

###### Virtual memory

For clusters with virtual memory, you may need to use `header_skip` to alter Dask's resource request from the scheduler:

```yaml
...
jobqueue:
  slurm:
    ...
    header_skip:
      - '--mem' # for clusters with virtual memory
```

###### Other resource requests

You can add other cluster-specific resource requests, like GPU-enabled nodes, with an `extra` section:

```yaml
...
jobqueue:
  slurm:
    ...
    extra:
      - '--gres=gpu:1'
```

#### Zip

To zip files all files in a directory, use `plantit zip <input directory>`.

To include file patterns or names, use (one or more) flags `--include_pattern` (abbr. `-ip`) or `--include_name` (`-in`).

To exclude file patterns or names, use (one or more) flags `--exclude_pattern` (`-ep`) or `--exclude_name` (`-en`).

Included files are gathered first, then excludes are filtered out of this collection.

#### Push

To push files in the current working directory to the `/iplant/home/<my>/<directory/` in the CyVerse Data Store, use `plantit terrain push /iplant/home/<my>/<directory/ --terrain_token <token>`.

Options are:

- `--local_path (-p)`: Local path to download files to.
- `--include_pattern (-ip)`: File patterns to include (one or more).
- `--include_name (-in)`: File names to include (one or more).
- `--exclude_pattern (-ep)`: File patterns to exclude (one or more).
- `--exclude_name (-en)`: File names to exclude (one or more).

If only `include_...`s are provided, only the file patterns and names specified will be included. If only `exclude_...`s section are present, all files except the patterns and names specified will be included. If you provide both `include_...` and `exclude_...` sections, the `include_...` rules will first be applied to generate a subset of files, which will then be filtered by the `exclude_...` rules.

<!--#### Verifying input file checksums

To verify checksums associated with input files, add a `checksums` attribute to the `input` section, with a list of `name`/`md5` pairs. For instance:

```yaml
input:
  kind: directory
  from: /iplant/home/username/directory
  checksums: 
    - name: file1.txt
      md5: 94fc3699a0f99317534736f0ec982dea
    - name: file2.txt
      md5: 8540f05638ac10899e8bc31c13d5074a
```-->

#### Ping

The `plantit ping` command is used internally by the PlantIT web application to test whether the CLI is properly installed on user-defined agents.

### Authenticating with Docker

To authenticate with Docker and bypass Docker Hub rate limits, provide a `--docker_username` and `--docker_password`. For instance:

```shell
plantit run hello_world.yaml --docker_username <your username> --docker_password <your password>
```

This is only required for the `plantit run` command.

### Authenticating with Terrain

The `pull`, `push`, and `run` commands use the Terrain API to access the CyVerse Data Store. Runs with inputs and outputs must provide a `--cyverse_token` argument. For instance, to run `hello_world.yaml`:

```shell
plantit run hello_world.yaml --cyverse_token 'eyJhbGciOiJSUzI1N...'
```

A CyVerse access token can be obtained from the Terrain API with a `GET` request (providing username/password for basic auth):

```shell script
GET https://de.cyverse.org/terrain/token/cas
```

### Authenticating with PlantIT

When the `run` command is invoked, `--plantit_url` and `--plantit_token` options may be provided to authenticate with PlantIT's RESTful API and push task status updates and logs back to the web application. This is only intended for internal use &mdash; requests with an invalid token or for a nonexistent task will be rejected.

### Logging

By default, the CLI will print all output to `stdout`. If a `--plantit_url` and `--plantit_token` are provided, output will be POSTed back to the PlantIT web application (only output generated by the CLI itself &mdash; container output will just be printed to `stdout`).  This is suitable for most cluster deployment targets, whose schedulers should automatically capture job output. To configure the CLI itself to write container output to a file, add the following to your configuration file:

```yaml
log_file: relative/path/to/logfile
```

## Development

To set up a development environment, clone the repo with `git clone https://github.com/Computational-Plant-Science/plantit-cli.git`. Then run `scripts/bootstrap.sh` (this will pull/build images for a small `docker-compose` SLURM cluster test environment).

### Tests

To run unit tests:

```docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/unit -s```

Note that integration tests invoke the Terrain API and may take some time to complete; they're rigged with a delay to allow writes to propagate from Terrain to the CyVerse Data Store (some pass/fail non-determinism occurs otherwise). To run integration tests:

```docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/integration -s```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Computational-Plant-Science/plantit-cli",
    "name": "plantit-cli",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6.8",
    "maintainer_email": "",
    "keywords": "",
    "author": "Computational Plant Science Lab",
    "author_email": "wbonelli@uga.edu",
    "download_url": "https://files.pythonhosted.org/packages/3e/38/03629ef0ef7a237d827fe7dfa9addac54c5b7f506fc38fda17887fda5f51/plantit-cli-0.2.94.tar.gz",
    "platform": "",
    "description": "<p align=\"center\">\n<img src=\"https://github.com/Computational-Plant-Science/plantit/blob/master/plantit/front_end/src/assets/logo.png?raw=true\" />\n</p>\n\n# PlantIT CLI\n\nContainer orchestration for reproducible phenotyping on laptops, servers, & HPC/HTC clusters\n\n- Easy transfers to/from the CyVerse Data Store via the Terrain API\n- Deploy Docker images as Singularity containers to clusters/servers\n- Compatible with any substrate/scheduler supported by [Dask-Jobqueue](https://jobqueue.dask.org/en/latest/index.html)\n\nThis package must be installed and available in the $PATH on agents bound to PlantIT\n\n![CI](https://github.com/Computational-Plant-Science/plantit-cli/workflows/CI/badge.svg)\n[![PyPI version](https://badge.fury.io/py/plantit-cli.svg)](https://badge.fury.io/py/plantit-cli)\n[![Coverage Status](https://coveralls.io/repos/github/Computational-Plant-Science/plantit-cli/badge.svg?branch=master)](https://coveralls.io/github/Computational-Plant-Science/plantit-cli) \n\n**This project is in open beta and is not yet stable.**\n\n<!-- START doctoc generated TOC please keep comment here to allow auto update -->\n<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->\n**Contents**\n\n- [Requirements](#requirements)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Commands](#commands)\n    - [Pull](#pull)\n    - [Run](#run)\n      - [Compound commands](#compound-commands)\n      - [Inputs](#inputs)\n        - [File](#file)\n        - [Files](#files)\n        - [Directory](#directory)\n      - [Bind mounts](#bind-mounts)\n      - [GPU mode](#gpu-mode)\n      - [HPC/HTC](#hpchtc)\n        - [Virtual memory](#virtual-memory)\n        - [Other resource requests](#other-resource-requests)\n    - [Zip](#zip)\n    - [Push](#push)\n    - [Ping](#ping)\n  - [Authenticating with Docker](#authenticating-with-docker)\n  - [Authenticating with Terrain](#authenticating-with-terrain)\n  - [Authenticating with PlantIT](#authenticating-with-plantit)\n  - [Logging](#logging)\n- [Development](#development)\n  - [Tests](#tests)\n\n<!-- END doctoc generated TOC please keep comment here to allow auto update -->\n\n## Requirements\n\n\n- Python 3.6.9+\n- [Singularity](https://sylabs.io/docs/)\n\n## Installation\n\nTo install the PlantIT CLI, use pip:\n\n```\npip3 install plantit-cli\n```\n\n## Usage\n\nOnce the CLI is installed it can be invoked with `plantit <command>`.\n\n### Commands\n\nThe CLI supports the following commands:\n\n- `ping`: Print `pong`.\n- `pull`: Download files from the CyVerse Data Store.\n- `run`: Run a workflow.\n- `zip`: Zip files produced by a workflow.\n- `push`: Upload files to the CyVerse Data Store.\n\n#### Pull\n\nTo pull files from the `/iplant/home/shared/iplantcollaborative/testing_tools/cowsay/` directory in the CyVerse Data Store to the current working directory, use:\n\n```shell\nplantit terrain pull /iplant/home/shared/iplantcollaborative/testing_tools/cowsay/ --terrain_token <token>\n```\n\nOptional arguments are:\n\n- `--local_path (-p)`: Local path to download files to.\n- `--pattern`: File patterns to include (one or more).\n- `--overwrite`: Whether to overwrite already-existing files.\n\n#### Run\n\nPlantIT workflows are defined in YAML files. To run a workflow defined in `hello_world.yaml`, use `plantit run hello_world.yaml`. At minimum, the schema should include the following attributes:\n\n```yaml\nimage: docker://alpine              # Docker image\nworkdir: /your/working/directory    # working directory\ncommand: echo \"Hello, world!\"       # entrypoint\n```\n\n##### Compound commands\n\nNote that your `command` may fail on some images if it contains `&&`. If you must run multiple consecutive commands, it's probably best to package them into a script.\n\n##### Inputs\n\nRuns involving inputs fall into 3 categories:\n\n- spawn a single container to process a single file\n- spawn a single container to process a single directory\n- spawn a container per file to process files in a directory\n\nTo pull a file or directory, add an `input` section.\n\n###### File\n\nTo pull a file from the Data Store and spawn a single container to process it, use `kind: file` and `from: <file path>`:\n\n```yaml\ninput:\n  kind: file\n  path: /iplant/home/username/directory/file\n```\n\n###### Files\n\nTo pull a directory from the Data Store and spawn a container for each file, use `kind: files` and `from: <directory path>`:\n\n```yaml\ninput:\n  kind: files\n  path: /iplant/home/username/directory\n  patterns:  # optional\n  - jpg\n  - png\n```\n\n###### Directory\n\nTo pull the contents of a directory from the Data Store and spawn a single container to process it, use `kind: directory` and `from: <directory path>`:\n\n```yaml\ninput:\n  kind: directory\n  path: /iplant/home/username/directory\n```\n\n##### Bind mounts\n\nIf your code needs to write temporary files somewhere other than the (automatically mounted) host working directory, use the `bind_mounts` attribute:\n\n```yaml\nbind_mounts:\n  - /path/in/your/container # defaults to the host working directory\n  - path/relative/to/host/working/directory:/another/path/in/your/container\n```\n\n##### CUDA GPU mode\n\nTo instruct Singularity to bind to NVIDIA GPU drivers on the host, add a `gpu: True` attribute to your configuration.\n\n##### HPC/HTC\n\nOn high-performance or high-throughput computing systems with a scheduler like Torque or SLURM, you can parallelize multi-file runs by adding a `jobqueue` section like the following:\n\n```yaml\n...\njobqueue:\n  slurm:\n    cores: 1\n    processes: 10,\n    project: '<your allocation>'\n    walltime: '01:00:00'\n    queue: '<your queue>'\n```\n\nSubstitute `pbs`, `moab`, `slurm`, or any other [Dask Jobqueue](https://jobqueue.dask.org/) cluster configuration section (the CLI uses Dask internally and passes your configuration directly through).\n\n###### Virtual memory\n\nFor clusters with virtual memory, you may need to use `header_skip` to alter Dask's resource request from the scheduler:\n\n```yaml\n...\njobqueue:\n  slurm:\n    ...\n    header_skip:\n      - '--mem' # for clusters with virtual memory\n```\n\n###### Other resource requests\n\nYou can add other cluster-specific resource requests, like GPU-enabled nodes, with an `extra` section:\n\n```yaml\n...\njobqueue:\n  slurm:\n    ...\n    extra:\n      - '--gres=gpu:1'\n```\n\n#### Zip\n\nTo zip files all files in a directory, use `plantit zip <input directory>`.\n\nTo include file patterns or names, use (one or more) flags `--include_pattern` (abbr. `-ip`) or `--include_name` (`-in`).\n\nTo exclude file patterns or names, use (one or more) flags `--exclude_pattern` (`-ep`) or `--exclude_name` (`-en`).\n\nIncluded files are gathered first, then excludes are filtered out of this collection.\n\n#### Push\n\nTo push files in the current working directory to the `/iplant/home/<my>/<directory/` in the CyVerse Data Store, use `plantit terrain push /iplant/home/<my>/<directory/ --terrain_token <token>`.\n\nOptions are:\n\n- `--local_path (-p)`: Local path to download files to.\n- `--include_pattern (-ip)`: File patterns to include (one or more).\n- `--include_name (-in)`: File names to include (one or more).\n- `--exclude_pattern (-ep)`: File patterns to exclude (one or more).\n- `--exclude_name (-en)`: File names to exclude (one or more).\n\nIf only `include_...`s are provided, only the file patterns and names specified will be included. If only `exclude_...`s section are present, all files except the patterns and names specified will be included. If you provide both `include_...` and `exclude_...` sections, the `include_...` rules will first be applied to generate a subset of files, which will then be filtered by the `exclude_...` rules.\n\n<!--#### Verifying input file checksums\n\nTo verify checksums associated with input files, add a `checksums` attribute to the `input` section, with a list of `name`/`md5` pairs. For instance:\n\n```yaml\ninput:\n  kind: directory\n  from: /iplant/home/username/directory\n  checksums: \n    - name: file1.txt\n      md5: 94fc3699a0f99317534736f0ec982dea\n    - name: file2.txt\n      md5: 8540f05638ac10899e8bc31c13d5074a\n```-->\n\n#### Ping\n\nThe `plantit ping` command is used internally by the PlantIT web application to test whether the CLI is properly installed on user-defined agents.\n\n### Authenticating with Docker\n\nTo authenticate with Docker and bypass Docker Hub rate limits, provide a `--docker_username` and `--docker_password`. For instance:\n\n```shell\nplantit run hello_world.yaml --docker_username <your username> --docker_password <your password>\n```\n\nThis is only required for the `plantit run` command.\n\n### Authenticating with Terrain\n\nThe `pull`, `push`, and `run` commands use the Terrain API to access the CyVerse Data Store. Runs with inputs and outputs must provide a `--cyverse_token` argument. For instance, to run `hello_world.yaml`:\n\n```shell\nplantit run hello_world.yaml --cyverse_token 'eyJhbGciOiJSUzI1N...'\n```\n\nA CyVerse access token can be obtained from the Terrain API with a `GET` request (providing username/password for basic auth):\n\n```shell script\nGET https://de.cyverse.org/terrain/token/cas\n```\n\n### Authenticating with PlantIT\n\nWhen the `run` command is invoked, `--plantit_url` and `--plantit_token` options may be provided to authenticate with PlantIT's RESTful API and push task status updates and logs back to the web application. This is only intended for internal use &mdash; requests with an invalid token or for a nonexistent task will be rejected.\n\n### Logging\n\nBy default, the CLI will print all output to `stdout`. If a `--plantit_url` and `--plantit_token` are provided, output will be POSTed back to the PlantIT web application (only output generated by the CLI itself &mdash; container output will just be printed to `stdout`).  This is suitable for most cluster deployment targets, whose schedulers should automatically capture job output. To configure the CLI itself to write container output to a file, add the following to your configuration file:\n\n```yaml\nlog_file: relative/path/to/logfile\n```\n\n## Development\n\nTo set up a development environment, clone the repo with `git clone https://github.com/Computational-Plant-Science/plantit-cli.git`. Then run `scripts/bootstrap.sh` (this will pull/build images for a small `docker-compose` SLURM cluster test environment).\n\n### Tests\n\nTo run unit tests:\n\n```docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/unit -s```\n\nNote that integration tests invoke the Terrain API and may take some time to complete; they're rigged with a delay to allow writes to propagate from Terrain to the CyVerse Data Store (some pass/fail non-determinism occurs otherwise). To run integration tests:\n\n```docker compose -f docker-compose.test.yml run -w /opt/plantit-cli/runs slurmctld python3 -m pytest /opt/plantit-cli/plantit_cli/tests/integration -s```\n\n\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "Deploy PlantIT workflows on laptops, servers, or clusters.",
    "version": "0.2.94",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "b821e168fbe58bcb474b4c7db65a1245",
                "sha256": "4cc0db817255b01abed3749a1d2ccbbf4f19052243203f514d38ca92234a4cf1"
            },
            "downloads": -1,
            "filename": "plantit_cli-0.2.94-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b821e168fbe58bcb474b4c7db65a1245",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6.8",
            "size": 35759,
            "upload_time": "2021-08-14T12:41:30",
            "upload_time_iso_8601": "2021-08-14T12:41:30.935128Z",
            "url": "https://files.pythonhosted.org/packages/df/7d/5e3270779508620f7fcb4dd67ad658a08e77f212069d902c513252d5c2b4/plantit_cli-0.2.94-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "2f7b768fdc7e265f3233489b3686271d",
                "sha256": "9ffb7458b2286f6bd5421d34b691d6a4b9d54b88165c7ecd0ed35b7b8342984e"
            },
            "downloads": -1,
            "filename": "plantit-cli-0.2.94.tar.gz",
            "has_sig": false,
            "md5_digest": "2f7b768fdc7e265f3233489b3686271d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6.8",
            "size": 33123,
            "upload_time": "2021-08-14T12:41:32",
            "upload_time_iso_8601": "2021-08-14T12:41:32.318334Z",
            "url": "https://files.pythonhosted.org/packages/3e/38/03629ef0ef7a237d827fe7dfa9addac54c5b7f506fc38fda17887fda5f51/plantit-cli-0.2.94.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-08-14 12:41:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "Computational-Plant-Science",
    "github_project": "plantit-cli",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "plantit-cli"
}
        
Elapsed time: 0.31412s