rms-cloud-tasks


Namerms-cloud-tasks JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA scalable, multi-cloud task processing system
upload_time2025-07-08 20:38:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseApache-2.0
keywords aws gcp azure
VCS
bugtrack_url
requirements aiohttp asyncio click json-stream numpy prettytable pydantic pyyaml rms-filecache shortuuid tqdm black flake8 moto mypy types-aiofiles types-PyYAML types-tqdm boto3 google-cloud-billing google-cloud-compute google-cloud-monitoring google-cloud-pubsub google-cloud-storage pytest pytest-asyncio pytest-cov pytest-mock myst-parser sphinx sphinx-rtd-theme sphinx-tabs sphinxcontrib-napoleon
Travis-CI No Travis.
coveralls test coverage
            [![GitHub release; latest by date](https://img.shields.io/github/v/release/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/releases)
[![GitHub Release Date](https://img.shields.io/github/release-date/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/releases)
[![Test Status](https://img.shields.io/github/actions/workflow/status/SETI/rms-cloud-tasks/run-tests.yml?branch=main)](https://github.com/SETI/rms-cloud-tasks/actions)
[![Documentation Status](https://readthedocs.org/projects/rms-cloud-tasks/badge/?version=latest)](https://rms-cloud-tasks.readthedocs.io/en/latest/?badge=latest)
[![Code coverage](https://img.shields.io/codecov/c/github/SETI/rms-cloud-tasks/main?logo=codecov)](https://codecov.io/gh/SETI/rms-cloud-tasks)
<br />
[![PyPI - Version](https://img.shields.io/pypi/v/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)
[![PyPI - Format](https://img.shields.io/pypi/format/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)
<br />
[![GitHub commits since latest release](https://img.shields.io/github/commits-since/SETI/rms-cloud-tasks/latest)](https://github.com/SETI/rms-cloud-tasks/commits/main/)
[![GitHub commit activity](https://img.shields.io/github/commit-activity/m/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/commits/main/)
[![GitHub last commit](https://img.shields.io/github/last-commit/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/commits/main/)
<br />
[![Number of GitHub open issues](https://img.shields.io/github/issues-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/issues)
[![Number of GitHub closed issues](https://img.shields.io/github/issues-closed-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/issues)
[![Number of GitHub open pull requests](https://img.shields.io/github/issues-pr-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/pulls)
[![Number of GitHub closed pull requests](https://img.shields.io/github/issues-pr-closed-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/pulls)
<br />
![GitHub License](https://img.shields.io/github/license/SETI/rms-cloud-tasks)
[![Number of GitHub stars](https://img.shields.io/github/stars/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/stargazers)
![GitHub forks](https://img.shields.io/github/forks/SETI/rms-cloud-tasks)

# Introduction

Cloud Tasks (contained in the `rms-cloud-tasks` package) is a framework for running
independent tasks on cloud providers with automatic compute instance and task queue
management. It is specifically designed for running the same code multiple times in a
batch environment to process a series of different inputs. For example, the program could
be an image processing program that takes the image filename as an argument, downloads the
image from the cloud, performs some manipulations, and writes the result to a cloud-based
location. It is very important that the tasks are completely independent; no communication
between them is supported. Also, the processing happens entirely in a batch mode: a
certain number of compute instances are created, they all process tasks in parallel, and
then the compute instances are destroyed.

`rms-cloud-tasks` is a product of the [PDS Ring-Moon Systems Node](https://pds-rings.seti.org).

# Features

Cloud Tasks is extremely easy to use with a simple command line interface and
straightforward configuration file. It supports AWS and GCP compute instances and queues
along with the ability to run jobs on a local workstation, all using a
provider-independent API. Although each cloud provider has implemented similar
functionality as part of their offering (e.g. GCP's Cloud Batch), Cloud Tasks is unique in
that it unifies all supported providers into a single, simple, universal system that does
not require learning the often-complicated details of the official full-featured services.

Cloud Tasks consists of four primary components:

- **A Python module to make parallel execution simple**
  - Allows conversion of an existing Python program to a parallel task with only a few lines
    of code
  - Supports both cloud compute instance and local machine environments
  - Executes each task in its own process for complete isolation
  - Reads task information from a cloud-based task queue or directly from a local file
  - Monitors the state of spot instances to notify tasks of upcoming preemption
- **A command line interface to manage the task queue system, that allows**
  - Loading of tasks from a JSON or YAML file
  - Checking the status of a queue
  - Purging a queue of remaining tasks
  - Deleting a queue entirely
- **A command line interface to query the cloud about available resources, given certain
  constraints**
  - Types of compute instances available, including price (both demand and spot instances)
  - VM boot images available
  - Regions and zones
- **A command line interface to manage a pool of compute instances optimized for price,
  given certain constraints**
  - Automatically finds the optimal compute instance type given pricing and other constraints
  - Automatically determines the number of simultaneous instances to use
  - Creates new instances and runs a specified startup script to execute the task manager
  - Monitors instances for failure or preemption and creates new instances as needed to keep
    the compute pool full
  - Detects when all jobs are complete and terminates the instances

# Installation

`cloud_tasks` consists of a command line interface (called `cloud_tasks`) and a Python
module (also called `cloud_tasks`). They are available via the `rms-cloud-tasks` package
on PyPI and can be installed with:

```sh
pip install rms-cloud-tasks
```

Note that this will install `cloud_tasks` into your current system Python, or into your
currently activated virtual environment (venv), if any.

If you already have the `rms-cloud-tasks` package installed but wish to upgrade to a
more recent version, you can use:

```sh
pip install --upgrade rms-cloud-tasks
```

You may also install `cloud_tasks` using `pipx`, which will isolate the installation from
your system Python without requiring the creation of a virtual environment. To install
`pipx`, please see the [installation
instructions](https://pipx.pypa.io/stable/installation/). Once `pipx` is available, you
may install `cloud_tasks` with:

```sh
pipx install rms-cloud-tasks
```

If you already have the `rms-cloud-tasks` package installed with `pipx`, you may
upgrade to a more recent version with:

```sh
pipx upgrade rms-cloud-tasks
```

Using `pipx` is only useful if you want to use the command line interface and not access
the Python module; however, it does not require you to worry about the Python version,
setting up a virtual environment, etc.

# Basic Examples

The `cloud_tasks` command line program supports many useful commands that control the task
queue, compute instance pool, and retrieve general information about the cloud in a
provider-indepent manner. A few examples are given below.

To get a list of available commands:

```bash
cloud_tasks --help
```

To get help on a particular command:

```bash
cloud_tasks load_queue --help
```

To list all ARM64-based compute instance types that have 2 to 4 vCPUs and at most 4 GB
memory per vCPU.

```bash
cloud_tasks list_instance_types \
  --provider gcp --region us-central1 \
  --min-cpu 2 --max-cpu 4 --arch ARM64 --max-memory-per-cpu 4
```

To load a JSON file containing task descriptions into the task queue:

```bash
cloud_tasks load_queue \
  --provider gcp --region us-central1 --project-id my-project \
  --job-id my-job --task-file mytasks.json
```

To start automatic creation and management of a compute instance pool:

```bash
cloud_tasks manage_pool --provider gcp --config myconfig.yaml
```

# Contributing

Information on contributing to this package can be found in the
[Contributing Guide](https://github.com/SETI/rms-cloud-tasks/blob/main/CONTRIBUTING.md).

# Links

- [Documentation](https://rms-cloud-tasks.readthedocs.io)
- [Repository](https://github.com/SETI/rms-cloud-tasks)
- [Issue tracker](https://github.com/SETI/rms-cloud-tasks/issues)
- [PyPi](https://pypi.org/project/rms-cloud-tasks)

# Licensing

This code is licensed under the [Apache License v2.0](https://github.com/SETI/rms-cloud-tasks/blob/main/LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rms-cloud-tasks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "\"Robert S. French\" <rfrench@seti.org>",
    "keywords": "AWS, GCP, Azure",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/75/8f/e18f8bcb95fa545296f2a4cb21b406c70a8f43a2a5e955dfc74eba8437c5/rms_cloud_tasks-0.0.2.tar.gz",
    "platform": null,
    "description": "[![GitHub release; latest by date](https://img.shields.io/github/v/release/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/releases)\n[![GitHub Release Date](https://img.shields.io/github/release-date/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/releases)\n[![Test Status](https://img.shields.io/github/actions/workflow/status/SETI/rms-cloud-tasks/run-tests.yml?branch=main)](https://github.com/SETI/rms-cloud-tasks/actions)\n[![Documentation Status](https://readthedocs.org/projects/rms-cloud-tasks/badge/?version=latest)](https://rms-cloud-tasks.readthedocs.io/en/latest/?badge=latest)\n[![Code coverage](https://img.shields.io/codecov/c/github/SETI/rms-cloud-tasks/main?logo=codecov)](https://codecov.io/gh/SETI/rms-cloud-tasks)\n<br />\n[![PyPI - Version](https://img.shields.io/pypi/v/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)\n[![PyPI - Format](https://img.shields.io/pypi/format/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/rms-cloud-tasks)](https://pypi.org/project/rms-cloud-tasks)\n<br />\n[![GitHub commits since latest release](https://img.shields.io/github/commits-since/SETI/rms-cloud-tasks/latest)](https://github.com/SETI/rms-cloud-tasks/commits/main/)\n[![GitHub commit activity](https://img.shields.io/github/commit-activity/m/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/commits/main/)\n[![GitHub last commit](https://img.shields.io/github/last-commit/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/commits/main/)\n<br />\n[![Number of GitHub open issues](https://img.shields.io/github/issues-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/issues)\n[![Number of GitHub closed issues](https://img.shields.io/github/issues-closed-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/issues)\n[![Number of GitHub open pull requests](https://img.shields.io/github/issues-pr-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/pulls)\n[![Number of GitHub closed pull requests](https://img.shields.io/github/issues-pr-closed-raw/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/pulls)\n<br />\n![GitHub License](https://img.shields.io/github/license/SETI/rms-cloud-tasks)\n[![Number of GitHub stars](https://img.shields.io/github/stars/SETI/rms-cloud-tasks)](https://github.com/SETI/rms-cloud-tasks/stargazers)\n![GitHub forks](https://img.shields.io/github/forks/SETI/rms-cloud-tasks)\n\n# Introduction\n\nCloud Tasks (contained in the `rms-cloud-tasks` package) is a framework for running\nindependent tasks on cloud providers with automatic compute instance and task queue\nmanagement. It is specifically designed for running the same code multiple times in a\nbatch environment to process a series of different inputs. For example, the program could\nbe an image processing program that takes the image filename as an argument, downloads the\nimage from the cloud, performs some manipulations, and writes the result to a cloud-based\nlocation. It is very important that the tasks are completely independent; no communication\nbetween them is supported. Also, the processing happens entirely in a batch mode: a\ncertain number of compute instances are created, they all process tasks in parallel, and\nthen the compute instances are destroyed.\n\n`rms-cloud-tasks` is a product of the [PDS Ring-Moon Systems Node](https://pds-rings.seti.org).\n\n# Features\n\nCloud Tasks is extremely easy to use with a simple command line interface and\nstraightforward configuration file. It supports AWS and GCP compute instances and queues\nalong with the ability to run jobs on a local workstation, all using a\nprovider-independent API. Although each cloud provider has implemented similar\nfunctionality as part of their offering (e.g. GCP's Cloud Batch), Cloud Tasks is unique in\nthat it unifies all supported providers into a single, simple, universal system that does\nnot require learning the often-complicated details of the official full-featured services.\n\nCloud Tasks consists of four primary components:\n\n- **A Python module to make parallel execution simple**\n  - Allows conversion of an existing Python program to a parallel task with only a few lines\n    of code\n  - Supports both cloud compute instance and local machine environments\n  - Executes each task in its own process for complete isolation\n  - Reads task information from a cloud-based task queue or directly from a local file\n  - Monitors the state of spot instances to notify tasks of upcoming preemption\n- **A command line interface to manage the task queue system, that allows**\n  - Loading of tasks from a JSON or YAML file\n  - Checking the status of a queue\n  - Purging a queue of remaining tasks\n  - Deleting a queue entirely\n- **A command line interface to query the cloud about available resources, given certain\n  constraints**\n  - Types of compute instances available, including price (both demand and spot instances)\n  - VM boot images available\n  - Regions and zones\n- **A command line interface to manage a pool of compute instances optimized for price,\n  given certain constraints**\n  - Automatically finds the optimal compute instance type given pricing and other constraints\n  - Automatically determines the number of simultaneous instances to use\n  - Creates new instances and runs a specified startup script to execute the task manager\n  - Monitors instances for failure or preemption and creates new instances as needed to keep\n    the compute pool full\n  - Detects when all jobs are complete and terminates the instances\n\n# Installation\n\n`cloud_tasks` consists of a command line interface (called `cloud_tasks`) and a Python\nmodule (also called `cloud_tasks`). They are available via the `rms-cloud-tasks` package\non PyPI and can be installed with:\n\n```sh\npip install rms-cloud-tasks\n```\n\nNote that this will install `cloud_tasks` into your current system Python, or into your\ncurrently activated virtual environment (venv), if any.\n\nIf you already have the `rms-cloud-tasks` package installed but wish to upgrade to a\nmore recent version, you can use:\n\n```sh\npip install --upgrade rms-cloud-tasks\n```\n\nYou may also install `cloud_tasks` using `pipx`, which will isolate the installation from\nyour system Python without requiring the creation of a virtual environment. To install\n`pipx`, please see the [installation\ninstructions](https://pipx.pypa.io/stable/installation/). Once `pipx` is available, you\nmay install `cloud_tasks` with:\n\n```sh\npipx install rms-cloud-tasks\n```\n\nIf you already have the `rms-cloud-tasks` package installed with `pipx`, you may\nupgrade to a more recent version with:\n\n```sh\npipx upgrade rms-cloud-tasks\n```\n\nUsing `pipx` is only useful if you want to use the command line interface and not access\nthe Python module; however, it does not require you to worry about the Python version,\nsetting up a virtual environment, etc.\n\n# Basic Examples\n\nThe `cloud_tasks` command line program supports many useful commands that control the task\nqueue, compute instance pool, and retrieve general information about the cloud in a\nprovider-indepent manner. A few examples are given below.\n\nTo get a list of available commands:\n\n```bash\ncloud_tasks --help\n```\n\nTo get help on a particular command:\n\n```bash\ncloud_tasks load_queue --help\n```\n\nTo list all ARM64-based compute instance types that have 2 to 4 vCPUs and at most 4 GB\nmemory per vCPU.\n\n```bash\ncloud_tasks list_instance_types \\\n  --provider gcp --region us-central1 \\\n  --min-cpu 2 --max-cpu 4 --arch ARM64 --max-memory-per-cpu 4\n```\n\nTo load a JSON file containing task descriptions into the task queue:\n\n```bash\ncloud_tasks load_queue \\\n  --provider gcp --region us-central1 --project-id my-project \\\n  --job-id my-job --task-file mytasks.json\n```\n\nTo start automatic creation and management of a compute instance pool:\n\n```bash\ncloud_tasks manage_pool --provider gcp --config myconfig.yaml\n```\n\n# Contributing\n\nInformation on contributing to this package can be found in the\n[Contributing Guide](https://github.com/SETI/rms-cloud-tasks/blob/main/CONTRIBUTING.md).\n\n# Links\n\n- [Documentation](https://rms-cloud-tasks.readthedocs.io)\n- [Repository](https://github.com/SETI/rms-cloud-tasks)\n- [Issue tracker](https://github.com/SETI/rms-cloud-tasks/issues)\n- [PyPi](https://pypi.org/project/rms-cloud-tasks)\n\n# Licensing\n\nThis code is licensed under the [Apache License v2.0](https://github.com/SETI/rms-cloud-tasks/blob/main/LICENSE).\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A scalable, multi-cloud task processing system",
    "version": "0.0.2",
    "project_urls": {
        "Homepage": "https://github.com/SETI/rms-cloud-tasks",
        "Issues": "https://github.com/SETI/rms-cloud-tasks/issues",
        "Repository": "https://github.com/SETI/rms-cloud-tasks",
        "Source": "https://github.com/SETI/rms-cloud-tasks"
    },
    "split_keywords": [
        "aws",
        " gcp",
        " azure"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "32e5b321b5bfb1aebf3d969c8a7f4e4f335589f903c55fe9a744cf758b07a475",
                "md5": "b04e1cbfd172e475eae9fe4c8255a1d9",
                "sha256": "ef7f35eea5ffca070f9cc97814c0455581bf106fe65eb9a67e5c44ded0e16b60"
            },
            "downloads": -1,
            "filename": "rms_cloud_tasks-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b04e1cbfd172e475eae9fe4c8255a1d9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 110470,
            "upload_time": "2025-07-08T20:38:55",
            "upload_time_iso_8601": "2025-07-08T20:38:55.028617Z",
            "url": "https://files.pythonhosted.org/packages/32/e5/b321b5bfb1aebf3d969c8a7f4e4f335589f903c55fe9a744cf758b07a475/rms_cloud_tasks-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "758fe18f8bcb95fa545296f2a4cb21b406c70a8f43a2a5e955dfc74eba8437c5",
                "md5": "29af47f9c3841a5e01045bbf31c309ba",
                "sha256": "3e56a2e83a04235b8bc0b81d83d84825126327736a2a9e67f035f81c4b637358"
            },
            "downloads": -1,
            "filename": "rms_cloud_tasks-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "29af47f9c3841a5e01045bbf31c309ba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 317657,
            "upload_time": "2025-07-08T20:38:56",
            "upload_time_iso_8601": "2025-07-08T20:38:56.332908Z",
            "url": "https://files.pythonhosted.org/packages/75/8f/e18f8bcb95fa545296f2a4cb21b406c70a8f43a2a5e955dfc74eba8437c5/rms_cloud_tasks-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 20:38:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SETI",
    "github_project": "rms-cloud-tasks",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [
        {
            "name": "aiohttp",
            "specs": []
        },
        {
            "name": "asyncio",
            "specs": []
        },
        {
            "name": "click",
            "specs": []
        },
        {
            "name": "json-stream",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "prettytable",
            "specs": []
        },
        {
            "name": "pydantic",
            "specs": []
        },
        {
            "name": "pyyaml",
            "specs": []
        },
        {
            "name": "rms-filecache",
            "specs": []
        },
        {
            "name": "shortuuid",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "black",
            "specs": []
        },
        {
            "name": "flake8",
            "specs": []
        },
        {
            "name": "moto",
            "specs": []
        },
        {
            "name": "mypy",
            "specs": []
        },
        {
            "name": "types-aiofiles",
            "specs": []
        },
        {
            "name": "types-PyYAML",
            "specs": []
        },
        {
            "name": "types-tqdm",
            "specs": []
        },
        {
            "name": "boto3",
            "specs": []
        },
        {
            "name": "google-cloud-billing",
            "specs": []
        },
        {
            "name": "google-cloud-compute",
            "specs": []
        },
        {
            "name": "google-cloud-monitoring",
            "specs": []
        },
        {
            "name": "google-cloud-pubsub",
            "specs": []
        },
        {
            "name": "google-cloud-storage",
            "specs": []
        },
        {
            "name": "pytest",
            "specs": []
        },
        {
            "name": "pytest-asyncio",
            "specs": []
        },
        {
            "name": "pytest-cov",
            "specs": []
        },
        {
            "name": "pytest-mock",
            "specs": []
        },
        {
            "name": "myst-parser",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "sphinx-rtd-theme",
            "specs": []
        },
        {
            "name": "sphinx-tabs",
            "specs": []
        },
        {
            "name": "sphinxcontrib-napoleon",
            "specs": []
        }
    ],
    "lcname": "rms-cloud-tasks"
}
        
Elapsed time: 0.43185s