datacube-alchemist


Namedatacube-alchemist JSON
Version 0.6.7 PyPI version JSON
download
home_pagehttps://github.com/opendatacube/datacube-alchemist
SummaryBatch process Open Data Cube datasets
upload_time2023-09-01 06:00:07
maintainer
docs_urlNone
author
requires_python
licenseApache License 2.0
keywords datacube-alchemist opendatacube
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Datacube Alchemist - ODC Dataset to Dataset Converter


![Scan](https://github.com/opendatacube/datacube-alchemist/workflows/Scan/badge.svg)
![Test](https://github.com/opendatacube/datacube-alchemist/workflows/Test/badge.svg)
![Push](https://github.com/opendatacube/datacube-alchemist/workflows/Push/badge.svg)
[![codecov](https://codecov.io/gh/opendatacube/datacube-alchemist/branch/main/graph/badge.svg?token=8dsJGc99qY)](https://codecov.io/gh/opendatacube/datacube-alchemist)

## PURPOSE

Datacube Alchemist is a command line application for performing Dataset to Dataset transformations in the context
of an Open Data Cube system.

It uses a configuration file which specifies an input _Product_ or _Products_, a _Transformation_ to perform, and
output parameters and destination.

Features

* Writes output to Cloud Optimised GeoTIFFs
* Easily runs within a Docker Container
* Parallelism using AWS SQS queues and Kubernetes
* Write output data to S3 or a file system
* Generates `eo3` format dataset metadata, along with processing information
* Generates STAC 1.0.0.beta2 dataset metadata
* Configurable thumbnail generation
* Pass any command line options as Environment Variables

## INSTALLATION

You can build the docker image locally with Docker or Docker Compose. The commands are
`docker build --tag opendatacube/datacube-alchemist .` or `docker-compose build`.

There's a Python setup file, so you can do `pip3 install .` in the root folder. You will
need to ensure that the Open Data Cube and all its dependencies happily install though.

## USAGE

### Development environment

To run some example processes you can use the Docker Compose file to create a local workspace.
To start the workspace and run an example, you can do the following:

* Export the environment variables `ODC_ACCESS_KEY` and `ODC_SECRET_KEY` with valid AWS credentials
* Run `make up` or `docker-compose up` to start the postgres and datacube-alchemist Docker containers
* `make initdb` to initialise the ODC database (or see the Makefile for the specific command)
* `make metadata` will add the metadata that the Landsat example product needs
* `make product` will add the Landsat product definitions
* `make index` will index a range of Landsat scenes to test processing with
* `make wofs-one` or `make fc-one` will process a single Fractional Cover or Water
Observations from Space scene and output the results to the ./examples folder in this project directory

## Commands

Note that the `--config-file` can be a local path or a URI.

### datacube-alchemist run-one

Note that `--dryrun` is optional, and will run a 1/10 scale load and will not
write output to the final destination.

``` bash
datacube-alchemist run-one \
  --config-file ./examples/c3_config_wo.yaml \
  --uuid 7b9553d4-3367-43fe-8e6f-b45999c5ada6 \
  --dryrun \

```

### datacube-alchemist run-many

Note that the final argument is a datacube _expression_ , see
[Datacube Search documentation](https://datacube-core.readthedocs.io/en/latest/ops/tools.html?highlight=expressions#datacube-dataset-search).

``` bash
datacube-alchemist run-many \
  --config-file ./examples/c3_config_wo.yaml \
  --limit=2 \
  --dryrun \
  time in 2020-01
```

### datacube-alchemist run-from-queue

Notes on queues. To run jobs from an SQS queue, good practice is to create a deadletter queue
as well as a main queue. Jobs (messages) get picked up off the main queue, and if they're successful,
then they're deleted. If they aren't successful, they're not deleted, and they go back on the
main queue after a defined amount of time. If this happens more than the defined number of times
then the message is moved to the deadletter queue. In this way, you can track work completion.

``` bash
datacube-alchemist run-from-queue \
  --config-file ./examples/c3_config_wo.yaml \
  --queue example-queue-name \
  --limit=1 \
  --queue-timeout=600 \
  --dryrun
```

### datacube-alchemist add-to-queue

The `--limit` is the total number of datasets to limit to, whereas the `--product-limit` is
the number of datasets per product, in the case that you have multiple input products.

``` bash
datacube-alchemist add-to-queue \
  --config-file ./examples/c3_config_wo.yaml \
  --queue example-queue-name \
  --limit=300 \
  --product-limit=100
```

### datacube-alchemist redrive-to-queue

This will get items from a deadletter queue and push them to an
alive queue. Be careful, because it doesn't know what queue is what.
You need to know that!

``` bash
datacube-alchemist redrive-to-queue \
  --queue example-from-queue \
  --to-queue example-to-queue
```

## License

Apache License 2.0

## Copyright

© 2021, Open Data Cube Community

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/opendatacube/datacube-alchemist",
    "name": "datacube-alchemist",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "datacube-alchemist,opendatacube",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/ca/77/28e7e3c746d7f44ae8e80754de17596ee6d0ddd1e5ed390f69b1fd548654/datacube-alchemist-0.6.7.tar.gz",
    "platform": "any",
    "description": "# Datacube Alchemist - ODC Dataset to Dataset Converter\n\n\n![Scan](https://github.com/opendatacube/datacube-alchemist/workflows/Scan/badge.svg)\n![Test](https://github.com/opendatacube/datacube-alchemist/workflows/Test/badge.svg)\n![Push](https://github.com/opendatacube/datacube-alchemist/workflows/Push/badge.svg)\n[![codecov](https://codecov.io/gh/opendatacube/datacube-alchemist/branch/main/graph/badge.svg?token=8dsJGc99qY)](https://codecov.io/gh/opendatacube/datacube-alchemist)\n\n## PURPOSE\n\nDatacube Alchemist is a command line application for performing Dataset to Dataset transformations in the context\nof an Open Data Cube system.\n\nIt uses a configuration file which specifies an input _Product_ or _Products_, a _Transformation_ to perform, and\noutput parameters and destination.\n\nFeatures\n\n* Writes output to Cloud Optimised GeoTIFFs\n* Easily runs within a Docker Container\n* Parallelism using AWS SQS queues and Kubernetes\n* Write output data to S3 or a file system\n* Generates `eo3` format dataset metadata, along with processing information\n* Generates STAC 1.0.0.beta2 dataset metadata\n* Configurable thumbnail generation\n* Pass any command line options as Environment Variables\n\n## INSTALLATION\n\nYou can build the docker image locally with Docker or Docker Compose. The commands are\n`docker build --tag opendatacube/datacube-alchemist .` or `docker-compose build`.\n\nThere's a Python setup file, so you can do `pip3 install .` in the root folder. You will\nneed to ensure that the Open Data Cube and all its dependencies happily install though.\n\n## USAGE\n\n### Development environment\n\nTo run some example processes you can use the Docker Compose file to create a local workspace.\nTo start the workspace and run an example, you can do the following:\n\n* Export the environment variables `ODC_ACCESS_KEY` and `ODC_SECRET_KEY` with valid AWS credentials\n* Run `make up` or `docker-compose up` to start the postgres and datacube-alchemist Docker containers\n* `make initdb` to initialise the ODC database (or see the Makefile for the specific command)\n* `make metadata` will add the metadata that the Landsat example product needs\n* `make product` will add the Landsat product definitions\n* `make index` will index a range of Landsat scenes to test processing with\n* `make wofs-one` or `make fc-one` will process a single Fractional Cover or Water\nObservations from Space scene and output the results to the ./examples folder in this project directory\n\n## Commands\n\nNote that the `--config-file` can be a local path or a URI.\n\n### datacube-alchemist run-one\n\nNote that `--dryrun` is optional, and will run a 1/10 scale load and will not\nwrite output to the final destination.\n\n``` bash\ndatacube-alchemist run-one \\\n  --config-file ./examples/c3_config_wo.yaml \\\n  --uuid 7b9553d4-3367-43fe-8e6f-b45999c5ada6 \\\n  --dryrun \\\n\n```\n\n### datacube-alchemist run-many\n\nNote that the final argument is a datacube _expression_ , see\n[Datacube Search documentation](https://datacube-core.readthedocs.io/en/latest/ops/tools.html?highlight=expressions#datacube-dataset-search).\n\n``` bash\ndatacube-alchemist run-many \\\n  --config-file ./examples/c3_config_wo.yaml \\\n  --limit=2 \\\n  --dryrun \\\n  time in 2020-01\n```\n\n### datacube-alchemist run-from-queue\n\nNotes on queues. To run jobs from an SQS queue, good practice is to create a deadletter queue\nas well as a main queue. Jobs (messages) get picked up off the main queue, and if they're successful,\nthen they're deleted. If they aren't successful, they're not deleted, and they go back on the\nmain queue after a defined amount of time. If this happens more than the defined number of times\nthen the message is moved to the deadletter queue. In this way, you can track work completion.\n\n``` bash\ndatacube-alchemist run-from-queue \\\n  --config-file ./examples/c3_config_wo.yaml \\\n  --queue example-queue-name \\\n  --limit=1 \\\n  --queue-timeout=600 \\\n  --dryrun\n```\n\n### datacube-alchemist add-to-queue\n\nThe `--limit` is the total number of datasets to limit to, whereas the `--product-limit` is\nthe number of datasets per product, in the case that you have multiple input products.\n\n``` bash\ndatacube-alchemist add-to-queue \\\n  --config-file ./examples/c3_config_wo.yaml \\\n  --queue example-queue-name \\\n  --limit=300 \\\n  --product-limit=100\n```\n\n### datacube-alchemist redrive-to-queue\n\nThis will get items from a deadletter queue and push them to an\nalive queue. Be careful, because it doesn't know what queue is what.\nYou need to know that!\n\n``` bash\ndatacube-alchemist redrive-to-queue \\\n  --queue example-from-queue \\\n  --to-queue example-to-queue\n```\n\n## License\n\nApache License 2.0\n\n## Copyright\n\n\u00a9 2021, Open Data Cube Community\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Batch process Open Data Cube datasets",
    "version": "0.6.7",
    "project_urls": {
        "Homepage": "https://github.com/opendatacube/datacube-alchemist"
    },
    "split_keywords": [
        "datacube-alchemist",
        "opendatacube"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "79a9f874af4520a30ed9305a03dba942d662f9296b94e0f469200eb06f1ec940",
                "md5": "4fd44e0a179800a3ab791b13132dea3e",
                "sha256": "6748eedbdfef3feaa3b2a32d844c1a6cb40062128716e1e52ae300819a7c4325"
            },
            "downloads": -1,
            "filename": "datacube_alchemist-0.6.7-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4fd44e0a179800a3ab791b13132dea3e",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 27530,
            "upload_time": "2023-09-01T05:59:34",
            "upload_time_iso_8601": "2023-09-01T05:59:34.497489Z",
            "url": "https://files.pythonhosted.org/packages/79/a9/f874af4520a30ed9305a03dba942d662f9296b94e0f469200eb06f1ec940/datacube_alchemist-0.6.7-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ca7728e7e3c746d7f44ae8e80754de17596ee6d0ddd1e5ed390f69b1fd548654",
                "md5": "18655792ccfd7a3f5b412a3de00d5d17",
                "sha256": "33c17a7a4491c681f9fa1f3dbeb8a8a2412cd370ff669c577c176d3d771cff2a"
            },
            "downloads": -1,
            "filename": "datacube-alchemist-0.6.7.tar.gz",
            "has_sig": false,
            "md5_digest": "18655792ccfd7a3f5b412a3de00d5d17",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 50118,
            "upload_time": "2023-09-01T06:00:07",
            "upload_time_iso_8601": "2023-09-01T06:00:07.667713Z",
            "url": "https://files.pythonhosted.org/packages/ca/77/28e7e3c746d7f44ae8e80754de17596ee6d0ddd1e5ed390f69b1fd548654/datacube-alchemist-0.6.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-01 06:00:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "opendatacube",
    "github_project": "datacube-alchemist",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "datacube-alchemist"
}
        
Elapsed time: 0.10560s