galv-harvester


Namegalv-harvester JSON
Version 1.3.4 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-10-07 14:27:44
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseCopyright (c) 2020-2023, The Chancellor, Masters and Scholars of the University of Oxford, and the 'Galv' Developers. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
keywords battery cycler galv
VCS
bugtrack_url
requirements click requests pandas dask fastnumbers holoviews matplotlib datashader galvani maya xlrd psutil
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Galv Harvester (Python program)
> A metadata secretary for battery science

[![PyPI - Version](https://img.shields.io/pypi/v/galv-harvester)](https://pypi.org/project/galv-harvester/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/galv-harvester)](https://pypi.org/project/galv-harvester/)

[![Hatch project](https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg)](https://github.com/pypa/hatch)
[![Test, Build, and Publish](https://github.com/galv-team/galv-harvester/actions/workflows/publish.yml/badge.svg)](https://github.com/galv-team/galv-harvester/actions/workflows/publish.yml)

## Galv Project
- [Backend](https://github.com/galv-team/galv-backend)
- [Frontend](https://github.com/galv-team/galv-frontend)
- [**Harvester**](https://github.com/galv-team/galv-harvester)

## Installation

The Galv Harvester can be installed from the [Python Package Index](https://pypi.org/project/galv-harvester/).

```bash
pip install galv-harvester
```

This will install the harvester and its dependencies, and make the `galv-harvester` command available.

## Usage

The first time you use the harvester, it will have to register itself with the Galv server.
To set up the harvester, using the following command:

```bash
galv-harvester setup
```

The harvester will prompt you for the necessary settings to connect to the Galv server (see [Initial Setup](#initial-setup)).

Alternatively, you can specify the settings as environment variables 
(see [Using Environment Variables](#using-environment-variables))
or as [command line arguments](#using-command-line-arguments) to the `galv-harvester` program.

## Initial Setup

There are three ways to set up the harvester: using the [setup wizard](#using-the-setup-wizard), 
using [command line arguments](#using-command-line-arguments),
or by specifying [environment variables](#using-environment-variables).
You can use a combination of both methods, specifying some settings in the environment and others in the wizard.

If you launch the program using the commands above, you will be prompted to enter the necessary settings by the wizard.

### Using the setup wizard

First, you'll be asked for the [Galv server](https://github.com/galv-team/galv-backend) URL.
This should be the URL of the Galv server you have set up.
Providing a frontend URL will not work, as the harvester needs to communicate with the backend.

Next, you'll be asked for your API token. 
This can be generated in either the Galv frontend or backend.
The token should be for a User who administers the Lab the Harvester will belong to.

Next, you'll be asked to specify a name for the new Harvester. 

Finally, you'll be asked if you want to monitor a directory.
If you answer 'yes', you'll be asked for the path to the directory you want to monitor,
and the Team that the monitored path will belong to.

The Harvester will register itself with the Galv server and begin to monitor for data files.

The `--foreground` flag is optional, and will keep the harvester running in the foreground.

### Using environment variables

You can specify harvester properties as environment variables.
If you are using docker-compose, you can specify these in the `docker-compose.yml` file (see below),
or you can specify them in your shell environment before running the harvester if you are running it as a standalone Python program.
Any environment variables can be omitted, and the harvester will prompt you for them when you start it if they are necessary.
For details on the variables you can set, and whether they are necessary, see the [variable details](#variable-details) section.

```yaml
# .env
GALV_HARVESTER_SERVER_URL=<your_server_url>
GALV_HARVESTER_NAME=<your_harvester_name>
GALV_HARVESTER_API_TOKEN=<your_api_token>
GALV_HARVESTER_LAB_ID=<your_lab_id>
GALV_HARVESTER_TEAM_ID=<your_team_id>
GALV_HARVESTER_MONITOR_PATH=<your_monitor_path>
GALV_HARVESTER_MONITOR_PATH_REGEX=<your_monitor_path_regex>
GALV_HARVESTER_SKIP_WIZARD=<true_or_omit>
GALV_HARVESTER_FOREGROUND=<true_or_omit>
```

If you don't want to have to specify the path to the data directory every time you start the harvester,
you can also edit the `docker-compose.yml` file to include the path as a volume.

### Using command line arguments

You can also specify harvester properties as command line arguments:

```text
Usage: galv-harvester setup [OPTIONS]

Options:
  --version                  Show the version and exit.
  --url TEXT                 API URL to register harvester with.
  --name TEXT                Name for the harvester.
  --api_token TEXT           Your API token. You must have admin access to at
                             least one Lab.
  --lab_id INTEGER           Id of the Lab to assign the Harvester to. Only
                             required if you administrate multiple Labs.
  --team_id INTEGER          Id of the Team to create a Monitored Path for.
                             Only required if you administrate multiple Teams
                             and wish to create a monitored path.
  --monitor_path TEXT        Path to harvest files from.
  --monitor_path_regex TEXT  Regex to match files to harvest. Other options
                             can be specified using the frontend.
  --foreground           On completion, run the harvester in the
                             foreground (will not close the thread, useful for
                             Dockerized application).
  --restart                  Ignore other options and run harvester if config
                             file already exists.
  --help                     Show this message and exit.
```

For details on the variables you can set, and when they are necessary, see the [variable details](#variable-details) section.

## Variable details

If not restarting from a previous configuration, the following variables are required, 
and will be prompted for by the wizard if not set (unless `GALV_HARVESTER_SKIP_WIZARD` is set to `true`).
If `GALV_HARVESTER_SKIP_WIZARD` is set to `true`, you must provide these variables in the environment or the docker-compose file:

- `GALV_HARVESTER_SERVER_URL`: The URL of the Galv server.
- `GALV_HARVESTER_NAME`: The name of the harvester.
- `GALV_HARVEST_API_TOKEN`: The API token for a User who administers the Lab the Harvester will be associated with.
- `GALV_HARVESTER_LAB_ID`: The ID of the lab the harvester belongs to. Only required if the User administers multiple Labs.

If you want to set up a monitored path, the following variables are required:
-  `GALV_HARVESTER_TEAM_ID`: The ID of the team the monitored path will belong to. Only required if the User has multiple Teams.
- `GALV_HARVESTER_MONITOR_PATH`: The path to the directory you want to monitor.

You may also optionally specify the following variables:
- `GALV_HARVESTER_MONITOR_PATH_REGEX`: A regex pattern to match files in the monitored path. Only files that match this pattern will be uploaded to the Galv server.
- `GALV_HARVESTER_SKIP_WIZARD`: If set to `true`, the harvester will not prompt you for any missing variables and setup will fail if necessary variables are not set.
- `GALV_HARVESTER_FOREGROUND`: If set to `true`, the harvester will run in the foreground.

## Further setup

Further setup can be done in the web frontend.

When you log into the frontend as a User who belongs to the same Lab as the Harvester,
you'll see the Harvester listed in the 'Harvesters' tab.

You can add new monitored paths to the Harvester, or change the Harvester's settings (if you have appropriate permissions).
Monitored Paths can only be created and edited by Team administrators, as a security measure.

## Starting the harvester

When the harvester is set up, you can start it by running the following command:

```bash
galv-harvester start
```

This will start the harvester using the previously-configured settings.

## Harvesting specific files/directories

If you want to harvest specific files or directories, or run the entire harvest cycle manually, 
you can do so with the command `galv-harvester harvest`.

With no arguments, this will harvest each monitored path in turn.

You can also specify paths to harvest:

```bash
galv-harvester harvest /path/to/directory /path/to/another/directory/file.csv /path/to/somewhere/else
```

This will harvest the specified paths, and only those paths. 
**Note**: The paths must be included in the monitored paths for the harvester.
This includes the regex pattern, if specified. 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "galv-harvester",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "battery, cycler, galv",
    "author": null,
    "author_email": "Matt Jaquiery <matt.jaquiery@dtc.ox.ac.uk>",
    "download_url": "https://files.pythonhosted.org/packages/1f/44/f2d96e6ce2ad18c33b36f9a3ee9fa6fbce4f8d4c72ea5b437a19b387dd0b/galv_harvester-1.3.4.tar.gz",
    "platform": null,
    "description": "# Galv Harvester (Python program)\n> A metadata secretary for battery science\n\n[![PyPI - Version](https://img.shields.io/pypi/v/galv-harvester)](https://pypi.org/project/galv-harvester/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/galv-harvester)](https://pypi.org/project/galv-harvester/)\n\n[![Hatch project](https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg)](https://github.com/pypa/hatch)\n[![Test, Build, and Publish](https://github.com/galv-team/galv-harvester/actions/workflows/publish.yml/badge.svg)](https://github.com/galv-team/galv-harvester/actions/workflows/publish.yml)\n\n## Galv Project\n- [Backend](https://github.com/galv-team/galv-backend)\n- [Frontend](https://github.com/galv-team/galv-frontend)\n- [**Harvester**](https://github.com/galv-team/galv-harvester)\n\n## Installation\n\nThe Galv Harvester can be installed from the [Python Package Index](https://pypi.org/project/galv-harvester/).\n\n```bash\npip install galv-harvester\n```\n\nThis will install the harvester and its dependencies, and make the `galv-harvester` command available.\n\n## Usage\n\nThe first time you use the harvester, it will have to register itself with the Galv server.\nTo set up the harvester, using the following command:\n\n```bash\ngalv-harvester setup\n```\n\nThe harvester will prompt you for the necessary settings to connect to the Galv server (see [Initial Setup](#initial-setup)).\n\nAlternatively, you can specify the settings as environment variables \n(see [Using Environment Variables](#using-environment-variables))\nor as [command line arguments](#using-command-line-arguments) to the `galv-harvester` program.\n\n## Initial Setup\n\nThere are three ways to set up the harvester: using the [setup wizard](#using-the-setup-wizard), \nusing [command line arguments](#using-command-line-arguments),\nor by specifying [environment variables](#using-environment-variables).\nYou can use a combination of both methods, specifying some settings in the environment and others in the wizard.\n\nIf you launch the program using the commands above, you will be prompted to enter the necessary settings by the wizard.\n\n### Using the setup wizard\n\nFirst, you'll be asked for the [Galv server](https://github.com/galv-team/galv-backend) URL.\nThis should be the URL of the Galv server you have set up.\nProviding a frontend URL will not work, as the harvester needs to communicate with the backend.\n\nNext, you'll be asked for your API token. \nThis can be generated in either the Galv frontend or backend.\nThe token should be for a User who administers the Lab the Harvester will belong to.\n\nNext, you'll be asked to specify a name for the new Harvester. \n\nFinally, you'll be asked if you want to monitor a directory.\nIf you answer 'yes', you'll be asked for the path to the directory you want to monitor,\nand the Team that the monitored path will belong to.\n\nThe Harvester will register itself with the Galv server and begin to monitor for data files.\n\nThe `--foreground` flag is optional, and will keep the harvester running in the foreground.\n\n### Using environment variables\n\nYou can specify harvester properties as environment variables.\nIf you are using docker-compose, you can specify these in the `docker-compose.yml` file (see below),\nor you can specify them in your shell environment before running the harvester if you are running it as a standalone Python program.\nAny environment variables can be omitted, and the harvester will prompt you for them when you start it if they are necessary.\nFor details on the variables you can set, and whether they are necessary, see the [variable details](#variable-details) section.\n\n```yaml\n# .env\nGALV_HARVESTER_SERVER_URL=<your_server_url>\nGALV_HARVESTER_NAME=<your_harvester_name>\nGALV_HARVESTER_API_TOKEN=<your_api_token>\nGALV_HARVESTER_LAB_ID=<your_lab_id>\nGALV_HARVESTER_TEAM_ID=<your_team_id>\nGALV_HARVESTER_MONITOR_PATH=<your_monitor_path>\nGALV_HARVESTER_MONITOR_PATH_REGEX=<your_monitor_path_regex>\nGALV_HARVESTER_SKIP_WIZARD=<true_or_omit>\nGALV_HARVESTER_FOREGROUND=<true_or_omit>\n```\n\nIf you don't want to have to specify the path to the data directory every time you start the harvester,\nyou can also edit the `docker-compose.yml` file to include the path as a volume.\n\n### Using command line arguments\n\nYou can also specify harvester properties as command line arguments:\n\n```text\nUsage: galv-harvester setup [OPTIONS]\n\nOptions:\n  --version                  Show the version and exit.\n  --url TEXT                 API URL to register harvester with.\n  --name TEXT                Name for the harvester.\n  --api_token TEXT           Your API token. You must have admin access to at\n                             least one Lab.\n  --lab_id INTEGER           Id of the Lab to assign the Harvester to. Only\n                             required if you administrate multiple Labs.\n  --team_id INTEGER          Id of the Team to create a Monitored Path for.\n                             Only required if you administrate multiple Teams\n                             and wish to create a monitored path.\n  --monitor_path TEXT        Path to harvest files from.\n  --monitor_path_regex TEXT  Regex to match files to harvest. Other options\n                             can be specified using the frontend.\n  --foreground           On completion, run the harvester in the\n                             foreground (will not close the thread, useful for\n                             Dockerized application).\n  --restart                  Ignore other options and run harvester if config\n                             file already exists.\n  --help                     Show this message and exit.\n```\n\nFor details on the variables you can set, and when they are necessary, see the [variable details](#variable-details) section.\n\n## Variable details\n\nIf not restarting from a previous configuration, the following variables are required, \nand will be prompted for by the wizard if not set (unless `GALV_HARVESTER_SKIP_WIZARD` is set to `true`).\nIf `GALV_HARVESTER_SKIP_WIZARD` is set to `true`, you must provide these variables in the environment or the docker-compose file:\n\n- `GALV_HARVESTER_SERVER_URL`: The URL of the Galv server.\n- `GALV_HARVESTER_NAME`: The name of the harvester.\n- `GALV_HARVEST_API_TOKEN`: The API token for a User who administers the Lab the Harvester will be associated with.\n- `GALV_HARVESTER_LAB_ID`: The ID of the lab the harvester belongs to. Only required if the User administers multiple Labs.\n\nIf you want to set up a monitored path, the following variables are required:\n-  `GALV_HARVESTER_TEAM_ID`: The ID of the team the monitored path will belong to. Only required if the User has multiple Teams.\n- `GALV_HARVESTER_MONITOR_PATH`: The path to the directory you want to monitor.\n\nYou may also optionally specify the following variables:\n- `GALV_HARVESTER_MONITOR_PATH_REGEX`: A regex pattern to match files in the monitored path. Only files that match this pattern will be uploaded to the Galv server.\n- `GALV_HARVESTER_SKIP_WIZARD`: If set to `true`, the harvester will not prompt you for any missing variables and setup will fail if necessary variables are not set.\n- `GALV_HARVESTER_FOREGROUND`: If set to `true`, the harvester will run in the foreground.\n\n## Further setup\n\nFurther setup can be done in the web frontend.\n\nWhen you log into the frontend as a User who belongs to the same Lab as the Harvester,\nyou'll see the Harvester listed in the 'Harvesters' tab.\n\nYou can add new monitored paths to the Harvester, or change the Harvester's settings (if you have appropriate permissions).\nMonitored Paths can only be created and edited by Team administrators, as a security measure.\n\n## Starting the harvester\n\nWhen the harvester is set up, you can start it by running the following command:\n\n```bash\ngalv-harvester start\n```\n\nThis will start the harvester using the previously-configured settings.\n\n## Harvesting specific files/directories\n\nIf you want to harvest specific files or directories, or run the entire harvest cycle manually, \nyou can do so with the command `galv-harvester harvest`.\n\nWith no arguments, this will harvest each monitored path in turn.\n\nYou can also specify paths to harvest:\n\n```bash\ngalv-harvester harvest /path/to/directory /path/to/another/directory/file.csv /path/to/somewhere/else\n```\n\nThis will harvest the specified paths, and only those paths. \n**Note**: The paths must be included in the monitored paths for the harvester.\nThis includes the regex pattern, if specified. \n",
    "bugtrack_url": null,
    "license": "Copyright  (c) 2020-2023, The Chancellor, Masters and Scholars of the University of Oxford, and the 'Galv' Developers. All rights reserved.  Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.",
    "summary": null,
    "version": "1.3.4",
    "project_urls": {
        "Documentation": "https://github.com/galv-team/galv-harvester#readme",
        "Issues": "https://github.com/galv-team/galv-harvester/issues",
        "Source": "https://github.com/galv-team/galv-harvester"
    },
    "split_keywords": [
        "battery",
        " cycler",
        " galv"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b6e54d372813ed7c36902a103036cb8ee928256b84380657e810f2850d0c923",
                "md5": "7b29d78cd603361019f9ac191d60a3bc",
                "sha256": "f184e4623e4571241820138d40cdf959425824875482af7ae9d34373fff4dc4f"
            },
            "downloads": -1,
            "filename": "galv_harvester-1.3.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7b29d78cd603361019f9ac191d60a3bc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 38362,
            "upload_time": "2024-10-07T14:27:42",
            "upload_time_iso_8601": "2024-10-07T14:27:42.311839Z",
            "url": "https://files.pythonhosted.org/packages/0b/6e/54d372813ed7c36902a103036cb8ee928256b84380657e810f2850d0c923/galv_harvester-1.3.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f44f2d96e6ce2ad18c33b36f9a3ee9fa6fbce4f8d4c72ea5b437a19b387dd0b",
                "md5": "10c331f64523ecf9c794038bb0688fc8",
                "sha256": "67ba927ad7f02f9a3970c87517b1fac8d063a615bc8d9414a8b94313c4eb4139"
            },
            "downloads": -1,
            "filename": "galv_harvester-1.3.4.tar.gz",
            "has_sig": false,
            "md5_digest": "10c331f64523ecf9c794038bb0688fc8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 32617,
            "upload_time": "2024-10-07T14:27:44",
            "upload_time_iso_8601": "2024-10-07T14:27:44.228123Z",
            "url": "https://files.pythonhosted.org/packages/1f/44/f2d96e6ce2ad18c33b36f9a3ee9fa6fbce4f8d4c72ea5b437a19b387dd0b/galv_harvester-1.3.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-07 14:27:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "galv-team",
    "github_project": "galv-harvester#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.3"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.28.1"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.1"
                ]
            ]
        },
        {
            "name": "dask",
            "specs": [
                [
                    "==",
                    "2024.4.2"
                ]
            ]
        },
        {
            "name": "fastnumbers",
            "specs": [
                [
                    "==",
                    "5.1.0"
                ]
            ]
        },
        {
            "name": "holoviews",
            "specs": [
                [
                    "==",
                    "1.18.3"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.9.0"
                ]
            ]
        },
        {
            "name": "datashader",
            "specs": [
                [
                    "==",
                    "0.16.1"
                ]
            ]
        },
        {
            "name": "galvani",
            "specs": [
                [
                    "==",
                    "0.4.1"
                ]
            ]
        },
        {
            "name": "maya",
            "specs": [
                [
                    "==",
                    "0.6.1"
                ]
            ]
        },
        {
            "name": "xlrd",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    "==",
                    "5.9.4"
                ]
            ]
        }
    ],
    "lcname": "galv-harvester"
}
        
Elapsed time: 0.33842s