pandastic


Namepandastic JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA command line tool for manipulating PanDA tasks and rucio datasets
upload_time2024-08-26 19:35:58
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Pandastic - Connecting Rucio and PanDA

`Pandastic` is a simple command line tool which utilises the PanDA and Rucio clients to allow the user to perform operations on PanDA tasks and their associated input and output datasets which live on rucio.
The PanDA-Rucio bridge is a main feature of `Pandastic`, but you don't have to always use it.
`Pandastic` allows you to search and manipulate Rucio datasets and PanDA tasks independently.

## Getting the Package

The tool is available with `pip` and can be downloaded with

```
pip install pandastic
```

**Note:: you must be able to generate a VOMS-proxy to manipulate datasets with ATLAS. On `lxplus`, you will be prompted for your Grid certificate passphrase. On a local machine, you need to install VOMS**

You may be prompted to create a rucio configuration.
In this case, you should create a file in the prompted location with the content:
```
[client]
rucio_host = https://rucio-lb-prod.cern.ch
auth_host = https://atlas-rucio-auth.cern.ch
ca_cert = /etc/grid-security/certificates/
account = <rucio_account>
auth_type = x509_proxy
```
or equivalent settings for different authentication method, or non-ATLAS host.
If you are on lxplus, or have access to the ATLAS environment setup, simply running `setupATLAS -q && lsetup rucio` will achieve the same result. 

## What can you do with the package:

### For datasets:

- You can retrieve datasets under some scope with a `regex` pattern
- You can retrieve datasets that are the inputs/outputs of grid tasks (specified by a `regex` pattern, from a given user) with a task status of choice
    - Support is avaiabe for Production tasks with extra flag
- You can retrieve datasets under some scope with a `regex` pattern that
    - specifically have/not have rules or replicas on specific RSEs
    - whose history never involvled a rule on a particular RSE/site
    - whose rules have a particular time left in their lifetime
    - whose containers respect those criteria

Once the list of datasets is is retrieved, you can perform one of the following actions on each task:

- Create, extend or delete rules associated to dataset or its container
- List the files within the dataset (to access them later with `XrootD` or `davs` protocoles)
- Download the datasets to a specified destination

### For tasks:

- You can retrieve tasks with a status of choice for some user with a `regex` pattern, looking back `N` days

Once the list of tasks is is retrieved, you can perform one of the following actions on each task:

- Pause/unpause the task
- Retry the task (optionally with new arguments)
- Kill the task

### Some general comments:

- You don't have to search for the tasks/datasets you would like to manipulate on-the-fly. You can simply provide a list of tasks or files to perform the actions over.

- Regex here is not the linux globbing regex. It is the UNIX-wide regexing syntax, where a wildcard is given by `.*`.


### Features to look forward to

- Ability to montior rule creation in an intuitive way
- Ability to build a nicely presented report of a user's usage on any RSE, including datasets with rules occupying spaces.
- Build a table of statuses of `PanDA` tasks differentiated by a pattern in their name
- Ability to upload datasets to Rucio
- More color and proper logging!
- Download the logs for jobs failed in given task

## Usage

### Manipulating datasets

The actions available for a given dataset are:

- *find*:      used to just dump list of datasets with no action
- *replicate*: used to create new rules for datasets
- *update*:    used to increase the lifetime for rules
- *delete*:    used to delete rules
- *download*:  used to download datasets

You can then use the command line tool to perform one of these actions (`<action>`) with some filter on the dataset specified by the arguments `<args>`:

```
pandastic-data <action> <args>
```

The required and allowed `<args>` are extensively described in their respecitive help messages. You shoud have a look on the availablle arguments with
```
pandastic-data --help
```
*A tabulated summary coming here soon!*

### Manipulating Tasks

The actions available for a given task are:

- *find*:       used to just dump list of tasks with no action
- *pause*:      used to pause tasks
- *unpause*:    used to unpause tasks
- *retry*:      used to retry tasks
- *kill*:       used to kill tasks

You can then use the command line tool to perform one of these actions (`<action>`) with some filter on the dataset specified by the arguments `<args>`:

```
pandastic-tasks <action> <args>
```

The required and allowed `<args>` are extensively described in their respecitive help messages. You shoud have a look on the availablle arguments with
```
pandastic-tasks --help
```
*A tabulated summary coming here soon!*

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pandastic",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Mohamed Aly <mohamed.aly@cern.ch>",
    "keywords": null,
    "author": null,
    "author_email": "Mohamed Aly <mohamed.aly@cern.ch>",
    "download_url": "https://files.pythonhosted.org/packages/43/b1/7c979b7d8eefd9172684ccc6874db6486af2383a5964161177559335669e/pandastic-0.0.2.tar.gz",
    "platform": null,
    "description": "# Pandastic - Connecting Rucio and PanDA\n\n`Pandastic` is a simple command line tool which utilises the PanDA and Rucio clients to allow the user to perform operations on PanDA tasks and their associated input and output datasets which live on rucio.\nThe PanDA-Rucio bridge is a main feature of `Pandastic`, but you don't have to always use it.\n`Pandastic` allows you to search and manipulate Rucio datasets and PanDA tasks independently.\n\n## Getting the Package\n\nThe tool is available with `pip` and can be downloaded with\n\n```\npip install pandastic\n```\n\n**Note:: you must be able to generate a VOMS-proxy to manipulate datasets with ATLAS. On `lxplus`, you will be prompted for your Grid certificate passphrase. On a local machine, you need to install VOMS**\n\nYou may be prompted to create a rucio configuration.\nIn this case, you should create a file in the prompted location with the content:\n```\n[client]\nrucio_host = https://rucio-lb-prod.cern.ch\nauth_host = https://atlas-rucio-auth.cern.ch\nca_cert = /etc/grid-security/certificates/\naccount = <rucio_account>\nauth_type = x509_proxy\n```\nor equivalent settings for different authentication method, or non-ATLAS host.\nIf you are on lxplus, or have access to the ATLAS environment setup, simply running `setupATLAS -q && lsetup rucio` will achieve the same result. \n\n## What can you do with the package:\n\n### For datasets:\n\n- You can retrieve datasets under some scope with a `regex` pattern\n- You can retrieve datasets that are the inputs/outputs of grid tasks (specified by a `regex` pattern, from a given user) with a task status of choice\n    - Support is avaiabe for Production tasks with extra flag\n- You can retrieve datasets under some scope with a `regex` pattern that\n    - specifically have/not have rules or replicas on specific RSEs\n    - whose history never involvled a rule on a particular RSE/site\n    - whose rules have a particular time left in their lifetime\n    - whose containers respect those criteria\n\nOnce the list of datasets is is retrieved, you can perform one of the following actions on each task:\n\n- Create, extend or delete rules associated to dataset or its container\n- List the files within the dataset (to access them later with `XrootD` or `davs` protocoles)\n- Download the datasets to a specified destination\n\n### For tasks:\n\n- You can retrieve tasks with a status of choice for some user with a `regex` pattern, looking back `N` days\n\nOnce the list of tasks is is retrieved, you can perform one of the following actions on each task:\n\n- Pause/unpause the task\n- Retry the task (optionally with new arguments)\n- Kill the task\n\n### Some general comments:\n\n- You don't have to search for the tasks/datasets you would like to manipulate on-the-fly. You can simply provide a list of tasks or files to perform the actions over.\n\n- Regex here is not the linux globbing regex. It is the UNIX-wide regexing syntax, where a wildcard is given by `.*`.\n\n\n### Features to look forward to\n\n- Ability to montior rule creation in an intuitive way\n- Ability to build a nicely presented report of a user's usage on any RSE, including datasets with rules occupying spaces.\n- Build a table of statuses of `PanDA` tasks differentiated by a pattern in their name\n- Ability to upload datasets to Rucio\n- More color and proper logging!\n- Download the logs for jobs failed in given task\n\n## Usage\n\n### Manipulating datasets\n\nThe actions available for a given dataset are:\n\n- *find*:      used to just dump list of datasets with no action\n- *replicate*: used to create new rules for datasets\n- *update*:    used to increase the lifetime for rules\n- *delete*:    used to delete rules\n- *download*:  used to download datasets\n\nYou can then use the command line tool to perform one of these actions (`<action>`) with some filter on the dataset specified by the arguments `<args>`:\n\n```\npandastic-data <action> <args>\n```\n\nThe required and allowed `<args>` are extensively described in their respecitive help messages. You shoud have a look on the availablle arguments with\n```\npandastic-data --help\n```\n*A tabulated summary coming here soon!*\n\n### Manipulating Tasks\n\nThe actions available for a given task are:\n\n- *find*:       used to just dump list of tasks with no action\n- *pause*:      used to pause tasks\n- *unpause*:    used to unpause tasks\n- *retry*:      used to retry tasks\n- *kill*:       used to kill tasks\n\nYou can then use the command line tool to perform one of these actions (`<action>`) with some filter on the dataset specified by the arguments `<args>`:\n\n```\npandastic-tasks <action> <args>\n```\n\nThe required and allowed `<args>` are extensively described in their respecitive help messages. You shoud have a look on the availablle arguments with\n```\npandastic-tasks --help\n```\n*A tabulated summary coming here soon!*\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A command line tool for manipulating PanDA tasks and rucio datasets",
    "version": "0.0.2",
    "project_urls": {
        "homepage": "https://github.com/MoAly98/pandastic/tree/main"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1697037163eaab4521393ca250527b3ed46e4d812146bf3ec43992041548f3d7",
                "md5": "8b681cc25fc4210c8498bfd0ea403995",
                "sha256": "ea75bd03526f8db9d1e20b1a33ba6a1733115ab83d5904f0a98f9f8a973ef289"
            },
            "downloads": -1,
            "filename": "pandastic-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8b681cc25fc4210c8498bfd0ea403995",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 39900,
            "upload_time": "2024-08-26T19:35:57",
            "upload_time_iso_8601": "2024-08-26T19:35:57.463065Z",
            "url": "https://files.pythonhosted.org/packages/16/97/037163eaab4521393ca250527b3ed46e4d812146bf3ec43992041548f3d7/pandastic-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "43b17c979b7d8eefd9172684ccc6874db6486af2383a5964161177559335669e",
                "md5": "0859445024662a3cdf73fa5cc54e8503",
                "sha256": "d71b54262a5862fccfdee3ee54348e253885281f50ff11c27978db6380ce1d36"
            },
            "downloads": -1,
            "filename": "pandastic-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "0859445024662a3cdf73fa5cc54e8503",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 33308,
            "upload_time": "2024-08-26T19:35:58",
            "upload_time_iso_8601": "2024-08-26T19:35:58.509290Z",
            "url": "https://files.pythonhosted.org/packages/43/b1/7c979b7d8eefd9172684ccc6874db6486af2383a5964161177559335669e/pandastic-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-26 19:35:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MoAly98",
    "github_project": "pandastic",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pandastic"
}
        
Elapsed time: 0.32556s