gen3-util


Namegen3-util JSON
Version 0.0.12 PyPI version JSON
download
home_pagehttps://github.com/ACED-IDP/gen3_util
SummaryCommons utilities
upload_time2023-11-27 23:53:59
maintainer
docs_urlNone
authorEllrott Lab
requires_python>=3.9, <4
license
keywords gen3 bioinformatics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Gen3 Utilities

Utilities to manage Gen3 schemas, projects and submissions.

## Installation

```

# optionally
$python3 -m venv venv ; source venv/bin/activate

pip install gen3_util

$ gen3_util
msg: Version 0.0.4


```

Note: requires [`magic`](https://github.com/ahupp/python-magic#installation) library. If it is not already installed you will see a warning like this:

```text
Requires libmagic installed on your system to determine mime-types
Error: 'failed to find libmagic.  Check your installation'
For installation instructions see https://github.com/ahupp/python-magic#installation
```

Note: some environments will get a `_ctypes` error.  Please install 3.9.16

## Use

```
$gen3_util --help
Usage: gen3_util [OPTIONS] COMMAND [ARGS]...

  Gen3 Management Utilities

Options:
  --config TEXT              Path to config file. GEN3_UTIL_CONFIG
  --format [yaml|json|text]  Result format. GEN3_UTIL_FORMAT  [default: yaml]
  --cred TEXT                See https://uc-cdis.github.io/gen3-user-
                             doc/appendices/api-gen3/#credentials-to-query-
                             the-api. GEN3_API_KEY
  --state_dir TEXT           Directory for file transfer state
                             GEN3_UTIL_STATE_DIR  [default: ~/.gen3/gen3_util]
  --help                     Show this message and exit.

Commands:
  projects  Manage Gen3 projects.
  buckets   Manage Gen3 buckets.
  meta      Manage meta data.
  files     Manage file buckets.
  access    Manage access requests.
  config    Configure this utility.


```

## Connectivity

* Leverages Gen3Auth  [See](https://uc-cdis.github.io/gen3-user-doc/appendices/api-gen3/#credentials-to-query-the-api.)
* Store the `credentials.json` file in ~/.gen3/credentials.json or specify location with either env[GEN3_API_KEY], or `--cred` parameter

## Use cases

> I need to verify connectivity.

```
$ gen3_util projects ping
msg: OK connected to endpoint https://aced-training.compbio.ohsu.edu
```

> I need to see what projects exist

```
$ gen3_util projects ls

endpoint: https://aced-training.compbio.ohsu.edu
msg: OK
projects:
- /programs
- /programs/aced
- /programs/aced/project
- /programs/aced/project/MCF10A
- /programs/aced/projects
- /programs/aced/projects/Alcoholism
- /programs/aced/projects/Alzheimers
- /programs/aced/projects/Breast_Cancer
- /programs/aced/projects/Colon_Cancer
- /programs/aced/projects/Diabetes
- /programs/aced/projects/HOP
- /programs/aced/projects/Lung_Cancer
- /programs/aced/projects/MCF10A
- /programs/aced/projects/NVIDIA
- /programs/aced/projects/Prostate_Cancer
- /programs/aced/projects/ohsu_download_testing
```

> I need to see what buckets are associated with the commons

```
$ gen3_util buckets ls
buckets:
  GS_BUCKETS: {}
  S3_BUCKETS:
    aced-default:
      endpoint_url: https://minio-default.compbio.ohsu.edu
      region: us-east-1
    aced-manchester:
      endpoint_url: https://minio-manchester.compbio.ohsu.edu
      region: us-east-1
    aced-ohsu:
      endpoint_url: https://minio-ohsu.compbio.ohsu.edu
      region: us-east-1
    aced-stanford:
      endpoint_url: https://minio-stanford.compbio.ohsu.edu
      region: us-east-1
    aced-ucl:
      endpoint_url: https://minio-ucl.compbio.ohsu.edu
      region: us-east-1
endpoint: https://aced-training.compbio.ohsu.edu
msg: OK


```

> I need to create a project

```text
$ gen3_util projects touch aced-MyExperiment
projects:
  aced-MyExperiment:
    exists: true
messages:
- Created program:aced Program is updated!

```

> I need to assign default policies to that project

```text
$ gen3_util projects add policies aced-MyExperiment
msg: Approve these requests to assign default policies to aced-MyExperiment
commands:
- gen3_util access update 24f047d7-0e7c-43c6-bab6-61e2d385c71a SIGNED
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED

```

> I need to add a user to that project

```text
$ gen3_util projects add user aced-MyExperiment linus.pauling@osu.edu
msg: Approve these requests to add linus.pauling@osu.edu to aced-MyExperiment
commands:
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED

```

> Before proceeding, I need to sign those equests

```text
gen3_util access update xxxxxx SIGNED
```



> I want to create a simple project structure with a set of files

```text
$ gen3_util meta  import dir tests/fixtures/dir_to_study/ tmp/foo --project_id aced-MyExperiment
summary:
  ResearchStudy:
    count: 1
  DocumentReference:
    count: 5
    size: 6013814
msg: OK

```

> I want need to do something a bit more complex, for example, I want to create a project structure with a set of files, but I need to specify the `Patient` and `Specimen` based on the path of the file.

```text
gen3_util meta  import dir tests/fixtures/dir_to_study_with_meta/ tmp/foometa --project_id aced-foometa --plugin_path ./tests/unit/plugins

tests/fixtures/dir_to_study_with_meta/
├── file-2.csv
├── p1
│   ├── s1
│   │   └── file-3.pdf
│   ├── s2
│   │   └── file-4.tsv
│   └── s3
│       └── file-5
└── p2
    └── s4
        └── file-1.txt

Will produce the following meta data:

summary:
  ResearchStudy:
    count: 1
  Patient:
    count: 2
  Specimen:
    count: 4
  DocumentReference:
    count: 5
    size: 6013814

```

For more see [test_meta_plugin](./tests/unit/meta/test_plugins.py)



> I need to upload those files to the instance

```
$ gen3_util files cp --ignore_state --project_id aced-MyExperiment tmp/foo/DocumentReference.ndjson  bucket://aced-development-ohsu-data-bucket
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 5.74M/5.74M [00:03<00:00, 1.71MB/s, elapsed=0:00:02.056022, file=6f8101]
info:
- Wrote state to ~/.gen3/gen3-util-state/state.ndjson
msg: OK
```


> I need to upload the meta data about those files to the instance

```
$gen3_util meta cp tmp/foo bucket://aced-development-ohsu-data-bucket --project_id aced-MyExperiment
msg: Uploaded /var/folders/2c/hffqqtr94nv64tjy0xrl38r89k1sty/T/tmpacozhhoo/_aced-MyExperiment_meta.zip
```


> I need to request or manage access to a project

```
$ gen3_util access
Usage: gen3_util access [OPTIONS] COMMAND [ARGS]...

  Manage access requests.

Options:
  --help  Show this message and exit.

Commands:
  touch   Create a request for read access.
  update  Update the request's approval workflow.
  ls      List current user's requests.
  cat     Show details of a specific request.

```





## Development Setup

```
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip install -e .
```

## Test

* fixtures - data for testing environment

```
tests/fixtures/
└── custom_config
    └── config.yaml  # testing configuration

```

* test parameters

```
tests/
├── integration
│   └── conftest.py
└── unit
    └── conftest.py
```

* running tests

```

$ pytest --cov=gen3_util

 88%


```

* pre commit tests

A reasonable set of checks, including running unit tests prior to each commit.  You can run these tests on demand by:

```
$ pre-commit install

$ pre-commit run --all-files
debug statements (python)................................................Passed
check python ast.........................................................Passed
fix utf-8 byte order marker..............................................Passed
check json...........................................(no files to check)Skipped
detect private key.......................................................Passed
check yaml...............................................................Passed
check for added large files..............................................Passed
check that scripts with shebangs are executable..........................Passed
check for case conflicts.................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
mixed line ending........................................................Passed
run our unit tests.......................................................Passed

```

## Distribution

- PyPi

```
# update pypi

# pypi credentials - see https://twine.readthedocs.io/en/stable/#environment-variables

export TWINE_USERNAME=  #  the username to use for authentication to the repository.
export TWINE_PASSWORD=  # the password to use for authentication to the repository.

# this could be maintained as so: export $(cat .env | xargs)

rm -r dist/
python3  setup.py sdist bdist_wheel
twine upload dist/*
```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ACED-IDP/gen3_util",
    "name": "gen3-util",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9, <4",
    "maintainer_email": "",
    "keywords": "gen3 bioinformatics",
    "author": "Ellrott Lab",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/03/db/15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc/gen3_util-0.0.12.tar.gz",
    "platform": null,
    "description": "\n# Gen3 Utilities\n\nUtilities to manage Gen3 schemas, projects and submissions.\n\n## Installation\n\n```\n\n# optionally\n$python3 -m venv venv ; source venv/bin/activate\n\npip install gen3_util\n\n$ gen3_util\nmsg: Version 0.0.4\n\n\n```\n\nNote: requires [`magic`](https://github.com/ahupp/python-magic#installation) library. If it is not already installed you will see a warning like this:\n\n```text\nRequires libmagic installed on your system to determine mime-types\nError: 'failed to find libmagic.  Check your installation'\nFor installation instructions see https://github.com/ahupp/python-magic#installation\n```\n\nNote: some environments will get a `_ctypes` error.  Please install 3.9.16\n\n## Use\n\n```\n$gen3_util --help\nUsage: gen3_util [OPTIONS] COMMAND [ARGS]...\n\n  Gen3 Management Utilities\n\nOptions:\n  --config TEXT              Path to config file. GEN3_UTIL_CONFIG\n  --format [yaml|json|text]  Result format. GEN3_UTIL_FORMAT  [default: yaml]\n  --cred TEXT                See https://uc-cdis.github.io/gen3-user-\n                             doc/appendices/api-gen3/#credentials-to-query-\n                             the-api. GEN3_API_KEY\n  --state_dir TEXT           Directory for file transfer state\n                             GEN3_UTIL_STATE_DIR  [default: ~/.gen3/gen3_util]\n  --help                     Show this message and exit.\n\nCommands:\n  projects  Manage Gen3 projects.\n  buckets   Manage Gen3 buckets.\n  meta      Manage meta data.\n  files     Manage file buckets.\n  access    Manage access requests.\n  config    Configure this utility.\n\n\n```\n\n## Connectivity\n\n* Leverages Gen3Auth  [See](https://uc-cdis.github.io/gen3-user-doc/appendices/api-gen3/#credentials-to-query-the-api.)\n* Store the `credentials.json` file in ~/.gen3/credentials.json or specify location with either env[GEN3_API_KEY], or `--cred` parameter\n\n## Use cases\n\n> I need to verify connectivity.\n\n```\n$ gen3_util projects ping\nmsg: OK connected to endpoint https://aced-training.compbio.ohsu.edu\n```\n\n> I need to see what projects exist\n\n```\n$ gen3_util projects ls\n\nendpoint: https://aced-training.compbio.ohsu.edu\nmsg: OK\nprojects:\n- /programs\n- /programs/aced\n- /programs/aced/project\n- /programs/aced/project/MCF10A\n- /programs/aced/projects\n- /programs/aced/projects/Alcoholism\n- /programs/aced/projects/Alzheimers\n- /programs/aced/projects/Breast_Cancer\n- /programs/aced/projects/Colon_Cancer\n- /programs/aced/projects/Diabetes\n- /programs/aced/projects/HOP\n- /programs/aced/projects/Lung_Cancer\n- /programs/aced/projects/MCF10A\n- /programs/aced/projects/NVIDIA\n- /programs/aced/projects/Prostate_Cancer\n- /programs/aced/projects/ohsu_download_testing\n```\n\n> I need to see what buckets are associated with the commons\n\n```\n$ gen3_util buckets ls\nbuckets:\n  GS_BUCKETS: {}\n  S3_BUCKETS:\n    aced-default:\n      endpoint_url: https://minio-default.compbio.ohsu.edu\n      region: us-east-1\n    aced-manchester:\n      endpoint_url: https://minio-manchester.compbio.ohsu.edu\n      region: us-east-1\n    aced-ohsu:\n      endpoint_url: https://minio-ohsu.compbio.ohsu.edu\n      region: us-east-1\n    aced-stanford:\n      endpoint_url: https://minio-stanford.compbio.ohsu.edu\n      region: us-east-1\n    aced-ucl:\n      endpoint_url: https://minio-ucl.compbio.ohsu.edu\n      region: us-east-1\nendpoint: https://aced-training.compbio.ohsu.edu\nmsg: OK\n\n\n```\n\n> I need to create a project\n\n```text\n$ gen3_util projects touch aced-MyExperiment\nprojects:\n  aced-MyExperiment:\n    exists: true\nmessages:\n- Created program:aced Program is updated!\n\n```\n\n> I need to assign default policies to that project\n\n```text\n$ gen3_util projects add policies aced-MyExperiment\nmsg: Approve these requests to assign default policies to aced-MyExperiment\ncommands:\n- gen3_util access update 24f047d7-0e7c-43c6-bab6-61e2d385c71a SIGNED\n- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED\n\n```\n\n> I need to add a user to that project\n\n```text\n$ gen3_util projects add user aced-MyExperiment linus.pauling@osu.edu\nmsg: Approve these requests to add linus.pauling@osu.edu to aced-MyExperiment\ncommands:\n- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED\n\n```\n\n> Before proceeding, I need to sign those equests\n\n```text\ngen3_util access update xxxxxx SIGNED\n```\n\n\n\n> I want to create a simple project structure with a set of files\n\n```text\n$ gen3_util meta  import dir tests/fixtures/dir_to_study/ tmp/foo --project_id aced-MyExperiment\nsummary:\n  ResearchStudy:\n    count: 1\n  DocumentReference:\n    count: 5\n    size: 6013814\nmsg: OK\n\n```\n\n> I want need to do something a bit more complex, for example, I want to create a project structure with a set of files, but I need to specify the `Patient` and `Specimen` based on the path of the file.\n\n```text\ngen3_util meta  import dir tests/fixtures/dir_to_study_with_meta/ tmp/foometa --project_id aced-foometa --plugin_path ./tests/unit/plugins\n\ntests/fixtures/dir_to_study_with_meta/\n\u251c\u2500\u2500 file-2.csv\n\u251c\u2500\u2500 p1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 s1\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 file-3.pdf\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 s2\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 file-4.tsv\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 s3\n\u2502\u00a0\u00a0     \u2514\u2500\u2500 file-5\n\u2514\u2500\u2500 p2\n    \u2514\u2500\u2500 s4\n        \u2514\u2500\u2500 file-1.txt\n\nWill produce the following meta data:\n\nsummary:\n  ResearchStudy:\n    count: 1\n  Patient:\n    count: 2\n  Specimen:\n    count: 4\n  DocumentReference:\n    count: 5\n    size: 6013814\n\n```\n\nFor more see [test_meta_plugin](./tests/unit/meta/test_plugins.py)\n\n\n\n> I need to upload those files to the instance\n\n```\n$ gen3_util files cp --ignore_state --project_id aced-MyExperiment tmp/foo/DocumentReference.ndjson  bucket://aced-development-ohsu-data-bucket\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 5.74M/5.74M [00:03<00:00, 1.71MB/s, elapsed=0:00:02.056022, file=6f8101]\ninfo:\n- Wrote state to ~/.gen3/gen3-util-state/state.ndjson\nmsg: OK\n```\n\n\n> I need to upload the meta data about those files to the instance\n\n```\n$gen3_util meta cp tmp/foo bucket://aced-development-ohsu-data-bucket --project_id aced-MyExperiment\nmsg: Uploaded /var/folders/2c/hffqqtr94nv64tjy0xrl38r89k1sty/T/tmpacozhhoo/_aced-MyExperiment_meta.zip\n```\n\n\n> I need to request or manage access to a project\n\n```\n$ gen3_util access\nUsage: gen3_util access [OPTIONS] COMMAND [ARGS]...\n\n  Manage access requests.\n\nOptions:\n  --help  Show this message and exit.\n\nCommands:\n  touch   Create a request for read access.\n  update  Update the request's approval workflow.\n  ls      List current user's requests.\n  cat     Show details of a specific request.\n\n```\n\n\n\n\n\n## Development Setup\n\n```\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\npip install -r requirements-dev.txt\npip install -e .\n```\n\n## Test\n\n* fixtures - data for testing environment\n\n```\ntests/fixtures/\n\u2514\u2500\u2500 custom_config\n    \u2514\u2500\u2500 config.yaml  # testing configuration\n\n```\n\n* test parameters\n\n```\ntests/\n\u251c\u2500\u2500 integration\n\u2502   \u2514\u2500\u2500 conftest.py\n\u2514\u2500\u2500 unit\n    \u2514\u2500\u2500 conftest.py\n```\n\n* running tests\n\n```\n\n$ pytest --cov=gen3_util\n\n 88%\n\n\n```\n\n* pre commit tests\n\nA reasonable set of checks, including running unit tests prior to each commit.  You can run these tests on demand by:\n\n```\n$ pre-commit install\n\n$ pre-commit run --all-files\ndebug statements (python)................................................Passed\ncheck python ast.........................................................Passed\nfix utf-8 byte order marker..............................................Passed\ncheck json...........................................(no files to check)Skipped\ndetect private key.......................................................Passed\ncheck yaml...............................................................Passed\ncheck for added large files..............................................Passed\ncheck that scripts with shebangs are executable..........................Passed\ncheck for case conflicts.................................................Passed\nfix end of files.........................................................Passed\ntrim trailing whitespace.................................................Passed\nmixed line ending........................................................Passed\nrun our unit tests.......................................................Passed\n\n```\n\n## Distribution\n\n- PyPi\n\n```\n# update pypi\n\n# pypi credentials - see https://twine.readthedocs.io/en/stable/#environment-variables\n\nexport TWINE_USERNAME=  #  the username to use for authentication to the repository.\nexport TWINE_PASSWORD=  # the password to use for authentication to the repository.\n\n# this could be maintained as so: export $(cat .env | xargs)\n\nrm -r dist/\npython3  setup.py sdist bdist_wheel\ntwine upload dist/*\n```\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Commons utilities",
    "version": "0.0.12",
    "project_urls": {
        "Bug Reports": "https://github.com/ACED-IDP/gen3_util/issues",
        "Homepage": "https://github.com/ACED-IDP/gen3_util",
        "Source": "https://github.com/ACED-IDP/gen3_util"
    },
    "split_keywords": [
        "gen3",
        "bioinformatics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ef787cdcbcb96165b607e247ef8af7dd91c19c8bfe9d09685c619588ba9d3c68",
                "md5": "898881940b982b9fa942b63f81bbb651",
                "sha256": "17282108b276f6b18fd8bb11899be381b0aa92f66d4f1ac9d5e015dad5150484"
            },
            "downloads": -1,
            "filename": "gen3_util-0.0.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "898881940b982b9fa942b63f81bbb651",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9, <4",
            "size": 70781,
            "upload_time": "2023-11-27T23:53:56",
            "upload_time_iso_8601": "2023-11-27T23:53:56.126290Z",
            "url": "https://files.pythonhosted.org/packages/ef/78/7cdcbcb96165b607e247ef8af7dd91c19c8bfe9d09685c619588ba9d3c68/gen3_util-0.0.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "03db15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc",
                "md5": "a47b37b08c7b8fed3f695e5b247fdd27",
                "sha256": "09c33b7dc63623040b99b276496eca97c30e2950dbdde7cfa0f8ad983bea5d01"
            },
            "downloads": -1,
            "filename": "gen3_util-0.0.12.tar.gz",
            "has_sig": false,
            "md5_digest": "a47b37b08c7b8fed3f695e5b247fdd27",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9, <4",
            "size": 47777,
            "upload_time": "2023-11-27T23:53:59",
            "upload_time_iso_8601": "2023-11-27T23:53:59.047606Z",
            "url": "https://files.pythonhosted.org/packages/03/db/15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc/gen3_util-0.0.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-27 23:53:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ACED-IDP",
    "github_project": "gen3_util",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "gen3-util"
}
        
Elapsed time: 0.23315s