# Gen3 Utilities
Utilities to manage Gen3 schemas, projects and submissions.
## Installation
```
# optionally
$python3 -m venv venv ; source venv/bin/activate
pip install gen3_util
$ gen3_util
msg: Version 0.0.4
```
Note: requires [`magic`](https://github.com/ahupp/python-magic#installation) library. If it is not already installed you will see a warning like this:
```text
Requires libmagic installed on your system to determine mime-types
Error: 'failed to find libmagic. Check your installation'
For installation instructions see https://github.com/ahupp/python-magic#installation
```
Note: some environments will get a `_ctypes` error. Please install 3.9.16
## Use
```
$gen3_util --help
Usage: gen3_util [OPTIONS] COMMAND [ARGS]...
Gen3 Management Utilities
Options:
--config TEXT Path to config file. GEN3_UTIL_CONFIG
--format [yaml|json|text] Result format. GEN3_UTIL_FORMAT [default: yaml]
--cred TEXT See https://uc-cdis.github.io/gen3-user-
doc/appendices/api-gen3/#credentials-to-query-
the-api. GEN3_API_KEY
--state_dir TEXT Directory for file transfer state
GEN3_UTIL_STATE_DIR [default: ~/.gen3/gen3_util]
--help Show this message and exit.
Commands:
projects Manage Gen3 projects.
buckets Manage Gen3 buckets.
meta Manage meta data.
files Manage file buckets.
access Manage access requests.
config Configure this utility.
```
## Connectivity
* Leverages Gen3Auth [See](https://uc-cdis.github.io/gen3-user-doc/appendices/api-gen3/#credentials-to-query-the-api.)
* Store the `credentials.json` file in ~/.gen3/credentials.json or specify location with either env[GEN3_API_KEY], or `--cred` parameter
## Use cases
> I need to verify connectivity.
```
$ gen3_util projects ping
msg: OK connected to endpoint https://aced-training.compbio.ohsu.edu
```
> I need to see what projects exist
```
$ gen3_util projects ls
endpoint: https://aced-training.compbio.ohsu.edu
msg: OK
projects:
- /programs
- /programs/aced
- /programs/aced/project
- /programs/aced/project/MCF10A
- /programs/aced/projects
- /programs/aced/projects/Alcoholism
- /programs/aced/projects/Alzheimers
- /programs/aced/projects/Breast_Cancer
- /programs/aced/projects/Colon_Cancer
- /programs/aced/projects/Diabetes
- /programs/aced/projects/HOP
- /programs/aced/projects/Lung_Cancer
- /programs/aced/projects/MCF10A
- /programs/aced/projects/NVIDIA
- /programs/aced/projects/Prostate_Cancer
- /programs/aced/projects/ohsu_download_testing
```
> I need to see what buckets are associated with the commons
```
$ gen3_util buckets ls
buckets:
GS_BUCKETS: {}
S3_BUCKETS:
aced-default:
endpoint_url: https://minio-default.compbio.ohsu.edu
region: us-east-1
aced-manchester:
endpoint_url: https://minio-manchester.compbio.ohsu.edu
region: us-east-1
aced-ohsu:
endpoint_url: https://minio-ohsu.compbio.ohsu.edu
region: us-east-1
aced-stanford:
endpoint_url: https://minio-stanford.compbio.ohsu.edu
region: us-east-1
aced-ucl:
endpoint_url: https://minio-ucl.compbio.ohsu.edu
region: us-east-1
endpoint: https://aced-training.compbio.ohsu.edu
msg: OK
```
> I need to create a project
```text
$ gen3_util projects touch aced-MyExperiment
projects:
aced-MyExperiment:
exists: true
messages:
- Created program:aced Program is updated!
```
> I need to assign default policies to that project
```text
$ gen3_util projects add policies aced-MyExperiment
msg: Approve these requests to assign default policies to aced-MyExperiment
commands:
- gen3_util access update 24f047d7-0e7c-43c6-bab6-61e2d385c71a SIGNED
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED
```
> I need to add a user to that project
```text
$ gen3_util projects add user aced-MyExperiment linus.pauling@osu.edu
msg: Approve these requests to add linus.pauling@osu.edu to aced-MyExperiment
commands:
- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED
```
> Before proceeding, I need to sign those equests
```text
gen3_util access update xxxxxx SIGNED
```
> I want to create a simple project structure with a set of files
```text
$ gen3_util meta import dir tests/fixtures/dir_to_study/ tmp/foo --project_id aced-MyExperiment
summary:
ResearchStudy:
count: 1
DocumentReference:
count: 5
size: 6013814
msg: OK
```
> I want need to do something a bit more complex, for example, I want to create a project structure with a set of files, but I need to specify the `Patient` and `Specimen` based on the path of the file.
```text
gen3_util meta import dir tests/fixtures/dir_to_study_with_meta/ tmp/foometa --project_id aced-foometa --plugin_path ./tests/unit/plugins
tests/fixtures/dir_to_study_with_meta/
├── file-2.csv
├── p1
│ ├── s1
│ │ └── file-3.pdf
│ ├── s2
│ │ └── file-4.tsv
│ └── s3
│ └── file-5
└── p2
└── s4
└── file-1.txt
Will produce the following meta data:
summary:
ResearchStudy:
count: 1
Patient:
count: 2
Specimen:
count: 4
DocumentReference:
count: 5
size: 6013814
```
For more see [test_meta_plugin](./tests/unit/meta/test_plugins.py)
> I need to upload those files to the instance
```
$ gen3_util files cp --ignore_state --project_id aced-MyExperiment tmp/foo/DocumentReference.ndjson bucket://aced-development-ohsu-data-bucket
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 5.74M/5.74M [00:03<00:00, 1.71MB/s, elapsed=0:00:02.056022, file=6f8101]
info:
- Wrote state to ~/.gen3/gen3-util-state/state.ndjson
msg: OK
```
> I need to upload the meta data about those files to the instance
```
$gen3_util meta cp tmp/foo bucket://aced-development-ohsu-data-bucket --project_id aced-MyExperiment
msg: Uploaded /var/folders/2c/hffqqtr94nv64tjy0xrl38r89k1sty/T/tmpacozhhoo/_aced-MyExperiment_meta.zip
```
> I need to request or manage access to a project
```
$ gen3_util access
Usage: gen3_util access [OPTIONS] COMMAND [ARGS]...
Manage access requests.
Options:
--help Show this message and exit.
Commands:
touch Create a request for read access.
update Update the request's approval workflow.
ls List current user's requests.
cat Show details of a specific request.
```
## Development Setup
```
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip install -e .
```
## Test
* fixtures - data for testing environment
```
tests/fixtures/
└── custom_config
└── config.yaml # testing configuration
```
* test parameters
```
tests/
├── integration
│ └── conftest.py
└── unit
└── conftest.py
```
* running tests
```
$ pytest --cov=gen3_util
88%
```
* pre commit tests
A reasonable set of checks, including running unit tests prior to each commit. You can run these tests on demand by:
```
$ pre-commit install
$ pre-commit run --all-files
debug statements (python)................................................Passed
check python ast.........................................................Passed
fix utf-8 byte order marker..............................................Passed
check json...........................................(no files to check)Skipped
detect private key.......................................................Passed
check yaml...............................................................Passed
check for added large files..............................................Passed
check that scripts with shebangs are executable..........................Passed
check for case conflicts.................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
mixed line ending........................................................Passed
run our unit tests.......................................................Passed
```
## Distribution
- PyPi
```
# update pypi
# pypi credentials - see https://twine.readthedocs.io/en/stable/#environment-variables
export TWINE_USERNAME= # the username to use for authentication to the repository.
export TWINE_PASSWORD= # the password to use for authentication to the repository.
# this could be maintained as so: export $(cat .env | xargs)
rm -r dist/
python3 setup.py sdist bdist_wheel
twine upload dist/*
```
Raw data
{
"_id": null,
"home_page": "https://github.com/ACED-IDP/gen3_util",
"name": "gen3-util",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9, <4",
"maintainer_email": "",
"keywords": "gen3 bioinformatics",
"author": "Ellrott Lab",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/03/db/15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc/gen3_util-0.0.12.tar.gz",
"platform": null,
"description": "\n# Gen3 Utilities\n\nUtilities to manage Gen3 schemas, projects and submissions.\n\n## Installation\n\n```\n\n# optionally\n$python3 -m venv venv ; source venv/bin/activate\n\npip install gen3_util\n\n$ gen3_util\nmsg: Version 0.0.4\n\n\n```\n\nNote: requires [`magic`](https://github.com/ahupp/python-magic#installation) library. If it is not already installed you will see a warning like this:\n\n```text\nRequires libmagic installed on your system to determine mime-types\nError: 'failed to find libmagic. Check your installation'\nFor installation instructions see https://github.com/ahupp/python-magic#installation\n```\n\nNote: some environments will get a `_ctypes` error. Please install 3.9.16\n\n## Use\n\n```\n$gen3_util --help\nUsage: gen3_util [OPTIONS] COMMAND [ARGS]...\n\n Gen3 Management Utilities\n\nOptions:\n --config TEXT Path to config file. GEN3_UTIL_CONFIG\n --format [yaml|json|text] Result format. GEN3_UTIL_FORMAT [default: yaml]\n --cred TEXT See https://uc-cdis.github.io/gen3-user-\n doc/appendices/api-gen3/#credentials-to-query-\n the-api. GEN3_API_KEY\n --state_dir TEXT Directory for file transfer state\n GEN3_UTIL_STATE_DIR [default: ~/.gen3/gen3_util]\n --help Show this message and exit.\n\nCommands:\n projects Manage Gen3 projects.\n buckets Manage Gen3 buckets.\n meta Manage meta data.\n files Manage file buckets.\n access Manage access requests.\n config Configure this utility.\n\n\n```\n\n## Connectivity\n\n* Leverages Gen3Auth [See](https://uc-cdis.github.io/gen3-user-doc/appendices/api-gen3/#credentials-to-query-the-api.)\n* Store the `credentials.json` file in ~/.gen3/credentials.json or specify location with either env[GEN3_API_KEY], or `--cred` parameter\n\n## Use cases\n\n> I need to verify connectivity.\n\n```\n$ gen3_util projects ping\nmsg: OK connected to endpoint https://aced-training.compbio.ohsu.edu\n```\n\n> I need to see what projects exist\n\n```\n$ gen3_util projects ls\n\nendpoint: https://aced-training.compbio.ohsu.edu\nmsg: OK\nprojects:\n- /programs\n- /programs/aced\n- /programs/aced/project\n- /programs/aced/project/MCF10A\n- /programs/aced/projects\n- /programs/aced/projects/Alcoholism\n- /programs/aced/projects/Alzheimers\n- /programs/aced/projects/Breast_Cancer\n- /programs/aced/projects/Colon_Cancer\n- /programs/aced/projects/Diabetes\n- /programs/aced/projects/HOP\n- /programs/aced/projects/Lung_Cancer\n- /programs/aced/projects/MCF10A\n- /programs/aced/projects/NVIDIA\n- /programs/aced/projects/Prostate_Cancer\n- /programs/aced/projects/ohsu_download_testing\n```\n\n> I need to see what buckets are associated with the commons\n\n```\n$ gen3_util buckets ls\nbuckets:\n GS_BUCKETS: {}\n S3_BUCKETS:\n aced-default:\n endpoint_url: https://minio-default.compbio.ohsu.edu\n region: us-east-1\n aced-manchester:\n endpoint_url: https://minio-manchester.compbio.ohsu.edu\n region: us-east-1\n aced-ohsu:\n endpoint_url: https://minio-ohsu.compbio.ohsu.edu\n region: us-east-1\n aced-stanford:\n endpoint_url: https://minio-stanford.compbio.ohsu.edu\n region: us-east-1\n aced-ucl:\n endpoint_url: https://minio-ucl.compbio.ohsu.edu\n region: us-east-1\nendpoint: https://aced-training.compbio.ohsu.edu\nmsg: OK\n\n\n```\n\n> I need to create a project\n\n```text\n$ gen3_util projects touch aced-MyExperiment\nprojects:\n aced-MyExperiment:\n exists: true\nmessages:\n- Created program:aced Program is updated!\n\n```\n\n> I need to assign default policies to that project\n\n```text\n$ gen3_util projects add policies aced-MyExperiment\nmsg: Approve these requests to assign default policies to aced-MyExperiment\ncommands:\n- gen3_util access update 24f047d7-0e7c-43c6-bab6-61e2d385c71a SIGNED\n- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED\n\n```\n\n> I need to add a user to that project\n\n```text\n$ gen3_util projects add user aced-MyExperiment linus.pauling@osu.edu\nmsg: Approve these requests to add linus.pauling@osu.edu to aced-MyExperiment\ncommands:\n- gen3_util access update 293c6cd1-7ab7-420f-bafb-34319589eac4 SIGNED\n\n```\n\n> Before proceeding, I need to sign those equests\n\n```text\ngen3_util access update xxxxxx SIGNED\n```\n\n\n\n> I want to create a simple project structure with a set of files\n\n```text\n$ gen3_util meta import dir tests/fixtures/dir_to_study/ tmp/foo --project_id aced-MyExperiment\nsummary:\n ResearchStudy:\n count: 1\n DocumentReference:\n count: 5\n size: 6013814\nmsg: OK\n\n```\n\n> I want need to do something a bit more complex, for example, I want to create a project structure with a set of files, but I need to specify the `Patient` and `Specimen` based on the path of the file.\n\n```text\ngen3_util meta import dir tests/fixtures/dir_to_study_with_meta/ tmp/foometa --project_id aced-foometa --plugin_path ./tests/unit/plugins\n\ntests/fixtures/dir_to_study_with_meta/\n\u251c\u2500\u2500 file-2.csv\n\u251c\u2500\u2500 p1\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 s1\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 file-3.pdf\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 s2\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 file-4.tsv\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 s3\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 file-5\n\u2514\u2500\u2500 p2\n \u2514\u2500\u2500 s4\n \u2514\u2500\u2500 file-1.txt\n\nWill produce the following meta data:\n\nsummary:\n ResearchStudy:\n count: 1\n Patient:\n count: 2\n Specimen:\n count: 4\n DocumentReference:\n count: 5\n size: 6013814\n\n```\n\nFor more see [test_meta_plugin](./tests/unit/meta/test_plugins.py)\n\n\n\n> I need to upload those files to the instance\n\n```\n$ gen3_util files cp --ignore_state --project_id aced-MyExperiment tmp/foo/DocumentReference.ndjson bucket://aced-development-ohsu-data-bucket\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 5.74M/5.74M [00:03<00:00, 1.71MB/s, elapsed=0:00:02.056022, file=6f8101]\ninfo:\n- Wrote state to ~/.gen3/gen3-util-state/state.ndjson\nmsg: OK\n```\n\n\n> I need to upload the meta data about those files to the instance\n\n```\n$gen3_util meta cp tmp/foo bucket://aced-development-ohsu-data-bucket --project_id aced-MyExperiment\nmsg: Uploaded /var/folders/2c/hffqqtr94nv64tjy0xrl38r89k1sty/T/tmpacozhhoo/_aced-MyExperiment_meta.zip\n```\n\n\n> I need to request or manage access to a project\n\n```\n$ gen3_util access\nUsage: gen3_util access [OPTIONS] COMMAND [ARGS]...\n\n Manage access requests.\n\nOptions:\n --help Show this message and exit.\n\nCommands:\n touch Create a request for read access.\n update Update the request's approval workflow.\n ls List current user's requests.\n cat Show details of a specific request.\n\n```\n\n\n\n\n\n## Development Setup\n\n```\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\npip install -r requirements-dev.txt\npip install -e .\n```\n\n## Test\n\n* fixtures - data for testing environment\n\n```\ntests/fixtures/\n\u2514\u2500\u2500 custom_config\n \u2514\u2500\u2500 config.yaml # testing configuration\n\n```\n\n* test parameters\n\n```\ntests/\n\u251c\u2500\u2500 integration\n\u2502 \u2514\u2500\u2500 conftest.py\n\u2514\u2500\u2500 unit\n \u2514\u2500\u2500 conftest.py\n```\n\n* running tests\n\n```\n\n$ pytest --cov=gen3_util\n\n 88%\n\n\n```\n\n* pre commit tests\n\nA reasonable set of checks, including running unit tests prior to each commit. You can run these tests on demand by:\n\n```\n$ pre-commit install\n\n$ pre-commit run --all-files\ndebug statements (python)................................................Passed\ncheck python ast.........................................................Passed\nfix utf-8 byte order marker..............................................Passed\ncheck json...........................................(no files to check)Skipped\ndetect private key.......................................................Passed\ncheck yaml...............................................................Passed\ncheck for added large files..............................................Passed\ncheck that scripts with shebangs are executable..........................Passed\ncheck for case conflicts.................................................Passed\nfix end of files.........................................................Passed\ntrim trailing whitespace.................................................Passed\nmixed line ending........................................................Passed\nrun our unit tests.......................................................Passed\n\n```\n\n## Distribution\n\n- PyPi\n\n```\n# update pypi\n\n# pypi credentials - see https://twine.readthedocs.io/en/stable/#environment-variables\n\nexport TWINE_USERNAME= # the username to use for authentication to the repository.\nexport TWINE_PASSWORD= # the password to use for authentication to the repository.\n\n# this could be maintained as so: export $(cat .env | xargs)\n\nrm -r dist/\npython3 setup.py sdist bdist_wheel\ntwine upload dist/*\n```\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Commons utilities",
"version": "0.0.12",
"project_urls": {
"Bug Reports": "https://github.com/ACED-IDP/gen3_util/issues",
"Homepage": "https://github.com/ACED-IDP/gen3_util",
"Source": "https://github.com/ACED-IDP/gen3_util"
},
"split_keywords": [
"gen3",
"bioinformatics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ef787cdcbcb96165b607e247ef8af7dd91c19c8bfe9d09685c619588ba9d3c68",
"md5": "898881940b982b9fa942b63f81bbb651",
"sha256": "17282108b276f6b18fd8bb11899be381b0aa92f66d4f1ac9d5e015dad5150484"
},
"downloads": -1,
"filename": "gen3_util-0.0.12-py3-none-any.whl",
"has_sig": false,
"md5_digest": "898881940b982b9fa942b63f81bbb651",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9, <4",
"size": 70781,
"upload_time": "2023-11-27T23:53:56",
"upload_time_iso_8601": "2023-11-27T23:53:56.126290Z",
"url": "https://files.pythonhosted.org/packages/ef/78/7cdcbcb96165b607e247ef8af7dd91c19c8bfe9d09685c619588ba9d3c68/gen3_util-0.0.12-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "03db15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc",
"md5": "a47b37b08c7b8fed3f695e5b247fdd27",
"sha256": "09c33b7dc63623040b99b276496eca97c30e2950dbdde7cfa0f8ad983bea5d01"
},
"downloads": -1,
"filename": "gen3_util-0.0.12.tar.gz",
"has_sig": false,
"md5_digest": "a47b37b08c7b8fed3f695e5b247fdd27",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9, <4",
"size": 47777,
"upload_time": "2023-11-27T23:53:59",
"upload_time_iso_8601": "2023-11-27T23:53:59.047606Z",
"url": "https://files.pythonhosted.org/packages/03/db/15a8b22cabb6504f2749f751638156331f8a336ea93d13f488f25a36c4fc/gen3_util-0.0.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-27 23:53:59",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ACED-IDP",
"github_project": "gen3_util",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "gen3-util"
}