ckanext-dcor_depot
==================
|PyPI Version| |Build Status| |Coverage Status|
This plugin manages how data are stored in DCOR. There are two types of
files in DCOR:
1. Resources uploaded by users, imported from figshare, or
imported from a data archive
2. Ancillary files that are generated upon resource creation, such as
condensed DC data, preview images (see
`ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).
This plugin implements:
- Data storage management. All resources uploaded by a user are moved
to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``
and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
via a background job.
CKAN itself will not notice this. The idea is to have a filesystem overview
about the datasets of each user.
- A backround job that uploads resources to S3 in `after_resource_create`
if the resources were uploaded via the legacy upload route.
- A background job that backs up resources from S3 to local block storage
if the resources were uploaded via the S3 upload route.
- Import datasets from figshare. Existing datasets from figshare are
downloaded to the ``/data/depots/figshare`` directory and, upon resource
creation, symlinked there from ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
(Note that this is an exemption of the data storage management described
above). When running the following command, the "figshare-import" organization
is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:
::
ckan import-figshare
- CLI for symlinking datasets that have failed to symlink before:
::
ckan run-jobs-dcor-depot
- CLI for appending a resource to a dataset
::
ckan append-resource /path/to/file dataset_id --delete-source
Please make sure that the necessary file permissions are given in ``/data``.
In 2023, it was decided that the huge block storage of DCOR
should be replaced with an S3-compatible object store, because block storage
does not scale well. This partially deprecates some of the commands above
which might be removed or modified to support object storage directly.
- CLI for migrating data from block storage to an S3-compatible object storage
service. For this, the following configuration keys must be specified in
the ``ckan.ini`` file::
dcor_object_store.access_key_id = ACCESS_KEY_ID
dcor_object_store.secret_access_key = SECRET_ACCESS_KEY
dcor_object_store.endpoint_url = S3_ENDPOINT_URL
dcor_object_store.ssl_verify = true
# The bucket name is by default defined by the circle ID. Resources
# are stored in the "RES/OUR/CEID-SCHEME" in that bucket.
dcor_object_store.bucket_name = circle-{organization_id}
Usage::
ckan dcor-migrate-resources-to-object-store
Installation
------------
::
pip install ckanext-dcor_depot
Add this extension to the plugins and defaul_views in ckan.ini:
::
ckan.plugins = [...] dcor_depot
ckan.storage_path=/data/ckan-HOSTNAME
ckanext.dcor_depot.depots_path=/data/depots
ckanext.dcor_depot.users_depot_name=users-HOSTNAME
This plugin stores resources to `/data`:
::
mkdir -p /data/depots/users-$(hostname)
chown -R www-data /data/depots/users-$(hostname)
Testing
-------
If CKAN/DCOR is installed and setup for testing, this extension can
be tested with pytest:
::
pytest ckanext
Testing can also be done via vagrant in a virtualmachine using the
`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.
Make sure that `vagrant` and `virtualbox` are installed and run the
following commands in the root of this repository:
::
# Setup virtual machine using `Vagrantfile`
vagrant up
# Run the tests
vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh
.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg
:target: https://pypi.python.org/pypi/ckanext.dcor_depot
.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml
:target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks
.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot
:target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot
Raw data
{
"_id": null,
"home_page": null,
"name": "ckanext-dcor-depot",
"maintainer": null,
"docs_url": null,
"requires_python": "<4,>=3.8",
"maintainer_email": "Paul M\u00fcller <dev@craban.de>",
"keywords": "DC, DCOR, deformability, cytometry",
"author": "Paul M\u00fcller",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/b8/4f/358d56bc49d539158e45e99b1406c404a22cd38094e040f525accf64fe0c/ckanext_dcor_depot-0.15.3.tar.gz",
"platform": null,
"description": "ckanext-dcor_depot\n==================\n\n|PyPI Version| |Build Status| |Coverage Status|\n\nThis plugin manages how data are stored in DCOR. There are two types of\nfiles in DCOR:\n\n1. Resources uploaded by users, imported from figshare, or\n imported from a data archive\n2. Ancillary files that are generated upon resource creation, such as\n condensed DC data, preview images (see\n `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).\n\nThis plugin implements:\n\n- Data storage management. All resources uploaded by a user are moved\n to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``\n and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n via a background job.\n CKAN itself will not notice this. The idea is to have a filesystem overview\n about the datasets of each user.\n- A backround job that uploads resources to S3 in `after_resource_create`\n if the resources were uploaded via the legacy upload route.\n- A background job that backs up resources from S3 to local block storage\n if the resources were uploaded via the S3 upload route.\n- Import datasets from figshare. Existing datasets from figshare are\n downloaded to the ``/data/depots/figshare`` directory and, upon resource\n creation, symlinked there from ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n (Note that this is an exemption of the data storage management described\n above). When running the following command, the \"figshare-import\" organization\n is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:\n\n ::\n\n ckan import-figshare\n\n\n- CLI for symlinking datasets that have failed to symlink before:\n\n ::\n\n ckan run-jobs-dcor-depot\n\n\n- CLI for appending a resource to a dataset\n\n ::\n\n ckan append-resource /path/to/file dataset_id --delete-source\n\nPlease make sure that the necessary file permissions are given in ``/data``.\n\nIn 2023, it was decided that the huge block storage of DCOR\nshould be replaced with an S3-compatible object store, because block storage\ndoes not scale well. This partially deprecates some of the commands above\nwhich might be removed or modified to support object storage directly.\n\n- CLI for migrating data from block storage to an S3-compatible object storage\n service. For this, the following configuration keys must be specified in\n the ``ckan.ini`` file::\n\n dcor_object_store.access_key_id = ACCESS_KEY_ID\n dcor_object_store.secret_access_key = SECRET_ACCESS_KEY\n dcor_object_store.endpoint_url = S3_ENDPOINT_URL\n dcor_object_store.ssl_verify = true\n # The bucket name is by default defined by the circle ID. Resources\n # are stored in the \"RES/OUR/CEID-SCHEME\" in that bucket.\n dcor_object_store.bucket_name = circle-{organization_id}\n\n Usage::\n\n ckan dcor-migrate-resources-to-object-store\n\n\nInstallation\n------------\n\n::\n\n pip install ckanext-dcor_depot\n\n\nAdd this extension to the plugins and defaul_views in ckan.ini:\n\n::\n\n ckan.plugins = [...] dcor_depot\n ckan.storage_path=/data/ckan-HOSTNAME\n ckanext.dcor_depot.depots_path=/data/depots\n ckanext.dcor_depot.users_depot_name=users-HOSTNAME\n\nThis plugin stores resources to `/data`:\n\n::\n\n mkdir -p /data/depots/users-$(hostname)\n chown -R www-data /data/depots/users-$(hostname)\n\n\nTesting\n-------\nIf CKAN/DCOR is installed and setup for testing, this extension can\nbe tested with pytest:\n\n::\n\n pytest ckanext\n\nTesting can also be done via vagrant in a virtualmachine using the\n`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.\nMake sure that `vagrant` and `virtualbox` are installed and run the\nfollowing commands in the root of this repository:\n\n::\n\n # Setup virtual machine using `Vagrantfile`\n vagrant up\n # Run the tests\n vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh\n\n\n.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg\n :target: https://pypi.python.org/pypi/ckanext.dcor_depot\n.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml\n :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks\n.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot\n :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot\n",
"bugtrack_url": null,
"license": "GNU Affero General Public License v3 or later (AGPLv3+)",
"summary": "Manages data storage for DCOR",
"version": "0.15.3",
"project_urls": {
"changelog": "https://github.com/DCOR-dev/ckanext-dcor_depot/blob/main/CHANGELOG",
"source": "https://github.com/DCOR-dev/ckanext-dcor_depot",
"tracker": "https://github.com/DCOR-dev/ckanext-dcor_depot/issues"
},
"split_keywords": [
"dc",
" dcor",
" deformability",
" cytometry"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2439a0ceb41684fe24f20725fd2df6afdf7b07f5302774f6223aebcd440ef31b",
"md5": "807cf5a8957c7db13d49f947d100fb7d",
"sha256": "26de9a5039236a83ca353ddeb0a0bb7c3284b75abf20cc9419708f0a33a10b9c"
},
"downloads": -1,
"filename": "ckanext_dcor_depot-0.15.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "807cf5a8957c7db13d49f947d100fb7d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4,>=3.8",
"size": 4298357,
"upload_time": "2024-10-03T08:55:15",
"upload_time_iso_8601": "2024-10-03T08:55:15.505362Z",
"url": "https://files.pythonhosted.org/packages/24/39/a0ceb41684fe24f20725fd2df6afdf7b07f5302774f6223aebcd440ef31b/ckanext_dcor_depot-0.15.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b84f358d56bc49d539158e45e99b1406c404a22cd38094e040f525accf64fe0c",
"md5": "b9b41c8a166e1d80a1443a14dfaf6d3f",
"sha256": "05e1b3c80846f7837aa48822acfaac8e5c780a125bbb5abf029391ddc09036f8"
},
"downloads": -1,
"filename": "ckanext_dcor_depot-0.15.3.tar.gz",
"has_sig": false,
"md5_digest": "b9b41c8a166e1d80a1443a14dfaf6d3f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4,>=3.8",
"size": 4300131,
"upload_time": "2024-10-03T08:55:17",
"upload_time_iso_8601": "2024-10-03T08:55:17.172181Z",
"url": "https://files.pythonhosted.org/packages/b8/4f/358d56bc49d539158e45e99b1406c404a22cd38094e040f525accf64fe0c/ckanext_dcor_depot-0.15.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-03 08:55:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DCOR-dev",
"github_project": "ckanext-dcor_depot",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ckanext-dcor-depot"
}