ckanext-dcor_depot
==================
|PyPI Version| |Build Status| |Coverage Status|
This plugin manages how data are stored in DCOR. There are two types of
files in DCOR:
1. Resources uploaded by users, imported from figshare, or
imported from a data archive
2. Ancillary files that are generated upon resource creation, such as
condensed DC data, preview images (see
`ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).
This plugin implements:
- Data storage management. All resources uploaded by a user are moved
to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``
and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
via a background job.
CKAN itself will not notice this. The idea is to have a filesystem overview
about the datasets of each user.
- A backround job that uploads resources to S3 in `after_resource_create`
if the resources were uploaded via the legacy upload route.
- A background job that backs up resources from S3 to local block storage
if the resources were uploaded via the S3 upload route.
- Import datasets from figshare. Existing datasets from figshare are
downloaded to the ``/data/depots/figshare`` directory and, upon resource
creation, symlinked there from ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
(Note that this is an exemption of the data storage management described
above). When running the following command, the "figshare-import" organization
is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:
::
ckan import-figshare
- CLI for symlinking datasets that have failed to symlink before:
::
ckan run-jobs-dcor-depot
- CLI for appending a resource to a dataset
::
ckan append-resource /path/to/file dataset_id --delete-source
Please make sure that the necessary file permissions are given in ``/data``.
In 2023, it was decided that the huge block storage of DCOR
should be replaced with an S3-compatible object store, because block storage
does not scale well. This partially deprecates some of the commands above
which might be removed or modified to support object storage directly.
- CLI for migrating data from block storage to an S3-compatible object storage
service. For this, the following configuration keys must be specified in
the ``ckan.ini`` file::
dcor_object_store.access_key_id = ACCESS_KEY_ID
dcor_object_store.secret_access_key = SECRET_ACCESS_KEY
dcor_object_store.endpoint_url = S3_ENDPOINT_URL
dcor_object_store.ssl_verify = true
# The bucket name is by default defined by the circle ID. Resources
# are stored in the "RES/OUR/CEID-SCHEME" in that bucket.
dcor_object_store.bucket_name = circle-{organization_id}
Usage::
ckan dcor-migrate-resources-to-object-store
Installation
------------
::
pip install ckanext-dcor_depot
Add this extension to the plugins and defaul_views in ckan.ini:
::
ckan.plugins = [...] dcor_depot
ckan.storage_path=/data/ckan-HOSTNAME
ckanext.dcor_depot.depots_path=/data/depots
ckanext.dcor_depot.users_depot_name=users-HOSTNAME
This plugin stores resources to `/data`:
::
mkdir -p /data/depots/users-$(hostname)
chown -R www-data /data/depots/users-$(hostname)
Testing
-------
If CKAN/DCOR is installed and setup for testing, this extension can
be tested with pytest:
::
pytest ckanext
Testing can also be done via vagrant in a virtualmachine using the
`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.
Make sure that `vagrant` and `virtualbox` are installed and run the
following commands in the root of this repository:
::
# Setup virtual machine using `Vagrantfile`
vagrant up
# Run the tests
vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh
.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg
:target: https://pypi.python.org/pypi/ckanext.dcor_depot
.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml
:target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks
.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot
:target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot
Raw data
{
"_id": null,
"home_page": "https://github.com/DCOR-dev/ckanext-dcor_depot",
"name": "ckanext-dcor-depot",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "CKAN, DCOR, RT-DC",
"author": "Paul M\u00fcller",
"author_email": "dev@craban.de",
"download_url": "https://files.pythonhosted.org/packages/b4/a5/736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba/ckanext-dcor_depot-0.14.0.tar.gz",
"platform": null,
"description": "ckanext-dcor_depot\n==================\n\n|PyPI Version| |Build Status| |Coverage Status|\n\nThis plugin manages how data are stored in DCOR. There are two types of\nfiles in DCOR:\n\n1. Resources uploaded by users, imported from figshare, or\n imported from a data archive\n2. Ancillary files that are generated upon resource creation, such as\n condensed DC data, preview images (see\n `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).\n\nThis plugin implements:\n\n- Data storage management. All resources uploaded by a user are moved\n to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``\n and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n via a background job.\n CKAN itself will not notice this. The idea is to have a filesystem overview\n about the datasets of each user.\n- A backround job that uploads resources to S3 in `after_resource_create`\n if the resources were uploaded via the legacy upload route.\n- A background job that backs up resources from S3 to local block storage\n if the resources were uploaded via the S3 upload route.\n- Import datasets from figshare. Existing datasets from figshare are\n downloaded to the ``/data/depots/figshare`` directory and, upon resource\n creation, symlinked there from ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n (Note that this is an exemption of the data storage management described\n above). When running the following command, the \"figshare-import\" organization\n is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:\n\n ::\n\n ckan import-figshare\n\n\n- CLI for symlinking datasets that have failed to symlink before:\n\n ::\n\n ckan run-jobs-dcor-depot\n\n\n- CLI for appending a resource to a dataset\n\n ::\n\n ckan append-resource /path/to/file dataset_id --delete-source\n\nPlease make sure that the necessary file permissions are given in ``/data``.\n\nIn 2023, it was decided that the huge block storage of DCOR\nshould be replaced with an S3-compatible object store, because block storage\ndoes not scale well. This partially deprecates some of the commands above\nwhich might be removed or modified to support object storage directly.\n\n- CLI for migrating data from block storage to an S3-compatible object storage\n service. For this, the following configuration keys must be specified in\n the ``ckan.ini`` file::\n\n dcor_object_store.access_key_id = ACCESS_KEY_ID\n dcor_object_store.secret_access_key = SECRET_ACCESS_KEY\n dcor_object_store.endpoint_url = S3_ENDPOINT_URL\n dcor_object_store.ssl_verify = true\n # The bucket name is by default defined by the circle ID. Resources\n # are stored in the \"RES/OUR/CEID-SCHEME\" in that bucket.\n dcor_object_store.bucket_name = circle-{organization_id}\n\n Usage::\n\n ckan dcor-migrate-resources-to-object-store\n\n\nInstallation\n------------\n\n::\n\n pip install ckanext-dcor_depot\n\n\nAdd this extension to the plugins and defaul_views in ckan.ini:\n\n::\n\n ckan.plugins = [...] dcor_depot\n ckan.storage_path=/data/ckan-HOSTNAME\n ckanext.dcor_depot.depots_path=/data/depots\n ckanext.dcor_depot.users_depot_name=users-HOSTNAME\n\nThis plugin stores resources to `/data`:\n\n::\n\n mkdir -p /data/depots/users-$(hostname)\n chown -R www-data /data/depots/users-$(hostname)\n\n\nTesting\n-------\nIf CKAN/DCOR is installed and setup for testing, this extension can\nbe tested with pytest:\n\n::\n\n pytest ckanext\n\nTesting can also be done via vagrant in a virtualmachine using the\n`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.\nMake sure that `vagrant` and `virtualbox` are installed and run the\nfollowing commands in the root of this repository:\n\n::\n\n # Setup virtual machine using `Vagrantfile`\n vagrant up\n # Run the tests\n vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh\n\n\n.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg\n :target: https://pypi.python.org/pypi/ckanext.dcor_depot\n.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml\n :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks\n.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot\n :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot\n\n\n",
"bugtrack_url": null,
"license": "AGPLv3+",
"summary": "Manages data storage for CKAN/DCOR (import, symlink, etc.)",
"version": "0.14.0",
"project_urls": {
"Homepage": "https://github.com/DCOR-dev/ckanext-dcor_depot"
},
"split_keywords": [
"ckan",
" dcor",
" rt-dc"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6dda46d340efe94184ff736d889a3e87ee2eb03969fdf4d17371ebe3936b6429",
"md5": "e2bc96992e08f451fc9bc6c8a03f4558",
"sha256": "064a328fea372990b665f8c84c32cdd3441351ea3d2d2fb5d8ed3d3a940f8b73"
},
"downloads": -1,
"filename": "ckanext_dcor_depot-0.14.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e2bc96992e08f451fc9bc6c8a03f4558",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 37709,
"upload_time": "2024-04-16T10:51:30",
"upload_time_iso_8601": "2024-04-16T10:51:30.607238Z",
"url": "https://files.pythonhosted.org/packages/6d/da/46d340efe94184ff736d889a3e87ee2eb03969fdf4d17371ebe3936b6429/ckanext_dcor_depot-0.14.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b4a5736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba",
"md5": "f48cf4df03e4fb7457cb15ac4b606c3d",
"sha256": "70b9dbd319f003db7e7e3940ef6ee94667b795f3a29249176ecd77d2dfa19e78"
},
"downloads": -1,
"filename": "ckanext-dcor_depot-0.14.0.tar.gz",
"has_sig": false,
"md5_digest": "f48cf4df03e4fb7457cb15ac4b606c3d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 35170,
"upload_time": "2024-04-16T10:51:32",
"upload_time_iso_8601": "2024-04-16T10:51:32.827167Z",
"url": "https://files.pythonhosted.org/packages/b4/a5/736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba/ckanext-dcor_depot-0.14.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-16 10:51:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DCOR-dev",
"github_project": "ckanext-dcor_depot",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ckanext-dcor-depot"
}