ckanext-dcor-depot


Nameckanext-dcor-depot JSON
Version 0.14.0 PyPI version JSON
download
home_pagehttps://github.com/DCOR-dev/ckanext-dcor_depot
SummaryManages data storage for CKAN/DCOR (import, symlink, etc.)
upload_time2024-04-16 10:51:32
maintainerNone
docs_urlNone
authorPaul Müller
requires_pythonNone
licenseAGPLv3+
keywords ckan dcor rt-dc
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ckanext-dcor_depot
==================

|PyPI Version| |Build Status| |Coverage Status|

This plugin manages how data are stored in DCOR. There are two types of
files in DCOR:

1. Resources uploaded by users, imported from figshare, or
   imported from a data archive
2. Ancillary files that are generated upon resource creation, such as
   condensed DC data, preview images (see
   `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).

This plugin implements:

- Data storage management. All resources uploaded by a user are moved
  to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``
  and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
  via a background job.
  CKAN itself will not notice this. The idea is to have a filesystem overview
  about the datasets of each user.
- A backround job that uploads resources to S3 in `after_resource_create`
  if the resources were uploaded via the legacy upload route.
- A background job that backs up resources from S3 to local block storage
  if the resources were uploaded via the S3 upload route.
- Import datasets from figshare. Existing datasets from figshare are
  downloaded to the ``/data/depots/figshare`` directory and, upon resource
  creation, symlinked there from  ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
  (Note that this is an exemption of the data storage management described
  above). When running the following command, the "figshare-import" organization
  is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:

  ::

     ckan import-figshare


- CLI for symlinking datasets that have failed to symlink before:

  ::

     ckan run-jobs-dcor-depot


- CLI for appending a resource to a dataset

  ::

     ckan append-resource /path/to/file dataset_id --delete-source

Please make sure that the necessary file permissions are given in ``/data``.

In 2023, it was decided that the huge block storage of DCOR
should be replaced with an S3-compatible object store, because block storage
does not scale well. This partially deprecates some of the commands above
which might be removed or modified to support object storage directly.

- CLI for migrating data from block storage to an S3-compatible object storage
  service. For this, the following configuration keys must be specified in
  the ``ckan.ini`` file::

    dcor_object_store.access_key_id = ACCESS_KEY_ID
    dcor_object_store.secret_access_key = SECRET_ACCESS_KEY
    dcor_object_store.endpoint_url = S3_ENDPOINT_URL
    dcor_object_store.ssl_verify = true
    # The bucket name is by default defined by the circle ID. Resources
    # are stored in the "RES/OUR/CEID-SCHEME" in that bucket.
    dcor_object_store.bucket_name = circle-{organization_id}

  Usage::

    ckan dcor-migrate-resources-to-object-store


Installation
------------

::

    pip install ckanext-dcor_depot


Add this extension to the plugins and defaul_views in ckan.ini:

::

    ckan.plugins = [...] dcor_depot
    ckan.storage_path=/data/ckan-HOSTNAME
    ckanext.dcor_depot.depots_path=/data/depots
    ckanext.dcor_depot.users_depot_name=users-HOSTNAME

This plugin stores resources to `/data`:

::

    mkdir -p /data/depots/users-$(hostname)
    chown -R www-data /data/depots/users-$(hostname)


Testing
-------
If CKAN/DCOR is installed and setup for testing, this extension can
be tested with pytest:

::

    pytest ckanext

Testing can also be done via vagrant in a virtualmachine using the
`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.
Make sure that `vagrant` and `virtualbox` are installed and run the
following commands in the root of this repository:

::

    # Setup virtual machine using `Vagrantfile`
    vagrant up
    # Run the tests
    vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh


.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg
   :target: https://pypi.python.org/pypi/ckanext.dcor_depot
.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml
   :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks
.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot
   :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/DCOR-dev/ckanext-dcor_depot",
    "name": "ckanext-dcor-depot",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "CKAN, DCOR, RT-DC",
    "author": "Paul M\u00fcller",
    "author_email": "dev@craban.de",
    "download_url": "https://files.pythonhosted.org/packages/b4/a5/736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba/ckanext-dcor_depot-0.14.0.tar.gz",
    "platform": null,
    "description": "ckanext-dcor_depot\n==================\n\n|PyPI Version| |Build Status| |Coverage Status|\n\nThis plugin manages how data are stored in DCOR. There are two types of\nfiles in DCOR:\n\n1. Resources uploaded by users, imported from figshare, or\n   imported from a data archive\n2. Ancillary files that are generated upon resource creation, such as\n   condensed DC data, preview images (see\n   `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).\n\nThis plugin implements:\n\n- Data storage management. All resources uploaded by a user are moved\n  to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``\n  and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n  via a background job.\n  CKAN itself will not notice this. The idea is to have a filesystem overview\n  about the datasets of each user.\n- A backround job that uploads resources to S3 in `after_resource_create`\n  if the resources were uploaded via the legacy upload route.\n- A background job that backs up resources from S3 to local block storage\n  if the resources were uploaded via the S3 upload route.\n- Import datasets from figshare. Existing datasets from figshare are\n  downloaded to the ``/data/depots/figshare`` directory and, upon resource\n  creation, symlinked there from  ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n  (Note that this is an exemption of the data storage management described\n  above). When running the following command, the \"figshare-import\" organization\n  is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:\n\n  ::\n\n     ckan import-figshare\n\n\n- CLI for symlinking datasets that have failed to symlink before:\n\n  ::\n\n     ckan run-jobs-dcor-depot\n\n\n- CLI for appending a resource to a dataset\n\n  ::\n\n     ckan append-resource /path/to/file dataset_id --delete-source\n\nPlease make sure that the necessary file permissions are given in ``/data``.\n\nIn 2023, it was decided that the huge block storage of DCOR\nshould be replaced with an S3-compatible object store, because block storage\ndoes not scale well. This partially deprecates some of the commands above\nwhich might be removed or modified to support object storage directly.\n\n- CLI for migrating data from block storage to an S3-compatible object storage\n  service. For this, the following configuration keys must be specified in\n  the ``ckan.ini`` file::\n\n    dcor_object_store.access_key_id = ACCESS_KEY_ID\n    dcor_object_store.secret_access_key = SECRET_ACCESS_KEY\n    dcor_object_store.endpoint_url = S3_ENDPOINT_URL\n    dcor_object_store.ssl_verify = true\n    # The bucket name is by default defined by the circle ID. Resources\n    # are stored in the \"RES/OUR/CEID-SCHEME\" in that bucket.\n    dcor_object_store.bucket_name = circle-{organization_id}\n\n  Usage::\n\n    ckan dcor-migrate-resources-to-object-store\n\n\nInstallation\n------------\n\n::\n\n    pip install ckanext-dcor_depot\n\n\nAdd this extension to the plugins and defaul_views in ckan.ini:\n\n::\n\n    ckan.plugins = [...] dcor_depot\n    ckan.storage_path=/data/ckan-HOSTNAME\n    ckanext.dcor_depot.depots_path=/data/depots\n    ckanext.dcor_depot.users_depot_name=users-HOSTNAME\n\nThis plugin stores resources to `/data`:\n\n::\n\n    mkdir -p /data/depots/users-$(hostname)\n    chown -R www-data /data/depots/users-$(hostname)\n\n\nTesting\n-------\nIf CKAN/DCOR is installed and setup for testing, this extension can\nbe tested with pytest:\n\n::\n\n    pytest ckanext\n\nTesting can also be done via vagrant in a virtualmachine using the\n`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.\nMake sure that `vagrant` and `virtualbox` are installed and run the\nfollowing commands in the root of this repository:\n\n::\n\n    # Setup virtual machine using `Vagrantfile`\n    vagrant up\n    # Run the tests\n    vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh\n\n\n.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg\n   :target: https://pypi.python.org/pypi/ckanext.dcor_depot\n.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml\n   :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks\n.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot\n   :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot\n\n\n",
    "bugtrack_url": null,
    "license": "AGPLv3+",
    "summary": "Manages data storage for CKAN/DCOR (import, symlink, etc.)",
    "version": "0.14.0",
    "project_urls": {
        "Homepage": "https://github.com/DCOR-dev/ckanext-dcor_depot"
    },
    "split_keywords": [
        "ckan",
        " dcor",
        " rt-dc"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6dda46d340efe94184ff736d889a3e87ee2eb03969fdf4d17371ebe3936b6429",
                "md5": "e2bc96992e08f451fc9bc6c8a03f4558",
                "sha256": "064a328fea372990b665f8c84c32cdd3441351ea3d2d2fb5d8ed3d3a940f8b73"
            },
            "downloads": -1,
            "filename": "ckanext_dcor_depot-0.14.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e2bc96992e08f451fc9bc6c8a03f4558",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 37709,
            "upload_time": "2024-04-16T10:51:30",
            "upload_time_iso_8601": "2024-04-16T10:51:30.607238Z",
            "url": "https://files.pythonhosted.org/packages/6d/da/46d340efe94184ff736d889a3e87ee2eb03969fdf4d17371ebe3936b6429/ckanext_dcor_depot-0.14.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b4a5736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba",
                "md5": "f48cf4df03e4fb7457cb15ac4b606c3d",
                "sha256": "70b9dbd319f003db7e7e3940ef6ee94667b795f3a29249176ecd77d2dfa19e78"
            },
            "downloads": -1,
            "filename": "ckanext-dcor_depot-0.14.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f48cf4df03e4fb7457cb15ac4b606c3d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 35170,
            "upload_time": "2024-04-16T10:51:32",
            "upload_time_iso_8601": "2024-04-16T10:51:32.827167Z",
            "url": "https://files.pythonhosted.org/packages/b4/a5/736665afaa324f1d3e0a40bd9dd39229ca3dc4ce5bbb27ebd56fdefdfeba/ckanext-dcor_depot-0.14.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-16 10:51:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DCOR-dev",
    "github_project": "ckanext-dcor_depot",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ckanext-dcor-depot"
}
        
Elapsed time: 0.24469s