ckanext-dcor-depot


Nameckanext-dcor-depot JSON
Version 0.16.0 PyPI version JSON
download
home_pageNone
SummaryManages data storage for DCOR
upload_time2024-11-26 19:51:28
maintainerNone
docs_urlNone
authorPaul Müller
requires_python<4,>=3.8
licenseGNU Affero General Public License v3 or later (AGPLv3+)
keywords dc dcor deformability cytometry
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ckanext-dcor_depot
==================

|PyPI Version| |Build Status| |Coverage Status|

This plugin manages how data are stored in DCOR. There are two types of
files in DCOR:

1. Resources uploaded by users, imported from figshare, or
   imported from a data archive
2. Ancillary files that are generated upon resource creation, such as
   condensed DC data, preview images (see
   `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).

This plugin implements:

- Data storage management. All resources uploaded by a user are moved
  to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``
  and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
  via a background job.
  CKAN itself will not notice this. The idea is to have a filesystem overview
  about the datasets of each user.
- A backround job that uploads resources to S3 in `after_resource_create`
  if the resources were uploaded via the legacy upload route.
- A background job that backs up resources from S3 to local block storage
  if the resources were uploaded via the S3 upload route.
- Import datasets from figshare. Existing datasets from figshare are
  downloaded to the ``/data/depots/figshare`` directory and, upon resource
  creation, symlinked there from  ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``
  (Note that this is an exemption of the data storage management described
  above). When running the following command, the "figshare-import" organization
  is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:

  ::

     ckan import-figshare


- CLI for symlinking datasets that have failed to symlink before:

  ::

     ckan run-jobs-dcor-depot


- CLI for appending a resource to a dataset

  ::

     ckan append-resource /path/to/file dataset_id --delete-source

Please make sure that the necessary file permissions are given in ``/data``.

In 2023, it was decided that the huge block storage of DCOR
should be replaced with an S3-compatible object store, because block storage
does not scale well. This partially deprecates some of the commands above
which might be removed or modified to support object storage directly.

- CLI for migrating data from block storage to an S3-compatible object storage
  service. For this, the following configuration keys must be specified in
  the ``ckan.ini`` file::

    dcor_object_store.access_key_id = ACCESS_KEY_ID
    dcor_object_store.secret_access_key = SECRET_ACCESS_KEY
    dcor_object_store.endpoint_url = S3_ENDPOINT_URL
    dcor_object_store.ssl_verify = true
    # The bucket name is by default defined by the circle ID. Resources
    # are stored in the "RES/OUR/CEID-SCHEME" in that bucket.
    dcor_object_store.bucket_name = circle-{organization_id}

  Usage::

    ckan dcor-migrate-resources-to-object-store


Installation
------------

::

    pip install ckanext-dcor_depot


Add this extension to the plugins and defaul_views in ckan.ini:

::

    ckan.plugins = [...] dcor_depot
    ckan.storage_path=/data/ckan-HOSTNAME
    ckanext.dcor_depot.depots_path=/data/depots
    ckanext.dcor_depot.users_depot_name=users-HOSTNAME

This plugin stores resources to `/data`:

::

    mkdir -p /data/depots/users-$(hostname)
    chown -R www-data /data/depots/users-$(hostname)


Testing
-------
If CKAN/DCOR is installed and setup for testing, this extension can
be tested with pytest:

::

    pytest ckanext

Testing can also be done via vagrant in a virtualmachine using the
`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.
Make sure that `vagrant` and `virtualbox` are installed and run the
following commands in the root of this repository:

::

    # Setup virtual machine using `Vagrantfile`
    vagrant up
    # Run the tests
    vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh


.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg
   :target: https://pypi.python.org/pypi/ckanext.dcor_depot
.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml
   :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks
.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot
   :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ckanext-dcor-depot",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.8",
    "maintainer_email": "Paul M\u00fcller <dev@craban.de>",
    "keywords": "DC, DCOR, deformability, cytometry",
    "author": "Paul M\u00fcller",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/d9/65/6621a2ce0371c40f3dda48240bb38f7d167f4fe3ab4994523f0d40683cde/ckanext_dcor_depot-0.16.0.tar.gz",
    "platform": null,
    "description": "ckanext-dcor_depot\n==================\n\n|PyPI Version| |Build Status| |Coverage Status|\n\nThis plugin manages how data are stored in DCOR. There are two types of\nfiles in DCOR:\n\n1. Resources uploaded by users, imported from figshare, or\n   imported from a data archive\n2. Ancillary files that are generated upon resource creation, such as\n   condensed DC data, preview images (see\n   `ckanext-dc_view <https://github.com/DCOR-dev/ckanext-dc_view>`_).\n\nThis plugin implements:\n\n- Data storage management. All resources uploaded by a user are moved\n  to ``/data/users-HOSTNAME/USERNAME-ORGNAME/PK/ID/PKGNAME_RESID_RESNAME``\n  and symlinks are created in ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n  via a background job.\n  CKAN itself will not notice this. The idea is to have a filesystem overview\n  about the datasets of each user.\n- A backround job that uploads resources to S3 in `after_resource_create`\n  if the resources were uploaded via the legacy upload route.\n- A background job that backs up resources from S3 to local block storage\n  if the resources were uploaded via the S3 upload route.\n- Import datasets from figshare. Existing datasets from figshare are\n  downloaded to the ``/data/depots/figshare`` directory and, upon resource\n  creation, symlinked there from  ``/data/ckan-HOSTNAME/resources/RES/OUR/CEID``\n  (Note that this is an exemption of the data storage management described\n  above). When running the following command, the \"figshare-import\" organization\n  is created and the datasets listed in ``figshare_dois.txt`` are added to CKAN:\n\n  ::\n\n     ckan import-figshare\n\n\n- CLI for symlinking datasets that have failed to symlink before:\n\n  ::\n\n     ckan run-jobs-dcor-depot\n\n\n- CLI for appending a resource to a dataset\n\n  ::\n\n     ckan append-resource /path/to/file dataset_id --delete-source\n\nPlease make sure that the necessary file permissions are given in ``/data``.\n\nIn 2023, it was decided that the huge block storage of DCOR\nshould be replaced with an S3-compatible object store, because block storage\ndoes not scale well. This partially deprecates some of the commands above\nwhich might be removed or modified to support object storage directly.\n\n- CLI for migrating data from block storage to an S3-compatible object storage\n  service. For this, the following configuration keys must be specified in\n  the ``ckan.ini`` file::\n\n    dcor_object_store.access_key_id = ACCESS_KEY_ID\n    dcor_object_store.secret_access_key = SECRET_ACCESS_KEY\n    dcor_object_store.endpoint_url = S3_ENDPOINT_URL\n    dcor_object_store.ssl_verify = true\n    # The bucket name is by default defined by the circle ID. Resources\n    # are stored in the \"RES/OUR/CEID-SCHEME\" in that bucket.\n    dcor_object_store.bucket_name = circle-{organization_id}\n\n  Usage::\n\n    ckan dcor-migrate-resources-to-object-store\n\n\nInstallation\n------------\n\n::\n\n    pip install ckanext-dcor_depot\n\n\nAdd this extension to the plugins and defaul_views in ckan.ini:\n\n::\n\n    ckan.plugins = [...] dcor_depot\n    ckan.storage_path=/data/ckan-HOSTNAME\n    ckanext.dcor_depot.depots_path=/data/depots\n    ckanext.dcor_depot.users_depot_name=users-HOSTNAME\n\nThis plugin stores resources to `/data`:\n\n::\n\n    mkdir -p /data/depots/users-$(hostname)\n    chown -R www-data /data/depots/users-$(hostname)\n\n\nTesting\n-------\nIf CKAN/DCOR is installed and setup for testing, this extension can\nbe tested with pytest:\n\n::\n\n    pytest ckanext\n\nTesting can also be done via vagrant in a virtualmachine using the\n`dcor-test <https://app.vagrantup.com/paulmueller/boxes/dcor-test/>` image.\nMake sure that `vagrant` and `virtualbox` are installed and run the\nfollowing commands in the root of this repository:\n\n::\n\n    # Setup virtual machine using `Vagrantfile`\n    vagrant up\n    # Run the tests\n    vagrant ssh -- sudo bash /testing/vagrant-run-tests.sh\n\n\n.. |PyPI Version| image:: https://img.shields.io/pypi/v/ckanext.dcor_depot.svg\n   :target: https://pypi.python.org/pypi/ckanext.dcor_depot\n.. |Build Status| image:: https://img.shields.io/github/actions/workflow/status/DCOR-dev/ckanext-dcor_depot/check.yml\n   :target: https://github.com/DCOR-dev/ckanext-dcor_depot/actions?query=workflow%3AChecks\n.. |Coverage Status| image:: https://img.shields.io/codecov/c/github/DCOR-dev/ckanext-dcor_depot\n   :target: https://codecov.io/gh/DCOR-dev/ckanext-dcor_depot\n",
    "bugtrack_url": null,
    "license": "GNU Affero General Public License v3 or later (AGPLv3+)",
    "summary": "Manages data storage for DCOR",
    "version": "0.16.0",
    "project_urls": {
        "changelog": "https://github.com/DCOR-dev/ckanext-dcor_depot/blob/main/CHANGELOG",
        "source": "https://github.com/DCOR-dev/ckanext-dcor_depot",
        "tracker": "https://github.com/DCOR-dev/ckanext-dcor_depot/issues"
    },
    "split_keywords": [
        "dc",
        " dcor",
        " deformability",
        " cytometry"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "60aa0e9164354716aa47d9cde206a136385d51abac4237bfbf45470607d48af9",
                "md5": "8a1e5c518c80cd986c06dbb031c9a5de",
                "sha256": "e968c6da11ad028fa3c59bce66a12dfa4502024945a0ee91580e275413207fc9"
            },
            "downloads": -1,
            "filename": "ckanext_dcor_depot-0.16.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8a1e5c518c80cd986c06dbb031c9a5de",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.8",
            "size": 4298578,
            "upload_time": "2024-11-26T19:51:26",
            "upload_time_iso_8601": "2024-11-26T19:51:26.528813Z",
            "url": "https://files.pythonhosted.org/packages/60/aa/0e9164354716aa47d9cde206a136385d51abac4237bfbf45470607d48af9/ckanext_dcor_depot-0.16.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d9656621a2ce0371c40f3dda48240bb38f7d167f4fe3ab4994523f0d40683cde",
                "md5": "90cf940eed5c6635e8d075c1026f54ee",
                "sha256": "e369269f2409a8dbfc62fe882968c0057255778a988841f08094100c489e90d7"
            },
            "downloads": -1,
            "filename": "ckanext_dcor_depot-0.16.0.tar.gz",
            "has_sig": false,
            "md5_digest": "90cf940eed5c6635e8d075c1026f54ee",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.8",
            "size": 4300407,
            "upload_time": "2024-11-26T19:51:28",
            "upload_time_iso_8601": "2024-11-26T19:51:28.809898Z",
            "url": "https://files.pythonhosted.org/packages/d9/65/6621a2ce0371c40f3dda48240bb38f7d167f4fe3ab4994523f0d40683cde/ckanext_dcor_depot-0.16.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-26 19:51:28",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DCOR-dev",
    "github_project": "ckanext-dcor_depot",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ckanext-dcor-depot"
}
        
Elapsed time: 4.66018s