<!--header-start-->
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://data.nhm.ac.uk/images/nhm_logo.svg">
<source media="(prefers-color-scheme: light)" srcset="https://data.nhm.ac.uk/images/nhm_logo_black.svg">
<img alt="The Natural History Museum logo." src="https://data.nhm.ac.uk/images/nhm_logo_black.svg" align="left" width="150px" height="100px" hspace="40">
</picture>
# ckanext-attribution
[![Tests](https://img.shields.io/github/actions/workflow/status/NaturalHistoryMuseum/ckanext-attribution/tests.yml?style=flat-square)](https://github.com/NaturalHistoryMuseum/ckanext-attribution/actions/workflows/tests.yml)
[![Coveralls](https://img.shields.io/coveralls/github/NaturalHistoryMuseum/ckanext-attribution/main?style=flat-square)](https://coveralls.io/github/NaturalHistoryMuseum/ckanext-attribution)
[![CKAN](https://img.shields.io/badge/ckan-2.9.7-orange.svg?style=flat-square)](https://github.com/ckan/ckan)
[![Python](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8-blue.svg?style=flat-square)](https://www.python.org/)
[![Docs](https://img.shields.io/readthedocs/ckanext-attribution?style=flat-square)](https://ckanext-attribution.readthedocs.io)
_A CKAN extension that adds support for complex attribution._
<!--header-end-->
# Overview
<!--overview-start-->
This extension standardises author/contributor attribution for datasets, enabling enhanced metadata
and greater linkage between datasets. It currently integrates with the [ORCID](https://orcid.org)
and [ROR](https://ror.org) APIs; contributors ('agents') can be added directly from these databases,
or manually.
Contributors can be added and edited via actions or via a Vue app that can be inserted into
the `package_metadata_fields.html` template snippet.
![A screenshot of the form for adding contributors when editing a package. At the top is a preview of the citation in APA format, then there are three example agents with their affiliations and contribution activities.](.github/screenshots/form-overview.png)
## Schema
The schema is (partially) based on
the [RDA/TDWG recommendations](https://github.com/tdwg/attribution). Three new models are
added: `Agent` (contributors), `ContributionActivity`, and `Affiliation` (plus small linking models
between these and `Package` records).
### `Agent`
Defines _one_ agent.
| Field | Type | Values | Notes |
|----------------------|--------|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `agent_type` | string | 'person', 'org', 'other' | |
| `family_name` | string | | only used for 'person' records |
| `given_names` | string | | only used for 'person' records |
| `given_names_first` | bool | True, False | only used for 'person' records; if the given names should be displayed first according to the person's culture/language (default True) |
| `name` | string | | used for non-'person' records |
| `location` | string | | used for non-person records, optional; a location to display for the organisation to help differentiate between similar names (e.g. 'Natural History Museum (_London_)' and 'Natural History Museum (_Dublin_)')
`external_id` | string | | an identifier from an external service like ORCID or ROR
`external_id_scheme` | string | 'orcid', 'ror', other | the scheme for the `external_id`; currently only 'orcid' and 'ror' are fully supported, though basic support for others can be implemented by adding to the `attribution_controlled_lists` [action](ckanext/attribution/logic/actions/extra.py#L14)
`user_id` | string | `User.id` foreign key | link to a user account on the CKAN instance
### `ContributionActivity`
Defines _one_ activity performed by _one_ agent on _one_ specific dataset.
| Field | Type | Values | Notes |
|------------|----------|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `activity` | string | [controlled vocabulary] | the activity/role the agent is associated with, e.g. 'Editor', 'Methodology'; roles are defined in the `attribution_controlled_lists` [action](ckanext/attribution/logic/actions/extra.py#L14), which currently lists the [Datacite](https://datacite.org) and [CRediT](https://credit.niso.org) role taxonomies (but can be expanded) |
| `scheme` | string | [controlled vocabulary] | name of the defined scheme from [`attribution_controlled_lists`](ckanext/attribution/logic/actions/extra.py#L14) |
| `level` | string | 'Lead', 'Equal', 'Supporting' | optional degree of contribution (from [CRediT](http://credit.niso.org/implementing-credit)) |
| `time` | datetime | | optional date/time of the activity |
| `order` | integer | | order of the agent within all who are associated with the same activity, e.g. 1st Editor, 3rd DataCollector (optional) |
A specialised `ContributionActivity` entry with a '[citation]' activity is used to define the order
in which contributors should be cited (and/or if they should be cited at all).
### `Affiliation`
Defines a relationship between _two_ agents, either as a 'universal' (persistent) affiliation or for
a single package (e.g. a project affiliation).
| Field | Type | Values | Notes |
|--------------------|--------|--------------------------|------------------------------------------------------------------------------------|
| `agent_a_id` | string | `Agent.id` foreign key | one of the two agents (a/b order does not matter) |
| `agent_b_id` | string | `Agent.id` foreign key | one of the two agents (a/b order does not matter) |
| `affiliation_type` | string | | very short description (1 or 2 words) of affiliation, e.g. 'employment' (optional) |
| `description` | string | | longer description of affiliation (optional) |
| `start_date` | date | | date at which the relationship began, e.g. employment start date (optional) |
| `end_date` | date | | date at which the relationship ended (optional) |
| `package_id` | string | `Package.id` foreign key | links affiliation to a specific package/dataset (optional) |
<!--overview-end-->
# Installation
<!--installation-start-->
Path variables used below:
- `$INSTALL_FOLDER` (i.e. where CKAN is installed), e.g. `/usr/lib/ckan/default`
- `$CONFIG_FILE`, e.g. `/etc/ckan/default/development.ini`
## Installing from PyPI
```shell
pip install ckanext-attribution
# to use the CLI as well:
pip install ckanext-attribution[cli]
```
## Installing from source
1. Clone the repository into the `src` folder:
```shell
cd $INSTALL_FOLDER/src
git clone https://github.com/NaturalHistoryMuseum/ckanext-attribution.git
```
2. Activate the virtual env:
```shell
. $INSTALL_FOLDER/bin/activate
```
3. Install via pip:
```shell
pip install $INSTALL_FOLDER/src/ckanext-attribution
# to use the cli as well:
pip install $INSTALL_FOLDER/src/ckanext-attribution[cli]
```
### Installing in editable mode
Installing from a `pyproject.toml` in editable mode (i.e. `pip install -e`) requires `setuptools>=64`; however, CKAN 2.9 requires `setuptools==44.1.0`. See [our CKAN fork](https://github.com/NaturalHistoryMuseum/ckan) for a version of v2.9 that uses an updated setuptools if this functionality is something you need.
## Post-install setup
1. Add 'attribution' to the list of plugins in your `$CONFIG_FILE`:
```ini
ckan.plugins = ... attribution
```
2. Install `lessc` globally:
```shell
npm install -g "less@~4.1"
```
3. Add this block to `package_metadata_fields.html` to show the Vue app:
```jinja2
{% block package_custom_fields_agent %}
{{ super() }}
{% endblock %}
```
4. Change the `authors` field in your SOLR `schema.xml` to set up faceting.
```xml
<schema>
<fields>
<...>
<field name="author" type="string" indexed="true" stored="true" multiValued="true"/>
<...>
</fields>
<...>
<copyField source="author" dest="text"/>
</schema>
```
After making the changes, restart SOLR and reindex (`ckan -c $CONFIG_FILE search-index rebuild`).
You will also have to enable the config option (see below) to see this in the UI.
<!--installation-end-->
# Configuration
<!--configuration-start-->
These are the options that can be specified in your .ini config file. NB:
setting `ckanext.attribution.debug` to `True` means that the API
accesses [sandbox.orcid.org](https://sandbox.orcid.org) instead of [orcid.org](https://orcid.org).
Although both run by the ORCID organisation, these are _different websites_ and you will need a
separate account/set of credentials for each. It is also worth noting that you will not have access
to the full set of authors on the sandbox.
## API credentials [REQUIRED]
| Name | Description | Options |
|------------------------------------|------------------------------|---------|
| `ckanext.attribution.orcid_key` | Your ORCID API client ID/key | |
| `ckanext.attribution.orcid_secret` | Your ORCID API client secret | |
## Optional
| Name | Description | Options | Default |
|---------------------------------------|-----------------------------------------------------------------------|------------|---------|
| `ckanext.attribution.debug` | If true, use sandbox.orcid.org (for testing) | True/False | True |
| `ckanext.attribution.enable_faceting` | Enable filtering by contributor name (requires change to SOLR schema) | True/False | False |
<!--configuration-end-->
# Usage
<!--usage-start-->
## Actions
This extension adds numerous new actions. These are primarily CRUD actions for managing models, with
inline documentation and predictable interactions. It's probably more helpful to only go over the
more "unusual" new actions here.
### `agent_list`
Search for agents by name or external ID, or just list all agents.
```python
data_dict = {
'q': 'QUERY', # optional; searches in name, family_name, given_names, and external_id
}
toolkit.get_action('agent_list')({}, data_dict)
```
### `package_contributions_show`
Show all contribution records for a package, grouped by agent. Optionally provide a limit and offset
for pagination.
```python
data_dict = {
'id': 'PACKAGE_ID',
'limit': 'PAGE_SIZE',
'offset': 'OFFSET'
}
toolkit.get_action('package_contributions_show')({}, data_dict)
```
Returns a dict:
```python
{
'contributions': [
{
'agent': {
# Agent.as_dict()
},
'activities': [
# list of Activity.as_dict()
],
'affiliations': [
{
'affiliation': {
# Affiliation.as_dict()
},
'other_agent': {
# Agent.as_dict()
}
},
# ...
]
},
# ...
],
'total': total,
'offset': offset,
'page_size': limit or total
}
```
### `agent_affiliations`
Show all affiliations for a given agent, optionally limited to a specific dataset/package (plus '
global' affiliations).
```python
data_dict = {
'agent_id': 'AGENT_ID',
'package_id': 'PACKAGE_ID' # optional
}
toolkit.get_action('agent_affiliations')({}, data_dict)
```
Returns a list of records formatted as such:
```python
{
'affiliation': {
# Affiliation.as_dict()
},
'other_agent': {
# Agent.as_dict()
}
}
```
### `attribution_controlled_lists`
Returns collections of defined values (which can be modified by using `@toolkit.chained_action`).
```python
data_dict = {
'lists': ['NAME1', 'NAME2'] # optional; only return these lists
}
toolkit.get_action('attribution_controlled_lists')({}, data_dict)
```
There are four collections:
1. `agent_types` describes valid types for agents and adds additional detail;
2. `contribution_activity_types` contains role/activity taxonomies (i.e. Datacite and CRediT) and
lists the available activity values;
3. `contribution_activity_levels` is a list of contribution levels (i.e. 'lead', 'equal', and '
supporting', from CRediT);
4. `agent_external_id_schemes` describes valid schemes for external IDs (currently, ORCID and ROR).
These collections are useful for validation and frontend connectivity/standardisation. They are
contained within an action to _a._ enable frontend access via AJAX requests, and _b._ allow users to
override values as needed.
### `agent_external_search`
Search external sources (ORCID and ROR) for agent data. Ignores records that already exist in the
database.
```python
data_dict = {
'q': 'QUERY_STRING',
'sources': ['SOURCE1', 'SOURCE2'] # optional; only search these sources
}
toolkit.get_action('agent_external_search')({}, data_dict)
```
Results are returned formatted as such:
```python
{
'SCHEME_NAME': {
'records': [
# list of agent dicts
]
'remaining': 10000 # number of other records found
}
}
```
### `agent_external_read`
Read data from an external source like ORCID or ROR, either from an existing record or a new
external ID.
```python
# EITHER
data_dict_existing = {
'agent_id': 'AGENT_ID',
'diff': False
# optional; only show values that differ from the record's current values (default False)
}
# OR
data_dict_new = {
'external_id': 'EXTERNAL_ID',
'external_id_scheme': 'orcid' # or 'ror', etc.
}
toolkit.get_action('agent_external_read')({}, data_dict)
```
## Commands
**NB**: you will have to install the optional `[cli]` packages to use several of these commands.
### `initdb`
```shell
ckan -c $CONFIG_FILE attribution initdb
```
Initialise database tables.
### `sync`
```shell
ckan -c $CONFIG_FILE attribution sync $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
```
Retrieve up-to-date information from external APIs for contributors with an external ID set.
### `refresh-packages`
```shell
ckan -c $CONFIG_FILE attribution refresh-packages $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
```
Update the author string for all (or the specified) packages.
### `agent-external-search`
```shell
ckan -c $CONFIG_FILE attribution agent-external-search --limit 10 $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
```
Search external APIs for contributors without an external ID set. Run `refresh-packages` and rebuild the search index after this command.
### `merge-agents`
```shell
ckan -c $CONFIG_FILE attribution merge-agents --q $SEARCH_QUERY --match-threshold 75
```
Find agents with similar names (optionally matching the search query) and merge them. Run `refresh-packages` and rebuild the search index after this command.
### `migratedb`
```shell
ckan -c $CONFIG_FILE attribution migratedb --limit 10 --dry-run --no-search-api
```
Attempt to extract names of contributors from author fields and convert them to the new format.
- `--limit` will only convert a certain number of packages at a time.
- `--dry-run` prevents saving to the database.
- `--no-search-api` just extracts the names, without searching external APIs for contributors after.
It is recommended to run `merge-agents`, `refresh-packages`, and rebuild the search index after running this command.
<!--usage-end-->
# Testing
<!--testing-start-->
There is a Docker compose configuration available in this repository to make it easier to run tests. The ckan image uses the Dockerfile in the `docker/` folder.
To run the tests against ckan 2.9.x on Python3:
1. Build the required images:
```shell
docker compose build
```
2. Then run the tests.
The root of the repository is mounted into the ckan container as a volume by the Docker compose
configuration, so you should only need to rebuild the ckan image if you change the extension's
dependencies.
```shell
docker compose run ckan
```
<!--testing-end-->
Raw data
{
"_id": null,
"home_page": null,
"name": "ckanext-attribution",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "CKAN, data, attribution",
"author": null,
"author_email": "Natural History Museum <data@nhm.ac.uk>",
"download_url": "https://files.pythonhosted.org/packages/e0/ba/cbd1f0b34c451a36f1a87e83f6dbadf064d8e9fb8298bbf5d964539e96d7/ckanext_attribution-1.2.13.tar.gz",
"platform": null,
"description": "<!--header-start-->\n<picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://data.nhm.ac.uk/images/nhm_logo.svg\">\n <source media=\"(prefers-color-scheme: light)\" srcset=\"https://data.nhm.ac.uk/images/nhm_logo_black.svg\">\n <img alt=\"The Natural History Museum logo.\" src=\"https://data.nhm.ac.uk/images/nhm_logo_black.svg\" align=\"left\" width=\"150px\" height=\"100px\" hspace=\"40\">\n</picture>\n\n# ckanext-attribution\n\n[![Tests](https://img.shields.io/github/actions/workflow/status/NaturalHistoryMuseum/ckanext-attribution/tests.yml?style=flat-square)](https://github.com/NaturalHistoryMuseum/ckanext-attribution/actions/workflows/tests.yml)\n[![Coveralls](https://img.shields.io/coveralls/github/NaturalHistoryMuseum/ckanext-attribution/main?style=flat-square)](https://coveralls.io/github/NaturalHistoryMuseum/ckanext-attribution)\n[![CKAN](https://img.shields.io/badge/ckan-2.9.7-orange.svg?style=flat-square)](https://github.com/ckan/ckan)\n[![Python](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8-blue.svg?style=flat-square)](https://www.python.org/)\n[![Docs](https://img.shields.io/readthedocs/ckanext-attribution?style=flat-square)](https://ckanext-attribution.readthedocs.io)\n\n_A CKAN extension that adds support for complex attribution._\n\n<!--header-end-->\n\n# Overview\n\n<!--overview-start-->\nThis extension standardises author/contributor attribution for datasets, enabling enhanced metadata\nand greater linkage between datasets. It currently integrates with the [ORCID](https://orcid.org)\nand [ROR](https://ror.org) APIs; contributors ('agents') can be added directly from these databases,\nor manually.\n\nContributors can be added and edited via actions or via a Vue app that can be inserted into\nthe `package_metadata_fields.html` template snippet.\n\n![A screenshot of the form for adding contributors when editing a package. At the top is a preview of the citation in APA format, then there are three example agents with their affiliations and contribution activities.](.github/screenshots/form-overview.png)\n\n## Schema\n\nThe schema is (partially) based on\nthe [RDA/TDWG recommendations](https://github.com/tdwg/attribution). Three new models are\nadded: `Agent` (contributors), `ContributionActivity`, and `Affiliation` (plus small linking models\nbetween these and `Package` records).\n\n### `Agent`\n\nDefines _one_ agent.\n\n| Field | Type | Values | Notes |\n|----------------------|--------|--------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `agent_type` | string | 'person', 'org', 'other' | |\n| `family_name` | string | | only used for 'person' records |\n| `given_names` | string | | only used for 'person' records |\n| `given_names_first` | bool | True, False | only used for 'person' records; if the given names should be displayed first according to the person's culture/language (default True) |\n| `name` | string | | used for non-'person' records |\n| `location` | string | | used for non-person records, optional; a location to display for the organisation to help differentiate between similar names (e.g. 'Natural History Museum (_London_)' and 'Natural History Museum (_Dublin_)')\n `external_id` | string | | an identifier from an external service like ORCID or ROR\n `external_id_scheme` | string | 'orcid', 'ror', other | the scheme for the `external_id`; currently only 'orcid' and 'ror' are fully supported, though basic support for others can be implemented by adding to the `attribution_controlled_lists` [action](ckanext/attribution/logic/actions/extra.py#L14)\n `user_id` | string | `User.id` foreign key | link to a user account on the CKAN instance\n\n### `ContributionActivity`\n\nDefines _one_ activity performed by _one_ agent on _one_ specific dataset.\n\n| Field | Type | Values | Notes |\n|------------|----------|-------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `activity` | string | [controlled vocabulary] | the activity/role the agent is associated with, e.g. 'Editor', 'Methodology'; roles are defined in the `attribution_controlled_lists` [action](ckanext/attribution/logic/actions/extra.py#L14), which currently lists the [Datacite](https://datacite.org) and [CRediT](https://credit.niso.org) role taxonomies (but can be expanded) |\n| `scheme` | string | [controlled vocabulary] | name of the defined scheme from [`attribution_controlled_lists`](ckanext/attribution/logic/actions/extra.py#L14) |\n| `level` | string | 'Lead', 'Equal', 'Supporting' | optional degree of contribution (from [CRediT](http://credit.niso.org/implementing-credit)) |\n| `time` | datetime | | optional date/time of the activity |\n| `order` | integer | | order of the agent within all who are associated with the same activity, e.g. 1st Editor, 3rd DataCollector (optional) |\n\nA specialised `ContributionActivity` entry with a '[citation]' activity is used to define the order\nin which contributors should be cited (and/or if they should be cited at all).\n\n### `Affiliation`\n\nDefines a relationship between _two_ agents, either as a 'universal' (persistent) affiliation or for\na single package (e.g. a project affiliation).\n\n| Field | Type | Values | Notes |\n|--------------------|--------|--------------------------|------------------------------------------------------------------------------------|\n| `agent_a_id` | string | `Agent.id` foreign key | one of the two agents (a/b order does not matter) |\n| `agent_b_id` | string | `Agent.id` foreign key | one of the two agents (a/b order does not matter) |\n| `affiliation_type` | string | | very short description (1 or 2 words) of affiliation, e.g. 'employment' (optional) |\n| `description` | string | | longer description of affiliation (optional) |\n| `start_date` | date | | date at which the relationship began, e.g. employment start date (optional) |\n| `end_date` | date | | date at which the relationship ended (optional) |\n| `package_id` | string | `Package.id` foreign key | links affiliation to a specific package/dataset (optional) |\n\n<!--overview-end-->\n\n# Installation\n\n<!--installation-start-->\nPath variables used below:\n- `$INSTALL_FOLDER` (i.e. where CKAN is installed), e.g. `/usr/lib/ckan/default`\n- `$CONFIG_FILE`, e.g. `/etc/ckan/default/development.ini`\n\n## Installing from PyPI\n\n```shell\npip install ckanext-attribution\n# to use the CLI as well:\npip install ckanext-attribution[cli]\n```\n\n## Installing from source\n\n1. Clone the repository into the `src` folder:\n ```shell\n cd $INSTALL_FOLDER/src\n git clone https://github.com/NaturalHistoryMuseum/ckanext-attribution.git\n ```\n\n2. Activate the virtual env:\n ```shell\n . $INSTALL_FOLDER/bin/activate\n ```\n\n3. Install via pip:\n ```shell\n pip install $INSTALL_FOLDER/src/ckanext-attribution\n # to use the cli as well:\n pip install $INSTALL_FOLDER/src/ckanext-attribution[cli]\n ```\n\n### Installing in editable mode\n\nInstalling from a `pyproject.toml` in editable mode (i.e. `pip install -e`) requires `setuptools>=64`; however, CKAN 2.9 requires `setuptools==44.1.0`. See [our CKAN fork](https://github.com/NaturalHistoryMuseum/ckan) for a version of v2.9 that uses an updated setuptools if this functionality is something you need.\n\n## Post-install setup\n\n1. Add 'attribution' to the list of plugins in your `$CONFIG_FILE`:\n ```ini\n ckan.plugins = ... attribution\n ```\n\n2. Install `lessc` globally:\n ```shell\n npm install -g \"less@~4.1\"\n ```\n\n3. Add this block to `package_metadata_fields.html` to show the Vue app:\n ```jinja2\n {% block package_custom_fields_agent %}\n {{ super() }}\n {% endblock %}\n ```\n\n4. Change the `authors` field in your SOLR `schema.xml` to set up faceting.\n ```xml\n <schema>\n <fields>\n <...>\n <field name=\"author\" type=\"string\" indexed=\"true\" stored=\"true\" multiValued=\"true\"/>\n <...>\n </fields>\n <...>\n <copyField source=\"author\" dest=\"text\"/>\n </schema>\n ```\n\n After making the changes, restart SOLR and reindex (`ckan -c $CONFIG_FILE search-index rebuild`).\n You will also have to enable the config option (see below) to see this in the UI.\n\n<!--installation-end-->\n\n# Configuration\n\n<!--configuration-start-->\nThese are the options that can be specified in your .ini config file. NB:\nsetting `ckanext.attribution.debug` to `True` means that the API\naccesses [sandbox.orcid.org](https://sandbox.orcid.org) instead of [orcid.org](https://orcid.org).\nAlthough both run by the ORCID organisation, these are _different websites_ and you will need a\nseparate account/set of credentials for each. It is also worth noting that you will not have access\nto the full set of authors on the sandbox.\n\n## API credentials [REQUIRED]\n\n| Name | Description | Options |\n|------------------------------------|------------------------------|---------|\n| `ckanext.attribution.orcid_key` | Your ORCID API client ID/key | |\n| `ckanext.attribution.orcid_secret` | Your ORCID API client secret | |\n\n## Optional\n\n| Name | Description | Options | Default |\n|---------------------------------------|-----------------------------------------------------------------------|------------|---------|\n| `ckanext.attribution.debug` | If true, use sandbox.orcid.org (for testing) | True/False | True |\n| `ckanext.attribution.enable_faceting` | Enable filtering by contributor name (requires change to SOLR schema) | True/False | False |\n\n<!--configuration-end-->\n\n# Usage\n\n<!--usage-start-->\n## Actions\n\nThis extension adds numerous new actions. These are primarily CRUD actions for managing models, with\ninline documentation and predictable interactions. It's probably more helpful to only go over the\nmore \"unusual\" new actions here.\n\n### `agent_list`\n\nSearch for agents by name or external ID, or just list all agents.\n\n```python\ndata_dict = {\n 'q': 'QUERY', # optional; searches in name, family_name, given_names, and external_id\n}\n\ntoolkit.get_action('agent_list')({}, data_dict)\n```\n\n### `package_contributions_show`\n\nShow all contribution records for a package, grouped by agent. Optionally provide a limit and offset\nfor pagination.\n\n```python\ndata_dict = {\n 'id': 'PACKAGE_ID',\n 'limit': 'PAGE_SIZE',\n 'offset': 'OFFSET'\n}\n\ntoolkit.get_action('package_contributions_show')({}, data_dict)\n```\n\nReturns a dict:\n\n```python\n{\n 'contributions': [\n {\n 'agent': {\n # Agent.as_dict()\n },\n 'activities': [\n # list of Activity.as_dict()\n ],\n 'affiliations': [\n {\n 'affiliation': {\n # Affiliation.as_dict()\n },\n 'other_agent': {\n # Agent.as_dict()\n }\n },\n # ...\n ]\n },\n # ...\n ],\n 'total': total,\n 'offset': offset,\n 'page_size': limit or total\n}\n```\n\n### `agent_affiliations`\n\nShow all affiliations for a given agent, optionally limited to a specific dataset/package (plus '\nglobal' affiliations).\n\n```python\ndata_dict = {\n 'agent_id': 'AGENT_ID',\n 'package_id': 'PACKAGE_ID' # optional\n}\n\ntoolkit.get_action('agent_affiliations')({}, data_dict)\n```\n\nReturns a list of records formatted as such:\n\n```python\n{\n 'affiliation': {\n # Affiliation.as_dict()\n },\n 'other_agent': {\n # Agent.as_dict()\n }\n}\n```\n\n### `attribution_controlled_lists`\n\nReturns collections of defined values (which can be modified by using `@toolkit.chained_action`).\n\n```python\ndata_dict = {\n 'lists': ['NAME1', 'NAME2'] # optional; only return these lists\n}\n\ntoolkit.get_action('attribution_controlled_lists')({}, data_dict)\n```\n\nThere are four collections:\n\n1. `agent_types` describes valid types for agents and adds additional detail;\n2. `contribution_activity_types` contains role/activity taxonomies (i.e. Datacite and CRediT) and\n lists the available activity values;\n3. `contribution_activity_levels` is a list of contribution levels (i.e. 'lead', 'equal', and '\n supporting', from CRediT);\n4. `agent_external_id_schemes` describes valid schemes for external IDs (currently, ORCID and ROR).\n\nThese collections are useful for validation and frontend connectivity/standardisation. They are\ncontained within an action to _a._ enable frontend access via AJAX requests, and _b._ allow users to\noverride values as needed.\n\n### `agent_external_search`\n\nSearch external sources (ORCID and ROR) for agent data. Ignores records that already exist in the\ndatabase.\n\n```python\ndata_dict = {\n 'q': 'QUERY_STRING',\n 'sources': ['SOURCE1', 'SOURCE2'] # optional; only search these sources\n}\n\ntoolkit.get_action('agent_external_search')({}, data_dict)\n```\n\nResults are returned formatted as such:\n\n```python\n{\n 'SCHEME_NAME': {\n 'records': [\n # list of agent dicts\n ]\n 'remaining': 10000 # number of other records found\n }\n}\n```\n\n### `agent_external_read`\n\nRead data from an external source like ORCID or ROR, either from an existing record or a new\nexternal ID.\n\n```python\n# EITHER\ndata_dict_existing = {\n 'agent_id': 'AGENT_ID',\n 'diff': False\n # optional; only show values that differ from the record's current values (default False)\n}\n\n# OR\ndata_dict_new = {\n 'external_id': 'EXTERNAL_ID',\n 'external_id_scheme': 'orcid' # or 'ror', etc.\n}\n\ntoolkit.get_action('agent_external_read')({}, data_dict)\n```\n\n## Commands\n\n**NB**: you will have to install the optional `[cli]` packages to use several of these commands.\n\n### `initdb`\n```shell\nckan -c $CONFIG_FILE attribution initdb\n```\nInitialise database tables.\n\n### `sync`\n```shell\nckan -c $CONFIG_FILE attribution sync $OPTIONAL_ID $ANOTHER_OPTIONAL_ID\n```\nRetrieve up-to-date information from external APIs for contributors with an external ID set.\n\n### `refresh-packages`\n```shell\nckan -c $CONFIG_FILE attribution refresh-packages $OPTIONAL_ID $ANOTHER_OPTIONAL_ID\n```\nUpdate the author string for all (or the specified) packages.\n\n### `agent-external-search`\n```shell\nckan -c $CONFIG_FILE attribution agent-external-search --limit 10 $OPTIONAL_ID $ANOTHER_OPTIONAL_ID\n```\nSearch external APIs for contributors without an external ID set. Run `refresh-packages` and rebuild the search index after this command.\n\n### `merge-agents`\n```shell\nckan -c $CONFIG_FILE attribution merge-agents --q $SEARCH_QUERY --match-threshold 75\n```\nFind agents with similar names (optionally matching the search query) and merge them. Run `refresh-packages` and rebuild the search index after this command.\n\n### `migratedb`\n```shell\nckan -c $CONFIG_FILE attribution migratedb --limit 10 --dry-run --no-search-api\n```\nAttempt to extract names of contributors from author fields and convert them to the new format.\n- `--limit` will only convert a certain number of packages at a time.\n- `--dry-run` prevents saving to the database.\n- `--no-search-api` just extracts the names, without searching external APIs for contributors after.\n\nIt is recommended to run `merge-agents`, `refresh-packages`, and rebuild the search index after running this command.\n\n<!--usage-end-->\n\n# Testing\n\n<!--testing-start-->\nThere is a Docker compose configuration available in this repository to make it easier to run tests. The ckan image uses the Dockerfile in the `docker/` folder.\n\nTo run the tests against ckan 2.9.x on Python3:\n\n1. Build the required images:\n ```shell\n docker compose build\n ```\n\n2. Then run the tests.\n The root of the repository is mounted into the ckan container as a volume by the Docker compose\n configuration, so you should only need to rebuild the ckan image if you change the extension's\n dependencies.\n ```shell\n docker compose run ckan\n ```\n\n<!--testing-end-->\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "A CKAN extension that adds support for complex attribution.",
"version": "1.2.13",
"project_urls": {
"changelog": "https://github.com/NaturalHistoryMuseum/ckanext-attribution/blob/main/CHANGELOG.md",
"repository": "https://github.com/NaturalHistoryMuseum/ckanext-attribution"
},
"split_keywords": [
"ckan",
" data",
" attribution"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6a2b90f4268cf4ac756a218937252c4d7bb44b0015fc483b03e40d6db5751901",
"md5": "3506d07875e48e777659c13f93c3c79b",
"sha256": "267ec9df94bf54e8b6564ae52defc4dfebbbbfae67223b63fe8ec70b3e462b51"
},
"downloads": -1,
"filename": "ckanext_attribution-1.2.13-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3506d07875e48e777659c13f93c3c79b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 569783,
"upload_time": "2024-11-04T11:36:37",
"upload_time_iso_8601": "2024-11-04T11:36:37.484143Z",
"url": "https://files.pythonhosted.org/packages/6a/2b/90f4268cf4ac756a218937252c4d7bb44b0015fc483b03e40d6db5751901/ckanext_attribution-1.2.13-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e0bacbd1f0b34c451a36f1a87e83f6dbadf064d8e9fb8298bbf5d964539e96d7",
"md5": "cb07a1967056c3af3b0eefbcb9901d60",
"sha256": "f1852e2595c18d29fc430e9cfdd64e2fc5af51da03bc35128ebc30a040402d25"
},
"downloads": -1,
"filename": "ckanext_attribution-1.2.13.tar.gz",
"has_sig": false,
"md5_digest": "cb07a1967056c3af3b0eefbcb9901d60",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 527236,
"upload_time": "2024-11-04T11:36:38",
"upload_time_iso_8601": "2024-11-04T11:36:38.839226Z",
"url": "https://files.pythonhosted.org/packages/e0/ba/cbd1f0b34c451a36f1a87e83f6dbadf064d8e9fb8298bbf5d964539e96d7/ckanext_attribution-1.2.13.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-04 11:36:38",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "NaturalHistoryMuseum",
"github_project": "ckanext-attribution",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "ckanext-attribution"
}