# oaipmh-scythe: A Scythe for harvesting OAI-PMH repositories.
Welcome to `oaipmh-scythe`, an updated and modernized version of the original
[sickle](https://github.com/mloesch/sickle), now with additional features and ongoing maintenance.
| __CI__ | [![pre-commit.ci status][pre-commit-ci-badge]][pre-commit-ci-status] [![ci][ci-badge]][ci-workflow] [![coverage][coverage-badge]][ci-workflow] [![codeql][codeql-badge]][codeql-workflow] |
| :---------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| __Docs__ | [![docs][docs-badge]][docs-workflow] |
| __Package__ | [![pypi-version][pypi-version-badge]][pypi-url] [![pypi-python-versions][pypi-python-versions-badge]][pypi-url] [![all-downloads][all-downloads-badge]][pepy-tech-url] [![monthly-downloads][monthly-downloads-badge]][pepy-tech-url] |
| __Meta__ | [![OpenSSF Scorecard][scorecard-badge]][scorecard-url] [![hatch][hatch-badge]][hatch] [![ruff][ruff-badge]][ruff] [![mypy][mypy-badge]][mypy] [![License][license-badge]][license] |
`oaipmh-scythe` is a lightweight [OAI-PMH](http://www.openarchives.org/OAI/openarchivesprotocol.html) client library
written in Python. It has been designed for conveniently retrieving data from OAI interfaces the Pythonic way:
```python
from oaipmh_scythe import Scythe
with Scythe("https://zenodo.org/oai2d") as scythe:
records = scythe.list_records()
next(records)
# <Record oai:zenodo.org:4574771>
```
## Features
- Easy harvesting of OAI-compliant interfaces
- Support for all six OAI verbs
- Convenient object representations of OAI items (records, headers, sets, ...)
- Automatic de-serialization of Dublin Core-encoded metadata payloads to Python dictionaries
- Option for ignoring deleted items
## Requirements
[Python](https://www.python.org/downloads/) >= 3.10
`oaipmh-scythe` is built with:
- [httpx](https://github.com/encode/httpx) for issuing HTTP requests
- [lxml](https://github.com/lxml/lxml) for parsing XML responses
## Installation
You can install `oaipmh-scythe` via pip from [PyPI][pypi-url]:
```console
python -m pip install oaipmh-scythe
```
## Documentation
The [documentation][docs-url] is made with [Material for MkDocs](https://github.com/squidfunk/mkdocs-material) and is
hosted by [GitHub Pages](https://docs.github.com/en/pages).
## Similar Projects
There are a couple of similar projects available on [PyPI](https://pypi.org/search/?q=oai-pmh) and GitHub, e.g. via the
topics [oai-pmh](https://github.com/topics/oai-pmh) and [oai-pmh-client](https://github.com/topics/oai-pmh-client).
Among them are these implementations in Python:
| Project | Description | Last commit |
| -------------------------------------------------------------------------------- | ------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| [sickle](https://github.com/mloesch/sickle) | `oaipmh-scythe` is a fork of `sickle` | ![last-commit](https://img.shields.io/github/last-commit/mloesch/sickle) |
| [pyoai](https://github.com/infrae/pyoai) | `sickle` was inspired by `pyoai` | ![last-commit](https://img.shields.io/github/last-commit/infrae/pyoai) |
| [pyoaiharvester](https://github.com/vphill/pyoaiharvester) | oai-pmh harvester CLI | ![last-commit](https://img.shields.io/github/last-commit/vphill/pyoaiharvester) |
| [ddblabs-ometha](https://github.com/Deutsche-Digitale-Bibliothek/ddblabs-ometha) | oai-pmh harvester with CLI and TUI | ![last-commit](https://img.shields.io/github/last-commit/Deutsche-Digitale-Bibliothek/ddblabs-ometha) |
| [oai-harvest](https://github.com/bloomonkey/oai-harvest) | uses `pyoai` internally | ![last-commit](https://img.shields.io/github/last-commit/bloomonkey/oai-harvest) |
| [oai-pmh-harvester](https://github.com/MITLibraries/oai-pmh-harvester) | uses `sickle` internally | ![last-commit](https://img.shields.io/github/last-commit/MITLibraries/oai-pmh-harvester) |
There are also similar projects available in [Java](https://github.com/topics/oai-pmh-client?l=java) and
[PHP](https://github.com/topics/oai-pmh-client?l=php).
## Acknowledgments
This is a fork of [sickle](https://github.com/mloesch/sickle) which was originally written by Mathias Loesch.
## License
`oaipmh-scythe` is distributed under the terms of the [BSD license](https://spdx.org/licenses/BSD-3-Clause.html).
<!-- Refs -->
[all-downloads-badge]: https://static.pepy.tech/badge/oaipmh-scythe
[ci-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/main.yml/badge.svg
[ci-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/main.yml
[codeql-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/codeql.yml/badge.svg
[codeql-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/codeql.yml
[coverage-badge]: https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/afuetterer/fcb87d45f4d7defdfeffa65eb1d65f63/raw/coverage-badge.json
[docs-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/docs.yml/badge.svg
[docs-url]: https://afuetterer.github.io/oaipmh-scythe
[docs-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/docs.yml
[hatch]: https://github.com/pypa/hatch
[hatch-badge]: https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg
[license]: https://spdx.org/licenses/BSD-3-Clause.html
[license-badge]: https://img.shields.io/badge/License-BSD_3--Clause-blue.svg
[monthly-downloads-badge]: https://static.pepy.tech/badge/oaipmh-scythe/month
[mypy]: https://mypy-lang.org
[mypy-badge]: https://img.shields.io/badge/types-mypy-blue.svg
[pepy-tech-url]: https://pepy.tech/project/oaipmh-scythe
[pre-commit-ci-badge]: https://results.pre-commit.ci/badge/github/afuetterer/oaipmh-scythe/main.svg
[pre-commit-ci-status]: https://results.pre-commit.ci/latest/github/afuetterer/oaipmh-scythe/main
[pypi-python-versions-badge]: https://img.shields.io/pypi/pyversions/oaipmh-scythe.svg?logo=python&label=Python
[pypi-url]: https://pypi.org/project/oaipmh-scythe/
[pypi-version-badge]: https://img.shields.io/pypi/v/oaipmh-scythe.svg?logo=pypi&label=PyPI
[ruff]: https://github.com/astral-sh/ruff
[ruff-badge]: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json
[scorecard-badge]: https://api.securityscorecards.dev/projects/github.com/afuetterer/oaipmh-scythe/badge
[scorecard-url]: https://securityscorecards.dev/viewer/?uri=github.com/afuetterer/oaipmh-scythe
Raw data
{
"_id": null,
"home_page": null,
"name": "oaipmh-scythe",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "metadata, oai-pmh, oai-pmh-client",
"author": "Heinz-Alexander F\u00fctterer",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/65/b2/c02f41ec7b876dc0abe214d47c37e5070e718f9ffe216de83117a303d6ee/oaipmh_scythe-0.13.0.tar.gz",
"platform": null,
"description": "# oaipmh-scythe: A Scythe for harvesting OAI-PMH repositories.\n\nWelcome to `oaipmh-scythe`, an updated and modernized version of the original\n[sickle](https://github.com/mloesch/sickle), now with additional features and ongoing maintenance.\n\n| __CI__ | [![pre-commit.ci status][pre-commit-ci-badge]][pre-commit-ci-status] [![ci][ci-badge]][ci-workflow] [![coverage][coverage-badge]][ci-workflow] [![codeql][codeql-badge]][codeql-workflow] |\n| :---------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |\n| __Docs__ | [![docs][docs-badge]][docs-workflow] |\n| __Package__ | [![pypi-version][pypi-version-badge]][pypi-url] [![pypi-python-versions][pypi-python-versions-badge]][pypi-url] [![all-downloads][all-downloads-badge]][pepy-tech-url] [![monthly-downloads][monthly-downloads-badge]][pepy-tech-url] |\n| __Meta__ | [![OpenSSF Scorecard][scorecard-badge]][scorecard-url] [![hatch][hatch-badge]][hatch] [![ruff][ruff-badge]][ruff] [![mypy][mypy-badge]][mypy] [![License][license-badge]][license] |\n\n`oaipmh-scythe` is a lightweight [OAI-PMH](http://www.openarchives.org/OAI/openarchivesprotocol.html) client library\nwritten in Python. It has been designed for conveniently retrieving data from OAI interfaces the Pythonic way:\n\n```python\nfrom oaipmh_scythe import Scythe\n\nwith Scythe(\"https://zenodo.org/oai2d\") as scythe:\n records = scythe.list_records()\n next(records)\n# <Record oai:zenodo.org:4574771>\n```\n\n## Features\n\n- Easy harvesting of OAI-compliant interfaces\n- Support for all six OAI verbs\n- Convenient object representations of OAI items (records, headers, sets, ...)\n- Automatic de-serialization of Dublin Core-encoded metadata payloads to Python dictionaries\n- Option for ignoring deleted items\n\n## Requirements\n\n[Python](https://www.python.org/downloads/) >= 3.10\n\n`oaipmh-scythe` is built with:\n\n- [httpx](https://github.com/encode/httpx) for issuing HTTP requests\n- [lxml](https://github.com/lxml/lxml) for parsing XML responses\n\n## Installation\n\nYou can install `oaipmh-scythe` via pip from [PyPI][pypi-url]:\n\n```console\npython -m pip install oaipmh-scythe\n```\n\n## Documentation\n\nThe [documentation][docs-url] is made with [Material for MkDocs](https://github.com/squidfunk/mkdocs-material) and is\nhosted by [GitHub Pages](https://docs.github.com/en/pages).\n\n## Similar Projects\n\nThere are a couple of similar projects available on [PyPI](https://pypi.org/search/?q=oai-pmh) and GitHub, e.g. via the\ntopics [oai-pmh](https://github.com/topics/oai-pmh) and [oai-pmh-client](https://github.com/topics/oai-pmh-client).\nAmong them are these implementations in Python:\n\n| Project | Description | Last commit |\n| -------------------------------------------------------------------------------- | ------------------------------------- | ----------------------------------------------------------------------------------------------------- |\n| [sickle](https://github.com/mloesch/sickle) | `oaipmh-scythe` is a fork of `sickle` | ![last-commit](https://img.shields.io/github/last-commit/mloesch/sickle) |\n| [pyoai](https://github.com/infrae/pyoai) | `sickle` was inspired by `pyoai` | ![last-commit](https://img.shields.io/github/last-commit/infrae/pyoai) |\n| [pyoaiharvester](https://github.com/vphill/pyoaiharvester) | oai-pmh harvester CLI | ![last-commit](https://img.shields.io/github/last-commit/vphill/pyoaiharvester) |\n| [ddblabs-ometha](https://github.com/Deutsche-Digitale-Bibliothek/ddblabs-ometha) | oai-pmh harvester with CLI and TUI | ![last-commit](https://img.shields.io/github/last-commit/Deutsche-Digitale-Bibliothek/ddblabs-ometha) |\n| [oai-harvest](https://github.com/bloomonkey/oai-harvest) | uses `pyoai` internally | ![last-commit](https://img.shields.io/github/last-commit/bloomonkey/oai-harvest) |\n| [oai-pmh-harvester](https://github.com/MITLibraries/oai-pmh-harvester) | uses `sickle` internally | ![last-commit](https://img.shields.io/github/last-commit/MITLibraries/oai-pmh-harvester) |\n\nThere are also similar projects available in [Java](https://github.com/topics/oai-pmh-client?l=java) and\n[PHP](https://github.com/topics/oai-pmh-client?l=php).\n\n## Acknowledgments\n\nThis is a fork of [sickle](https://github.com/mloesch/sickle) which was originally written by Mathias Loesch.\n\n## License\n\n`oaipmh-scythe` is distributed under the terms of the [BSD license](https://spdx.org/licenses/BSD-3-Clause.html).\n\n<!-- Refs -->\n\n[all-downloads-badge]: https://static.pepy.tech/badge/oaipmh-scythe\n[ci-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/main.yml/badge.svg\n[ci-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/main.yml\n[codeql-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/codeql.yml/badge.svg\n[codeql-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/codeql.yml\n[coverage-badge]: https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/afuetterer/fcb87d45f4d7defdfeffa65eb1d65f63/raw/coverage-badge.json\n[docs-badge]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/docs.yml/badge.svg\n[docs-url]: https://afuetterer.github.io/oaipmh-scythe\n[docs-workflow]: https://github.com/afuetterer/oaipmh-scythe/actions/workflows/docs.yml\n[hatch]: https://github.com/pypa/hatch\n[hatch-badge]: https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg\n[license]: https://spdx.org/licenses/BSD-3-Clause.html\n[license-badge]: https://img.shields.io/badge/License-BSD_3--Clause-blue.svg\n[monthly-downloads-badge]: https://static.pepy.tech/badge/oaipmh-scythe/month\n[mypy]: https://mypy-lang.org\n[mypy-badge]: https://img.shields.io/badge/types-mypy-blue.svg\n[pepy-tech-url]: https://pepy.tech/project/oaipmh-scythe\n[pre-commit-ci-badge]: https://results.pre-commit.ci/badge/github/afuetterer/oaipmh-scythe/main.svg\n[pre-commit-ci-status]: https://results.pre-commit.ci/latest/github/afuetterer/oaipmh-scythe/main\n[pypi-python-versions-badge]: https://img.shields.io/pypi/pyversions/oaipmh-scythe.svg?logo=python&label=Python\n[pypi-url]: https://pypi.org/project/oaipmh-scythe/\n[pypi-version-badge]: https://img.shields.io/pypi/v/oaipmh-scythe.svg?logo=pypi&label=PyPI\n[ruff]: https://github.com/astral-sh/ruff\n[ruff-badge]: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json\n[scorecard-badge]: https://api.securityscorecards.dev/projects/github.com/afuetterer/oaipmh-scythe/badge\n[scorecard-url]: https://securityscorecards.dev/viewer/?uri=github.com/afuetterer/oaipmh-scythe\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "A Scythe for harvesting OAI-PMH repositories.",
"version": "0.13.0",
"project_urls": {
"Changelog": "https://github.com/afuetterer/oaipmh-scythe/blob/main/CHANGELOG.md",
"Documentation": "https://afuetterer.github.io/oaipmh-scythe",
"Issues": "https://github.com/afuetterer/oaipmh-scythe/issues",
"Repository": "https://github.com/afuetterer/oaipmh-scythe.git"
},
"split_keywords": [
"metadata",
" oai-pmh",
" oai-pmh-client"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6b1a2ffc6da5b5a7e3e67788911438b82eca8a46e07d3406fde97a7585a612f2",
"md5": "3c05aa8ede96ec9f07f95d8c386430bc",
"sha256": "6630ce9ddbac774b8983cac57d1ce6403f6a9b6a4be0e5495160be23e44ecceb"
},
"downloads": -1,
"filename": "oaipmh_scythe-0.13.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c05aa8ede96ec9f07f95d8c386430bc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 21732,
"upload_time": "2024-05-07T09:35:16",
"upload_time_iso_8601": "2024-05-07T09:35:16.008790Z",
"url": "https://files.pythonhosted.org/packages/6b/1a/2ffc6da5b5a7e3e67788911438b82eca8a46e07d3406fde97a7585a612f2/oaipmh_scythe-0.13.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "65b2c02f41ec7b876dc0abe214d47c37e5070e718f9ffe216de83117a303d6ee",
"md5": "c35928d9accba206b5602f0104030b4c",
"sha256": "19ad43b0a223d87036af17c3919f1132db60bc561eb8d1713a2f85e736ad208f"
},
"downloads": -1,
"filename": "oaipmh_scythe-0.13.0.tar.gz",
"has_sig": false,
"md5_digest": "c35928d9accba206b5602f0104030b4c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 28796,
"upload_time": "2024-05-07T09:35:17",
"upload_time_iso_8601": "2024-05-07T09:35:17.821842Z",
"url": "https://files.pythonhosted.org/packages/65/b2/c02f41ec7b876dc0abe214d47c37e5070e718f9ffe216de83117a303d6ee/oaipmh_scythe-0.13.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-07 09:35:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "afuetterer",
"github_project": "oaipmh-scythe",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "oaipmh-scythe"
}