Name | oaklib JSON |
Version |
0.6.6
JSON |
| download |
home_page | None |
Summary | Ontology Access Kit: Python library for common ontology operations over a variety of backends |
upload_time | 2024-05-09 23:40:03 |
maintainer | None |
docs_url | None |
author | cmungall |
requires_python | <4.0.0,>=3.9 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Ontology Access Kit (OAK)
Python lib for common ontology operations over a variety of backends.
<img src="docs/logos/oak-logo_black-icon.png" width="20%">
[![PyPI version](https://badge.fury.io/py/oaklib.svg)](https://badge.fury.io/py/oaklib)
![](https://github.com/incatools/ontology-access-kit/workflows/Build/badge.svg)
[![badge](https://img.shields.io/badge/launch-binder-579ACA.svg)](https://mybinder.org/v2/gh/incatools/ontology-access-kit/main?filepath=notebooks)
[![Downloads](https://pepy.tech/badge/oaklib/week)](https://pepy.tech/project/oaklib)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6456239.svg)](https://doi.org/10.5281/zenodo.6456239)
[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](.github/CODE_OF_CONDUCT.md)
OAK provides a collection of [interfaces](https://incatools.github.io/ontology-access-kit/packages/interfaces/index.html#interfaces) for various ontology operations, including:
- [look up basic features](https://incatools.github.io/ontology-access-kit/guide/basics.html) of an ontology element, such as its label, definition, relationships, or aliases
- search an ontology for a term
- validate an ontology
- modify or delete terms
- generate and visualize subgraphs
- identify lexical matches and export as SSSOM mapping tables
- perform more advanced operations, such as graph traversal, OWL axiom processing, or text annotation
These interfaces are *separated* from any particular backend, for which there a number of different [adapters](https://incatools.github.io/ontology-access-kit/implementations/index.html).
This means the same Python API and command line can be used regardless of whether the ontology:
- is served by a remote API such as OLS or BioPortal
- is present locally on the filesystem in owl, obo, obojson, or sqlite formats
- is to be downloaded from a remote repository such as the OBO library
- is queried from a remote database, including SPARQL endpoints (Ontobee/Ubergraph), A SQL database, a Solr/ES endpoint
## Documentation:
- [incatools.github.io/ontology-access-kit](https://incatools.github.io/ontology-access-kit)
- Presentations:
- [Using the OAK command line](https://doi.org/10.5281/zenodo.7708962) *OBO Academy 2023*
- [Introduction to OAK](https://doi.org/10.5281/zenodo.7765088) *OAK workshop 2022*
## Contributing
See the contribution guidelines at [CONTRIBUTING.md](.github/CONTRIBUTING.md).
All contributors are expected to uphold our [Code of Conduct](.github/CODE_OF_CONDUCT.md).
## Usage
```python
from oaklib import get_adapter
# connect to the CL sqlite database adapter
# (will first download if not already downloaded)
adapter = get_adapter("sqlite:obo:cl")
NEURON = "CL:0000540"
print('## Basic info')
print(f'ID: {NEURON}')
print(f'Label: {adapter.label(NEURON)}')
for alias in adapter.entity_aliases(NEURON):
print(f'Alias: {alias}')
print('## Relationships (direct)')
for relationship in adapter.relationships([NEURON]):
print(f' * {relationship.predicate} -> {relationship.object} "{adapter.label(relationship.object)}"')
print('## Ancestors (over IS_A and PART_OF)')
from oaklib.datamodels.vocabulary import IS_A, PART_OF
from oaklib.interfaces import OboGraphInterface
if not isinstance(adapter, OboGraphInterface):
raise ValueError('This adapter does not support graph operations')
for ancestor in adapter.ancestors(NEURON, predicates=[IS_A, PART_OF]):
print(f' * ANCESTOR: "{adapter.label(ancestor)}"')
```
For more examples, see
- [demo notebook](https://github.com/incatools/ontology-access-kit/blob/main/notebooks/basic-demo.ipynb)
- [tutorial part 2](https://incatools.github.io/ontology-access-kit/intro/tutorial02.html)
## Command Line
See:
- [CLI docs](https://incatools.github.io/ontology-access-kit/cli.html)
- [Example notebooks](https://github.com/INCATools/ontology-access-kit/tree/main/notebooks/Commands)
## Search
Use the pronto backend to fetch and parse an ontology from the OBO library, then use the `search` command
```bash
runoak -i obolibrary:pato.obo search osmol
```
Returns:
```
PATO:0001655 ! osmolarity
PATO:0001656 ! decreased osmolarity
PATO:0001657 ! increased osmolarity
PATO:0002027 ! osmolality
PATO:0002028 ! decreased osmolality
PATO:0002029 ! increased osmolality
PATO:0045034 ! normal osmolality
PATO:0045035 ! normal osmolarity
```
### QC and Validation
Perform validation on PR using sqlite/rdftab instance:
```bash
runoak -i sqlite:../semantic-sql/db/pr.db validate
```
### List all terms
List all terms obolibrary has for mondo
```bash
runoak -i obolibrary:mondo.obo terms
```
### Lexical index
Make a lexical index of all terms in Mondo:
```bash
runoak -i obolibrary:mondo.obo lexmatch -L mondo.index.yaml
```
### Search
Searching over OBO using ontobee:
```bash
runoak -i ontobee: search tentacle
```
yields:
```
http://purl.obolibrary.org/obo/CEPH_0000256 ! tentacle
http://purl.obolibrary.org/obo/CEPH_0000257 ! tentacle absence
http://purl.obolibrary.org/obo/CEPH_0000258 ! tentacle pad
...
```
Searching over a broader set of ontologies in bioportal (requires API KEY)
(https://www.bioontology.org/wiki/BioPortal_Help#Getting_an_API_key)
```bash
runoak set-apikey bioportal YOUR-KEY-HERE
runoak -i bioportal: search tentacle
```
yields:
```
BTO:0001357 ! tentacle
http://purl.jp/bio/4/id/200906071014668510 ! tentacle
CEPH:0000256 ! tentacle
http://www.projecthalo.com/aura#Tentacle ! Tentacle
CEPH:0000256 ! tentacle
...
```
Alternatively, you can add "BIOPORTAL_API_KEY" to your environment variables.
Searching over more limited set of ontologies in Ubergraph:
```bash
runoak -v -i ubergraph: search tentacle
```
yields
```
UBERON:0013206 ! nasal tentacle
```
### Annotating Texts
```bash
runoak -i bioportal: annotate neuron from CA4 region of hippocampus of mouse
```
yields:
```yaml
object_id: CL:0000540
object_label: neuron
object_source: https://data.bioontology.org/ontologies/NIFDYS
match_type: PREF
subject_start: 1
subject_end: 6
subject_label: NEURON
object_id: http://www.co-ode.org/ontologies/galen#Neuron
object_label: Neuron
object_source: https://data.bioontology.org/ontologies/GALEN
match_type: PREF
subject_start: 1
subject_end: 6
subject_label: NEURON
...
```
### Mapping
Create a SSSOM mapping file for a set of ontologies:
```bash
robot merge -I http://purl.obolibrary.org/obo/hp.owl -I http://purl.obolibrary.org/obo/mp.owl convert --check false -o hp-mp.obo
runoak lexmatch -i hp-mp.obo -o hp-mp.sssom.tsv
```
### Visualization of ancestor graphs
Use the sqlite backend to visualize graph up from 'vacuole' using test ontology sqlite:
```bash
runoak -i sqlite:tests/input/go-nucleus.db viz GO:0005773
```
![img](notebooks/output/vacuole.png)
Same using ubergraph, restricting to is-a and part-of
```bash
runoak -i ubergraph: viz GO:0005773 -p i,BFO:0000050
```
Same using pronto, fetching ontology from obolibrary
```bash
runoak -i obolibrary:go.obo viz GO:0005773
```
## Configuration
OAK uses [`pystow`](https://github.com/cthoyt/pystow) for caching. By default,
this goes inside `~/.data/`, but can be configured following
[these instructions](https://github.com/cthoyt/pystow#%EF%B8%8F%EF%B8%8F-configuration).
Raw data
{
"_id": null,
"home_page": null,
"name": "oaklib",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "cmungall",
"author_email": "cjm@berkeleybop.org",
"download_url": "https://files.pythonhosted.org/packages/d2/04/ccad06a0ed69ad6285cef0c52dc091c520fc5df409943f50c00a05ed300c/oaklib-0.6.6.tar.gz",
"platform": null,
"description": "# Ontology Access Kit (OAK)\n\nPython lib for common ontology operations over a variety of backends.\n\n<img src=\"docs/logos/oak-logo_black-icon.png\" width=\"20%\">\n\n[![PyPI version](https://badge.fury.io/py/oaklib.svg)](https://badge.fury.io/py/oaklib)\n![](https://github.com/incatools/ontology-access-kit/workflows/Build/badge.svg)\n[![badge](https://img.shields.io/badge/launch-binder-579ACA.svg)](https://mybinder.org/v2/gh/incatools/ontology-access-kit/main?filepath=notebooks)\n[![Downloads](https://pepy.tech/badge/oaklib/week)](https://pepy.tech/project/oaklib)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6456239.svg)](https://doi.org/10.5281/zenodo.6456239)\n[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](.github/CODE_OF_CONDUCT.md) \n\nOAK provides a collection of [interfaces](https://incatools.github.io/ontology-access-kit/packages/interfaces/index.html#interfaces) for various ontology operations, including:\n\n - [look up basic features](https://incatools.github.io/ontology-access-kit/guide/basics.html) of an ontology element, such as its label, definition, relationships, or aliases\n - search an ontology for a term\n - validate an ontology\n - modify or delete terms\n - generate and visualize subgraphs\n - identify lexical matches and export as SSSOM mapping tables\n - perform more advanced operations, such as graph traversal, OWL axiom processing, or text annotation\n\nThese interfaces are *separated* from any particular backend, for which there a number of different [adapters](https://incatools.github.io/ontology-access-kit/implementations/index.html).\nThis means the same Python API and command line can be used regardless of whether the ontology:\n\n - is served by a remote API such as OLS or BioPortal\n - is present locally on the filesystem in owl, obo, obojson, or sqlite formats\n - is to be downloaded from a remote repository such as the OBO library\n - is queried from a remote database, including SPARQL endpoints (Ontobee/Ubergraph), A SQL database, a Solr/ES endpoint\n\n## Documentation:\n\n- [incatools.github.io/ontology-access-kit](https://incatools.github.io/ontology-access-kit)\n- Presentations:\n - [Using the OAK command line](https://doi.org/10.5281/zenodo.7708962) *OBO Academy 2023*\n - [Introduction to OAK](https://doi.org/10.5281/zenodo.7765088) *OAK workshop 2022*\n\n## Contributing\n\nSee the contribution guidelines at [CONTRIBUTING.md](.github/CONTRIBUTING.md).\nAll contributors are expected to uphold our [Code of Conduct](.github/CODE_OF_CONDUCT.md).\n\n## Usage\n\n```python\nfrom oaklib import get_adapter\n\n# connect to the CL sqlite database adapter\n# (will first download if not already downloaded)\nadapter = get_adapter(\"sqlite:obo:cl\")\n\nNEURON = \"CL:0000540\"\n\nprint('## Basic info')\nprint(f'ID: {NEURON}')\nprint(f'Label: {adapter.label(NEURON)}')\n\nfor alias in adapter.entity_aliases(NEURON):\n print(f'Alias: {alias}')\n\nprint('## Relationships (direct)')\nfor relationship in adapter.relationships([NEURON]):\n print(f' * {relationship.predicate} -> {relationship.object} \"{adapter.label(relationship.object)}\"')\n \nprint('## Ancestors (over IS_A and PART_OF)')\nfrom oaklib.datamodels.vocabulary import IS_A, PART_OF\nfrom oaklib.interfaces import OboGraphInterface\n\nif not isinstance(adapter, OboGraphInterface):\n raise ValueError('This adapter does not support graph operations')\n\nfor ancestor in adapter.ancestors(NEURON, predicates=[IS_A, PART_OF]):\n print(f' * ANCESTOR: \"{adapter.label(ancestor)}\"')\n```\n\nFor more examples, see\n\n- [demo notebook](https://github.com/incatools/ontology-access-kit/blob/main/notebooks/basic-demo.ipynb)\n- [tutorial part 2](https://incatools.github.io/ontology-access-kit/intro/tutorial02.html)\n\n## Command Line\n\nSee:\n\n - [CLI docs](https://incatools.github.io/ontology-access-kit/cli.html)\n - [Example notebooks](https://github.com/INCATools/ontology-access-kit/tree/main/notebooks/Commands)\n\n## Search\n\nUse the pronto backend to fetch and parse an ontology from the OBO library, then use the `search` command\n\n```bash\nrunoak -i obolibrary:pato.obo search osmol \n```\n\nReturns:\n\n```\nPATO:0001655 ! osmolarity\nPATO:0001656 ! decreased osmolarity\nPATO:0001657 ! increased osmolarity\nPATO:0002027 ! osmolality\nPATO:0002028 ! decreased osmolality\nPATO:0002029 ! increased osmolality\nPATO:0045034 ! normal osmolality\nPATO:0045035 ! normal osmolarity\n```\n\n### QC and Validation\n\nPerform validation on PR using sqlite/rdftab instance:\n\n```bash\nrunoak -i sqlite:../semantic-sql/db/pr.db validate\n```\n\n### List all terms\n\nList all terms obolibrary has for mondo\n\n```bash\nrunoak -i obolibrary:mondo.obo terms \n```\n\n### Lexical index\n\nMake a lexical index of all terms in Mondo:\n\n```bash\nrunoak -i obolibrary:mondo.obo lexmatch -L mondo.index.yaml\n```\n\n### Search\n\nSearching over OBO using ontobee:\n\n```bash\nrunoak -i ontobee: search tentacle\n```\n\nyields:\n\n```\nhttp://purl.obolibrary.org/obo/CEPH_0000256 ! tentacle\nhttp://purl.obolibrary.org/obo/CEPH_0000257 ! tentacle absence\nhttp://purl.obolibrary.org/obo/CEPH_0000258 ! tentacle pad\n...\n```\n\nSearching over a broader set of ontologies in bioportal (requires API KEY)\n(https://www.bioontology.org/wiki/BioPortal_Help#Getting_an_API_key)\n\n```bash\nrunoak set-apikey bioportal YOUR-KEY-HERE\nrunoak -i bioportal: search tentacle\n```\n\nyields:\n\n```\nBTO:0001357 ! tentacle\nhttp://purl.jp/bio/4/id/200906071014668510 ! tentacle\nCEPH:0000256 ! tentacle\nhttp://www.projecthalo.com/aura#Tentacle ! Tentacle\nCEPH:0000256 ! tentacle\n...\n```\nAlternatively, you can add \"BIOPORTAL_API_KEY\" to your environment variables.\n\nSearching over more limited set of ontologies in Ubergraph:\n\n```bash\nrunoak -v -i ubergraph: search tentacle\n```\n\nyields\n```\nUBERON:0013206 ! nasal tentacle\n```\n\n### Annotating Texts\n\n```bash\nrunoak -i bioportal: annotate neuron from CA4 region of hippocampus of mouse\n```\n\nyields:\n\n```yaml\nobject_id: CL:0000540\nobject_label: neuron\nobject_source: https://data.bioontology.org/ontologies/NIFDYS\nmatch_type: PREF\nsubject_start: 1\nsubject_end: 6\nsubject_label: NEURON\n\nobject_id: http://www.co-ode.org/ontologies/galen#Neuron\nobject_label: Neuron\nobject_source: https://data.bioontology.org/ontologies/GALEN\nmatch_type: PREF\nsubject_start: 1\nsubject_end: 6\nsubject_label: NEURON\n\n...\n```\n\n### Mapping\n\nCreate a SSSOM mapping file for a set of ontologies:\n\n```bash\nrobot merge -I http://purl.obolibrary.org/obo/hp.owl -I http://purl.obolibrary.org/obo/mp.owl convert --check false -o hp-mp.obo\nrunoak lexmatch -i hp-mp.obo -o hp-mp.sssom.tsv\n```\n\n\n\n\n### Visualization of ancestor graphs\n\nUse the sqlite backend to visualize graph up from 'vacuole' using test ontology sqlite:\n\n```bash\nrunoak -i sqlite:tests/input/go-nucleus.db viz GO:0005773\n```\n\n![img](notebooks/output/vacuole.png)\n\nSame using ubergraph, restricting to is-a and part-of\n\n```bash\nrunoak -i ubergraph: viz GO:0005773 -p i,BFO:0000050\n```\n\nSame using pronto, fetching ontology from obolibrary\n\n```bash\nrunoak -i obolibrary:go.obo viz GO:0005773\n```\n\n## Configuration\n\nOAK uses [`pystow`](https://github.com/cthoyt/pystow) for caching. By default,\nthis goes inside `~/.data/`, but can be configured following\n[these instructions](https://github.com/cthoyt/pystow#%EF%B8%8F%EF%B8%8F-configuration).\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Ontology Access Kit: Python library for common ontology operations over a variety of backends",
"version": "0.6.6",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b658d0f49c56fed2af7d7e973bc614156579ae2094286db59ccc9ee14f4bb2df",
"md5": "c4804ec56875ecf17db0d829bfcc8c7e",
"sha256": "416cd1625c741a555b9109167aba6aacac49b75600882185c412d73ea9b1034e"
},
"downloads": -1,
"filename": "oaklib-0.6.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c4804ec56875ecf17db0d829bfcc8c7e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.9",
"size": 638550,
"upload_time": "2024-05-09T23:39:59",
"upload_time_iso_8601": "2024-05-09T23:39:59.285409Z",
"url": "https://files.pythonhosted.org/packages/b6/58/d0f49c56fed2af7d7e973bc614156579ae2094286db59ccc9ee14f4bb2df/oaklib-0.6.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d204ccad06a0ed69ad6285cef0c52dc091c520fc5df409943f50c00a05ed300c",
"md5": "07c2fb6fa464baf094157b86844fbdbc",
"sha256": "e5cc7afa337695cbdbc554115e67c817a959179faa94fb64fb4dbf00c04f147c"
},
"downloads": -1,
"filename": "oaklib-0.6.6.tar.gz",
"has_sig": false,
"md5_digest": "07c2fb6fa464baf094157b86844fbdbc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.9",
"size": 509273,
"upload_time": "2024-05-09T23:40:03",
"upload_time_iso_8601": "2024-05-09T23:40:03.764918Z",
"url": "https://files.pythonhosted.org/packages/d2/04/ccad06a0ed69ad6285cef0c52dc091c520fc5df409943f50c00a05ed300c/oaklib-0.6.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-09 23:40:03",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "oaklib"
}