# obonet: load OBO-formatted ontologies into networkx
[![GitHub Actions CI Build Status](https://img.shields.io/github/workflow/status/dhimmel/obonet/Build?label=actions&style=for-the-badge&logo=github&logoColor=white)](https://github.com/dhimmel/obonet/actions)
[![Software License](https://img.shields.io/pypi/l/obonet?style=for-the-badge&logo=FreeBSD&logoColor=white)](https://github.com/dhimmel/obonet/blob/main/LICENSE)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=Python&logoColor=white)](https://github.com/psf/black)
[![PyPI](https://img.shields.io/pypi/v/obonet.svg?style=for-the-badge&logo=PyPI&logoColor=white)](https://pypi.org/project/obonet/)
Read OBO-formatted ontologies in Python.
`obonet` is
+ user friendly
+ no nonsense
+ pythonic
+ modern
+ simple and tested
+ lightweight
+ [`networkx`](https://networkx.readthedocs.io/en/stable/overview.html) leveraging
This Python package loads OBO serialized ontologies into networks.
The function `obonet.read_obo()` takes an `.obo` file and returns a [`networkx.MultiDiGraph`](https://networkx.github.io/documentation/stable/reference/classes/multigraph.html) representation of the ontology.
The parser was designed for the OBO specification version [1.2](https://owlcollab.github.io/oboformat/doc/GO.format.obo-1_2.html) & [1.4](https://owlcollab.github.io/oboformat/doc/GO.format.obo-1_4.html).
## Usage
See [`setup.cfg`](setup.cfg) for the minimum Python version required and the dependencies.
OBO files can be read from a path, URL, or open file handle.
Compression is inferred from the path's extension.
See example usage below:
```python
import networkx
import obonet
# Read the taxrank ontology
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo'
graph = obonet.read_obo(url)
# Or read the xz-compressed taxrank ontology
url = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo.xz'
graph = obonet.read_obo(url)
# Number of nodes
len(graph)
# Number of edges
graph.number_of_edges()
# Check if the ontology is a DAG
networkx.is_directed_acyclic_graph(graph)
# Mapping from term ID to name
id_to_name = {id_: data.get('name') for id_, data in graph.nodes(data=True)}
id_to_name['TAXRANK:0000006'] # TAXRANK:0000006 is species
# Find all superterms of species. Note that networkx.descendants gets
# superterms, while networkx.ancestors returns subterms.
networkx.descendants(graph, 'TAXRANK:0000006')
```
For a more detailed tutorial, see the [**Gene Ontology example notebook**](https://github.com/dhimmel/obonet/blob/main/examples/go-obonet.ipynb).
## Comparison
This package specializes in reading OBO files into a `newtorkx.MultiDiGraph`.
A more general ontology-to-NetworkX reader is available in the Python [nxontology package](https://github.com/related-sciences/nxontology) via the `nxontology.imports.pronto_to_multidigraph` function.
This function takes a `pronto.Ontology` object,
which can be loaded from an OBO file, OBO Graphs JSON file, or Ontology Web Language 2 RDF/XML file (OWL).
Using `pronto_to_multidigraph` allows creating a MultiDiGraph similar to the created by `obonet`,
with some differences in the amount of metadata retained.
The primary focus of the `nxontology` package is to provide an `NXOntology` class for representing ontologies based around a `networkx.DiGraph`.
NXOntology provides optimized implementations for computing node similarity and other intrinsic ontology metrics.
There are two important differences between a DiGraph for NXOntology and the MultiDiGraph produced by obonet:
1. NXOntology is based on a DiGraph that does not allow multiple edges between the same two nodes.
Multiple edges between the same two nodes must therefore be collapsed.
By default, it only considers _is a_ / `rdfs:subClassOf` relationships,
but using `pronto_to_multidigraph` to create the NXOntology allows for retaining additional relationship types, like _part of_ in the case of the Gene Ontology.
2. NXOntology reverses the direction of relationships so edges go from superterm to subterm.
Traditionally in ontologies, the _is a_ relationships go from subterm to superterm,
but this is confusing.
NXOntology reverses edges so functions such as _ancestors_ refer to more general concepts and _descendants_ refer to more specific concepts.
The `nxontology.imports.multidigraph_to_digraph` function converts from a MultiDiGraph, like the one produced by obonet, to a DiGraph by filtering to the desired relationship types, reversing edges, and collapsing parallel edges.
## Installation
The recommended approach is to install the latest release from [PyPI](https://pypi.org/project/obonet/) using:
```sh
pip install obonet
```
However, if you'd like to install the most recent version from GitHub, use:
```sh
pip install git+https://github.com/dhimmel/obonet.git#egg=obonet
```
## Contributing
[![GitHub issues](https://img.shields.io/github/issues/dhimmel/obonet.svg?style=for-the-badge)](https://github.com/dhimmel/obonet/issues)
We welcome feature suggestions and community contributions.
Currently, only reading OBO files is supported.
Please open an issue if you're interested in writing OBO files in Python.
## Develop
Some development commands:
```bash
# create virtual environment
python3 -m venv ./env
# activate virtual environment
source env/bin/activate
# editable installation for development
pip install --editable ".[dev]"
# install pre-commit hooks
pre-commit install
# run all pre-commit checks
pre-commit run --all
# run tests
pytest
# generate changelog for release notes
git fetch --tags origin main
OLD_TAG=$(git describe --tags --abbrev=0)
git log --oneline --decorate=no --reverse $OLD_TAG..HEAD
```
Maintainers can make a new release at <https://github.com/dhimmel/obonet/releases/new>.
Raw data
{
"_id": null,
"home_page": "https://github.com/dhimmel/obonet",
"name": "obonet",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "obo,ontology,networkx,parser,network",
"author": "Daniel Himmelstein",
"author_email": "daniel.himmelstein@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/3f/8c/5a35474cd573d658ebe588211013b237b1b3089a9d2b9617969b7e9d4b86/obonet-0.3.1.tar.gz",
"platform": null,
"description": "# obonet: load OBO-formatted ontologies into networkx\n\n[![GitHub Actions CI Build Status](https://img.shields.io/github/workflow/status/dhimmel/obonet/Build?label=actions&style=for-the-badge&logo=github&logoColor=white)](https://github.com/dhimmel/obonet/actions) \n[![Software License](https://img.shields.io/pypi/l/obonet?style=for-the-badge&logo=FreeBSD&logoColor=white)](https://github.com/dhimmel/obonet/blob/main/LICENSE) \n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=Python&logoColor=white)](https://github.com/psf/black) \n[![PyPI](https://img.shields.io/pypi/v/obonet.svg?style=for-the-badge&logo=PyPI&logoColor=white)](https://pypi.org/project/obonet/) \n\n\nRead OBO-formatted ontologies in Python.\n`obonet` is\n\n+ user friendly\n+ no nonsense\n+ pythonic\n+ modern\n+ simple and tested\n+ lightweight\n+ [`networkx`](https://networkx.readthedocs.io/en/stable/overview.html) leveraging\n\nThis Python package loads OBO serialized ontologies into networks.\nThe function `obonet.read_obo()` takes an `.obo` file and returns a [`networkx.MultiDiGraph`](https://networkx.github.io/documentation/stable/reference/classes/multigraph.html) representation of the ontology.\nThe parser was designed for the OBO specification version [1.2](https://owlcollab.github.io/oboformat/doc/GO.format.obo-1_2.html) & [1.4](https://owlcollab.github.io/oboformat/doc/GO.format.obo-1_4.html).\n\n## Usage\n\nSee [`setup.cfg`](setup.cfg) for the minimum Python version required and the dependencies.\nOBO files can be read from a path, URL, or open file handle.\nCompression is inferred from the path's extension.\nSee example usage below:\n\n```python\nimport networkx\nimport obonet\n\n# Read the taxrank ontology\nurl = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo'\ngraph = obonet.read_obo(url)\n\n# Or read the xz-compressed taxrank ontology\nurl = 'https://github.com/dhimmel/obonet/raw/main/tests/data/taxrank.obo.xz'\ngraph = obonet.read_obo(url)\n\n# Number of nodes\nlen(graph)\n\n# Number of edges\ngraph.number_of_edges()\n\n# Check if the ontology is a DAG\nnetworkx.is_directed_acyclic_graph(graph)\n\n# Mapping from term ID to name\nid_to_name = {id_: data.get('name') for id_, data in graph.nodes(data=True)}\nid_to_name['TAXRANK:0000006'] # TAXRANK:0000006 is species\n\n# Find all superterms of species. Note that networkx.descendants gets\n# superterms, while networkx.ancestors returns subterms.\nnetworkx.descendants(graph, 'TAXRANK:0000006')\n```\n\nFor a more detailed tutorial, see the [**Gene Ontology example notebook**](https://github.com/dhimmel/obonet/blob/main/examples/go-obonet.ipynb).\n\n## Comparison\n\nThis package specializes in reading OBO files into a `newtorkx.MultiDiGraph`.\nA more general ontology-to-NetworkX reader is available in the Python [nxontology package](https://github.com/related-sciences/nxontology) via the `nxontology.imports.pronto_to_multidigraph` function.\nThis function takes a `pronto.Ontology` object,\nwhich can be loaded from an OBO file, OBO Graphs JSON file, or Ontology Web Language 2 RDF/XML file (OWL).\nUsing `pronto_to_multidigraph` allows creating a MultiDiGraph similar to the created by `obonet`,\nwith some differences in the amount of metadata retained.\n\nThe primary focus of the `nxontology` package is to provide an `NXOntology` class for representing ontologies based around a `networkx.DiGraph`.\nNXOntology provides optimized implementations for computing node similarity and other intrinsic ontology metrics.\nThere are two important differences between a DiGraph for NXOntology and the MultiDiGraph produced by obonet:\n\n1. NXOntology is based on a DiGraph that does not allow multiple edges between the same two nodes.\n Multiple edges between the same two nodes must therefore be collapsed.\n By default, it only considers _is a_ / `rdfs:subClassOf` relationships,\n but using `pronto_to_multidigraph` to create the NXOntology allows for retaining additional relationship types, like _part of_ in the case of the Gene Ontology.\n\n2. NXOntology reverses the direction of relationships so edges go from superterm to subterm.\n Traditionally in ontologies, the _is a_ relationships go from subterm to superterm,\n but this is confusing.\n NXOntology reverses edges so functions such as _ancestors_ refer to more general concepts and _descendants_ refer to more specific concepts.\n\nThe `nxontology.imports.multidigraph_to_digraph` function converts from a MultiDiGraph, like the one produced by obonet, to a DiGraph by filtering to the desired relationship types, reversing edges, and collapsing parallel edges.\n\n## Installation\n\nThe recommended approach is to install the latest release from [PyPI](https://pypi.org/project/obonet/) using:\n\n```sh\npip install obonet\n```\n\nHowever, if you'd like to install the most recent version from GitHub, use:\n\n```sh\npip install git+https://github.com/dhimmel/obonet.git#egg=obonet\n```\n\n## Contributing\n\n[![GitHub issues](https://img.shields.io/github/issues/dhimmel/obonet.svg?style=for-the-badge)](https://github.com/dhimmel/obonet/issues)\n\nWe welcome feature suggestions and community contributions.\nCurrently, only reading OBO files is supported.\nPlease open an issue if you're interested in writing OBO files in Python.\n\n## Develop\n\nSome development commands:\n\n```bash\n# create virtual environment\npython3 -m venv ./env\n\n# activate virtual environment\nsource env/bin/activate\n\n# editable installation for development\npip install --editable \".[dev]\"\n\n# install pre-commit hooks\npre-commit install\n\n# run all pre-commit checks\npre-commit run --all\n\n# run tests\npytest\n\n# generate changelog for release notes\ngit fetch --tags origin main\nOLD_TAG=$(git describe --tags --abbrev=0)\ngit log --oneline --decorate=no --reverse $OLD_TAG..HEAD\n```\n\nMaintainers can make a new release at <https://github.com/dhimmel/obonet/releases/new>.\n",
"bugtrack_url": null,
"license": "BSD-2-Clause-Patent",
"summary": "Parse OBO formatted ontologies into networkx",
"version": "0.3.1",
"split_keywords": [
"obo",
"ontology",
"networkx",
"parser",
"network"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "7e64c3507ee9b94935bd30c6f96300d4",
"sha256": "a4fe5ee83cc165dfb613153abf08b469732bcb24a65a00067d3686864ed4a8f4"
},
"downloads": -1,
"filename": "obonet-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7e64c3507ee9b94935bd30c6f96300d4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 8589,
"upload_time": "2022-11-10T12:42:51",
"upload_time_iso_8601": "2022-11-10T12:42:51.349378Z",
"url": "https://files.pythonhosted.org/packages/9a/14/1bcf986000f46c53619c5eba489cb7946f12c7469f85bc0c72add962c245/obonet-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "b2f1e69de56b93024f1f9a68e4bce0c6",
"sha256": "442dc810f3f914858457006030d7da3880b2566b1e03278d5e7802b6ea3ed27c"
},
"downloads": -1,
"filename": "obonet-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "b2f1e69de56b93024f1f9a68e4bce0c6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 24654,
"upload_time": "2022-11-10T12:42:53",
"upload_time_iso_8601": "2022-11-10T12:42:53.048359Z",
"url": "https://files.pythonhosted.org/packages/3f/8c/5a35474cd573d658ebe588211013b237b1b3089a9d2b9617969b7e9d4b86/obonet-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-11-10 12:42:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "dhimmel",
"github_project": "obonet",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "obonet"
}