nxontology


Namenxontology JSON
Version 0.5.0 PyPI version JSON
download
home_page
SummaryNetworkX for ontologies
upload_time2023-02-28 20:02:30
maintainer
docs_urlNone
author
requires_python>=3.8
licenseApache
keywords networkx ontologies similarity graphs networks digraph information-content semantic-similarity
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NetworkX-based Python library for representing ontologies

[![GitHub Actions CI Build Status](https://img.shields.io/github/actions/workflow/status/related-sciences/nxontology/build.yaml?branch=main&label=actions&style=for-the-badge&logo=github&logoColor=white)](https://github.com/related-sciences/nxontology/actions)  
[![Software License](https://img.shields.io/github/license/related-sciences/nxontology?style=for-the-badge&logo=Apache&logoColor=white)](https://github.com/related-sciences/nxontology/blob/main/LICENSE)  
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=Python&logoColor=white)](https://github.com/psf/black)  
[![PyPI](https://img.shields.io/pypi/v/nxontology.svg?style=for-the-badge&logo=PyPI&logoColor=white)](https://pypi.org/project/nxontology/)  

## Summary

nxontology is a Python library for representing ontologies using a NetworkX graph.
Currently, the main area of functionality is computing similarity measures between pairs of nodes.

## Usage

Here, we'll use the example [metals ontology](https://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-2-5/figures/1 "From Figure 1 of Disjunctive shared information between ontology concepts: application to Gene Ontology. Couto & Silva. 2011. Released under CC BY 2.0."):

![Metals ontology from Couto & Silva (2011)](https://raw.githubusercontent.com/related-sciences/nxontology/13de9d63ac9d08ffc1e25ee80e912c611b990473/media/metals.svg?sanitize=true)
<!-- use absolute URL instead of media/metals.svg for PyPI long_description -->

Note that `NXOntology` represents the ontology as a [`networkx.DiGraph`](https://networkx.org/documentation/stable/reference/classes/digraph.html), where edge direction goes from superterm to subterm.

Given an `NXOntology` instance, here how to compute intrinsic similarity metrics.

```python
from nxontology.examples import create_metal_nxo
metals = create_metal_nxo()
# Freezing the ontology prevents adding or removing nodes or edges.
# Frozen ontologies cache expensive computations.
metals.freeze()
# Get object for computing similarity, using the Sanchez et al metric for information content.
similarity = metals.similarity("gold", "silver", ic_metric="intrinsic_ic_sanchez")
# Access a single similarity metric
similarity.lin
# Access all similarity metrics
similarity.results()
```

The final line outputs a dictionary like:

```python
{
    'node_0': 'gold',
    'node_1': 'silver',
    'node_0_subsumes_1': False,
    'node_1_subsumes_0': False,
    'n_common_ancestors': 3,
    'n_union_ancestors': 5,
    'batet': 0.6,
    'batet_log': 0.5693234419266069,
    'ic_metric': 'intrinsic_ic_sanchez',
    'mica': 'coinage',
    'resnik': 0.8754687373538999,
    'resnik_scaled': 0.48860840553061435,
    'lin': 0.5581154235118403, 
    'jiang': 0.41905978419640516,
    'jiang_seco': 0.6131471927654584,
}
```

It's also possible to visualize the similarity between two nodes like:

```python
from nxontology.viz import create_similarity_graphviz
gviz = create_similarity_graphviz(
    # similarity instance from above
    similarity,
    # show all nodes (defaults to union of ancestors)
    nodes=list(metals.graph),
)
# draw to PNG file
gviz.draw("metals-sim-gold-silver-all.png"))
```

Resulting in the following figure:
<!-- from test output: cp nxontology/tests/viz_outputs/metals-sim-gold-silver-all.png media/ -->

![Metals ontology from Couto & Silva (2011) showing similarity between gold and silver](https://raw.githubusercontent.com/related-sciences/nxontology/13de9d63ac9d08ffc1e25ee80e912c611b990473/media/metals-sim-gold-silver-all.png)

The two query nodes (gold & silver) are outlined with a bold dashed line.
Node fill color corresponds to the Sánchez information content, such that darker nodes have higher IC.
The most informative common ancestor (coinage) is outlined with a bold solid line.
Nodes that are not an ancestor of gold or silver have an invisible outline.

### Loading ontologies

Pronto supports reading ontologies from the following file formats:

1. [Open Biomedical Ontologies 1.4](http://owlcollab.github.io/oboformat/doc/GO.format.obo-1_4.html): `.obo` extension, uses the [fastobo](https://github.com/fastobo/fastobo-py) parser.
2. [OBO Graphs JSON](https://github.com/geneontology/obographs): `.json` extension, uses the fastobo parser.
3. [Ontology Web Language 2 RDF/XML](https://www.w3.org/TR/owl2-overview/): `.owl` extension, uses the pronto `RdfXMLParser`.

The files can be local or at a network location (URL starting with https, http, or ftp).
Pronto detects and handles gzip, bzip2, and xz compression.

Here are examples operations on the Gene Ontology,
using pronto to load the ontology:

```python
>>> from nxontology.imports import from_file
>>> # versioned URL for the Gene Ontology
>>> url = "http://release.geneontology.org/2021-02-01/ontology/go-basic.json.gz"
>>> nxo = from_file(url)
>>> nxo.n_nodes
44085
>>> # similarity between "myelination" and "neurogenesis"
>>> sim = nxo.similarity("GO:0042552", "GO:0022008")
>>> round(sim.lin, 2)
0.21
>>> import networkx as nx
>>> # Gene Ontology domains are disconnected, expect 3 components
>>> nx.number_weakly_connected_components(nxo.graph)
3
>>> # Note however that the default from_file reader only uses "is a" relationships.
>>> # We can preserve all GO relationship types as follows
>>> from collections import Counter
>>> import pronto
>>> from nxontology import NXOntology
>>> from nxontology.imports import pronto_to_multidigraph, multidigraph_to_digraph
>>> go_pronto = pronto.Ontology(handle=url)
>>> go_multidigraph = pronto_to_multidigraph(go_pronto)
>>> Counter(key for _, _, key in go_multidigraph.edges(keys=True))
Counter({'is a': 71509,
         'part of': 7187,
         'regulates': 3216,
         'negatively regulates': 2768,
         'positively regulates': 2756})
>>> go_digraph = multidigraph_to_digraph(go_multidigraph, reduce=True)
>>> go_nxo = NXOntology(go_digraph)
>>> # Notice the similarity increases due to the full set of edges
>>> round(go_nxo.similarity("GO:0042552", "GO:0022008").lin, 3)
0.699
>>> # Note that there is also a dedicated reader for the Gene Ontology
>>> from nxontology.imports import read_gene_ontology
>>> read_gene_ontology(release="2021-02-01")
```

Users can also create their own `networkx.DiGraph` to use this package.

### Prebuilt Ontologies

The [nxontology-data](https://github.com/related-sciences/nxontology-data) repository creates NXOntology objects for many popular ontologies / taxonomies.

## Installation

nxontology can be installed with `pip` from [PyPI](https://pypi.org/project/nxontology/) like:

```shell
# standard installation
pip install nxontology

# installation with viz extras
pip install nxontology[viz]
```

The extra `viz` dependencies are required for the `nxontology.viz` module.
This includes [pygraphviz](https://pygraphviz.github.io/), which requires a pre-existing [graphviz](https://graphviz.org/) installation.

## Development

Some helpful development commands:

```shell
# create a virtual environment for development
python3 -m venv .venv

# activate virtual environment
source .venv/bin/activate

# install package for development
pip install --editable ".[dev,viz]"

# Set up the git pre-commit hooks.
# `git commit` will now trigger automatic checks including linting.
pre-commit install

# Run all pre-commit checks (CI will also run this).
pre-commit run --all

# run tests
pytest
```

Releases are created on [GitHub](https://github.com/related-sciences/nxontology/releases).
The [release action](https://github.com/related-sciences/nxontology/actions/workflows/release.yaml) defined by [`release.yaml`](https://github.com/related-sciences/nxontology/blob/main/.github/workflows/release.yaml) will build the distribution and upload to [PyPI](https://pypi.org/project/nxontology/).
The package version is automatically generated from the git tag by [`setuptools_scm`](https://github.com/pypa/setuptools_scm).

## Bibliography

Here's a list of alternative projects with code for computing semantic similarity measures on ontologies:

- [Semantic Measures Library & ToolKit](https://www.semantic-measures-library.org/sml/) at [sharispe/slib](https://github.com/sharispe/slib) in Java.
- [DiShIn](http://labs.rd.ciencias.ulisboa.pt/dishin/) at [lasigeBioTM/DiShIn](https://github.com/lasigeBioTM/DiShIn) in Python.
- [Sematch](http://sematch.gsi.upm.es/) at [gsi-upm/sematch](https://github.com/gsi-upm/sematch) in Python.
- [ontologySimilarity](https://rdrr.io/cran/ontologySimilarity/) mirrored at [cran/ontologySimilarity](https://github.com/cran/ontologySimilarity). Part of the [ontologyX](https://doi.org/10.1093/bioinformatics/btw763 "ontologyX: a suite of R packages for working with ontological data") suite of R packages. 
- Materials for Machine Learning with Ontologies at [bio-ontology-research-group/machine-learning-with-ontologies](https://github.com/bio-ontology-research-group/machine-learning-with-ontologies) (compilation)

Below are a list of references related to ontology-derived measures of similarity.
Feel free to add any reference that provides useful context and details for algorithms supported by this package.
Metadata for a reference can be generated like `manubot cite --yml doi:10.1016/j.jbi.2011.03.013`.
Adding CSL YAML output to `media/bibliography.yaml` will cache the metadata and allow manual edits in case of errors.

<!--
# code to generate references (uses cached metadata in bibliography.yaml if available)
manubot cite \
  --md \
  --bibliography=media/bibliography.yaml \
  doi:10.1371/journal.pcbi.1000443 \
  url:https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1065.1695 \
  doi:10.1186/1471-2105-9-S5-S4 \
  doi:10.1093/bib/bbaa199 \
  doi:10.1613/jair.514 \
  https://api.semanticscholar.org/CorpusID:5659557 \
  doi:10.1093/bioinformatics/btw763 \
  https://dl.acm.org/doi/10.5555/1862330.1862343 \
  doi:10.1186/2041-1480-2-5 \
  doi:10.1016/j.jbi.2013.11.006 \
  doi:10.5772/intechopen.89032 \
  doi:10.1016/j.jbi.2010.09.002 \
  doi:10.1186/1471-2105-13-261 \
  doi:10.1016/j.jbi.2011.03.013 \
  doi:10.1016/j.knosys.2010.10.001 \
  doi:10.1007/s10462-019-09725-4 \
  doi:10.1002/asi.24021


```bash
# future code to render references with pandoc > 2.11
pandoc \
  --citeproc \
  --metadata=nocite:\'@*\' \
  --csl=https://citation-style.manubot.org \
  --bibliography=media/bibliography.yaml \
  --wrap=none \
  --to=markdown_strict-raw_html <<< ""
```
-->

1. **Semantic Similarity in Biomedical Ontologies**   
Catia Pesquita, Daniel Faria, André O. Falcão, Phillip Lord, Francisco M. Couto  
*PLoS Computational Biology* (2009-07-31) <https://doi.org/cx8h87>   
DOI: [10.1371/journal.pcbi.1000443](https://doi.org/10.1371/journal.pcbi.1000443) · PMID: [19649320](https://www.ncbi.nlm.nih.gov/pubmed/19649320) · PMCID: [PMC2712090](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2712090)

2. **An Intrinsic Information Content Metric for Semantic Similarity in WordNet.**   
Nuno Seco, Tony Veale, Jer Hayes  
*In Proceedings of the 16th European Conference on Artificial Intelligence (ECAI-04),* (2004) <https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1065.1695>

3. **Metrics for GO based protein semantic similarity: a systematic evaluation**   
Catia Pesquita, Daniel Faria, Hugo Bastos, António EN Ferreira, André O Falcão, Francisco M Couto  
*BMC Bioinformatics* (2008-04-29) <https://doi.org/cmcgw6>   
DOI: [10.1186/1471-2105-9-s5-s4](https://doi.org/10.1186/1471-2105-9-s5-s4) · PMID: [18460186](https://www.ncbi.nlm.nih.gov/pubmed/18460186) · PMCID: [PMC2367622](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367622)

4. **Semantic similarity and machine learning with ontologies**   
Maxat Kulmanov, Fatima Zohra Smaili, Xin Gao, Robert Hoehndorf  
*Briefings in Bioinformatics* (2020-10-13) <https://doi.org/ghfqkt>   
DOI: [10.1093/bib/bbaa199](https://doi.org/10.1093/bib/bbaa199) · PMID: [33049044](https://www.ncbi.nlm.nih.gov/pubmed/33049044)

5. **Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language**   
P. Resnik  
*Journal of Artificial Intelligence Research* (1999-07-01) <https://doi.org/gftcpz>   
DOI: [10.1613/jair.514](https://doi.org/10.1613/jair.514)

6. **An Information-Theoretic Definition of Similarity**   
Dekang Lin  
*ICML* (1998) <https://api.semanticscholar.org/CorpusID:5659557>

7. **ontologyX: a suite of R packages for working with ontological data**   
Daniel Greene, Sylvia Richardson, Ernest Turro  
*Bioinformatics* (2017-01-05) <https://doi.org/f9k7sx>   
DOI: [10.1093/bioinformatics/btw763](https://doi.org/10.1093/bioinformatics/btw763) · PMID: [28062448](https://www.ncbi.nlm.nih.gov/pubmed/28062448) · PMCID: [PMC5386138](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5386138)

8. **Metric of intrinsic information content for measuring semantic similarity in an ontology**   
Md. Hanif Seddiqui, Masaki Aono  
*Proceedings of the Seventh Asia-Pacific Conference on Conceptual Modelling - Volume 110* (2010-01-01) <https://dl.acm.org/doi/10.5555/1862330.1862343>   
ISBN: [9781920682927](https://worldcat.org/isbn/9781920682927)

9. **Disjunctive shared information between ontology concepts: application to Gene Ontology**   
Francisco M Couto, Mário J Silva  
*Journal of Biomedical Semantics* (2011) <https://doi.org/fnb73v>   
DOI: [10.1186/2041-1480-2-5](https://doi.org/10.1186/2041-1480-2-5) · PMID: [21884591](https://www.ncbi.nlm.nih.gov/pubmed/21884591) · PMCID: [PMC3200982](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3200982)

10. **A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain**   
Sébastien Harispe, David Sánchez, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain  
*Journal of Biomedical Informatics* (2014-04) <https://doi.org/f52557>   
DOI: [10.1016/j.jbi.2013.11.006](https://doi.org/10.1016/j.jbi.2013.11.006) · PMID: [24269894](https://www.ncbi.nlm.nih.gov/pubmed/24269894)

11. **Semantic Similarity in Cheminformatics**   
João D. Ferreira, Francisco M. Couto  
*IntechOpen* (2020-07-15) <https://doi.org/ghh2d4>   
DOI: [10.5772/intechopen.89032](https://doi.org/10.5772/intechopen.89032)

12. **An ontology-based measure to compute semantic similarity in biomedicine**   
Montserrat Batet, David Sánchez, Aida Valls  
*Journal of Biomedical Informatics* (2011-02) <https://doi.org/dfhkjv>   
DOI: [10.1016/j.jbi.2010.09.002](https://doi.org/10.1016/j.jbi.2010.09.002) · PMID: [20837160](https://www.ncbi.nlm.nih.gov/pubmed/20837160)

13. **Semantic similarity in the biomedical domain: an evaluation across knowledge sources**   
Vijay N Garla, Cynthia Brandt  
*BMC Bioinformatics* (2012-10-10) <https://doi.org/gb8vpn>   
DOI: [10.1186/1471-2105-13-261](https://doi.org/10.1186/1471-2105-13-261) · PMID: [23046094](https://www.ncbi.nlm.nih.gov/pubmed/23046094) · PMCID: [PMC3533586](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3533586)

14. **Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective**   
David Sánchez, Montserrat Batet  
*Journal of Biomedical Informatics* (2011-10) <https://doi.org/d2436q>   
DOI: [10.1016/j.jbi.2011.03.013](https://doi.org/10.1016/j.jbi.2011.03.013) · PMID: [21463704](https://www.ncbi.nlm.nih.gov/pubmed/21463704)

15. **Ontology-based information content computation**   
David Sánchez, Montserrat Batet, David Isern  
*Knowledge-Based Systems* (2011-03) <https://doi.org/cwzw4r>   
DOI: [10.1016/j.knosys.2010.10.001](https://doi.org/10.1016/j.knosys.2010.10.001)

16. **Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content**   
Montserrat Batet, David Sánchez  
*Artificial Intelligence Review* (2019-06-03) <https://doi.org/ghnfmt>   
DOI: [10.1007/s10462-019-09725-4](https://doi.org/10.1007/s10462-019-09725-4)

17. **An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology**   
Abhijit Adhikari, Biswanath Dutta, Animesh Dutta, Deepjyoti Mondal, Shivang Singh  
*Journal of the Association for Information Science and Technology* (2018-08) <https://doi.org/gd2j5b>   
DOI: [10.1002/asi.24021](https://doi.org/10.1002/asi.24021)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "nxontology",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Daniel Himmelstein <dhimmel@related.vc>",
    "keywords": "networkx,ontologies,similarity,graphs,networks,digraph,information-content,semantic-similarity",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/d5/44/a5f52f698a580982b3afc53f6c14edb64d27acc07e78ad12aa1d9ee8be6e/nxontology-0.5.0.tar.gz",
    "platform": null,
    "description": "# NetworkX-based Python library for representing ontologies\n\n[![GitHub Actions CI Build Status](https://img.shields.io/github/actions/workflow/status/related-sciences/nxontology/build.yaml?branch=main&label=actions&style=for-the-badge&logo=github&logoColor=white)](https://github.com/related-sciences/nxontology/actions)  \n[![Software License](https://img.shields.io/github/license/related-sciences/nxontology?style=for-the-badge&logo=Apache&logoColor=white)](https://github.com/related-sciences/nxontology/blob/main/LICENSE)  \n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=for-the-badge&logo=Python&logoColor=white)](https://github.com/psf/black)  \n[![PyPI](https://img.shields.io/pypi/v/nxontology.svg?style=for-the-badge&logo=PyPI&logoColor=white)](https://pypi.org/project/nxontology/)  \n\n## Summary\n\nnxontology is a Python library for representing ontologies using a NetworkX graph.\nCurrently, the main area of functionality is computing similarity measures between pairs of nodes.\n\n## Usage\n\nHere, we'll use the example [metals ontology](https://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-2-5/figures/1 \"From Figure 1 of Disjunctive shared information between ontology concepts: application to Gene Ontology. Couto & Silva. 2011. Released under CC BY 2.0.\"):\n\n![Metals ontology from Couto & Silva (2011)](https://raw.githubusercontent.com/related-sciences/nxontology/13de9d63ac9d08ffc1e25ee80e912c611b990473/media/metals.svg?sanitize=true)\n<!-- use absolute URL instead of media/metals.svg for PyPI long_description -->\n\nNote that `NXOntology` represents the ontology as a [`networkx.DiGraph`](https://networkx.org/documentation/stable/reference/classes/digraph.html), where edge direction goes from superterm to subterm.\n\nGiven an `NXOntology` instance, here how to compute intrinsic similarity metrics.\n\n```python\nfrom nxontology.examples import create_metal_nxo\nmetals = create_metal_nxo()\n# Freezing the ontology prevents adding or removing nodes or edges.\n# Frozen ontologies cache expensive computations.\nmetals.freeze()\n# Get object for computing similarity, using the Sanchez et al metric for information content.\nsimilarity = metals.similarity(\"gold\", \"silver\", ic_metric=\"intrinsic_ic_sanchez\")\n# Access a single similarity metric\nsimilarity.lin\n# Access all similarity metrics\nsimilarity.results()\n```\n\nThe final line outputs a dictionary like:\n\n```python\n{\n    'node_0': 'gold',\n    'node_1': 'silver',\n    'node_0_subsumes_1': False,\n    'node_1_subsumes_0': False,\n    'n_common_ancestors': 3,\n    'n_union_ancestors': 5,\n    'batet': 0.6,\n    'batet_log': 0.5693234419266069,\n    'ic_metric': 'intrinsic_ic_sanchez',\n    'mica': 'coinage',\n    'resnik': 0.8754687373538999,\n    'resnik_scaled': 0.48860840553061435,\n    'lin': 0.5581154235118403, \n    'jiang': 0.41905978419640516,\n    'jiang_seco': 0.6131471927654584,\n}\n```\n\nIt's also possible to visualize the similarity between two nodes like:\n\n```python\nfrom nxontology.viz import create_similarity_graphviz\ngviz = create_similarity_graphviz(\n    # similarity instance from above\n    similarity,\n    # show all nodes (defaults to union of ancestors)\n    nodes=list(metals.graph),\n)\n# draw to PNG file\ngviz.draw(\"metals-sim-gold-silver-all.png\"))\n```\n\nResulting in the following figure:\n<!-- from test output: cp nxontology/tests/viz_outputs/metals-sim-gold-silver-all.png media/ -->\n\n![Metals ontology from Couto & Silva (2011) showing similarity between gold and silver](https://raw.githubusercontent.com/related-sciences/nxontology/13de9d63ac9d08ffc1e25ee80e912c611b990473/media/metals-sim-gold-silver-all.png)\n\nThe two query nodes (gold & silver) are outlined with a bold dashed line.\nNode fill color corresponds to the S\u00e1nchez information content, such that darker nodes have higher IC.\nThe most informative common ancestor (coinage) is outlined with a bold solid line.\nNodes that are not an ancestor of gold or silver have an invisible outline.\n\n### Loading ontologies\n\nPronto supports reading ontologies from the following file formats:\n\n1. [Open Biomedical Ontologies 1.4](http://owlcollab.github.io/oboformat/doc/GO.format.obo-1_4.html): `.obo` extension, uses the [fastobo](https://github.com/fastobo/fastobo-py) parser.\n2. [OBO Graphs JSON](https://github.com/geneontology/obographs): `.json` extension, uses the fastobo parser.\n3. [Ontology Web Language 2 RDF/XML](https://www.w3.org/TR/owl2-overview/): `.owl` extension, uses the pronto `RdfXMLParser`.\n\nThe files can be local or at a network location (URL starting with https, http, or ftp).\nPronto detects and handles gzip, bzip2, and xz compression.\n\nHere are examples operations on the Gene Ontology,\nusing pronto to load the ontology:\n\n```python\n>>> from nxontology.imports import from_file\n>>> # versioned URL for the Gene Ontology\n>>> url = \"http://release.geneontology.org/2021-02-01/ontology/go-basic.json.gz\"\n>>> nxo = from_file(url)\n>>> nxo.n_nodes\n44085\n>>> # similarity between \"myelination\" and \"neurogenesis\"\n>>> sim = nxo.similarity(\"GO:0042552\", \"GO:0022008\")\n>>> round(sim.lin, 2)\n0.21\n>>> import networkx as nx\n>>> # Gene Ontology domains are disconnected, expect 3 components\n>>> nx.number_weakly_connected_components(nxo.graph)\n3\n>>> # Note however that the default from_file reader only uses \"is a\" relationships.\n>>> # We can preserve all GO relationship types as follows\n>>> from collections import Counter\n>>> import pronto\n>>> from nxontology import NXOntology\n>>> from nxontology.imports import pronto_to_multidigraph, multidigraph_to_digraph\n>>> go_pronto = pronto.Ontology(handle=url)\n>>> go_multidigraph = pronto_to_multidigraph(go_pronto)\n>>> Counter(key for _, _, key in go_multidigraph.edges(keys=True))\nCounter({'is a': 71509,\n         'part of': 7187,\n         'regulates': 3216,\n         'negatively regulates': 2768,\n         'positively regulates': 2756})\n>>> go_digraph = multidigraph_to_digraph(go_multidigraph, reduce=True)\n>>> go_nxo = NXOntology(go_digraph)\n>>> # Notice the similarity increases due to the full set of edges\n>>> round(go_nxo.similarity(\"GO:0042552\", \"GO:0022008\").lin, 3)\n0.699\n>>> # Note that there is also a dedicated reader for the Gene Ontology\n>>> from nxontology.imports import read_gene_ontology\n>>> read_gene_ontology(release=\"2021-02-01\")\n```\n\nUsers can also create their own `networkx.DiGraph` to use this package.\n\n### Prebuilt Ontologies\n\nThe [nxontology-data](https://github.com/related-sciences/nxontology-data) repository creates NXOntology objects for many popular ontologies / taxonomies.\n\n## Installation\n\nnxontology can be installed with `pip` from [PyPI](https://pypi.org/project/nxontology/) like:\n\n```shell\n# standard installation\npip install nxontology\n\n# installation with viz extras\npip install nxontology[viz]\n```\n\nThe extra `viz` dependencies are required for the `nxontology.viz` module.\nThis includes [pygraphviz](https://pygraphviz.github.io/), which requires a pre-existing [graphviz](https://graphviz.org/) installation.\n\n## Development\n\nSome helpful development commands:\n\n```shell\n# create a virtual environment for development\npython3 -m venv .venv\n\n# activate virtual environment\nsource .venv/bin/activate\n\n# install package for development\npip install --editable \".[dev,viz]\"\n\n# Set up the git pre-commit hooks.\n# `git commit` will now trigger automatic checks including linting.\npre-commit install\n\n# Run all pre-commit checks (CI will also run this).\npre-commit run --all\n\n# run tests\npytest\n```\n\nReleases are created on [GitHub](https://github.com/related-sciences/nxontology/releases).\nThe [release action](https://github.com/related-sciences/nxontology/actions/workflows/release.yaml) defined by [`release.yaml`](https://github.com/related-sciences/nxontology/blob/main/.github/workflows/release.yaml) will build the distribution and upload to [PyPI](https://pypi.org/project/nxontology/).\nThe package version is automatically generated from the git tag by [`setuptools_scm`](https://github.com/pypa/setuptools_scm).\n\n## Bibliography\n\nHere's a list of alternative projects with code for computing semantic similarity measures on ontologies:\n\n- [Semantic Measures Library & ToolKit](https://www.semantic-measures-library.org/sml/) at [sharispe/slib](https://github.com/sharispe/slib) in Java.\n- [DiShIn](http://labs.rd.ciencias.ulisboa.pt/dishin/) at [lasigeBioTM/DiShIn](https://github.com/lasigeBioTM/DiShIn) in Python.\n- [Sematch](http://sematch.gsi.upm.es/) at [gsi-upm/sematch](https://github.com/gsi-upm/sematch) in Python.\n- [ontologySimilarity](https://rdrr.io/cran/ontologySimilarity/) mirrored at [cran/ontologySimilarity](https://github.com/cran/ontologySimilarity). Part of the [ontologyX](https://doi.org/10.1093/bioinformatics/btw763 \"ontologyX: a suite of R packages for working with ontological data\") suite of R packages. \n- Materials for Machine Learning with Ontologies at [bio-ontology-research-group/machine-learning-with-ontologies](https://github.com/bio-ontology-research-group/machine-learning-with-ontologies) (compilation)\n\nBelow are a list of references related to ontology-derived measures of similarity.\nFeel free to add any reference that provides useful context and details for algorithms supported by this package.\nMetadata for a reference can be generated like `manubot cite --yml doi:10.1016/j.jbi.2011.03.013`.\nAdding CSL YAML output to `media/bibliography.yaml` will cache the metadata and allow manual edits in case of errors.\n\n<!--\n# code to generate references (uses cached metadata in bibliography.yaml if available)\nmanubot cite \\\n  --md \\\n  --bibliography=media/bibliography.yaml \\\n  doi:10.1371/journal.pcbi.1000443 \\\n  url:https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1065.1695 \\\n  doi:10.1186/1471-2105-9-S5-S4 \\\n  doi:10.1093/bib/bbaa199 \\\n  doi:10.1613/jair.514 \\\n  https://api.semanticscholar.org/CorpusID:5659557 \\\n  doi:10.1093/bioinformatics/btw763 \\\n  https://dl.acm.org/doi/10.5555/1862330.1862343 \\\n  doi:10.1186/2041-1480-2-5 \\\n  doi:10.1016/j.jbi.2013.11.006 \\\n  doi:10.5772/intechopen.89032 \\\n  doi:10.1016/j.jbi.2010.09.002 \\\n  doi:10.1186/1471-2105-13-261 \\\n  doi:10.1016/j.jbi.2011.03.013 \\\n  doi:10.1016/j.knosys.2010.10.001 \\\n  doi:10.1007/s10462-019-09725-4 \\\n  doi:10.1002/asi.24021\n\n\n```bash\n# future code to render references with pandoc > 2.11\npandoc \\\n  --citeproc \\\n  --metadata=nocite:\\'@*\\' \\\n  --csl=https://citation-style.manubot.org \\\n  --bibliography=media/bibliography.yaml \\\n  --wrap=none \\\n  --to=markdown_strict-raw_html <<< \"\"\n```\n-->\n\n1. **Semantic Similarity in Biomedical Ontologies**   \nCatia Pesquita, Daniel Faria, Andr\u00e9 O. Falc\u00e3o, Phillip Lord, Francisco M. Couto  \n*PLoS Computational Biology* (2009-07-31) <https://doi.org/cx8h87>   \nDOI: [10.1371/journal.pcbi.1000443](https://doi.org/10.1371/journal.pcbi.1000443) \u00b7 PMID: [19649320](https://www.ncbi.nlm.nih.gov/pubmed/19649320) \u00b7 PMCID: [PMC2712090](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2712090)\n\n2. **An Intrinsic Information Content Metric for Semantic Similarity in WordNet.**   \nNuno Seco, Tony Veale, Jer Hayes  \n*In Proceedings of the 16th European Conference on Artificial Intelligence (ECAI-04),* (2004) <https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1065.1695>\n\n3. **Metrics for GO based protein semantic similarity: a systematic evaluation**   \nCatia Pesquita, Daniel Faria, Hugo Bastos, Ant\u00f3nio EN Ferreira, Andr\u00e9 O Falc\u00e3o, Francisco M Couto  \n*BMC Bioinformatics* (2008-04-29) <https://doi.org/cmcgw6>   \nDOI: [10.1186/1471-2105-9-s5-s4](https://doi.org/10.1186/1471-2105-9-s5-s4) \u00b7 PMID: [18460186](https://www.ncbi.nlm.nih.gov/pubmed/18460186) \u00b7 PMCID: [PMC2367622](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367622)\n\n4. **Semantic similarity and machine learning with ontologies**   \nMaxat Kulmanov, Fatima Zohra Smaili, Xin Gao, Robert Hoehndorf  \n*Briefings in Bioinformatics* (2020-10-13) <https://doi.org/ghfqkt>   \nDOI: [10.1093/bib/bbaa199](https://doi.org/10.1093/bib/bbaa199) \u00b7 PMID: [33049044](https://www.ncbi.nlm.nih.gov/pubmed/33049044)\n\n5. **Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language**   \nP. Resnik  \n*Journal of Artificial Intelligence Research* (1999-07-01) <https://doi.org/gftcpz>   \nDOI: [10.1613/jair.514](https://doi.org/10.1613/jair.514)\n\n6. **An Information-Theoretic Definition of Similarity**   \nDekang Lin  \n*ICML* (1998) <https://api.semanticscholar.org/CorpusID:5659557>\n\n7. **ontologyX: a suite of R packages for working with ontological data**   \nDaniel Greene, Sylvia Richardson, Ernest Turro  \n*Bioinformatics* (2017-01-05) <https://doi.org/f9k7sx>   \nDOI: [10.1093/bioinformatics/btw763](https://doi.org/10.1093/bioinformatics/btw763) \u00b7 PMID: [28062448](https://www.ncbi.nlm.nih.gov/pubmed/28062448) \u00b7 PMCID: [PMC5386138](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5386138)\n\n8. **Metric of intrinsic information content for measuring semantic similarity in an ontology**   \nMd. Hanif Seddiqui, Masaki Aono  \n*Proceedings of the Seventh Asia-Pacific Conference on Conceptual Modelling - Volume 110* (2010-01-01) <https://dl.acm.org/doi/10.5555/1862330.1862343>   \nISBN: [9781920682927](https://worldcat.org/isbn/9781920682927)\n\n9. **Disjunctive shared information between ontology concepts: application to Gene Ontology**   \nFrancisco M Couto, M\u00e1rio J Silva  \n*Journal of Biomedical Semantics* (2011) <https://doi.org/fnb73v>   \nDOI: [10.1186/2041-1480-2-5](https://doi.org/10.1186/2041-1480-2-5) \u00b7 PMID: [21884591](https://www.ncbi.nlm.nih.gov/pubmed/21884591) \u00b7 PMCID: [PMC3200982](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3200982)\n\n10. **A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain**   \nS\u00e9bastien Harispe, David S\u00e1nchez, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain  \n*Journal of Biomedical Informatics* (2014-04) <https://doi.org/f52557>   \nDOI: [10.1016/j.jbi.2013.11.006](https://doi.org/10.1016/j.jbi.2013.11.006) \u00b7 PMID: [24269894](https://www.ncbi.nlm.nih.gov/pubmed/24269894)\n\n11. **Semantic Similarity in Cheminformatics**   \nJo\u00e3o D. Ferreira, Francisco M. Couto  \n*IntechOpen* (2020-07-15) <https://doi.org/ghh2d4>   \nDOI: [10.5772/intechopen.89032](https://doi.org/10.5772/intechopen.89032)\n\n12. **An ontology-based measure to compute semantic similarity in biomedicine**   \nMontserrat Batet, David S\u00e1nchez, Aida Valls  \n*Journal of Biomedical Informatics* (2011-02) <https://doi.org/dfhkjv>   \nDOI: [10.1016/j.jbi.2010.09.002](https://doi.org/10.1016/j.jbi.2010.09.002) \u00b7 PMID: [20837160](https://www.ncbi.nlm.nih.gov/pubmed/20837160)\n\n13. **Semantic similarity in the biomedical domain: an evaluation across knowledge sources**   \nVijay N Garla, Cynthia Brandt  \n*BMC Bioinformatics* (2012-10-10) <https://doi.org/gb8vpn>   \nDOI: [10.1186/1471-2105-13-261](https://doi.org/10.1186/1471-2105-13-261) \u00b7 PMID: [23046094](https://www.ncbi.nlm.nih.gov/pubmed/23046094) \u00b7 PMCID: [PMC3533586](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3533586)\n\n14. **Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective**   \nDavid S\u00e1nchez, Montserrat Batet  \n*Journal of Biomedical Informatics* (2011-10) <https://doi.org/d2436q>   \nDOI: [10.1016/j.jbi.2011.03.013](https://doi.org/10.1016/j.jbi.2011.03.013) \u00b7 PMID: [21463704](https://www.ncbi.nlm.nih.gov/pubmed/21463704)\n\n15. **Ontology-based information content computation**   \nDavid S\u00e1nchez, Montserrat Batet, David Isern  \n*Knowledge-Based Systems* (2011-03) <https://doi.org/cwzw4r>   \nDOI: [10.1016/j.knosys.2010.10.001](https://doi.org/10.1016/j.knosys.2010.10.001)\n\n16. **Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content**   \nMontserrat Batet, David S\u00e1nchez  \n*Artificial Intelligence Review* (2019-06-03) <https://doi.org/ghnfmt>   \nDOI: [10.1007/s10462-019-09725-4](https://doi.org/10.1007/s10462-019-09725-4)\n\n17. **An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology**   \nAbhijit Adhikari, Biswanath Dutta, Animesh Dutta, Deepjyoti Mondal, Shivang Singh  \n*Journal of the Association for Information Science and Technology* (2018-08) <https://doi.org/gd2j5b>   \nDOI: [10.1002/asi.24021](https://doi.org/10.1002/asi.24021)\n",
    "bugtrack_url": null,
    "license": "Apache",
    "summary": "NetworkX for ontologies",
    "version": "0.5.0",
    "split_keywords": [
        "networkx",
        "ontologies",
        "similarity",
        "graphs",
        "networks",
        "digraph",
        "information-content",
        "semantic-similarity"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "00c0a7e608c1e86c4164468e6e969e330a5c47f0f7393ef7dabdfc91cc8b36cd",
                "md5": "b6321c129b7338146a4cee09f0c30971",
                "sha256": "c95d2d47c2ce8c0bb48e839a3bdc70e489c13f4579b6e88b8d93a2be87c2c07f"
            },
            "downloads": -1,
            "filename": "nxontology-0.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b6321c129b7338146a4cee09f0c30971",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 38698,
            "upload_time": "2023-02-28T20:02:28",
            "upload_time_iso_8601": "2023-02-28T20:02:28.407884Z",
            "url": "https://files.pythonhosted.org/packages/00/c0/a7e608c1e86c4164468e6e969e330a5c47f0f7393ef7dabdfc91cc8b36cd/nxontology-0.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d544a5f52f698a580982b3afc53f6c14edb64d27acc07e78ad12aa1d9ee8be6e",
                "md5": "72c6a8d6b594fd86bccbdd41d998e968",
                "sha256": "727712c2bdbcaf7113f0012aa66bb9097ddef375f604939fb279b0871aed2c2f"
            },
            "downloads": -1,
            "filename": "nxontology-0.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "72c6a8d6b594fd86bccbdd41d998e968",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 135145,
            "upload_time": "2023-02-28T20:02:30",
            "upload_time_iso_8601": "2023-02-28T20:02:30.825411Z",
            "url": "https://files.pythonhosted.org/packages/d5/44/a5f52f698a580982b3afc53f6c14edb64d27acc07e78ad12aa1d9ee8be6e/nxontology-0.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-28 20:02:30",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "nxontology"
}
        
Elapsed time: 0.05584s