Name | adeft JSON |
Version |
0.12.2
JSON |
| download |
home_page | None |
Summary | Acromine based Disambiguation of Entities From Text |
upload_time | 2024-05-10 14:34:34 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.7 |
license | BSD 2-Clause License Copyright (c) 2018, INDRA Labs All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
keywords |
nlp
biology
disambiguation
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Adeft
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01708/status.svg)](https://doi.org/10.21105/joss.01708)
[![DOI](https://zenodo.org/badge/156276061.svg)](https://zenodo.org/badge/latestdoi/156276061)
[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)
[![Tests](https://github.com/indralab/adeft/actions/workflows/tests.yml/badge.svg)](https://github.com/indralab/adeft/actions/workflows/tests.yml)
[![Documentation](https://readthedocs.org/projects/adeft/badge/?version=latest)](https://adeft.readthedocs.io/en/latest/?badge=latest)
[![PyPI version](https://badge.fury.io/py/adeft.svg)](https://badge.fury.io/py/adeft)
[![Python 3](https://img.shields.io/pypi/pyversions/adeft.svg)](https://www.python.org/downloads/release/python-357/)
Adeft (Acromine based Disambiguation of Entities From Text context) is a
utility for building models to disambiguate acronyms and other abbreviations of
biological terms in the scientific literature. It makes use of an
implementation of the [Acromine](http://www.chokkan.org/research/acromine/)
algorithm developed by the [NaCTeM](http://www.nactem.ac.uk/index.php) at the
University of Manchester to identify possible longform expansions for
shortforms in a text corpus. It allows users to build disambiguation models to
disambiguate shortforms based on their text context. A growing number of
pretrained disambiguation models are publicly available to download through
adeft.
#### Citation
If you use Adeft in your research, please cite the paper in the Journal of
Open Source Software:
Steppi A, Gyori BM, Bachman JA (2020). Adeft: Acromine-based Disambiguation of
Entities from Text with applications to the biomedical literature. *Journal of
Open Source Software,* 5(45), 1708, https://doi.org/10.21105/joss.01708
## Installation
Adeft works with Python versions 3.5 and above. It is available on PyPi and can be installed with the command
$ pip install adeft
Adeft's pretrained machine learning models can then be downloaded with the command
$ python -m adeft.download
If you choose to install by cloning this repository
$ git clone https://github.com/indralab/adeft.git
You should also run
$ python setup.py build_ext --inplace
at the top level of your local repository in order to build the extension module
for alignment based longform detection and scoring.
## Using Adeft
A dictionary of available models can be imported with `from adeft import available_models`
The dictionary maps shortforms to model names. It's possible for multiple equivalent
shortforms to map to the same model.
Here's an example of running a disambiguator for ER on a list of texts
```python
from adeft.disambiguate import load_disambiguator
er_dd = load_disambiguator('ER')
...
er_dd.disambiguate(texts)
```
Users may also build and train their own disambiguators. See the documention
for more info.
## Documentation
Documentation is available at
[https://adeft.readthedocs.io](http://adeft.readthedocs.io)
Jupyter notebooks illustrating Adeft workflows are available under `notebooks`:
- [Introduction](notebooks/introduction.ipynb)
- [Model building](notebooks/model_building.ipynb)
## Testing
Adeft uses `pytest` for unit testing, and uses Github Actions as a
continuous integration environment. To run tests locally, make sure
to install the test-specific requirements listed in setup.py as
```bash
pip install adeft[test]
```
and download all pre-trained models as shown above.
Then run `pytest` in the top-level `adeft` folder.
## Funding
Development of this software was supported by the Defense Advanced Research
Projects Agency under awards W911NF018-1-0124 and W911NF-15-1-0544, and the
National Cancer Institute under award U54-CA225088.
Raw data
{
"_id": null,
"home_page": null,
"name": "adeft",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "adeft developers <albert.steppi@gmail.com>",
"keywords": "nlp, biology, disambiguation",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/f0/6a/b89a90bbbb7089070c8eb86ab22af1d41f07d36c0e064a0f395c640e7bbb/adeft-0.12.2.tar.gz",
"platform": null,
"description": "# Adeft\n[![DOI](https://joss.theoj.org/papers/10.21105/joss.01708/status.svg)](https://doi.org/10.21105/joss.01708)\n[![DOI](https://zenodo.org/badge/156276061.svg)](https://zenodo.org/badge/latestdoi/156276061)\n[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)\n[![Tests](https://github.com/indralab/adeft/actions/workflows/tests.yml/badge.svg)](https://github.com/indralab/adeft/actions/workflows/tests.yml)\n[![Documentation](https://readthedocs.org/projects/adeft/badge/?version=latest)](https://adeft.readthedocs.io/en/latest/?badge=latest)\n[![PyPI version](https://badge.fury.io/py/adeft.svg)](https://badge.fury.io/py/adeft)\n[![Python 3](https://img.shields.io/pypi/pyversions/adeft.svg)](https://www.python.org/downloads/release/python-357/)\n\nAdeft (Acromine based Disambiguation of Entities From Text context) is a\nutility for building models to disambiguate acronyms and other abbreviations of\nbiological terms in the scientific literature. It makes use of an\nimplementation of the [Acromine](http://www.chokkan.org/research/acromine/)\nalgorithm developed by the [NaCTeM](http://www.nactem.ac.uk/index.php) at the\nUniversity of Manchester to identify possible longform expansions for\nshortforms in a text corpus. It allows users to build disambiguation models to\ndisambiguate shortforms based on their text context. A growing number of\npretrained disambiguation models are publicly available to download through\nadeft.\n\n#### Citation\n\nIf you use Adeft in your research, please cite the paper in the Journal of\nOpen Source Software:\n\nSteppi A, Gyori BM, Bachman JA (2020). Adeft: Acromine-based Disambiguation of\nEntities from Text with applications to the biomedical literature. *Journal of\nOpen Source Software,* 5(45), 1708, https://doi.org/10.21105/joss.01708\n\n## Installation\n\nAdeft works with Python versions 3.5 and above. It is available on PyPi and can be installed with the command\n\n $ pip install adeft\n\nAdeft's pretrained machine learning models can then be downloaded with the command\n\n $ python -m adeft.download\n\nIf you choose to install by cloning this repository\n\n $ git clone https://github.com/indralab/adeft.git\n\nYou should also run\n\n $ python setup.py build_ext --inplace\n\nat the top level of your local repository in order to build the extension module\nfor alignment based longform detection and scoring.\n\n## Using Adeft\nA dictionary of available models can be imported with `from adeft import available_models`\n\nThe dictionary maps shortforms to model names. It's possible for multiple equivalent\nshortforms to map to the same model.\n\nHere's an example of running a disambiguator for ER on a list of texts\n\n```python\nfrom adeft.disambiguate import load_disambiguator\n\ner_dd = load_disambiguator('ER')\n\n ...\n\ner_dd.disambiguate(texts)\n```\n\nUsers may also build and train their own disambiguators. See the documention\nfor more info.\n\n\n## Documentation\n\nDocumentation is available at\n[https://adeft.readthedocs.io](http://adeft.readthedocs.io)\n\nJupyter notebooks illustrating Adeft workflows are available under `notebooks`:\n- [Introduction](notebooks/introduction.ipynb)\n- [Model building](notebooks/model_building.ipynb)\n\n\n## Testing\n\nAdeft uses `pytest` for unit testing, and uses Github Actions as a\ncontinuous integration environment. To run tests locally, make sure\nto install the test-specific requirements listed in setup.py as\n\n```bash\npip install adeft[test]\n```\n\nand download all pre-trained models as shown above.\nThen run `pytest` in the top-level `adeft` folder.\n\n## Funding\n\nDevelopment of this software was supported by the Defense Advanced Research\nProjects Agency under awards W911NF018-1-0124 and W911NF-15-1-0544, and the\nNational Cancer Institute under award U54-CA225088.\n",
"bugtrack_url": null,
"license": "BSD 2-Clause License Copyright (c) 2018, INDRA Labs All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.",
"summary": "Acromine based Disambiguation of Entities From Text",
"version": "0.12.2",
"project_urls": null,
"split_keywords": [
"nlp",
" biology",
" disambiguation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f06ab89a90bbbb7089070c8eb86ab22af1d41f07d36c0e064a0f395c640e7bbb",
"md5": "3af6742dafe33ca034b5a08f14e26bae",
"sha256": "b7e5cde8687f3787ad8f5d8d7d4dc8b57f25121bfa8bb2826efcfbb7e4853731"
},
"downloads": -1,
"filename": "adeft-0.12.2.tar.gz",
"has_sig": false,
"md5_digest": "3af6742dafe33ca034b5a08f14e26bae",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 176000,
"upload_time": "2024-05-10T14:34:34",
"upload_time_iso_8601": "2024-05-10T14:34:34.259376Z",
"url": "https://files.pythonhosted.org/packages/f0/6a/b89a90bbbb7089070c8eb86ab22af1d41f07d36c0e064a0f395c640e7bbb/adeft-0.12.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-10 14:34:34",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "adeft"
}