# pyGNparser
![https://img.shields.io/pypi/v/pygnparser.svg](https://pypi.python.org/pypi/pygnparser) ![https://github.com/gnames/pygnparser/workflows/Python/badge.svg](https://github.com/gnames/pygnparser/actions?query=workflow%3APython)
This is a Python wrapper on the [GNparser](https://parser.globalnames.org/) API. Code follow the spirit/approach of the [pygbif](https://github.com/gbif/pygbif/graphs/contributors) package, and indeed much of the wrapping utility is copied 1:1 from that repo, thanks [@sckott](https://github.com/sckott) and other [contributors](https://github.com/gbif/pygbif/graphs/contributors).
## Installation
Add this line to your application's requirements.txt:
```python
pygnparser
```
And then execute:
$ pip install -r requirements.txt
Or install it yourself as:
$ pip install pygnparser
## Usage
Import the library:
```
from pygnparser import gnparser
```
If you have a local installation of gnparser, set the GNPARSER_BASE_URL to the host and port that the service is running on, for example if running locally on port 8787:
```python
GNPARSER_BASE_URL = "http://localhost:8787/"
```
Without the GNPARSER_BASE_URL environment variable set, the wrapper will default to using the remote API which will perform slower: https://parser.globalnames.org/
---
### Parse a scientific name
Parse a scientific name:
```python
>>> result = gnparser('Ursus arctos Linnaeus, 1758') # => Dictionary
```
Check if parsed:
```python
>>> result.parsed() # => Boolean
True
```
Get [parsed quality](https://github.com/gnames/gnparser#figuring-out-if-names-are-well-formed):
```python
>>> result.quality() # => Integer
1
```
Get the genus name:
```python
>>> result.genus() # => String
'Ursus'
```
Get the species name:
```python
>>> result.species() # => String
'arctos'
```
Get the year:
```python
>>> result.year() # => String
'1758'
```
Get the authorship:
```python
>>> result.authorship() # => String
'(Linnaeus, 1758)'
```
Get the scientific name without the Latin gender stem:
```python
>>> result.canonical_stemmed() # => String
'Ursus arct'
```
Get the parsed name components for a hybrid formula:
```python
>>> result = gnparser('Isoetes lacustris × stricta Gay') # => Dictionary
>>> result.is_hybrid() # => Boolean
True
>>> result.hybrid() # => String
'HYBRID_FORMULA'
>>> result.normalized() # => String
'Isoetes lacustris × Isoetes stricta Gay'
>>> result.hybrid_formula_ranks() # => Array
['species', 'species']
>>> res.hybrid_formula_genera() # => Array
['Isoetes', 'Isoetes']
>>> res.hybrid_formula_species() # => Array
['lacustris', 'stricta']
>>> res.hybrid_formula_authorship() # => Array
['', 'Gay']
```
Parse a scientific name under a specified nomenclatural code:
```python
>>> result = gnparser('Malus domestica \'Fuji\'', code='cultivar')
>>> result.is_cultivar() # => Boolean
True
>>> result.genus() # => String
'Malus'
>>> result.species() # => String
'domestica'
>>> result.cultivar() # => String
'‘Fuji’'
>>> result.nomenclatural_code() # => String
'ICNCP'
```
---
### Parse multiple scientific names
Parse multiple scientific names by separating them with `\r\n`:
```python
results = gnparser('Ursus arctos Linnaeus, 1758\r\nAlces alces (Linnaeus, 1758)\r\nRangifer tarandus (Linnaeus, 1758)\r\nUrsus maritimus (Phipps, 1774') # => Array
```
Get the genus of the 1st parsed name in the list:
```python
results[0].genus() # => String
'Ursus'
```
---
## Deviations
Some extra helpers are included that extend the functionality of GNparser.
1) The page() method gets the page number out of the unparsed tail:
```python
>>> result = gnparser('Ursus arctos Linnaeus, 1758: 81')
result.page() # => String
'81'
```
2) The authorship() method returns a formatted authorship string depending on the number of authors. If it is one author with a year, it will return as Smith, 1970. For two authors and a year it will return as Smith & Johnson, 1970. For three authors it will return as Smith, Johnson & Jones, 1970. Any additional authors beyond 3 will be comma separated with the last author included with an ampersand.
```python
>>> result = gnparser('Aus bus cus Smith, Johnson, & Jones, 1970')
result.authorship() # => String
'Smith, Johnson & Jones, 1970'
```
3) The infraspecies() method will return the infraspecies name. Currently there is no special methods for ranks lower than trinomials but you can access them with the infraspecies_details() method. Please [open an issue](https://github.com/gnames/pygnparser/issues/new) if you need it added.
```python
>>> result = gnparser('Aus bus cus')
result.infraspecies() # => String
'cus'
```
4) At present, GNParser normalizes authorships like `Smith in Jones, 1999` to `Smith ex Jones 1999`. In the Python wrapper, it is possible to override that behavior by setting the `preserve_in_authorship` parameter to `True` when calling the `authorship()`, `authorship_normalized()`, `combination_authorship()`, `original_authorship()`, or `normalized()` functions.
```python
>>> result = gnparser('Aus bus Smith in Jones, 1999')
result.normalized() # => String
'Aus bus Smith ex Jones 1999'
result.normalized(preserve_in_authorship=True) # => String
'Aus bus Smith in Jones 1999'
```
* If the verbatim authorship contains `ex`, setting preserve_in_authorship to `True` will not change `ex` to `in`:
```python
>>> result = gnparser('Aus bus Smith ex Jones, 1999')
result.normalized(preserve_in_authorship=True) # => String
'Aus bus Smith ex Jones 1999'
```
---
## Other GNparser Libraries
* Node.js: [node-gnparser](https://github.com/amazingplants/node-gnparser)
* R: [rgnparser](https://github.com/ropensci/rgnparser)
* Ruby Gem: [biodiversity](https://github.com/GlobalNamesArchitecture/biodiversity)
---
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/gnames/pygnparser. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).
---
## Development
After checking out the repo, change into the package directory `cd pygnparser`, run `pip install .` to install the package, and `pip install -r requirements.txt` to install the dependencies. Then, run `pytest` to run the tests. You can also run `bin/console` for an interactive Python prompt that will allow you to experiment with the above example commands.
---
## License
The package is available as open source under the terms of the [MIT](https://github.com/gnames/pygnparser/blob/main/LICENSE.txt) license. You can learn more about the MIT license on [Wikipedia](https://en.wikipedia.org/wiki/MIT_License) and compare it with other open source licenses at the [Open Source Initiative](https://opensource.org/license/mit/).
---
## Code of Conduct
Everyone interacting in the pyGNparser project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).
## [Unreleased]
## [0.0.5] - 2024-11-11
- Added nomenclatural code parameter
- Added result methods for cultivars
- Disabled et_al_cutoff formatting by default
- Removed preserve_in_authorship parameter from authorship() because GNparser no longer normalizes `in` to `ex`
## [0.0.4] - 2024-10-15
- Added preserve_in_authorship parameter to authorship() to optionally override normalization of `in` to `ex`
## [0.0.3] - 2024-10-14
- Added uninomial method
## [0.0.2] - 2024-10-10
- Added named hybrid and hybrid formula handling
- Added original authorship and combination authorship handling
- Added et_al_cutoff parameter to authorship formatting
## [0.0.1] - 2024-03-27
- Initial release
Raw data
{
"_id": null,
"home_page": "http://github.com/gnames/pyGNparser",
"name": "pyGNparser",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "biodiversity, scientific names, parser, nomenclature, taxonomy, API, web-services, species, natural history, taxonomists, biologists, Global Names",
"author": "Geoff Ower",
"author_email": "gdower@illinois.edu",
"download_url": "https://files.pythonhosted.org/packages/05/f5/db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787/pygnparser-0.0.5.tar.gz",
"platform": null,
"description": "# pyGNparser\n\n![https://img.shields.io/pypi/v/pygnparser.svg](https://pypi.python.org/pypi/pygnparser) ![https://github.com/gnames/pygnparser/workflows/Python/badge.svg](https://github.com/gnames/pygnparser/actions?query=workflow%3APython)\n\nThis is a Python wrapper on the [GNparser](https://parser.globalnames.org/) API. Code follow the spirit/approach of the [pygbif](https://github.com/gbif/pygbif/graphs/contributors) package, and indeed much of the wrapping utility is copied 1:1 from that repo, thanks [@sckott](https://github.com/sckott) and other [contributors](https://github.com/gbif/pygbif/graphs/contributors).\n\n## Installation\n\nAdd this line to your application's requirements.txt:\n\n```python\npygnparser\n```\n\nAnd then execute:\n\n $ pip install -r requirements.txt\n\nOr install it yourself as:\n\n $ pip install pygnparser\n\n## Usage\n\n\nImport the library:\n```\nfrom pygnparser import gnparser\n```\n\nIf you have a local installation of gnparser, set the GNPARSER_BASE_URL to the host and port that the service is running on, for example if running locally on port 8787:\n\n```python\nGNPARSER_BASE_URL = \"http://localhost:8787/\"\n```\n\nWithout the GNPARSER_BASE_URL environment variable set, the wrapper will default to using the remote API which will perform slower: https://parser.globalnames.org/\n\n\n---\n### Parse a scientific name\nParse a scientific name:\n```python\n>>> result = gnparser('Ursus arctos Linnaeus, 1758') # => Dictionary\n```\n\nCheck if parsed:\n```python\n>>> result.parsed() # => Boolean\nTrue\n```\n\nGet [parsed quality](https://github.com/gnames/gnparser#figuring-out-if-names-are-well-formed):\n```python\n>>> result.quality() # => Integer\n1\n```\n\nGet the genus name:\n```python\n>>> result.genus() # => String\n'Ursus'\n```\n\nGet the species name:\n```python\n>>> result.species() # => String\n'arctos'\n```\n\nGet the year:\n```python\n>>> result.year() # => String\n'1758'\n```\n\nGet the authorship:\n```python\n>>> result.authorship() # => String\n'(Linnaeus, 1758)'\n```\n\nGet the scientific name without the Latin gender stem:\n```python\n>>> result.canonical_stemmed() # => String\n'Ursus arct'\n```\n\nGet the parsed name components for a hybrid formula:\n```python\n>>> result = gnparser('Isoetes lacustris \u00d7 stricta Gay') # => Dictionary\n>>> result.is_hybrid() # => Boolean\nTrue\n>>> result.hybrid() # => String\n'HYBRID_FORMULA'\n>>> result.normalized() # => String\n'Isoetes lacustris \u00d7 Isoetes stricta Gay'\n>>> result.hybrid_formula_ranks() # => Array\n['species', 'species']\n>>> res.hybrid_formula_genera() # => Array\n['Isoetes', 'Isoetes']\n>>> res.hybrid_formula_species() # => Array\n['lacustris', 'stricta']\n>>> res.hybrid_formula_authorship() # => Array\n['', 'Gay']\n```\n\nParse a scientific name under a specified nomenclatural code:\n```python\n>>> result = gnparser('Malus domestica \\'Fuji\\'', code='cultivar')\n>>> result.is_cultivar() # => Boolean\nTrue\n>>> result.genus() # => String\n'Malus'\n>>> result.species() # => String\n'domestica'\n>>> result.cultivar() # => String\n'\u2018Fuji\u2019'\n>>> result.nomenclatural_code() # => String\n'ICNCP'\n```\n\n---\n### Parse multiple scientific names\nParse multiple scientific names by separating them with `\\r\\n`:\n```python\nresults = gnparser('Ursus arctos Linnaeus, 1758\\r\\nAlces alces (Linnaeus, 1758)\\r\\nRangifer tarandus (Linnaeus, 1758)\\r\\nUrsus maritimus (Phipps, 1774') # => Array\n```\n\nGet the genus of the 1st parsed name in the list:\n```python\nresults[0].genus() # => String\n'Ursus'\n```\n\n---\n## Deviations\n\nSome extra helpers are included that extend the functionality of GNparser.\n\n1) The page() method gets the page number out of the unparsed tail:\n```python\n>>> result = gnparser('Ursus arctos Linnaeus, 1758: 81')\nresult.page() # => String\n'81'\n```\n\n2) The authorship() method returns a formatted authorship string depending on the number of authors. If it is one author with a year, it will return as Smith, 1970. For two authors and a year it will return as Smith & Johnson, 1970. For three authors it will return as Smith, Johnson & Jones, 1970. Any additional authors beyond 3 will be comma separated with the last author included with an ampersand.\n```python\n>>> result = gnparser('Aus bus cus Smith, Johnson, & Jones, 1970')\nresult.authorship() # => String\n'Smith, Johnson & Jones, 1970'\n```\n\n3) The infraspecies() method will return the infraspecies name. Currently there is no special methods for ranks lower than trinomials but you can access them with the infraspecies_details() method. Please [open an issue](https://github.com/gnames/pygnparser/issues/new) if you need it added.\n```python\n>>> result = gnparser('Aus bus cus')\nresult.infraspecies() # => String\n'cus'\n```\n\n4) At present, GNParser normalizes authorships like `Smith in Jones, 1999` to `Smith ex Jones 1999`. In the Python wrapper, it is possible to override that behavior by setting the `preserve_in_authorship` parameter to `True` when calling the `authorship()`, `authorship_normalized()`, `combination_authorship()`, `original_authorship()`, or `normalized()` functions.\n```python\n>>> result = gnparser('Aus bus Smith in Jones, 1999')\nresult.normalized() # => String\n'Aus bus Smith ex Jones 1999'\nresult.normalized(preserve_in_authorship=True) # => String\n'Aus bus Smith in Jones 1999'\n```\n* If the verbatim authorship contains `ex`, setting preserve_in_authorship to `True` will not change `ex` to `in`:\n```python\n>>> result = gnparser('Aus bus Smith ex Jones, 1999')\nresult.normalized(preserve_in_authorship=True) # => String\n'Aus bus Smith ex Jones 1999'\n```\n\n---\n## Other GNparser Libraries\n\n* Node.js: [node-gnparser](https://github.com/amazingplants/node-gnparser)\n* R: [rgnparser](https://github.com/ropensci/rgnparser)\n* Ruby Gem: [biodiversity](https://github.com/GlobalNamesArchitecture/biodiversity)\n\n---\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/gnames/pygnparser. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).\n\n---\n## Development\n\nAfter checking out the repo, change into the package directory `cd pygnparser`, run `pip install .` to install the package, and `pip install -r requirements.txt` to install the dependencies. Then, run `pytest` to run the tests. You can also run `bin/console` for an interactive Python prompt that will allow you to experiment with the above example commands.\n\n---\n## License\n\nThe package is available as open source under the terms of the [MIT](https://github.com/gnames/pygnparser/blob/main/LICENSE.txt) license. You can learn more about the MIT license on [Wikipedia](https://en.wikipedia.org/wiki/MIT_License) and compare it with other open source licenses at the [Open Source Initiative](https://opensource.org/license/mit/).\n\n---\n## Code of Conduct\n\nEveryone interacting in the pyGNparser project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).\n\n\n## [Unreleased]\n\n## [0.0.5] - 2024-11-11\n- Added nomenclatural code parameter\n- Added result methods for cultivars\n- Disabled et_al_cutoff formatting by default\n- Removed preserve_in_authorship parameter from authorship() because GNparser no longer normalizes `in` to `ex`\n\n## [0.0.4] - 2024-10-15\n\n- Added preserve_in_authorship parameter to authorship() to optionally override normalization of `in` to `ex`\n\n## [0.0.3] - 2024-10-14\n\n- Added uninomial method\n\n## [0.0.2] - 2024-10-10\n\n- Added named hybrid and hybrid formula handling\n- Added original authorship and combination authorship handling\n- Added et_al_cutoff parameter to authorship formatting\n\n## [0.0.1] - 2024-03-27\n\n- Initial release\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python client for GNparser",
"version": "0.0.5",
"project_urls": {
"Download": "https://github.com/gnames/pyGNparser/archive/refs/tags/v0.0.5.tar.gz",
"Homepage": "http://github.com/gnames/pyGNparser"
},
"split_keywords": [
"biodiversity",
" scientific names",
" parser",
" nomenclature",
" taxonomy",
" api",
" web-services",
" species",
" natural history",
" taxonomists",
" biologists",
" global names"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3f42d0a676205883722635a5dbac734ac61f88bd86a554426985c1387771634f",
"md5": "5101cf75ebfa7eefed28e36ec8344555",
"sha256": "4789c815544d1d5b4c3354d65a662b3aee0d9608136915eab48b00e6fb1c6a17"
},
"downloads": -1,
"filename": "pyGNparser-0.0.5-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "5101cf75ebfa7eefed28e36ec8344555",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 12653,
"upload_time": "2024-11-12T17:37:42",
"upload_time_iso_8601": "2024-11-12T17:37:42.667099Z",
"url": "https://files.pythonhosted.org/packages/3f/42/d0a676205883722635a5dbac734ac61f88bd86a554426985c1387771634f/pyGNparser-0.0.5-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "05f5db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787",
"md5": "2d0373d603d90ec1cb489addd0a3cfd6",
"sha256": "679e4be4fa9a2478ad90ddc83c44551b24485eff9e1f91f7d9e41159db016435"
},
"downloads": -1,
"filename": "pygnparser-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "2d0373d603d90ec1cb489addd0a3cfd6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 11883,
"upload_time": "2024-11-12T17:37:44",
"upload_time_iso_8601": "2024-11-12T17:37:44.670327Z",
"url": "https://files.pythonhosted.org/packages/05/f5/db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787/pygnparser-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-12 17:37:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gnames",
"github_project": "pyGNparser",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "pygnparser"
}