pyGNparser


NamepyGNparser JSON
Version 0.0.5 PyPI version JSON
download
home_pagehttp://github.com/gnames/pyGNparser
SummaryPython client for GNparser
upload_time2024-11-12 17:37:44
maintainerNone
docs_urlNone
authorGeoff Ower
requires_pythonNone
licenseMIT
keywords biodiversity scientific names parser nomenclature taxonomy api web-services species natural history taxonomists biologists global names
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pyGNparser

![https://img.shields.io/pypi/v/pygnparser.svg](https://pypi.python.org/pypi/pygnparser) ![https://github.com/gnames/pygnparser/workflows/Python/badge.svg](https://github.com/gnames/pygnparser/actions?query=workflow%3APython)

This is a Python wrapper on the [GNparser](https://parser.globalnames.org/) API. Code follow the spirit/approach of the [pygbif](https://github.com/gbif/pygbif/graphs/contributors) package, and indeed much of the wrapping utility is copied 1:1 from that repo, thanks [@sckott](https://github.com/sckott) and other [contributors](https://github.com/gbif/pygbif/graphs/contributors).

## Installation

Add this line to your application's requirements.txt:

```python
pygnparser
```

And then execute:

    $ pip install -r requirements.txt

Or install it yourself as:

    $ pip install pygnparser

## Usage


Import the library:
```
from pygnparser import gnparser
```

If you have a local installation of gnparser, set the GNPARSER_BASE_URL to the host and port that the service is running on, for example if running locally on port 8787:

```python
GNPARSER_BASE_URL = "http://localhost:8787/"
```

Without the GNPARSER_BASE_URL environment variable set, the wrapper will default to using the remote API which will perform slower: https://parser.globalnames.org/


---
### Parse a scientific name
Parse a scientific name:
```python
>>> result = gnparser('Ursus arctos Linnaeus, 1758') #  => Dictionary
```

Check if parsed:
```python
>>> result.parsed() #  => Boolean
True
```

Get [parsed quality](https://github.com/gnames/gnparser#figuring-out-if-names-are-well-formed):
```python
>>> result.quality() #  => Integer
1
```

Get the genus name:
```python
>>> result.genus() #  => String
'Ursus'
```

Get the species name:
```python
>>> result.species() #  => String
'arctos'
```

Get the year:
```python
>>> result.year() #  => String
'1758'
```

Get the authorship:
```python
>>> result.authorship() #  => String
'(Linnaeus, 1758)'
```

Get the scientific name without the Latin gender stem:
```python
>>> result.canonical_stemmed() #  => String
'Ursus arct'
```

Get the parsed name components for a hybrid formula:
```python
>>> result = gnparser('Isoetes lacustris × stricta Gay') #  => Dictionary
>>> result.is_hybrid() #  => Boolean
True
>>> result.hybrid() #  => String
'HYBRID_FORMULA'
>>> result.normalized() #  => String
'Isoetes lacustris × Isoetes stricta Gay'
>>> result.hybrid_formula_ranks() #  => Array
['species', 'species']
>>> res.hybrid_formula_genera() #  => Array
['Isoetes', 'Isoetes']
>>> res.hybrid_formula_species() #  => Array
['lacustris', 'stricta']
>>> res.hybrid_formula_authorship() #  => Array
['', 'Gay']
```

Parse a scientific name under a specified nomenclatural code:
```python
>>> result = gnparser('Malus domestica \'Fuji\'', code='cultivar')
>>> result.is_cultivar() #  => Boolean
True
>>> result.genus() #  => String
'Malus'
>>> result.species() #  => String
'domestica'
>>> result.cultivar() #  => String
'‘Fuji’'
>>> result.nomenclatural_code() #  => String
'ICNCP'
```

---
### Parse multiple scientific names
Parse multiple scientific names by separating them with `\r\n`:
```python
results = gnparser('Ursus arctos Linnaeus, 1758\r\nAlces alces (Linnaeus, 1758)\r\nRangifer tarandus (Linnaeus, 1758)\r\nUrsus maritimus (Phipps, 1774') #  => Array
```

Get the genus of the 1st parsed name in the list:
```python
results[0].genus() #  => String
'Ursus'
```

---
## Deviations

Some extra helpers are included that extend the functionality of GNparser.

1) The page() method gets the page number out of the unparsed tail:
```python
>>> result = gnparser('Ursus arctos Linnaeus, 1758: 81')
result.page()  # => String
'81'
```

2) The authorship() method returns a formatted authorship string depending on the number of authors. If it is one author with a year, it will return as Smith, 1970. For two authors and a year it will return as Smith & Johnson, 1970. For three authors it will return as Smith, Johnson & Jones, 1970. Any additional authors beyond 3 will be comma separated with the last author included with an ampersand.
```python
>>> result = gnparser('Aus bus cus Smith, Johnson, & Jones, 1970')
result.authorship()  # => String
'Smith, Johnson & Jones, 1970'
```

3) The infraspecies() method will return the infraspecies name. Currently there is no special methods for ranks lower than trinomials but you can access them with the infraspecies_details() method. Please [open an issue](https://github.com/gnames/pygnparser/issues/new) if you need it added.
```python
>>> result = gnparser('Aus bus cus')
result.infraspecies()  # => String
'cus'
```

4) At present, GNParser normalizes authorships like `Smith in Jones, 1999` to `Smith ex Jones 1999`. In the Python wrapper, it is possible to override that behavior by setting the `preserve_in_authorship` parameter to `True` when calling the `authorship()`, `authorship_normalized()`, `combination_authorship()`, `original_authorship()`, or `normalized()` functions.
```python
>>> result = gnparser('Aus bus Smith in Jones, 1999')
result.normalized()  # => String
'Aus bus Smith ex Jones 1999'
result.normalized(preserve_in_authorship=True)  # => String
'Aus bus Smith in Jones 1999'
```
* If the verbatim authorship contains `ex`, setting preserve_in_authorship to `True` will not change `ex` to `in`:
```python
>>> result = gnparser('Aus bus Smith ex Jones, 1999')
result.normalized(preserve_in_authorship=True)  # => String
'Aus bus Smith ex Jones 1999'
```

---
## Other GNparser Libraries

* Node.js: [node-gnparser](https://github.com/amazingplants/node-gnparser)
* R: [rgnparser](https://github.com/ropensci/rgnparser)
* Ruby Gem: [biodiversity](https://github.com/GlobalNamesArchitecture/biodiversity)

---
## Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/gnames/pygnparser. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).

---
## Development

After checking out the repo, change into the package directory `cd pygnparser`, run `pip install .` to install the package, and `pip install -r requirements.txt` to install the dependencies. Then, run `pytest` to run the tests. You can also run `bin/console` for an interactive Python prompt that will allow you to experiment with the above example commands.

---
## License

The package is available as open source under the terms of the [MIT](https://github.com/gnames/pygnparser/blob/main/LICENSE.txt) license. You can learn more about the MIT license on [Wikipedia](https://en.wikipedia.org/wiki/MIT_License) and compare it with other open source licenses at the [Open Source Initiative](https://opensource.org/license/mit/).

---
## Code of Conduct

Everyone interacting in the pyGNparser project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).


## [Unreleased]

## [0.0.5] - 2024-11-11
- Added nomenclatural code parameter
- Added result methods for cultivars
- Disabled et_al_cutoff formatting by default
- Removed preserve_in_authorship parameter from authorship() because GNparser no longer normalizes `in` to `ex`

## [0.0.4] - 2024-10-15

- Added preserve_in_authorship parameter to authorship() to optionally override normalization of `in` to `ex`

## [0.0.3] - 2024-10-14

- Added uninomial method

## [0.0.2] - 2024-10-10

- Added named hybrid and hybrid formula handling
- Added original authorship and combination authorship handling
- Added et_al_cutoff parameter to authorship formatting

## [0.0.1] - 2024-03-27

- Initial release

            

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/gnames/pyGNparser",
    "name": "pyGNparser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "biodiversity, scientific names, parser, nomenclature, taxonomy, API, web-services, species, natural history, taxonomists, biologists, Global Names",
    "author": "Geoff Ower",
    "author_email": "gdower@illinois.edu",
    "download_url": "https://files.pythonhosted.org/packages/05/f5/db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787/pygnparser-0.0.5.tar.gz",
    "platform": null,
    "description": "# pyGNparser\n\n![https://img.shields.io/pypi/v/pygnparser.svg](https://pypi.python.org/pypi/pygnparser) ![https://github.com/gnames/pygnparser/workflows/Python/badge.svg](https://github.com/gnames/pygnparser/actions?query=workflow%3APython)\n\nThis is a Python wrapper on the [GNparser](https://parser.globalnames.org/) API. Code follow the spirit/approach of the [pygbif](https://github.com/gbif/pygbif/graphs/contributors) package, and indeed much of the wrapping utility is copied 1:1 from that repo, thanks [@sckott](https://github.com/sckott) and other [contributors](https://github.com/gbif/pygbif/graphs/contributors).\n\n## Installation\n\nAdd this line to your application's requirements.txt:\n\n```python\npygnparser\n```\n\nAnd then execute:\n\n    $ pip install -r requirements.txt\n\nOr install it yourself as:\n\n    $ pip install pygnparser\n\n## Usage\n\n\nImport the library:\n```\nfrom pygnparser import gnparser\n```\n\nIf you have a local installation of gnparser, set the GNPARSER_BASE_URL to the host and port that the service is running on, for example if running locally on port 8787:\n\n```python\nGNPARSER_BASE_URL = \"http://localhost:8787/\"\n```\n\nWithout the GNPARSER_BASE_URL environment variable set, the wrapper will default to using the remote API which will perform slower: https://parser.globalnames.org/\n\n\n---\n### Parse a scientific name\nParse a scientific name:\n```python\n>>> result = gnparser('Ursus arctos Linnaeus, 1758') #  => Dictionary\n```\n\nCheck if parsed:\n```python\n>>> result.parsed() #  => Boolean\nTrue\n```\n\nGet [parsed quality](https://github.com/gnames/gnparser#figuring-out-if-names-are-well-formed):\n```python\n>>> result.quality() #  => Integer\n1\n```\n\nGet the genus name:\n```python\n>>> result.genus() #  => String\n'Ursus'\n```\n\nGet the species name:\n```python\n>>> result.species() #  => String\n'arctos'\n```\n\nGet the year:\n```python\n>>> result.year() #  => String\n'1758'\n```\n\nGet the authorship:\n```python\n>>> result.authorship() #  => String\n'(Linnaeus, 1758)'\n```\n\nGet the scientific name without the Latin gender stem:\n```python\n>>> result.canonical_stemmed() #  => String\n'Ursus arct'\n```\n\nGet the parsed name components for a hybrid formula:\n```python\n>>> result = gnparser('Isoetes lacustris \u00d7 stricta Gay') #  => Dictionary\n>>> result.is_hybrid() #  => Boolean\nTrue\n>>> result.hybrid() #  => String\n'HYBRID_FORMULA'\n>>> result.normalized() #  => String\n'Isoetes lacustris \u00d7 Isoetes stricta Gay'\n>>> result.hybrid_formula_ranks() #  => Array\n['species', 'species']\n>>> res.hybrid_formula_genera() #  => Array\n['Isoetes', 'Isoetes']\n>>> res.hybrid_formula_species() #  => Array\n['lacustris', 'stricta']\n>>> res.hybrid_formula_authorship() #  => Array\n['', 'Gay']\n```\n\nParse a scientific name under a specified nomenclatural code:\n```python\n>>> result = gnparser('Malus domestica \\'Fuji\\'', code='cultivar')\n>>> result.is_cultivar() #  => Boolean\nTrue\n>>> result.genus() #  => String\n'Malus'\n>>> result.species() #  => String\n'domestica'\n>>> result.cultivar() #  => String\n'\u2018Fuji\u2019'\n>>> result.nomenclatural_code() #  => String\n'ICNCP'\n```\n\n---\n### Parse multiple scientific names\nParse multiple scientific names by separating them with `\\r\\n`:\n```python\nresults = gnparser('Ursus arctos Linnaeus, 1758\\r\\nAlces alces (Linnaeus, 1758)\\r\\nRangifer tarandus (Linnaeus, 1758)\\r\\nUrsus maritimus (Phipps, 1774') #  => Array\n```\n\nGet the genus of the 1st parsed name in the list:\n```python\nresults[0].genus() #  => String\n'Ursus'\n```\n\n---\n## Deviations\n\nSome extra helpers are included that extend the functionality of GNparser.\n\n1) The page() method gets the page number out of the unparsed tail:\n```python\n>>> result = gnparser('Ursus arctos Linnaeus, 1758: 81')\nresult.page()  # => String\n'81'\n```\n\n2) The authorship() method returns a formatted authorship string depending on the number of authors. If it is one author with a year, it will return as Smith, 1970. For two authors and a year it will return as Smith & Johnson, 1970. For three authors it will return as Smith, Johnson & Jones, 1970. Any additional authors beyond 3 will be comma separated with the last author included with an ampersand.\n```python\n>>> result = gnparser('Aus bus cus Smith, Johnson, & Jones, 1970')\nresult.authorship()  # => String\n'Smith, Johnson & Jones, 1970'\n```\n\n3) The infraspecies() method will return the infraspecies name. Currently there is no special methods for ranks lower than trinomials but you can access them with the infraspecies_details() method. Please [open an issue](https://github.com/gnames/pygnparser/issues/new) if you need it added.\n```python\n>>> result = gnparser('Aus bus cus')\nresult.infraspecies()  # => String\n'cus'\n```\n\n4) At present, GNParser normalizes authorships like `Smith in Jones, 1999` to `Smith ex Jones 1999`. In the Python wrapper, it is possible to override that behavior by setting the `preserve_in_authorship` parameter to `True` when calling the `authorship()`, `authorship_normalized()`, `combination_authorship()`, `original_authorship()`, or `normalized()` functions.\n```python\n>>> result = gnparser('Aus bus Smith in Jones, 1999')\nresult.normalized()  # => String\n'Aus bus Smith ex Jones 1999'\nresult.normalized(preserve_in_authorship=True)  # => String\n'Aus bus Smith in Jones 1999'\n```\n* If the verbatim authorship contains `ex`, setting preserve_in_authorship to `True` will not change `ex` to `in`:\n```python\n>>> result = gnparser('Aus bus Smith ex Jones, 1999')\nresult.normalized(preserve_in_authorship=True)  # => String\n'Aus bus Smith ex Jones 1999'\n```\n\n---\n## Other GNparser Libraries\n\n* Node.js: [node-gnparser](https://github.com/amazingplants/node-gnparser)\n* R: [rgnparser](https://github.com/ropensci/rgnparser)\n* Ruby Gem: [biodiversity](https://github.com/GlobalNamesArchitecture/biodiversity)\n\n---\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/gnames/pygnparser. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).\n\n---\n## Development\n\nAfter checking out the repo, change into the package directory `cd pygnparser`, run `pip install .` to install the package, and `pip install -r requirements.txt` to install the dependencies. Then, run `pytest` to run the tests. You can also run `bin/console` for an interactive Python prompt that will allow you to experiment with the above example commands.\n\n---\n## License\n\nThe package is available as open source under the terms of the [MIT](https://github.com/gnames/pygnparser/blob/main/LICENSE.txt) license. You can learn more about the MIT license on [Wikipedia](https://en.wikipedia.org/wiki/MIT_License) and compare it with other open source licenses at the [Open Source Initiative](https://opensource.org/license/mit/).\n\n---\n## Code of Conduct\n\nEveryone interacting in the pyGNparser project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/gnames/pygnparser/blob/main/CODE_OF_CONDUCT.md).\n\n\n## [Unreleased]\n\n## [0.0.5] - 2024-11-11\n- Added nomenclatural code parameter\n- Added result methods for cultivars\n- Disabled et_al_cutoff formatting by default\n- Removed preserve_in_authorship parameter from authorship() because GNparser no longer normalizes `in` to `ex`\n\n## [0.0.4] - 2024-10-15\n\n- Added preserve_in_authorship parameter to authorship() to optionally override normalization of `in` to `ex`\n\n## [0.0.3] - 2024-10-14\n\n- Added uninomial method\n\n## [0.0.2] - 2024-10-10\n\n- Added named hybrid and hybrid formula handling\n- Added original authorship and combination authorship handling\n- Added et_al_cutoff parameter to authorship formatting\n\n## [0.0.1] - 2024-03-27\n\n- Initial release\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python client for GNparser",
    "version": "0.0.5",
    "project_urls": {
        "Download": "https://github.com/gnames/pyGNparser/archive/refs/tags/v0.0.5.tar.gz",
        "Homepage": "http://github.com/gnames/pyGNparser"
    },
    "split_keywords": [
        "biodiversity",
        " scientific names",
        " parser",
        " nomenclature",
        " taxonomy",
        " api",
        " web-services",
        " species",
        " natural history",
        " taxonomists",
        " biologists",
        " global names"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f42d0a676205883722635a5dbac734ac61f88bd86a554426985c1387771634f",
                "md5": "5101cf75ebfa7eefed28e36ec8344555",
                "sha256": "4789c815544d1d5b4c3354d65a662b3aee0d9608136915eab48b00e6fb1c6a17"
            },
            "downloads": -1,
            "filename": "pyGNparser-0.0.5-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5101cf75ebfa7eefed28e36ec8344555",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 12653,
            "upload_time": "2024-11-12T17:37:42",
            "upload_time_iso_8601": "2024-11-12T17:37:42.667099Z",
            "url": "https://files.pythonhosted.org/packages/3f/42/d0a676205883722635a5dbac734ac61f88bd86a554426985c1387771634f/pyGNparser-0.0.5-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "05f5db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787",
                "md5": "2d0373d603d90ec1cb489addd0a3cfd6",
                "sha256": "679e4be4fa9a2478ad90ddc83c44551b24485eff9e1f91f7d9e41159db016435"
            },
            "downloads": -1,
            "filename": "pygnparser-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "2d0373d603d90ec1cb489addd0a3cfd6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11883,
            "upload_time": "2024-11-12T17:37:44",
            "upload_time_iso_8601": "2024-11-12T17:37:44.670327Z",
            "url": "https://files.pythonhosted.org/packages/05/f5/db74f0d9053d3335aa7635e327234b11bee98b352864cf5181bcbfcf6787/pygnparser-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-12 17:37:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "gnames",
    "github_project": "pyGNparser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "pygnparser"
}
        
Elapsed time: 0.37222s