EcoNameTranslator


NameEcoNameTranslator JSON
Version 2.1 PyPI version JSON
download
home_pagehttps://github.com/Daniel-Davies/MedeinaTranslator
SummaryA lightweight but powerful package for full management and translation of ecological names
upload_time2020-07-05 15:16:31
maintainer
docs_urlNone
authorDaniel Davies
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # The Ecological Name Translator

### What is it?

A lightweight python package containing everything you need for translation and management of ecological names. The package takes inspiration from the "taxize" package in R, and currently provides all of it's functionality. On top of this however, the EcoNameTranslator aims to be far more powerful; rather than being a thin wrapper around specific ecological name data-stores, the multiple data-stores are leveraged together alongside statistical inference to provide more coherent output and failure correction of the underlying APIs and user input. API calls are made concurrently for increased performance. 

### Functionality

##### Get Taxonomy For A Scientific Name

Given a list of scientific names (at any taxonomic rank) this function will standardise and spell check your input names, before returning a taxonomic profile for each name:

```python
from EcoNameTranslator import classify
scientific_names = classify(['Vulpes vulpes','Delphinidae']) 
# {
#   'vulpes vulpes': {'species': 'vulpes vulpes','genus':'vulpes','family':'canidae'...},
#   'delphinidae': {'family': 'delphinidae','order': 'cetacea','class': 'mammalia'...}
# } 
```

The output of multiple databases is taken by the classify funcion, and a consensus protocol is run to determine the most likely true taxonomic ranking. This is to guard against inconsistencies (or false inputs) that occur in some databases, which arise from time to time, especially for lesser known species. 


##### Common Name To Scientific Name

A list of common names are accepted as input, which are then mapped into their scientific species names:

```python
from EcoNameTranslator import to_scientific
scientific_names = to_scientific(['blackbird']) 
# {
#   'blackbird': ['Turdus merula', 'Chrysomus icterocephalus', 'Agelaius assimilis'...],
# } 
```

This basic version should suit most applications- but some common names can differ from what you may mean; for example, suppose we want to obtain various species of crocodile:

```python
scientific_names = to_scientific(['crocodile']) 
# {
#   'crocodile': ['Crocodylus novaeguineae', 'Crocodylus johnsoni','Pseudocarcharias kamoharai'...]
# } 
```

But oh no! Pseudocarcharias kamoharai isn't a crocodile...it's the "crocodile shark". If you would like to guard against these natural language issues, you can use the sanityCheck parameter in the to_scientific function, as follows:

```python
scientific_names = to_scientific(['crocodile'],sanityCheck=True) 
# {
#   'crocodile': ['crocodylus acutus', 'crocodylus moreletii', 'crocodylus novaeguineae'...]
# } 
```

Now, only the species that we commonly know as crocodiles will be returned. (Note however that as a side effect, this will also remove any additional specifics in the name- for example,  "Osteolaemus tetraspis tetraspis" will become simply "Osteolaemus tetraspis")

##### Scientific Name To Common

Given a list of scientific names (at any taxonomic rank) this function will standardise and spell check your input names, before returning the common English names that can describe the taxonmic input name

```python
from EcoNameTranslator import to_common
common_names = to_common(['vulpes vulpes','ursus']) 
# {
#   'vulpes vulpes': ['Red Fox','Renard Roux'],
#   'ursus': ['Asiatic Black Bear', 'Mexican Grizzly Bear', 'American Black Bear', ...]
# } 
```

##### Any Unstandardised Names To Scientific Species

A list of ecological names, in any format, is accepted as input. This undergoes a data-cleaning procedure (namely, removing nomenclature flags and other redundant information), after which the following actions are taken:

- Names that are already in a standard species format (that is, genus + species), have any spelling errors corrected and are passed back

- Names at higher levels of taxonomy again have any spelling mistakes corrected, and are then mapped to a list of specific species names

- Common names (currently, English only) are mapped to all of the scientific species names that can be described by the common name)

```python
from EcoNameTranslator import to_species
#Should be "Panthera tigris"  
wrong_spelling = to_species(['Panhera tigris'])      
# {'Panera tigris':['panthera tigris']}    
```
```python
#Higher taxa    
higher_taxa = to_species(['Vulpes']) 
# {'Vulpes': ['Vulpes lagopus', 'Vulpes ferrilata', 'Vulpes zerda', 'Vulpes vulpes'...]}
```
```python
#Common English name
common_name = to_species(['blackbird']) 
# {'blackbird':['Turdus merula', 'Chrysomus icterocephalus', 'Agelaius assimilis', 'Turdus albocinctus'...]}    
```

The function becomes incredibly useful if you're working with large datasets of names that come from multiple sources. Authors use totally different formats and conventions, which this function will help you map to a standard.


##### Synonyms

Given a list of scientific names (at any taxonomic rank) this function will return the synonyms of the name

```python
from EcoNameTranslator import synonyms
scientific_names = synonyms(['Myodes']) 
# {
#   {'Myodes': ['Clethrionomys', 'Phaulomys', 'Craseomys', 'Evotomys', 'Glareomys', 'Neoaschizomys']}
# } 
```

##### Children

Given a list of names (at any taxonomic rank) this function will return the immediate children under the name

```python
from EcoNameTranslator import children
scientific_names = children(['Vulpes','Felidae','Carnivora']) 
# {
#  'Vulpes': ['Vulpes vulpes', 'Vulpes macrotis', 'Vulpes velox'...],
#  'Felidae': ['Lynx', 'Felis', 'Acinonyx', 'Leopardus'...],
#  'Carnivora': ['Ursidae', 'Mustelidae', 'Procyonidae'...]
# }
```

##### Generalised Downstream Species

Given a list of names (at any taxonomic rank),and a target rank, this function will return the list of children at the specified taxonomic rank for each input name

```python
from EcoNameTranslator import downstream
scientific_names = downstream(['Felidae','Vulpes'],'species')
# {
#  'Vulpes': ['Vulpes vulpes', 'Vulpes macrotis', 'Vulpes velox'...]
#  'Felidae': ['Lynx rufus', 'Lynx lynx', 'Lynx canadensis'...], 
# } 
```

##### Generalised Upstream Species

Given a list of names (at any taxonomic rank),and a target rank, this function will return the list of taxa above the given name

```python
from EcoNameTranslator import upstream
scientific_names = upstream(['Ursus Arctos','Vulpes Vulpes'],'genus')
# {
#  'Ursus Arctos': ['Ursus', 'Ailuropoda', 'Helarctos'...],
#  'Vulpes Vulpes': ['Canis', 'Vulpes', 'Urocyon'...]
# } 
```

##### Lowest intersecting taxonomic rank

```python
from EcoNameTranslator import lowest_rank_intersection
intersection = lowest_rank_intersection(['vulpes vulpes','ursus arctos','panthera tigris','turdus merula'])
# ('phylum', 'chordata')
intersection = lowest_rank_intersection(['felis catus','panthera tigris'])
# ('family', 'felidae')

```

### Contributing

See the Github page for both, [here](https://github.com/Daniel-Davies/MedeinaTranslator). Pull requests are welcome! 

### Coming Soon

- Only some of the functions can currently use multiple databases for running consensus on their output. We will soon add a consensus service layer that enables all functions to use this feature. 
- We will be gradually build out functionality that combines APIs to achieve functionality that cannot be done with one database alone


### Credit 

The package uses various APIs for conversions of names. These are:

- [The Global Names Resolver](https://resolver.globalnames.org/)
- [The Integrated Taxonomic Information System](https://www.itis.gov/)
- [The Global Biodiversity Information Facility](https://www.gbif.org/)



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Daniel-Davies/MedeinaTranslator",
    "name": "EcoNameTranslator",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Daniel Davies",
    "author_email": "dd16785@bristol.ac.uk",
    "download_url": "https://files.pythonhosted.org/packages/bc/6a/89694fa60516b943c044bd4865f3763f332d1c949bb56e4fd5b10d8ad063/EcoNameTranslator-2.1.tar.gz",
    "platform": "",
    "description": "# The Ecological Name Translator\n\n### What is it?\n\nA lightweight python package containing everything you need for translation and management of ecological names. The package takes inspiration from the \"taxize\" package in R, and currently provides all of it's functionality. On top of this however, the EcoNameTranslator aims to be far more powerful; rather than being a thin wrapper around specific ecological name data-stores, the multiple data-stores are leveraged together alongside statistical inference to provide more coherent output and failure correction of the underlying APIs and user input. API calls are made concurrently for increased performance. \n\n### Functionality\n\n##### Get Taxonomy For A Scientific Name\n\nGiven a list of scientific names (at any taxonomic rank) this function will standardise and spell check your input names, before returning a taxonomic profile for each name:\n\n```python\nfrom EcoNameTranslator import classify\nscientific_names = classify(['Vulpes vulpes','Delphinidae']) \n# {\n#   'vulpes vulpes': {'species': 'vulpes vulpes','genus':'vulpes','family':'canidae'...},\n#   'delphinidae': {'family': 'delphinidae','order': 'cetacea','class': 'mammalia'...}\n# } \n```\n\nThe output of multiple databases is taken by the classify funcion, and a consensus protocol is run to determine the most likely true taxonomic ranking. This is to guard against inconsistencies (or false inputs) that occur in some databases, which arise from time to time, especially for lesser known species. \n\n\n##### Common Name To Scientific Name\n\nA list of common names are accepted as input, which are then mapped into their scientific species names:\n\n```python\nfrom EcoNameTranslator import to_scientific\nscientific_names = to_scientific(['blackbird']) \n# {\n#   'blackbird': ['Turdus merula', 'Chrysomus icterocephalus', 'Agelaius assimilis'...],\n# } \n```\n\nThis basic version should suit most applications- but some common names can differ from what you may mean; for example, suppose we want to obtain various species of crocodile:\n\n```python\nscientific_names = to_scientific(['crocodile']) \n# {\n#   'crocodile': ['Crocodylus novaeguineae', 'Crocodylus johnsoni','Pseudocarcharias kamoharai'...]\n# } \n```\n\nBut oh no! Pseudocarcharias kamoharai isn't a crocodile...it's the \"crocodile shark\". If you would like to guard against these natural language issues, you can use the sanityCheck parameter in the to_scientific function, as follows:\n\n```python\nscientific_names = to_scientific(['crocodile'],sanityCheck=True) \n# {\n#   'crocodile': ['crocodylus acutus', 'crocodylus moreletii', 'crocodylus novaeguineae'...]\n# } \n```\n\nNow, only the species that we commonly know as crocodiles will be returned. (Note however that as a side effect, this will also remove any additional specifics in the name- for example,  \"Osteolaemus tetraspis tetraspis\" will become simply \"Osteolaemus tetraspis\")\n\n##### Scientific Name To Common\n\nGiven a list of scientific names (at any taxonomic rank) this function will standardise and spell check your input names, before returning the common English names that can describe the taxonmic input name\n\n```python\nfrom EcoNameTranslator import to_common\ncommon_names = to_common(['vulpes vulpes','ursus']) \n# {\n#   'vulpes vulpes': ['Red Fox','Renard Roux'],\n#   'ursus': ['Asiatic Black Bear', 'Mexican Grizzly Bear', 'American Black Bear', ...]\n# } \n```\n\n##### Any Unstandardised Names To Scientific Species\n\nA list of ecological names, in any format, is accepted as input. This undergoes a data-cleaning procedure (namely, removing nomenclature flags and other redundant information), after which the following actions are taken:\n\n- Names that are already in a standard species format (that is, genus + species), have any spelling errors corrected and are passed back\n\n- Names at higher levels of taxonomy again have any spelling mistakes corrected, and are then mapped to a list of specific species names\n\n- Common names (currently, English only) are mapped to all of the scientific species names that can be described by the common name)\n\n```python\nfrom EcoNameTranslator import to_species\n#Should be \"Panthera tigris\"  \nwrong_spelling = to_species(['Panhera tigris'])      \n# {'Panera tigris':['panthera tigris']}    \n```\n```python\n#Higher taxa    \nhigher_taxa = to_species(['Vulpes']) \n# {'Vulpes': ['Vulpes lagopus', 'Vulpes ferrilata', 'Vulpes zerda', 'Vulpes vulpes'...]}\n```\n```python\n#Common English name\ncommon_name = to_species(['blackbird']) \n# {'blackbird':['Turdus merula', 'Chrysomus icterocephalus', 'Agelaius assimilis', 'Turdus albocinctus'...]}    \n```\n\nThe function becomes incredibly useful if you're working with large datasets of names that come from multiple sources. Authors use totally different formats and conventions, which this function will help you map to a standard.\n\n\n##### Synonyms\n\nGiven a list of scientific names (at any taxonomic rank) this function will return the synonyms of the name\n\n```python\nfrom EcoNameTranslator import synonyms\nscientific_names = synonyms(['Myodes']) \n# {\n#   {'Myodes': ['Clethrionomys', 'Phaulomys', 'Craseomys', 'Evotomys', 'Glareomys', 'Neoaschizomys']}\n# } \n```\n\n##### Children\n\nGiven a list of names (at any taxonomic rank) this function will return the immediate children under the name\n\n```python\nfrom EcoNameTranslator import children\nscientific_names = children(['Vulpes','Felidae','Carnivora']) \n# {\n#  'Vulpes': ['Vulpes vulpes', 'Vulpes macrotis', 'Vulpes velox'...],\n#  'Felidae': ['Lynx', 'Felis', 'Acinonyx', 'Leopardus'...],\n#  'Carnivora': ['Ursidae', 'Mustelidae', 'Procyonidae'...]\n# }\n```\n\n##### Generalised Downstream Species\n\nGiven a list of names (at any taxonomic rank),and a target rank, this function will return the list of children at the specified taxonomic rank for each input name\n\n```python\nfrom EcoNameTranslator import downstream\nscientific_names = downstream(['Felidae','Vulpes'],'species')\n# {\n#  'Vulpes': ['Vulpes vulpes', 'Vulpes macrotis', 'Vulpes velox'...]\n#  'Felidae': ['Lynx rufus', 'Lynx lynx', 'Lynx canadensis'...], \n# } \n```\n\n##### Generalised Upstream Species\n\nGiven a list of names (at any taxonomic rank),and a target rank, this function will return the list of taxa above the given name\n\n```python\nfrom EcoNameTranslator import upstream\nscientific_names = upstream(['Ursus Arctos','Vulpes Vulpes'],'genus')\n# {\n#  'Ursus Arctos': ['Ursus', 'Ailuropoda', 'Helarctos'...],\n#  'Vulpes Vulpes': ['Canis', 'Vulpes', 'Urocyon'...]\n# } \n```\n\n##### Lowest intersecting taxonomic rank\n\n```python\nfrom EcoNameTranslator import lowest_rank_intersection\nintersection = lowest_rank_intersection(['vulpes vulpes','ursus arctos','panthera tigris','turdus merula'])\n# ('phylum', 'chordata')\nintersection = lowest_rank_intersection(['felis catus','panthera tigris'])\n# ('family', 'felidae')\n\n```\n\n### Contributing\n\nSee the Github page for both, [here](https://github.com/Daniel-Davies/MedeinaTranslator). Pull requests are welcome! \n\n### Coming Soon\n\n- Only some of the functions can currently use multiple databases for running consensus on their output. We will soon add a consensus service layer that enables all functions to use this feature. \n- We will be gradually build out functionality that combines APIs to achieve functionality that cannot be done with one database alone\n\n\n### Credit \n\nThe package uses various APIs for conversions of names. These are:\n\n- [The Global Names Resolver](https://resolver.globalnames.org/)\n- [The Integrated Taxonomic Information System](https://www.itis.gov/)\n- [The Global Biodiversity Information Facility](https://www.gbif.org/)\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A lightweight but powerful package for full management and translation of ecological names",
    "version": "2.1",
    "project_urls": {
        "Homepage": "https://github.com/Daniel-Davies/MedeinaTranslator"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9be2246cb5f3bd3b8a5d65b0d2319e24c191395f6ad5c8364b5dd7fe42ff689e",
                "md5": "b12c9dab9b5fec647a46c07533addf1a",
                "sha256": "81f210c2b1f469ba82eb0006a141092e39daf3e6a6280481d08644ceae2e8523"
            },
            "downloads": -1,
            "filename": "EcoNameTranslator-2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b12c9dab9b5fec647a46c07533addf1a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 26299,
            "upload_time": "2020-07-05T15:16:30",
            "upload_time_iso_8601": "2020-07-05T15:16:30.679727Z",
            "url": "https://files.pythonhosted.org/packages/9b/e2/246cb5f3bd3b8a5d65b0d2319e24c191395f6ad5c8364b5dd7fe42ff689e/EcoNameTranslator-2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc6a89694fa60516b943c044bd4865f3763f332d1c949bb56e4fd5b10d8ad063",
                "md5": "5d93a5176833afcc54f9ae088937185c",
                "sha256": "f940af229bbfd57ff1a91a2e1d483d3e2b1202d49016d189a6c54745c30f1143"
            },
            "downloads": -1,
            "filename": "EcoNameTranslator-2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "5d93a5176833afcc54f9ae088937185c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13658,
            "upload_time": "2020-07-05T15:16:31",
            "upload_time_iso_8601": "2020-07-05T15:16:31.676340Z",
            "url": "https://files.pythonhosted.org/packages/bc/6a/89694fa60516b943c044bd4865f3763f332d1c949bb56e4fd5b10d8ad063/EcoNameTranslator-2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2020-07-05 15:16:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Daniel-Davies",
    "github_project": "MedeinaTranslator",
    "github_not_found": true,
    "lcname": "econametranslator"
}
        
Elapsed time: 0.18551s