uniprot-id-mapper


Nameuniprot-id-mapper JSON
Version 1.1.2 PyPI version JSON
download
home_pageNone
SummaryA Python wrapper for the UniProt Mapping RESTful API.
upload_time2024-07-19 08:34:47
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT License Copyright (c) 2023 David Araripe Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords uniprot database protein id gene id parser
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![License: MIT](https://img.shields.io/badge/License-MIT-purple?style=flat-square)](https://opensource.org/licenses/MIT)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Code style: black](https://img.shields.io/badge/code%20style-black-black?style=flat-square)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat-square&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![GitHub Actions](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2FDavid-Araripe%2FUniProtMapper%2Fbadge%3Fref%3Dmaster&style=flat-square)](https://actions-badge.atrox.dev/David-Araripe/UniProtMapper/goto?ref=master)
[![Downloads:PyPI](https://img.shields.io/pypi/dm/uniprot-id-mapper?style=flat-square)](https://pypi.org/project/uniprot-id-mapper/)

# UniProtMapper <img align="left" width="40" height="40" src="https://raw.githubusercontent.com/whitead/protein-emoji/main/src/protein-72-color.svg">

A Python wrapper for UniProt's [Retrieve/ID Mapping](https://www.uniprot.org/id-mapping) RESTful API. This package supports the following functionalities:

1. Map (almost) any UniProt [cross-referenced IDs](https://github.com/David-Araripe/UniProtMapper/blob/master/src/UniProtMapper/resources/uniprot_mapping_dbs.json) to other identifiers & vice-versa;
2. Programmatically  retrieve any of the supported [return](https://www.uniprot.org/help/return_fields) and [cross-reference fields](https://www.uniprot.org/help/return_fields_databases) from both UniProt-SwissProt and UniProt-TrEMBL (unreviewed) databases;

For these, check [Example 1](#example-1-mapping-ids) and [Example 2](#example-2-retrieving-information) below. Both functionalities can also be accessed through the CLI. For more information, check [CLI](#-cli).

## 📦 Installation

From PyPI:
``` Shell
python -m pip install uniprot-id-mapper
```

Directly from GitHub:
``` Shell
python -m pip install git+https://github.com/David-Araripe/UniProtMapper.git
```

From source:
``` Shell
git clone https://github.com/David-Araripe/UniProtMapper
cd UniProtMapper
python -m pip install .
```
# 🛠️ Usage
## Example 1: Mapping IDs
To map IDs, the user can either call the object directly or use the `get` method to obtain the response. The different identifiers that are used by the API are designated by the `from_db` and `to_db` parameters. For example:

``` python
from UniProtMapper import ProtMapper

mapper = ProtMapper()

result, failed = mapper.get(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)

result, failed = mapper(
    ids=["P30542", "Q16678", "Q02880"], from_db="UniProtKB_AC-ID", to_db="Ensembl"
)
```
Where failed corresponds to a list of the identifiers that failed to be mapped and result is the following pandas DataFrame:

|    | UniProtKB_AC-ID   | Ensembl            |
|---:|:------------------|:-------------------|
|  0 | P30542            | ENSG00000163485.17 |
|  1 | Q16678            | ENSG00000138061.12 |
|  2 | Q02880            | ENSG00000077097.17 |

## Example 2: Retrieving information

The supported [return](https://www.uniprot.org/help/return_fields) and [cross-reference fields](https://www.uniprot.org/help/return_fields_databases) are both accessible through UniProt's website or by the attribute `ProtMapper.fields_table`. For example:

```Python
from UniProtMapper import ProtMapper

mapper = ProtMapper()
df = mapper.fields_table
df.head()
```
|    | label                | returned_field   | field_type       | has_full_version   | type          |
|---:|:---------------------|:-----------------|:-----------------|:-------------------|:--------------|
|  0 | Entry                | accession        | Names & Taxonomy | yes                | uniprot_field |
|  1 | Entry Name           | id               | Names & Taxonomy | yes                | uniprot_field |
|  2 | Gene Names           | gene_names       | Names & Taxonomy | yes                | uniprot_field |
|  3 | Gene Names (primary) | gene_primary     | Names & Taxonomy | yes                | uniprot_field |
|  4 | Gene Names (synonym) | gene_synonym     | Names & Taxonomy | yes                | uniprot_field |

To retrieve information, the user can either call the object directly or use the `get` method to obtain the response. For example:

```Python
result, failed = mapper.get(["Q02880"])
>>> Fetched: 1 / 1

result, failed = mapper(["Q02880"])
>>> Fetched: 1 / 1
```

Custom returned fields can be retrieved by passing a list of fields to the `fields` parameter. These fields need to be within `UniProtRetriever.fields_table["returned_field"]` and will be returned with columns named as their respective `Label`.

The object already has a list of default fields under `self.default_fields`, but these are ignored if the parameter `fields` is passed.

```Python
fields = ["accession", "organism_name", "structure_3d"]
result, failed = mapper.get(["Q02880"], fields=fields)
```

# 💻 CLI

The package also comes with a CLI that can be used to map IDs and retrieve information. To map IDs, the user can use the `protmap` command, accessible after installation. Here is a list of the available arguments, shown by `protmap -h`:

```text
usage: UniProtMapper [-h] -i [IDS ...] [-r [RETURN_FIELDS ...]] [--default-fields] [-o OUTPUT]
                     [-from FROM_DB] [-to TO_DB] [-over] [-pf]

Retrieve data from UniProt using UniProt's RESTful API. For a list of all available fields, see: https://www.uniprot.org/help/return_fields 

Alternatively, use the --print-fields argument to print the available fields and exit the program.

optional arguments:
  -h, --help            show this help message and exit
  -i [IDS ...], --ids [IDS ...]
                        List of UniProt IDs to retrieve information from. Values must be
                        separated by spaces.
  -r [RETURN_FIELDS ...], --return-fields [RETURN_FIELDS ...]
                        If not defined, will pass `None`, returning all available fields.
                        Else, values should be fields to be returned separated by spaces. See
                        --print-fields for available options.
  --default-fields, -def
                        This option will override the --return-fields option. Returns only the
                        default fields stored in: <pkg_path>/resources/cli_return_fields.txt
  -o OUTPUT, --output OUTPUT
                        Path to the output file to write the returned fields. If not provided,
                        will write to stdout.
  -from FROM_DB, --from-db FROM_DB
                        The database from which the IDs are. For the available cross
                        references, see: <pkg_path>/resources/uniprot_mapping_dbs.json
  -to TO_DB, --to-db TO_DB
                        The database to which the IDs will be mapped. For the available cross
                        references, see: <pkg_path>/resources/uniprot_mapping_dbs.json
  -over, --overwrite    If desired to overwrite an existing file when using -o/--output
  -pf, --print-fields   Prints the available return fields and exits the program.
  ```

Usage example, retrieving default fields from `<pkg_path>/resources/cli_return_fields.txt`:
<p align="center">
    <img src="https://github.com/David-Araripe/UniProtMapper/blob/master/figures/cli_example_fig.png?raw=true" alt="Image displaying the output of UniProtMapper's CLI, protmap"/>
</p>

# 👏🏼 Credits:

- [UniProt](https://www.uniprot.org/) for providing the API and the amazing database;
- [Andrew White and the University of Rochester](https://github.com/whitead/protein-emoji) for the protein emoji;

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "uniprot-id-mapper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "David Araripe <david.araripe17@gmail.com>",
    "keywords": "uniprot, database, protein ID, gene ID, parser",
    "author": null,
    "author_email": "David Araripe <david.araripe17@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/91/15/cc5cfd32f822afc15c33b8db924467e97000d0b84dc16ab16db00219c3ba/uniprot_id_mapper-1.1.2.tar.gz",
    "platform": null,
    "description": "[![License: MIT](https://img.shields.io/badge/License-MIT-purple?style=flat-square)](https://opensource.org/licenses/MIT)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-black?style=flat-square)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat-square&labelColor=ef8336)](https://pycqa.github.io/isort/)\n[![GitHub Actions](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2FDavid-Araripe%2FUniProtMapper%2Fbadge%3Fref%3Dmaster&style=flat-square)](https://actions-badge.atrox.dev/David-Araripe/UniProtMapper/goto?ref=master)\n[![Downloads:PyPI](https://img.shields.io/pypi/dm/uniprot-id-mapper?style=flat-square)](https://pypi.org/project/uniprot-id-mapper/)\n\n# UniProtMapper <img align=\"left\" width=\"40\" height=\"40\" src=\"https://raw.githubusercontent.com/whitead/protein-emoji/main/src/protein-72-color.svg\">\n\nA Python wrapper for UniProt's [Retrieve/ID Mapping](https://www.uniprot.org/id-mapping) RESTful API. This package supports the following functionalities:\n\n1. Map (almost) any UniProt [cross-referenced IDs](https://github.com/David-Araripe/UniProtMapper/blob/master/src/UniProtMapper/resources/uniprot_mapping_dbs.json) to other identifiers & vice-versa;\n2. Programmatically  retrieve any of the supported [return](https://www.uniprot.org/help/return_fields) and [cross-reference fields](https://www.uniprot.org/help/return_fields_databases) from both UniProt-SwissProt and UniProt-TrEMBL (unreviewed) databases;\n\nFor these, check [Example 1](#example-1-mapping-ids) and [Example 2](#example-2-retrieving-information) below. Both functionalities can also be accessed through the CLI. For more information, check [CLI](#-cli).\n\n## \ud83d\udce6 Installation\n\nFrom PyPI:\n``` Shell\npython -m pip install uniprot-id-mapper\n```\n\nDirectly from GitHub:\n``` Shell\npython -m pip install git+https://github.com/David-Araripe/UniProtMapper.git\n```\n\nFrom source:\n``` Shell\ngit clone https://github.com/David-Araripe/UniProtMapper\ncd UniProtMapper\npython -m pip install .\n```\n# \ud83d\udee0\ufe0f Usage\n## Example 1: Mapping IDs\nTo map IDs, the user can either call the object directly or use the `get` method to obtain the response. The different identifiers that are used by the API are designated by the `from_db` and `to_db` parameters. For example:\n\n``` python\nfrom UniProtMapper import ProtMapper\n\nmapper = ProtMapper()\n\nresult, failed = mapper.get(\n    ids=[\"P30542\", \"Q16678\", \"Q02880\"], from_db=\"UniProtKB_AC-ID\", to_db=\"Ensembl\"\n)\n\nresult, failed = mapper(\n    ids=[\"P30542\", \"Q16678\", \"Q02880\"], from_db=\"UniProtKB_AC-ID\", to_db=\"Ensembl\"\n)\n```\nWhere failed corresponds to a list of the identifiers that failed to be mapped and result is the following pandas DataFrame:\n\n|    | UniProtKB_AC-ID   | Ensembl            |\n|---:|:------------------|:-------------------|\n|  0 | P30542            | ENSG00000163485.17 |\n|  1 | Q16678            | ENSG00000138061.12 |\n|  2 | Q02880            | ENSG00000077097.17 |\n\n## Example 2: Retrieving information\n\nThe supported [return](https://www.uniprot.org/help/return_fields) and [cross-reference fields](https://www.uniprot.org/help/return_fields_databases) are both accessible through UniProt's website or by the attribute `ProtMapper.fields_table`. For example:\n\n```Python\nfrom UniProtMapper import ProtMapper\n\nmapper = ProtMapper()\ndf = mapper.fields_table\ndf.head()\n```\n|    | label                | returned_field   | field_type       | has_full_version   | type          |\n|---:|:---------------------|:-----------------|:-----------------|:-------------------|:--------------|\n|  0 | Entry                | accession        | Names & Taxonomy | yes                | uniprot_field |\n|  1 | Entry Name           | id               | Names & Taxonomy | yes                | uniprot_field |\n|  2 | Gene Names           | gene_names       | Names & Taxonomy | yes                | uniprot_field |\n|  3 | Gene Names (primary) | gene_primary     | Names & Taxonomy | yes                | uniprot_field |\n|  4 | Gene Names (synonym) | gene_synonym     | Names & Taxonomy | yes                | uniprot_field |\n\nTo retrieve information, the user can either call the object directly or use the `get` method to obtain the response. For example:\n\n```Python\nresult, failed = mapper.get([\"Q02880\"])\n>>> Fetched: 1 / 1\n\nresult, failed = mapper([\"Q02880\"])\n>>> Fetched: 1 / 1\n```\n\nCustom returned fields can be retrieved by passing a list of fields to the `fields` parameter. These fields need to be within `UniProtRetriever.fields_table[\"returned_field\"]` and will be returned with columns named as their respective `Label`.\n\nThe object already has a list of default fields under `self.default_fields`, but these are ignored if the parameter `fields` is passed.\n\n```Python\nfields = [\"accession\", \"organism_name\", \"structure_3d\"]\nresult, failed = mapper.get([\"Q02880\"], fields=fields)\n```\n\n# \ud83d\udcbb CLI\n\nThe package also comes with a CLI that can be used to map IDs and retrieve information. To map IDs, the user can use the `protmap` command, accessible after installation. Here is a list of the available arguments, shown by `protmap -h`:\n\n```text\nusage: UniProtMapper [-h] -i [IDS ...] [-r [RETURN_FIELDS ...]] [--default-fields] [-o OUTPUT]\n                     [-from FROM_DB] [-to TO_DB] [-over] [-pf]\n\nRetrieve data from UniProt using UniProt's RESTful API. For a list of all available fields, see: https://www.uniprot.org/help/return_fields \n\nAlternatively, use the --print-fields argument to print the available fields and exit the program.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -i [IDS ...], --ids [IDS ...]\n                        List of UniProt IDs to retrieve information from. Values must be\n                        separated by spaces.\n  -r [RETURN_FIELDS ...], --return-fields [RETURN_FIELDS ...]\n                        If not defined, will pass `None`, returning all available fields.\n                        Else, values should be fields to be returned separated by spaces. See\n                        --print-fields for available options.\n  --default-fields, -def\n                        This option will override the --return-fields option. Returns only the\n                        default fields stored in: <pkg_path>/resources/cli_return_fields.txt\n  -o OUTPUT, --output OUTPUT\n                        Path to the output file to write the returned fields. If not provided,\n                        will write to stdout.\n  -from FROM_DB, --from-db FROM_DB\n                        The database from which the IDs are. For the available cross\n                        references, see: <pkg_path>/resources/uniprot_mapping_dbs.json\n  -to TO_DB, --to-db TO_DB\n                        The database to which the IDs will be mapped. For the available cross\n                        references, see: <pkg_path>/resources/uniprot_mapping_dbs.json\n  -over, --overwrite    If desired to overwrite an existing file when using -o/--output\n  -pf, --print-fields   Prints the available return fields and exits the program.\n  ```\n\nUsage example, retrieving default fields from `<pkg_path>/resources/cli_return_fields.txt`:\n<p align=\"center\">\n    <img src=\"https://github.com/David-Araripe/UniProtMapper/blob/master/figures/cli_example_fig.png?raw=true\" alt=\"Image displaying the output of UniProtMapper's CLI, protmap\"/>\n</p>\n\n# \ud83d\udc4f\ud83c\udffc Credits:\n\n- [UniProt](https://www.uniprot.org/) for providing the API and the amazing database;\n- [Andrew White and the University of Rochester](https://github.com/whitead/protein-emoji) for the protein emoji;\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 David Araripe  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "A Python wrapper for the UniProt Mapping RESTful API.",
    "version": "1.1.2",
    "project_urls": {
        "homepage": "https://github.com/David-Araripe/UniProtMapper",
        "repository": "https://github.com/David-Araripe/UniProtMapper"
    },
    "split_keywords": [
        "uniprot",
        " database",
        " protein id",
        " gene id",
        " parser"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ad8fa77bd6f0f115fd52f99a4acd4098d2f5f1aa71f67f3b4f50b379c0fc6ded",
                "md5": "2f587ad4b637f191061145c16df8466a",
                "sha256": "d58bacc43dcb7f4ae13e744f4fa5d3ea59d899f3b34748dc694d4ca8f6ade9c3"
            },
            "downloads": -1,
            "filename": "uniprot_id_mapper-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2f587ad4b637f191061145c16df8466a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 36201,
            "upload_time": "2024-07-19T08:34:45",
            "upload_time_iso_8601": "2024-07-19T08:34:45.353679Z",
            "url": "https://files.pythonhosted.org/packages/ad/8f/a77bd6f0f115fd52f99a4acd4098d2f5f1aa71f67f3b4f50b379c0fc6ded/uniprot_id_mapper-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9115cc5cfd32f822afc15c33b8db924467e97000d0b84dc16ab16db00219c3ba",
                "md5": "dc92a1d2c5de2c611a320650f0d4f20e",
                "sha256": "5cc0bf5ca91cd123749b631cc1ea807052676085239cc4e4409e2036a367003b"
            },
            "downloads": -1,
            "filename": "uniprot_id_mapper-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "dc92a1d2c5de2c611a320650f0d4f20e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 789217,
            "upload_time": "2024-07-19T08:34:47",
            "upload_time_iso_8601": "2024-07-19T08:34:47.513983Z",
            "url": "https://files.pythonhosted.org/packages/91/15/cc5cfd32f822afc15c33b8db924467e97000d0b84dc16ab16db00219c3ba/uniprot_id_mapper-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-19 08:34:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "David-Araripe",
    "github_project": "UniProtMapper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "uniprot-id-mapper"
}
        
Elapsed time: 0.34240s