.. image:: https://pepy.tech/badge/mygene
:target: https://pepy.tech/project/mygene
.. image:: https://img.shields.io/pypi/dm/mygene.svg
:target: https://pypistats.org/packages/mygene
.. image:: https://badge.fury.io/py/mygene.svg
:target: https://pypi.org/project/mygene/
.. image:: https://img.shields.io/pypi/pyversions/mygene.svg
:target: https://pypi.org/project/mygene/
.. image:: https://img.shields.io/pypi/format/mygene.svg
:target: https://pypi.org/project/mygene/
.. image:: https://img.shields.io/pypi/status/mygene.svg
:target: https://pypi.org/project/mygene/
Intro
=====
MyGene.Info_ provides simple-to-use REST web services to query/retrieve gene annotation data.
It's designed with simplicity and performance emphasized. ``mygene``, is an easy-to-use Python
wrapper to access MyGene.Info_ services.
.. _MyGene.Info: http://mygene.info
.. _biothings_client: https://pypi.org/project/biothings-client/
.. _mygene: https://pypi.org/project/mygene/
Since v3.1.0, mygene_ Python package has become a thin wrapper of underlying biothings_client_ package,
a universal Python client for all `BioThings APIs <http://biothings.io>`_, including MyGene.info_.
The installation of mygene_ will install biothings_client_ automatically. The following code snippets
are essentially equivalent:
* Continue using mygene_ package
.. code-block:: python
In [1]: import mygene
In [2]: mg = mygene.MyGeneInfo()
* Use biothings_client_ package directly
.. code-block:: python
In [1]: from biothings_client import get_client
In [2]: mg = get_client('gene')
After that, the use of ``mg`` instance is exactly the same, e.g. the usage examples below.
Requirements
============
python >=2.7 (including python3)
(Python 2.6 might still work, but it's not supported any more since v3.1.0.)
biothings_client_ (>=0.2.0, install using "pip install biothings_client")
Optional dependencies
======================
`pandas <http://pandas.pydata.org>`_ (install using "pip install pandas") is required for
returning a list of gene objects as `DataFrame <http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe>`_.
Installation
=============
Option 1
pip install mygene
Option 2
download/extract the source code and run::
python setup.py install
Option 3
install the latest code directly from the repository::
pip install -e git+https://github.com/biothings/mygene.py#egg=mygene
Version history
===============
`CHANGES.txt <https://raw.githubusercontent.com/SuLab/mygene.py/master/CHANGES.txt>`_
Tutorial
=========
* `ID mapping using mygene module in Python <http://nbviewer.ipython.org/6771106>`_
Documentation
=============
http://mygene-py.readthedocs.org/
Usage
=====
.. code-block:: python
In [1]: import mygene
In [2]: mg = mygene.MyGeneInfo()
In [3]: mg.getgene(1017)
Out[3]:
{'_id': '1017',
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'symbol': 'CDK2',
'taxid': 9606,
...
}
# use "fields" parameter to return a subset of fields
In [4]: mg.getgene(1017, fields='name,symbol,refseq')
Out[4]:
{'_id': '1017',
'name': 'cyclin-dependent kinase 2',
'refseq': {'genomic': ['AC_000144.1',
'NC_000012.11',
'NG_028086.1',
'NT_029419.12',
'NW_001838059.1'],
'protein': ['NP_001789.2', 'NP_439892.2'],
'rna': ['NM_001798.3', 'NM_052827.2']},
'symbol': 'CDK2'}
In [5]: mg.getgene(1017, fields=['name', 'symbol', 'refseq.rna'])
Out[5]:
{'_id': '1017',
'name': 'cyclin-dependent kinase 2',
'refseq': {'rna': ['NM_001798.5', 'NM_052827.3']},
'symbol': 'CDK2'}
In [6]: mg.getgenes([1017,1018,'ENSG00000148795'], fields='name,symbol,entrezgene,taxid')
Out[6]:
[{'_id': '1017',
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'query': '1017',
'symbol': 'CDK2',
'taxid': 9606},
{'_id': '1018',
'entrezgene': 1018,
'name': 'cyclin-dependent kinase 3',
'query': '1018',
'symbol': 'CDK3',
'taxid': 9606},
{'_id': '1586',
'entrezgene': 1586,
'name': 'cytochrome P450, family 17, subfamily A, polypeptide 1',
'query': 'ENSG00000148795',
'symbol': 'CYP17A1',
'taxid': 9606}]
# return results in Pandas DataFrame
In [7]: mg.getgenes([1017,1018,'ENSG00000148795'], fields='name,symbol,entrezgene,taxid', as_dataframe=True)
Out[7]:
_id entrezgene \
query
1017 1017 1017
1018 1018 1018
ENSG00000148795 1586 1586
name symbol \
query
1017 cyclin-dependent kinase 2 CDK2
1018 cyclin-dependent kinase 3 CDK3
ENSG00000148795 cytochrome P450, family 17, subfamily A, polyp... CYP17A1
taxid
query
1017 9606
1018 9606
ENSG00000148795 9606
[3 rows x 5 columns]
In [8]: mg.query('cdk2', size=5)
Out[8]:
{'hits': [{'_id': '1017',
'_score': 373.24667,
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'symbol': 'CDK2',
'taxid': 9606},
{'_id': '12566',
'_score': 353.90176,
'entrezgene': 12566,
'name': 'cyclin-dependent kinase 2',
'symbol': 'Cdk2',
'taxid': 10090},
{'_id': '362817',
'_score': 264.88477,
'entrezgene': 362817,
'name': 'cyclin dependent kinase 2',
'symbol': 'Cdk2',
'taxid': 10116},
{'_id': '52004',
'_score': 21.221401,
'entrezgene': 52004,
'name': 'CDK2-associated protein 2',
'symbol': 'Cdk2ap2',
'taxid': 10090},
{'_id': '143384',
'_score': 18.617256,
'entrezgene': 143384,
'name': 'CDK2-associated, cullin domain 1',
'symbol': 'CACUL1',
'taxid': 9606}],
'max_score': 373.24667,
'took': 10,
'total': 28}
In [9]: mg.query('reporter:1000_at')
Out[9]:
{'hits': [{'_id': '5595',
'_score': 11.163337,
'entrezgene': 5595,
'name': 'mitogen-activated protein kinase 3',
'symbol': 'MAPK3',
'taxid': 9606}],
'max_score': 11.163337,
'took': 6,
'total': 1}
In [10]: mg.query('symbol:cdk2', species='human')
Out[10]:
{'hits': [{'_id': '1017',
'_score': 84.17707,
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'symbol': 'CDK2',
'taxid': 9606}],
'max_score': 84.17707,
'took': 27,
'total': 1}
In [11]: mg.querymany([1017, '695'], scopes='entrezgene', species='human')
Finished.
Out[11]:
[{'_id': '1017',
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'query': '1017',
'symbol': 'CDK2',
'taxid': 9606},
{'_id': '695',
'entrezgene': 695,
'name': 'Bruton agammaglobulinemia tyrosine kinase',
'query': '695',
'symbol': 'BTK',
'taxid': 9606}]
In [12]: mg.querymany([1017, '695'], scopes='entrezgene', species=9606)
Finished.
Out[12]:
[{'_id': '1017',
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'query': '1017',
'symbol': 'CDK2',
'taxid': 9606},
{'_id': '695',
'entrezgene': 695,
'name': 'Bruton agammaglobulinemia tyrosine kinase',
'query': '695',
'symbol': 'BTK',
'taxid': 9606}]
In [13]: mg.querymany([1017, '695'], scopes='entrezgene', species=9606, as_dataframe=True)
Finished.
Out[13]:
_id entrezgene name symbol \
query
1017 1017 1017 cyclin-dependent kinase 2 CDK2
695 695 695 Bruton agammaglobulinemia tyrosine kinase BTK
taxid
query
1017 9606
695 9606
[2 rows x 5 columns]
In [14]: mg.querymany([1017, '695', 'NA_TEST'], scopes='entrezgene', species='human')
Finished.
Out[14]:
[{'_id': '1017',
'entrezgene': 1017,
'name': 'cyclin-dependent kinase 2',
'query': '1017',
'symbol': 'CDK2',
'taxid': 9606},
{'_id': '695',
'entrezgene': 695,
'name': 'Bruton agammaglobulinemia tyrosine kinase',
'query': '695',
'symbol': 'BTK',
'taxid': 9606},
{'notfound': True, 'query': 'NA_TEST'}]
# query all human kinases using fetch_all parameter:
In [15]: kinases = mg.query('name:kinase', species='human', fetch_all=True)
In [16]: kinases
Out [16]" <generator object _fetch_all at 0x7fec027d2eb0>
# kinases is a Python generator, now you can loop through it to get all 1073 hits:
In [16]: for gene in kinases:
....: print gene['_id'], gene['symbol']
Out [16]: <output omitted here>
Contact
========
Drop us any question or feedback:
* biothings@googlegroups.com (public discussion)
* help@mygene.info (reach devs privately)
* `Github issues <https://github.com/biothings/mygene.info/issues>`_
* on twitter `@mygeneinfo <https://twitter.com/mygeneinfo>`_
* Post a question on `BioStars.org <https://www.biostars.org/p/new/post/?tag_val=mygene>`_ with tag #mygene.
Raw data
{
"_id": null,
"home_page": "https://github.com/biothings/mygene.py",
"name": "mygene",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "biology gene annotation web service client api",
"author": "Chunlei Wu, Cyrus Afrasiabi, Sebastien Lelong",
"author_email": "cwu@scripps.edu",
"download_url": "https://files.pythonhosted.org/packages/0a/ec/a256003f84196aa3fdd65a7c6f5adfc0688398fb66442eba75b39c9b7627/mygene-3.2.2.tar.gz",
"platform": "",
"description": ".. image:: https://pepy.tech/badge/mygene\n :target: https://pepy.tech/project/mygene\n\n.. image:: https://img.shields.io/pypi/dm/mygene.svg\n :target: https://pypistats.org/packages/mygene\n\n.. image:: https://badge.fury.io/py/mygene.svg\n :target: https://pypi.org/project/mygene/\n\n.. image:: https://img.shields.io/pypi/pyversions/mygene.svg\n :target: https://pypi.org/project/mygene/\n\n.. image:: https://img.shields.io/pypi/format/mygene.svg\n :target: https://pypi.org/project/mygene/\n\n.. image:: https://img.shields.io/pypi/status/mygene.svg\n :target: https://pypi.org/project/mygene/\n\nIntro\n=====\n\nMyGene.Info_ provides simple-to-use REST web services to query/retrieve gene annotation data.\nIt's designed with simplicity and performance emphasized. ``mygene``, is an easy-to-use Python\nwrapper to access MyGene.Info_ services.\n\n.. _MyGene.Info: http://mygene.info\n.. _biothings_client: https://pypi.org/project/biothings-client/\n.. _mygene: https://pypi.org/project/mygene/\n\nSince v3.1.0, mygene_ Python package has become a thin wrapper of underlying biothings_client_ package,\na universal Python client for all `BioThings APIs <http://biothings.io>`_, including MyGene.info_.\nThe installation of mygene_ will install biothings_client_ automatically. The following code snippets\nare essentially equivalent:\n\n\n* Continue using mygene_ package\n\n .. code-block:: python\n\n In [1]: import mygene\n In [2]: mg = mygene.MyGeneInfo()\n\n* Use biothings_client_ package directly\n\n .. code-block:: python\n\n In [1]: from biothings_client import get_client\n In [2]: mg = get_client('gene')\n\nAfter that, the use of ``mg`` instance is exactly the same, e.g. the usage examples below.\n\nRequirements\n============\n python >=2.7 (including python3)\n\n (Python 2.6 might still work, but it's not supported any more since v3.1.0.)\n\n biothings_client_ (>=0.2.0, install using \"pip install biothings_client\")\n\nOptional dependencies\n======================\n `pandas <http://pandas.pydata.org>`_ (install using \"pip install pandas\") is required for\n returning a list of gene objects as `DataFrame <http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe>`_.\n\nInstallation\n=============\n\n Option 1\n pip install mygene\n\n Option 2\n download/extract the source code and run::\n\n python setup.py install\n\n Option 3\n install the latest code directly from the repository::\n\n pip install -e git+https://github.com/biothings/mygene.py#egg=mygene\n\nVersion history\n===============\n\n `CHANGES.txt <https://raw.githubusercontent.com/SuLab/mygene.py/master/CHANGES.txt>`_\n\nTutorial\n=========\n\n* `ID mapping using mygene module in Python <http://nbviewer.ipython.org/6771106>`_\n\nDocumentation\n=============\n\n http://mygene-py.readthedocs.org/\n\nUsage\n=====\n\n.. code-block:: python\n\n In [1]: import mygene\n\n In [2]: mg = mygene.MyGeneInfo()\n\n In [3]: mg.getgene(1017)\n Out[3]:\n {'_id': '1017',\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'symbol': 'CDK2',\n 'taxid': 9606,\n ...\n }\n\n # use \"fields\" parameter to return a subset of fields\n In [4]: mg.getgene(1017, fields='name,symbol,refseq')\n Out[4]:\n {'_id': '1017',\n 'name': 'cyclin-dependent kinase 2',\n 'refseq': {'genomic': ['AC_000144.1',\n 'NC_000012.11',\n 'NG_028086.1',\n 'NT_029419.12',\n 'NW_001838059.1'],\n 'protein': ['NP_001789.2', 'NP_439892.2'],\n 'rna': ['NM_001798.3', 'NM_052827.2']},\n 'symbol': 'CDK2'}\n\n In [5]: mg.getgene(1017, fields=['name', 'symbol', 'refseq.rna'])\n Out[5]:\n {'_id': '1017',\n 'name': 'cyclin-dependent kinase 2',\n 'refseq': {'rna': ['NM_001798.5', 'NM_052827.3']},\n 'symbol': 'CDK2'}\n\n\n In [6]: mg.getgenes([1017,1018,'ENSG00000148795'], fields='name,symbol,entrezgene,taxid')\n Out[6]:\n [{'_id': '1017',\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'query': '1017',\n 'symbol': 'CDK2',\n 'taxid': 9606},\n {'_id': '1018',\n 'entrezgene': 1018,\n 'name': 'cyclin-dependent kinase 3',\n 'query': '1018',\n 'symbol': 'CDK3',\n 'taxid': 9606},\n {'_id': '1586',\n 'entrezgene': 1586,\n 'name': 'cytochrome P450, family 17, subfamily A, polypeptide 1',\n 'query': 'ENSG00000148795',\n 'symbol': 'CYP17A1',\n 'taxid': 9606}]\n\n # return results in Pandas DataFrame\n In [7]: mg.getgenes([1017,1018,'ENSG00000148795'], fields='name,symbol,entrezgene,taxid', as_dataframe=True)\n Out[7]:\n _id entrezgene \\\n query\n 1017 1017 1017\n 1018 1018 1018\n ENSG00000148795 1586 1586\n\n name symbol \\\n query\n 1017 cyclin-dependent kinase 2 CDK2\n 1018 cyclin-dependent kinase 3 CDK3\n ENSG00000148795 cytochrome P450, family 17, subfamily A, polyp... CYP17A1\n\n taxid\n query\n 1017 9606\n 1018 9606\n ENSG00000148795 9606\n\n [3 rows x 5 columns]\n\n In [8]: mg.query('cdk2', size=5)\n Out[8]:\n {'hits': [{'_id': '1017',\n '_score': 373.24667,\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'symbol': 'CDK2',\n 'taxid': 9606},\n {'_id': '12566',\n '_score': 353.90176,\n 'entrezgene': 12566,\n 'name': 'cyclin-dependent kinase 2',\n 'symbol': 'Cdk2',\n 'taxid': 10090},\n {'_id': '362817',\n '_score': 264.88477,\n 'entrezgene': 362817,\n 'name': 'cyclin dependent kinase 2',\n 'symbol': 'Cdk2',\n 'taxid': 10116},\n {'_id': '52004',\n '_score': 21.221401,\n 'entrezgene': 52004,\n 'name': 'CDK2-associated protein 2',\n 'symbol': 'Cdk2ap2',\n 'taxid': 10090},\n {'_id': '143384',\n '_score': 18.617256,\n 'entrezgene': 143384,\n 'name': 'CDK2-associated, cullin domain 1',\n 'symbol': 'CACUL1',\n 'taxid': 9606}],\n 'max_score': 373.24667,\n 'took': 10,\n 'total': 28}\n\n In [9]: mg.query('reporter:1000_at')\n Out[9]:\n {'hits': [{'_id': '5595',\n '_score': 11.163337,\n 'entrezgene': 5595,\n 'name': 'mitogen-activated protein kinase 3',\n 'symbol': 'MAPK3',\n 'taxid': 9606}],\n 'max_score': 11.163337,\n 'took': 6,\n 'total': 1}\n\n In [10]: mg.query('symbol:cdk2', species='human')\n Out[10]:\n {'hits': [{'_id': '1017',\n '_score': 84.17707,\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'symbol': 'CDK2',\n 'taxid': 9606}],\n 'max_score': 84.17707,\n 'took': 27,\n 'total': 1}\n\n In [11]: mg.querymany([1017, '695'], scopes='entrezgene', species='human')\n Finished.\n Out[11]:\n [{'_id': '1017',\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'query': '1017',\n 'symbol': 'CDK2',\n 'taxid': 9606},\n {'_id': '695',\n 'entrezgene': 695,\n 'name': 'Bruton agammaglobulinemia tyrosine kinase',\n 'query': '695',\n 'symbol': 'BTK',\n 'taxid': 9606}]\n\n In [12]: mg.querymany([1017, '695'], scopes='entrezgene', species=9606)\n Finished.\n Out[12]:\n [{'_id': '1017',\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'query': '1017',\n 'symbol': 'CDK2',\n 'taxid': 9606},\n {'_id': '695',\n 'entrezgene': 695,\n 'name': 'Bruton agammaglobulinemia tyrosine kinase',\n 'query': '695',\n 'symbol': 'BTK',\n 'taxid': 9606}]\n\n In [13]: mg.querymany([1017, '695'], scopes='entrezgene', species=9606, as_dataframe=True)\n Finished.\n Out[13]:\n _id entrezgene name symbol \\\n query\n 1017 1017 1017 cyclin-dependent kinase 2 CDK2\n 695 695 695 Bruton agammaglobulinemia tyrosine kinase BTK\n\n taxid\n query\n 1017 9606\n 695 9606\n\n [2 rows x 5 columns]\n\n In [14]: mg.querymany([1017, '695', 'NA_TEST'], scopes='entrezgene', species='human')\n Finished.\n Out[14]:\n [{'_id': '1017',\n 'entrezgene': 1017,\n 'name': 'cyclin-dependent kinase 2',\n 'query': '1017',\n 'symbol': 'CDK2',\n 'taxid': 9606},\n {'_id': '695',\n 'entrezgene': 695,\n 'name': 'Bruton agammaglobulinemia tyrosine kinase',\n 'query': '695',\n 'symbol': 'BTK',\n 'taxid': 9606},\n {'notfound': True, 'query': 'NA_TEST'}]\n\n # query all human kinases using fetch_all parameter:\n In [15]: kinases = mg.query('name:kinase', species='human', fetch_all=True)\n In [16]: kinases\n Out [16]\" <generator object _fetch_all at 0x7fec027d2eb0>\n\n # kinases is a Python generator, now you can loop through it to get all 1073 hits:\n In [16]: for gene in kinases:\n ....: print gene['_id'], gene['symbol']\n Out [16]: <output omitted here>\n\n\nContact\n========\nDrop us any question or feedback:\n * biothings@googlegroups.com (public discussion)\n * help@mygene.info (reach devs privately)\n * `Github issues <https://github.com/biothings/mygene.info/issues>`_\n * on twitter `@mygeneinfo <https://twitter.com/mygeneinfo>`_\n * Post a question on `BioStars.org <https://www.biostars.org/p/new/post/?tag_val=mygene>`_ with tag #mygene.\n\n\n\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Python Client for MyGene.Info services.",
"version": "3.2.2",
"split_keywords": [
"biology",
"gene",
"annotation",
"web",
"service",
"client",
"api"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "1d18c51963a35f89aa30ef674ca186e1",
"sha256": "18d85d1b28ecee2be31d844607fb0c5f7d7c58573278432df819ee2a5e88fe46"
},
"downloads": -1,
"filename": "mygene-3.2.2-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "1d18c51963a35f89aa30ef674ca186e1",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 5357,
"upload_time": "2021-04-05T21:24:29",
"upload_time_iso_8601": "2021-04-05T21:24:29.070886Z",
"url": "https://files.pythonhosted.org/packages/a7/b7/132b1673c0ec00881d49d56c09624942fa0ebd2fc21d73d80647efa082e9/mygene-3.2.2-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "c52d987c25c68e15b35efebf636de339",
"sha256": "e729cabbc28cf5afb221bca1ab637883b375cb1a3e2f067587ec79f71affdaea"
},
"downloads": -1,
"filename": "mygene-3.2.2.tar.gz",
"has_sig": false,
"md5_digest": "c52d987c25c68e15b35efebf636de339",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5399,
"upload_time": "2021-04-05T21:24:30",
"upload_time_iso_8601": "2021-04-05T21:24:30.934280Z",
"url": "https://files.pythonhosted.org/packages/0a/ec/a256003f84196aa3fdd65a7c6f5adfc0688398fb66442eba75b39c9b7627/mygene-3.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2021-04-05 21:24:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "biothings",
"github_project": "mygene.py",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "mygene"
}