# pyEnsemblRest
`pyEnsemblRest` is a simple Python client for the Ensembl REST API
[](https://github.com/gawbul/pyEnsemblRest/actions/workflows/pull_request.yaml)
[](https://github.com/gawbul/pyEnsemblRest/actions/workflows/push_tag.yaml)
[](https://coveralls.io/github/gawbul/pyEnsemblRest?branch=main)
[](https://scrutinizer-ci.com/g/gawbul/pyEnsemblRest/?branch=main)
[](https://gitter.im/gawbul/pyEnsemblRest)
[](https://pypi.python.org/pypi/pyensemblrest)
[](https://github.com/gawbul/pyEnsemblRest/releases)
[](https://img.shields.io/pypi/dd/pyensemblrest.svg?maxAge=2592000)
## License
pyEnsemblRest - A client for the Ensembl REST API written in the Python
programming language
Copyright (C) 2013-2024, Steve Moss
pyEnsemblRest is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your
option) any later version.
pyEnsemblRest is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.
You should have received a copy of the GNU General Public License along
with pyEnsemblRest. If not, see \<<http://www.gnu.org/licenses/>\>.
## Installation
### Using pip
Simply type:
``` bash
pip install pyensemblrest
```
### From source
Clone the pyEnsemblRest repository then install the package from source:
``` bash
git clone https://github.com/gawbul/pyEnsemblRest.git
cd pyEnsemblRest
make install
```
## Usage
To import and setup a new EnsemblRest object you should do the following:
``` python
from pyensemblrest import EnsemblRest
ensRest = EnsemblRest()
```
The `EnsemblRest()` instance points to <https://rest.ensembl.org/> by default.
To use a custom Ensembl REST server you should setup the `EnsemblRest()` as follows:
``` python
from pyensemblrest import EnsemblRest
# setup rest object to point to localhost server. The 3000 stands for REST default port
ensRest = EnsemblRest(base_url='http://localhost:3000')
```
You may also provide proxy server settings in the form of a dict, as follows:
``` python
from pyensemblrest import EnsemblRest
# setup rest object to point to a proxy server
ensRest = EnsemblRest(proxies={'http':'proxy.address.com:3128', 'https':'proxy.address.com:3128'})
```
EnsEMBL has a rate-limit policy to deal with requests. You can do up to
15 requests per second. You could wait a little during your requests:
``` python
from time import sleep
# sleep for a second so we don't get rate-limited
sleep(1)
```
Alternatively this library verifies and limits your requests to 15
requests per second. Avoid running different python processes to get your
data, otherwise you will be blacklisted by the Ensembl team. If you have to
do a lot or requests, consider using POST supported endpoints, or
contact the Ensembl team to add POST support to endpoints of your interest.
### GET endpoints
EnsemblRest class methods are not defined in the libraries so you
cannot see docstrings using the help() method on the python or
ipython terminal. However you can see all methods available for
the [ensembl](https://rest.ensembl.org/) REST server once the class
is instantiated. To get help on a particular method, please refer to
Ensembl help documentation for different endpoints in the
[Ensembl](https://rest.ensembl.org/) REST service. If you look at the
[sequence](https://rest.ensembl.org/documentation/info/sequence_id)
endpoint documentation, you will find optional and required parameters.
Required parameters must be specified in order to work properly,
otherwise you will get an exception. Optional parameters may be
specified or not, depending on your request. In all cases parameter names
are the same as those used in documentation. For example to get data using
the [sequence](http://rest.ensembl.org/documentation/info/sequence_id)
endpoint, you must specify at least the required parameters:
``` python
seq = ensRest.getSequenceById(id='ENSG00000157764')
```
In order to mask the sequence and to expand the 5\' UTR you may set
optional parameters using the same parameter described in the documentation:
``` python
seq = ensRest.getSequenceById(id='ENSG00000157764', mask="soft", expand_5prime=1000)
```
Multiple values for certain parameters (for GET methods) can be
submitted in a list. For example, to get the same result for
``` bash
curl 'http://rest.ensembl.org/overlap/region/human/7:140424943-140624564?feature=gene;feature=transcript;feature=cds;feature=exon' -H 'Content-type:application/json'
```
as described in [overlap region](https://rest.ensembl.org/documentation/info/overlap_region)
GET endpoint, you can use the following function and parameters:
``` python
data = ensRest.getOverlapByRegion(species="human", region="7:140424943-140624564", feature=["gene", "transcript", "cds", "exon"])
```
### POST endpoints
POST endpoints can be used as the GET endpoints, the only difference is
that they support parameters in a python list in order to perform multiple
queries on the same Ensembl endpoint. The parameter names are the same as
used in the documentation, for example we can use the [POST sequence](https://rest.ensembl.org/documentation/info/sequence_id_post)
endpoint in the following way:
``` python
seqs = ensRest.getSequenceByMultipleIds(ids=["ENSG00000157764", "ENSG00000248378"])
```
where the example values
`{ "ids": ["ENSG00000157764", "ENSG00000248378"] }` are converted to
the non-positional argument `ids=["ENSG00000157764", "ENSG00000248378"]`.
As with the previous example, we can add optional parameters:
``` python
seqs = ensRest.getSequenceByMultipleIds(ids=["ENSG00000157764", "ENSG00000248378"], mask="soft")
```
### Change the default output format
You can change the default output format by passing a supported
`Content-type` using the `content_type` parameter, for example:
``` python
plain_xml = ensRest.getArchiveById(id='ENSG00000157764', content_type="text/xml")
```
For a complete list of supported `Content-type` see [Supported MIME
Types](https://github.com/Ensembl/ensembl-rest/wiki/Output-formats#supported-mime-types)
from the Ensembl REST documentation. You also need to check if the same
`Content-type` is supported in the EnsEMBL endpoint description.
### Rate limiting
Sometime you can be rate limited if you are querying EnsEMBL REST
services with more than one concurrent process, or by [sharing ip
addresses](https://github.com/Ensembl/ensembl-rest/wiki#example-clients).
In such case, you can receive a message like this:
``` bash
ensemblrest.exceptions.EnsemblRestRateLimitError: EnsEMBL REST API returned a 429 (Too Many Requests): You have been rate-limited; wait and retry. The headers X-RateLimit-Reset, X-RateLimit-Limit and X-RateLimit-Remaining will inform you of how long you have until your limit is reset and what that limit was. If you get this response and have not exceeded your limit then check if you have made too many requests per second. (Rate limit hit: Retry after 2 seconds)
```
Even though this library tries to do 15 request per seconds, you should
avoid running multiple EnsEMBL REST clients. To deal which such problems
without interrupting your code, try to deal with the exceptions; For
example:
``` python
# import required modules
import os
import sys
import time
import logging
# get ensembl REST modules and exception
from pyensemblrest import EnsemblRest
from pyensemblrest import EnsemblRestRateLimitError
# An useful way to defined a logger lever, handler, and formatter
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(os.path.basename(sys.argv[0]))
# setup a new EnsemblRest object
ensRest = EnsemblRest()
# Get a request and deal with retry_after. Set a maximum number of retries (don't
# try to do the same request forever or you will be banned from ensembl!)
attempt = 0
max_attempts = 3
while attempt < max_attempts:
# update attempt count
attempt += 1
try:
result = ensRest.getLookupById(id='ENSG00000157764')
# exit while on success
break
# log exception and sleep a certain amount of time (sleeping time increases at each step)
except EnsemblRestRateLimitError, message:
logger.warn(message)
time.sleep(ensRest.retry_after*attempt)
finally:
if attempt >= max_attempts:
raise Exception("max attempts exceeded (%s)" %(max_attempts))
sys.stdout.write("%s\n" %(result))
sys.stdout.flush()
```
### Methods list
Here is a list of all methods defined. Methods called by `ensRest`
instance are specific to the [Ensembl](https://rest.ensembl.org/)
REST server.
To access the *Archive* endpoints you can use the following methods:
``` python
print(ensRest.getArchiveById(id="ENSG00000157764"))
print(ensRest.getArchiveByMultipleIds(id=["ENSG00000157764", "ENSG00000248378"]))
```
To access the *Comparative Genomics* endpoints you can use the following
methods:
``` python
print(ensRest.getCafeGeneTreeById(id="ENSGT00390000003602"))
print(ensRest.getCafeGeneTreeMemberBySymbol(species="human", symbol="BRCA2"))
print(ensRest.getCafeGeneTreeMemberById(species="human", id="ENSG00000167664"))
print(ensRest.getGeneTreeById(id="ENSGT00390000003602"))
print(ensRest.getGeneTreeMemberBySymbol(species="human", symbol="BRCA2"))
print(ensRest.getGeneTreeMemberById(species="human", id="ENSG00000167664"))
print(
ensRest.getAlignmentByRegion(
species="human",
region="X:1000000..1000100:1",
species_set_group="mammals",
)
)
print(ensRest.getHomologyById(species="human", id="ENSG00000157764"))
print(ensRest.getHomologyBySymbol(species="human", symbol="BRCA2"))
```
To access the *Cross References* endpoints you can use the following
methods:
``` python
print(ensRest.getXrefsBySymbol(species="human", symbol="BRCA2"))
print(ensRest.getXrefsById(id="ENSG00000157764"))
print(ensRest.getXrefsByName(species="human", name="BRCA2"))
```
To access the *Information* endpoints you can use the following methods:
``` python
print(ensRest.getInfoAnalysis(species="homo_sapiens"))
print(
ensRest.getInfoAssembly(species="homo_sapiens", bands=1)
) # bands is an optional parameter
print(ensRest.getInfoAssemblyRegion(species="homo_sapiens", region_name="X"))
ensRest.timeout = 300
print(ensRest.getInfoBiotypes(species="homo_sapiens")) # this keeps timing out
ensRest.timeout = 60
print(ensRest.getInfoBiotypesByGroup(group="coding", object_type="gene"))
print(ensRest.getInfoBiotypesByName(name="protein_coding", object_type="gene"))
print(ensRest.getInfoComparaMethods())
print(ensRest.getInfoComparaSpeciesSets(methods="EPO"))
print(ensRest.getInfoComparas())
print(ensRest.getInfoData())
print(ensRest.getInfoEgVersion())
print(ensRest.getInfoExternalDbs(species="homo_sapiens"))
print(ensRest.getInfoDivisions())
print(ensRest.getInfoGenomesByName(name="arabidopsis_thaliana"))
print(ensRest.getInfoGenomesByAccession(accession="U00096"))
print(ensRest.getInfoGenomesByAssembly(assembly_id="GCA_902167145.1"))
print(ensRest.getInfoGenomesByDivision(division="EnsemblPlants"))
print(ensRest.getInfoGenomesByTaxonomy(taxon_name="Homo sapiens"))
print(ensRest.getInfoPing())
print(ensRest.getInfoRest())
print(ensRest.getInfoSoftware())
print(ensRest.getInfoSpecies())
print(ensRest.getInfoVariationBySpecies(species="homo_sapiens"))
print(ensRest.getInfoVariationConsequenceTypes())
print(
ensRest.getInfoVariationPopulationIndividuals(
species="human", population_name="1000GENOMES:phase_3:ASW"
)
)
# Restrict populations returned to e.g. only populations with LD data. It is highly recommended
# to set a filter and to avoid loading the complete list of populations.
print(ensRest.getInfoVariationPopulations(species="homo_sapiens", filter="LD"))
```
To access the *Linkage Disequilibrium* endpoints you can use the
following methods:
``` python
print(
ensRest.getLdId(
species="homo_sapiens",
id="rs56116432",
population_name="1000GENOMES:phase_3:KHV",
window_size=500,
d_prime=1.0,
)
)
print(ensRest.getLdPairwise(species="homo_sapiens", id1="rs6792369", id2="rs1042779"))
print(
ensRest.getLdRegion(
species="homo_sapiens",
region="6:25837556..25843455",
population_name="1000GENOMES:phase_3:KHV",
)
)
```
To access the *Lookup* endpoints you can use the following methods:
``` python
print(ensRest.getLookupById(id="ENSG00000157764"))
print(ensRest.getLookupByMultipleIds(ids=["ENSG00000157764", "ENSG00000248378"]))
print(ensRest.getLookupBySymbol(species="homo_sapiens", symbol="BRCA2", expand=1))
print(
ensRest.getLookupByMultipleSymbols(
species="homo_sapiens", symbols=["BRCA2", "BRAF"]
)
)
```
To access the *Mapping* endpoints you can use the following methods:
``` python
print(ensRest.getMapCdnaToRegion(id="ENST00000288602", region="100..300"))
print(ensRest.getMapCdsToRegion(id="ENST00000288602", region="1..1000"))
print(
ensRest.getMapAssemblyOneToTwo(
species="homo_sapiens",
asm_one="GRCh37",
region="X:1000000..1000100:1",
asm_two="GRCh38",
)
)
print(ensRest.getMapTranslationToRegion(id="ENSP00000288602", region="100..300"))
```
To access the *Ontologies and Taxonomy* endpoints you can use the
following methods:
``` python
print(ensRest.getAncestorsById(id="GO:0005667"))
print(ensRest.getAncestorsChartById(id="GO:0005667"))
print(ensRest.getDescendantsById(id="GO:0005667"))
print(ensRest.getOntologyById(id="GO:0005667"))
print(ensRest.getOntologyByName(name="transcription factor complex"))
print(ensRest.getTaxonomyClassificationById(id="9606"))
print(ensRest.getTaxonomyById(id="9606"))
print(ensRest.getTaxonomyByName(name="Homo%25"))
```
To access the *Overlap* endpoints you can use the following methods:
``` python
print(ensRest.getOverlapById(id="ENSG00000157764", feature="gene"))
print(
ensRest.getOverlapByRegion(
species="homo_sapiens", region="X:1..1000:1", feature="gene"
)
)
print(ensRest.getOverlapByTranslation(id="ENSP00000288602"))
```
To access the *Phenotype annotations* endpoints you can use the following methods:
``` python
print(ensRest.getPhenotypeByAccession(species="homo_sapiens", accession="EFO:0003900"))
print(ensRest.getPhenotypeByGene(species="homo_sapiens", gene="ENSG00000157764"))
print(
ensRest.getPhenotypeByRegion(species="homo_sapiens", region="9:22125500-22136000:1")
)
print(ensRest.getPhenotypeByTerm(species="homo_sapiens", term="coffee consumption"))
```
To access the *Regulation* endpoints you can use the following method:
``` python
print(
ensRest.getRegulationBindingMatrix(
species="homo_sapiens", binding_matrix="ENSPFM0001"
)
)
```
To access the *Sequences* endpoints you can use the following methods:
``` python
print(ensRest.getSequenceById(id="ENSG00000157764"))
print(ensRest.getSequenceByMultipleIds(ids=["ENSG00000157764", "ENSG00000248378"]))
print(
ensRest.getSequenceByRegion(species="homo_sapiens", region="X:1000000..1000100:1")
)
print(
ensRest.getSequenceByMultipleRegions(
species="homo_sapiens",
regions=["X:1000000..1000100:1", "ABBA01004489.1:1..100"],
)
)
```
To access the *Transcript Haplotypes* endpoints you can use the
following method:
``` python
print(ensRest.getTranscriptHaplotypes(species="homo_sapiens", id="ENST00000288602"))
```
To access the *VEP* endpoints you can use the following methods:
``` python
print(
ensRest.getVariantConsequencesByHGVSNotation(
species="homo_sapiens", hgvs_notation="ENST00000366667:c.803C>T"
)
)
print(
ensRest.getVariantConsequencesByMultipleHGVSNotations(
species="homo_sapiens",
hgvs_notations=["ENST00000366667:c.803C>T", "9:g.22125504G>C"],
)
)
print(ensRest.getVariantConsequencesById(species="homo_sapiens", id="rs56116432"))
print(
ensRest.getVariantConsequencesByMultipleIds(
species="homo_sapiens", ids=["rs56116432", "COSM476", "__VAR(sv_id)__"]
)
)
print(
ensRest.getVariantConsequencesByRegion(
species="homo_sapiens", region="9:22125503-22125502:1", allele="C"
)
)
print(
ensRest.getVariantConsequencesByMultipleRegions(
species="homo_sapiens",
variants=[
"21 26960070 rs116645811 G A . . .",
"21 26965148 rs1135638 G A . . .",
],
)
)
```
To access the *Variation* endpoints you can use the following methods:
``` python
print(ensRest.getVariationRecoderById(species="homo_sapiens", id="rs56116432"))
print(
ensRest.getVariationRecoderByMultipleIds(
species="homo_sapiens", ids=["rs56116432", "rs1042779"]
)
)
print(ensRest.getVariationById(species="homo_sapiens", id="rs56116432"))
print(ensRest.getVariationByPMCID(species="homo_sapiens", pmcid="PMC5002951"))
print(ensRest.getVariationByPMID(species="homo_sapiens", pmid="26318936"))
print(
ensRest.getVariationByMultipleIds(
species="homo_sapiens", ids=["rs56116432", "COSM476", "__VAR(sv_id)__"]
)
)
```
To access the *Variation GA4GH* endpoints you can use the following
methods:
``` python
print(ensRest.getGA4GHBeacon())
print(
ensRest.getGA4GHBeaconQuery(
alternateBases="C",
assemblyId="GRCh38",
end="23125503",
referenceBases="G",
referenceName="9",
start="22125503",
variantType="DUP",
)
)
print(
ensRest.postGA4GHBeaconQuery(
alternateBases="C",
assemblyId="GRCh38",
end="23125503",
referenceBases="G",
referenceName="9",
start="22125503",
variantType="DUP",
)
)
print(ensRest.getGA4GHFeaturesById(id="ENST00000408937.7"))
ensRest.timeout = 180
print(
ensRest.searchGA4GHFeatures(
parentId="ENST00000408937.7",
featureSetId="",
featureTypes=["cds"],
end=220023,
referenceName="X",
start=197859,
pageSize=1,
)
) # this keeps timing out
ensRest.timeout = 60
print(ensRest.searchGA4GHCallset(variantSetId=1, pageSize=2))
print(ensRest.getGA4GHCallsetById(id="1"))
print(ensRest.searchGA4GHDatasets(pageSize=3))
print(ensRest.getGA4GHDatasetsById(id="6e340c4d1e333c7a676b1710d2e3953c"))
print(ensRest.searchGA4GHFeaturesets(datasetId="Ensembl"))
print(ensRest.getGA4GHFeaturesetsById(id="Ensembl"))
print(ensRest.getGA4GHVariantsById(id="1:rs1333049"))
print(
ensRest.searchGA4GHVariantAnnotations(
variantAnnotationSetId="Ensembl",
referenceId="9489ae7581e14efcad134f02afafe26c",
start=25221400,
end=25221500,
pageSize=1,
)
)
print(
ensRest.searchGA4GHVariants(
variantSetId=1,
referenceName=22,
start=25455086,
end=25455087,
pageToken="",
pageSize=1,
)
)
print(
ensRest.searchGA4GHVariantsets(
datasetId="6e340c4d1e333c7a676b1710d2e3953c", pageToken="", pageSize=2
)
)
print(ensRest.getGA4GHVariantsetsById(id=1))
print(ensRest.searchGA4GHReferences(referenceSetId="GRCh38", pageSize=10))
print(ensRest.getGA4GHReferencesById(id="9489ae7581e14efcad134f02afafe26c"))
print(ensRest.searchGA4GHReferencesets())
print(ensRest.getGA4GHReferencesetsById(id="GRCh38"))
print(ensRest.searchGA4GHVariantAnnotationsets(variantSetId="Ensembl"))
print(ensRest.getGA4GHVariantAnnotationsetsById(id="Ensembl"))
```
Raw data
{
"_id": null,
"home_page": "https://github.com/gawbul/pyEnsemblRest",
"name": "pyEnsemblRest",
"maintainer": "Steve Moss",
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": "gawbul@gmail.com",
"keywords": "ensembl, python, rest, api",
"author": "Steve Moss",
"author_email": "gawbul@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/8c/94/6eb60d9dfc97a8d7548271450b52d1923ca830e87027662df88f21c75191/pyensemblrest-0.3.2.tar.gz",
"platform": null,
"description": "# pyEnsemblRest\n\n`pyEnsemblRest` is a simple Python client for the Ensembl REST API\n\n[](https://github.com/gawbul/pyEnsemblRest/actions/workflows/pull_request.yaml)\n\n[](https://github.com/gawbul/pyEnsemblRest/actions/workflows/push_tag.yaml)\n\n[](https://coveralls.io/github/gawbul/pyEnsemblRest?branch=main)\n\n[](https://scrutinizer-ci.com/g/gawbul/pyEnsemblRest/?branch=main)\n\n[](https://gitter.im/gawbul/pyEnsemblRest)\n\n[](https://pypi.python.org/pypi/pyensemblrest)\n\n[](https://github.com/gawbul/pyEnsemblRest/releases)\n\n[](https://img.shields.io/pypi/dd/pyensemblrest.svg?maxAge=2592000)\n\n## License\n\npyEnsemblRest - A client for the Ensembl REST API written in the Python\nprogramming language\n\nCopyright (C) 2013-2024, Steve Moss\n\npyEnsemblRest is free software: you can redistribute it and/or modify it\nunder the terms of the GNU General Public License as published by the\nFree Software Foundation, either version 3 of the License, or (at your\noption) any later version.\n\npyEnsemblRest is distributed in the hope that it will be useful, but\nWITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General\nPublic License for more details.\n\nYou should have received a copy of the GNU General Public License along\nwith pyEnsemblRest. If not, see \\<<http://www.gnu.org/licenses/>\\>.\n\n## Installation\n\n### Using pip\n\nSimply type:\n\n``` bash\npip install pyensemblrest\n```\n\n### From source\n\nClone the pyEnsemblRest repository then install the package from source:\n\n``` bash\ngit clone https://github.com/gawbul/pyEnsemblRest.git\ncd pyEnsemblRest\nmake install\n```\n\n## Usage\n\nTo import and setup a new EnsemblRest object you should do the following:\n\n``` python\nfrom pyensemblrest import EnsemblRest\nensRest = EnsemblRest()\n```\n\nThe `EnsemblRest()` instance points to <https://rest.ensembl.org/> by default.\n\nTo use a custom Ensembl REST server you should setup the `EnsemblRest()` as follows:\n\n``` python\nfrom pyensemblrest import EnsemblRest\n# setup rest object to point to localhost server. The 3000 stands for REST default port\nensRest = EnsemblRest(base_url='http://localhost:3000')\n```\n\nYou may also provide proxy server settings in the form of a dict, as follows:\n\n``` python\nfrom pyensemblrest import EnsemblRest\n# setup rest object to point to a proxy server\nensRest = EnsemblRest(proxies={'http':'proxy.address.com:3128', 'https':'proxy.address.com:3128'})\n```\n\nEnsEMBL has a rate-limit policy to deal with requests. You can do up to\n15 requests per second. You could wait a little during your requests:\n\n``` python\nfrom time import sleep\n# sleep for a second so we don't get rate-limited\nsleep(1)\n```\n\nAlternatively this library verifies and limits your requests to 15\nrequests per second. Avoid running different python processes to get your\ndata, otherwise you will be blacklisted by the Ensembl team. If you have to\ndo a lot or requests, consider using POST supported endpoints, or\ncontact the Ensembl team to add POST support to endpoints of your interest.\n\n### GET endpoints\n\nEnsemblRest class methods are not defined in the libraries so you\ncannot see docstrings using the help() method on the python or\nipython terminal. However you can see all methods available for\nthe [ensembl](https://rest.ensembl.org/) REST server once the class\nis instantiated. To get help on a particular method, please refer to\nEnsembl help documentation for different endpoints in the\n[Ensembl](https://rest.ensembl.org/) REST service. If you look at the\n[sequence](https://rest.ensembl.org/documentation/info/sequence_id)\nendpoint documentation, you will find optional and required parameters.\nRequired parameters must be specified in order to work properly,\notherwise you will get an exception. Optional parameters may be\nspecified or not, depending on your request. In all cases parameter names\nare the same as those used in documentation. For example to get data using\nthe [sequence](http://rest.ensembl.org/documentation/info/sequence_id)\nendpoint, you must specify at least the required parameters:\n\n``` python\nseq = ensRest.getSequenceById(id='ENSG00000157764')\n```\n\nIn order to mask the sequence and to expand the 5\\' UTR you may set\noptional parameters using the same parameter described in the documentation:\n\n``` python\nseq = ensRest.getSequenceById(id='ENSG00000157764', mask=\"soft\", expand_5prime=1000)\n```\n\nMultiple values for certain parameters (for GET methods) can be\nsubmitted in a list. For example, to get the same result for\n\n``` bash\ncurl 'http://rest.ensembl.org/overlap/region/human/7:140424943-140624564?feature=gene;feature=transcript;feature=cds;feature=exon' -H 'Content-type:application/json'\n```\n\nas described in [overlap region](https://rest.ensembl.org/documentation/info/overlap_region)\nGET endpoint, you can use the following function and parameters:\n\n``` python\ndata = ensRest.getOverlapByRegion(species=\"human\", region=\"7:140424943-140624564\", feature=[\"gene\", \"transcript\", \"cds\", \"exon\"])\n```\n\n### POST endpoints\n\nPOST endpoints can be used as the GET endpoints, the only difference is\nthat they support parameters in a python list in order to perform multiple\nqueries on the same Ensembl endpoint. The parameter names are the same as\nused in the documentation, for example we can use the [POST sequence](https://rest.ensembl.org/documentation/info/sequence_id_post)\nendpoint in the following way:\n\n``` python\nseqs = ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\"])\n```\n\nwhere the example values\n`{ \"ids\": [\"ENSG00000157764\", \"ENSG00000248378\"] }` are converted to\nthe non-positional argument `ids=[\"ENSG00000157764\", \"ENSG00000248378\"]`.\nAs with the previous example, we can add optional parameters:\n\n``` python\nseqs = ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\"], mask=\"soft\")\n```\n\n### Change the default output format\n\nYou can change the default output format by passing a supported\n`Content-type` using the `content_type` parameter, for example:\n\n``` python\nplain_xml = ensRest.getArchiveById(id='ENSG00000157764', content_type=\"text/xml\")\n```\n\nFor a complete list of supported `Content-type` see [Supported MIME\nTypes](https://github.com/Ensembl/ensembl-rest/wiki/Output-formats#supported-mime-types)\nfrom the Ensembl REST documentation. You also need to check if the same\n`Content-type` is supported in the EnsEMBL endpoint description.\n\n### Rate limiting\n\nSometime you can be rate limited if you are querying EnsEMBL REST\nservices with more than one concurrent process, or by [sharing ip\naddresses](https://github.com/Ensembl/ensembl-rest/wiki#example-clients).\nIn such case, you can receive a message like this:\n\n``` bash\nensemblrest.exceptions.EnsemblRestRateLimitError: EnsEMBL REST API returned a 429 (Too Many Requests): You have been rate-limited; wait and retry. The headers X-RateLimit-Reset, X-RateLimit-Limit and X-RateLimit-Remaining will inform you of how long you have until your limit is reset and what that limit was. If you get this response and have not exceeded your limit then check if you have made too many requests per second. (Rate limit hit: Retry after 2 seconds)\n```\n\nEven though this library tries to do 15 request per seconds, you should\navoid running multiple EnsEMBL REST clients. To deal which such problems\nwithout interrupting your code, try to deal with the exceptions; For\nexample:\n\n``` python\n# import required modules\nimport os\nimport sys\nimport time\nimport logging\n\n# get ensembl REST modules and exception\nfrom pyensemblrest import EnsemblRest\nfrom pyensemblrest import EnsemblRestRateLimitError\n\n# An useful way to defined a logger lever, handler, and formatter\nlogging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=logging.INFO)\nlogger = logging.getLogger(os.path.basename(sys.argv[0]))\n\n# setup a new EnsemblRest object\nensRest = EnsemblRest()\n\n# Get a request and deal with retry_after. Set a maximum number of retries (don't\n# try to do the same request forever or you will be banned from ensembl!)\nattempt = 0\nmax_attempts = 3\n\nwhile attempt < max_attempts:\n # update attempt count\n attempt += 1\n\n try:\n result = ensRest.getLookupById(id='ENSG00000157764')\n # exit while on success\n break\n\n # log exception and sleep a certain amount of time (sleeping time increases at each step)\n except EnsemblRestRateLimitError, message:\n logger.warn(message)\n time.sleep(ensRest.retry_after*attempt)\n\n finally:\n if attempt >= max_attempts:\n raise Exception(\"max attempts exceeded (%s)\" %(max_attempts))\n\nsys.stdout.write(\"%s\\n\" %(result))\nsys.stdout.flush()\n```\n\n### Methods list\n\nHere is a list of all methods defined. Methods called by `ensRest`\ninstance are specific to the [Ensembl](https://rest.ensembl.org/)\nREST server.\n\nTo access the *Archive* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getArchiveById(id=\"ENSG00000157764\"))\nprint(ensRest.getArchiveByMultipleIds(id=[\"ENSG00000157764\", \"ENSG00000248378\"]))\n```\n\nTo access the *Comparative Genomics* endpoints you can use the following\nmethods:\n\n``` python\nprint(ensRest.getCafeGeneTreeById(id=\"ENSGT00390000003602\"))\nprint(ensRest.getCafeGeneTreeMemberBySymbol(species=\"human\", symbol=\"BRCA2\"))\nprint(ensRest.getCafeGeneTreeMemberById(species=\"human\", id=\"ENSG00000167664\"))\nprint(ensRest.getGeneTreeById(id=\"ENSGT00390000003602\"))\nprint(ensRest.getGeneTreeMemberBySymbol(species=\"human\", symbol=\"BRCA2\"))\nprint(ensRest.getGeneTreeMemberById(species=\"human\", id=\"ENSG00000167664\"))\nprint(\n ensRest.getAlignmentByRegion(\n species=\"human\",\n region=\"X:1000000..1000100:1\",\n species_set_group=\"mammals\",\n )\n)\nprint(ensRest.getHomologyById(species=\"human\", id=\"ENSG00000157764\"))\nprint(ensRest.getHomologyBySymbol(species=\"human\", symbol=\"BRCA2\"))\n```\n\nTo access the *Cross References* endpoints you can use the following\nmethods:\n\n``` python\nprint(ensRest.getXrefsBySymbol(species=\"human\", symbol=\"BRCA2\"))\nprint(ensRest.getXrefsById(id=\"ENSG00000157764\"))\nprint(ensRest.getXrefsByName(species=\"human\", name=\"BRCA2\"))\n```\n\nTo access the *Information* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getInfoAnalysis(species=\"homo_sapiens\"))\nprint(\n ensRest.getInfoAssembly(species=\"homo_sapiens\", bands=1)\n) # bands is an optional parameter\nprint(ensRest.getInfoAssemblyRegion(species=\"homo_sapiens\", region_name=\"X\"))\nensRest.timeout = 300\nprint(ensRest.getInfoBiotypes(species=\"homo_sapiens\")) # this keeps timing out\nensRest.timeout = 60\nprint(ensRest.getInfoBiotypesByGroup(group=\"coding\", object_type=\"gene\"))\nprint(ensRest.getInfoBiotypesByName(name=\"protein_coding\", object_type=\"gene\"))\nprint(ensRest.getInfoComparaMethods())\nprint(ensRest.getInfoComparaSpeciesSets(methods=\"EPO\"))\nprint(ensRest.getInfoComparas())\nprint(ensRest.getInfoData())\nprint(ensRest.getInfoEgVersion())\nprint(ensRest.getInfoExternalDbs(species=\"homo_sapiens\"))\nprint(ensRest.getInfoDivisions())\nprint(ensRest.getInfoGenomesByName(name=\"arabidopsis_thaliana\"))\nprint(ensRest.getInfoGenomesByAccession(accession=\"U00096\"))\nprint(ensRest.getInfoGenomesByAssembly(assembly_id=\"GCA_902167145.1\"))\nprint(ensRest.getInfoGenomesByDivision(division=\"EnsemblPlants\"))\nprint(ensRest.getInfoGenomesByTaxonomy(taxon_name=\"Homo sapiens\"))\nprint(ensRest.getInfoPing())\nprint(ensRest.getInfoRest())\nprint(ensRest.getInfoSoftware())\nprint(ensRest.getInfoSpecies())\nprint(ensRest.getInfoVariationBySpecies(species=\"homo_sapiens\"))\nprint(ensRest.getInfoVariationConsequenceTypes())\nprint(\n ensRest.getInfoVariationPopulationIndividuals(\n species=\"human\", population_name=\"1000GENOMES:phase_3:ASW\"\n )\n)\n# Restrict populations returned to e.g. only populations with LD data. It is highly recommended\n# to set a filter and to avoid loading the complete list of populations.\nprint(ensRest.getInfoVariationPopulations(species=\"homo_sapiens\", filter=\"LD\"))\n```\n\nTo access the *Linkage Disequilibrium* endpoints you can use the\nfollowing methods:\n\n``` python\nprint(\n ensRest.getLdId(\n species=\"homo_sapiens\",\n id=\"rs56116432\",\n population_name=\"1000GENOMES:phase_3:KHV\",\n window_size=500,\n d_prime=1.0,\n )\n)\nprint(ensRest.getLdPairwise(species=\"homo_sapiens\", id1=\"rs6792369\", id2=\"rs1042779\"))\nprint(\n ensRest.getLdRegion(\n species=\"homo_sapiens\",\n region=\"6:25837556..25843455\",\n population_name=\"1000GENOMES:phase_3:KHV\",\n )\n)\n```\n\nTo access the *Lookup* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getLookupById(id=\"ENSG00000157764\"))\nprint(ensRest.getLookupByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\"]))\nprint(ensRest.getLookupBySymbol(species=\"homo_sapiens\", symbol=\"BRCA2\", expand=1))\nprint(\n ensRest.getLookupByMultipleSymbols(\n species=\"homo_sapiens\", symbols=[\"BRCA2\", \"BRAF\"]\n )\n)\n```\n\nTo access the *Mapping* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getMapCdnaToRegion(id=\"ENST00000288602\", region=\"100..300\"))\nprint(ensRest.getMapCdsToRegion(id=\"ENST00000288602\", region=\"1..1000\"))\nprint(\n ensRest.getMapAssemblyOneToTwo(\n species=\"homo_sapiens\",\n asm_one=\"GRCh37\",\n region=\"X:1000000..1000100:1\",\n asm_two=\"GRCh38\",\n )\n)\nprint(ensRest.getMapTranslationToRegion(id=\"ENSP00000288602\", region=\"100..300\"))\n```\n\nTo access the *Ontologies and Taxonomy* endpoints you can use the\nfollowing methods:\n\n``` python\nprint(ensRest.getAncestorsById(id=\"GO:0005667\"))\nprint(ensRest.getAncestorsChartById(id=\"GO:0005667\"))\nprint(ensRest.getDescendantsById(id=\"GO:0005667\"))\nprint(ensRest.getOntologyById(id=\"GO:0005667\"))\nprint(ensRest.getOntologyByName(name=\"transcription factor complex\"))\nprint(ensRest.getTaxonomyClassificationById(id=\"9606\"))\nprint(ensRest.getTaxonomyById(id=\"9606\"))\nprint(ensRest.getTaxonomyByName(name=\"Homo%25\"))\n```\n\nTo access the *Overlap* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getOverlapById(id=\"ENSG00000157764\", feature=\"gene\"))\nprint(\n ensRest.getOverlapByRegion(\n species=\"homo_sapiens\", region=\"X:1..1000:1\", feature=\"gene\"\n )\n)\nprint(ensRest.getOverlapByTranslation(id=\"ENSP00000288602\"))\n```\n\nTo access the *Phenotype annotations* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getPhenotypeByAccession(species=\"homo_sapiens\", accession=\"EFO:0003900\"))\nprint(ensRest.getPhenotypeByGene(species=\"homo_sapiens\", gene=\"ENSG00000157764\"))\nprint(\n ensRest.getPhenotypeByRegion(species=\"homo_sapiens\", region=\"9:22125500-22136000:1\")\n)\nprint(ensRest.getPhenotypeByTerm(species=\"homo_sapiens\", term=\"coffee consumption\"))\n```\n\nTo access the *Regulation* endpoints you can use the following method:\n\n``` python\nprint(\n ensRest.getRegulationBindingMatrix(\n species=\"homo_sapiens\", binding_matrix=\"ENSPFM0001\"\n )\n)\n```\n\nTo access the *Sequences* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getSequenceById(id=\"ENSG00000157764\"))\nprint(ensRest.getSequenceByMultipleIds(ids=[\"ENSG00000157764\", \"ENSG00000248378\"]))\nprint(\n ensRest.getSequenceByRegion(species=\"homo_sapiens\", region=\"X:1000000..1000100:1\")\n)\nprint(\n ensRest.getSequenceByMultipleRegions(\n species=\"homo_sapiens\",\n regions=[\"X:1000000..1000100:1\", \"ABBA01004489.1:1..100\"],\n )\n)\n```\n\nTo access the *Transcript Haplotypes* endpoints you can use the\nfollowing method:\n\n``` python\nprint(ensRest.getTranscriptHaplotypes(species=\"homo_sapiens\", id=\"ENST00000288602\"))\n```\n\nTo access the *VEP* endpoints you can use the following methods:\n\n``` python\nprint(\n ensRest.getVariantConsequencesByHGVSNotation(\n species=\"homo_sapiens\", hgvs_notation=\"ENST00000366667:c.803C>T\"\n )\n)\nprint(\n ensRest.getVariantConsequencesByMultipleHGVSNotations(\n species=\"homo_sapiens\",\n hgvs_notations=[\"ENST00000366667:c.803C>T\", \"9:g.22125504G>C\"],\n )\n)\nprint(ensRest.getVariantConsequencesById(species=\"homo_sapiens\", id=\"rs56116432\"))\nprint(\n ensRest.getVariantConsequencesByMultipleIds(\n species=\"homo_sapiens\", ids=[\"rs56116432\", \"COSM476\", \"__VAR(sv_id)__\"]\n )\n)\nprint(\n ensRest.getVariantConsequencesByRegion(\n species=\"homo_sapiens\", region=\"9:22125503-22125502:1\", allele=\"C\"\n )\n)\nprint(\n ensRest.getVariantConsequencesByMultipleRegions(\n species=\"homo_sapiens\",\n variants=[\n \"21 26960070 rs116645811 G A . . .\",\n \"21 26965148 rs1135638 G A . . .\",\n ],\n )\n)\n```\n\nTo access the *Variation* endpoints you can use the following methods:\n\n``` python\nprint(ensRest.getVariationRecoderById(species=\"homo_sapiens\", id=\"rs56116432\"))\nprint(\n ensRest.getVariationRecoderByMultipleIds(\n species=\"homo_sapiens\", ids=[\"rs56116432\", \"rs1042779\"]\n )\n)\nprint(ensRest.getVariationById(species=\"homo_sapiens\", id=\"rs56116432\"))\nprint(ensRest.getVariationByPMCID(species=\"homo_sapiens\", pmcid=\"PMC5002951\"))\nprint(ensRest.getVariationByPMID(species=\"homo_sapiens\", pmid=\"26318936\"))\nprint(\n ensRest.getVariationByMultipleIds(\n species=\"homo_sapiens\", ids=[\"rs56116432\", \"COSM476\", \"__VAR(sv_id)__\"]\n )\n)\n```\n\nTo access the *Variation GA4GH* endpoints you can use the following\nmethods:\n\n``` python\nprint(ensRest.getGA4GHBeacon())\nprint(\n ensRest.getGA4GHBeaconQuery(\n alternateBases=\"C\",\n assemblyId=\"GRCh38\",\n end=\"23125503\",\n referenceBases=\"G\",\n referenceName=\"9\",\n start=\"22125503\",\n variantType=\"DUP\",\n )\n)\nprint(\n ensRest.postGA4GHBeaconQuery(\n alternateBases=\"C\",\n assemblyId=\"GRCh38\",\n end=\"23125503\",\n referenceBases=\"G\",\n referenceName=\"9\",\n start=\"22125503\",\n variantType=\"DUP\",\n )\n)\nprint(ensRest.getGA4GHFeaturesById(id=\"ENST00000408937.7\"))\nensRest.timeout = 180\nprint(\n ensRest.searchGA4GHFeatures(\n parentId=\"ENST00000408937.7\",\n featureSetId=\"\",\n featureTypes=[\"cds\"],\n end=220023,\n referenceName=\"X\",\n start=197859,\n pageSize=1,\n )\n) # this keeps timing out\nensRest.timeout = 60\nprint(ensRest.searchGA4GHCallset(variantSetId=1, pageSize=2))\nprint(ensRest.getGA4GHCallsetById(id=\"1\"))\nprint(ensRest.searchGA4GHDatasets(pageSize=3))\nprint(ensRest.getGA4GHDatasetsById(id=\"6e340c4d1e333c7a676b1710d2e3953c\"))\nprint(ensRest.searchGA4GHFeaturesets(datasetId=\"Ensembl\"))\nprint(ensRest.getGA4GHFeaturesetsById(id=\"Ensembl\"))\nprint(ensRest.getGA4GHVariantsById(id=\"1:rs1333049\"))\nprint(\n ensRest.searchGA4GHVariantAnnotations(\n variantAnnotationSetId=\"Ensembl\",\n referenceId=\"9489ae7581e14efcad134f02afafe26c\",\n start=25221400,\n end=25221500,\n pageSize=1,\n )\n)\nprint(\n ensRest.searchGA4GHVariants(\n variantSetId=1,\n referenceName=22,\n start=25455086,\n end=25455087,\n pageToken=\"\",\n pageSize=1,\n )\n)\nprint(\n ensRest.searchGA4GHVariantsets(\n datasetId=\"6e340c4d1e333c7a676b1710d2e3953c\", pageToken=\"\", pageSize=2\n )\n)\nprint(ensRest.getGA4GHVariantsetsById(id=1))\nprint(ensRest.searchGA4GHReferences(referenceSetId=\"GRCh38\", pageSize=10))\nprint(ensRest.getGA4GHReferencesById(id=\"9489ae7581e14efcad134f02afafe26c\"))\nprint(ensRest.searchGA4GHReferencesets())\nprint(ensRest.getGA4GHReferencesetsById(id=\"GRCh38\"))\nprint(ensRest.searchGA4GHVariantAnnotationsets(variantSetId=\"Ensembl\"))\nprint(ensRest.getGA4GHVariantAnnotationsetsById(id=\"Ensembl\"))\n```\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "A Python Ensembl REST API client",
"version": "0.3.2",
"project_urls": {
"Documentation": "https://github.com/gawbul/pyEnsemblRest?tab=readme-ov-file",
"Homepage": "https://github.com/gawbul/pyEnsemblRest",
"Repository": "https://github.com/gawbul/pyEnsemblRest"
},
"split_keywords": [
"ensembl",
" python",
" rest",
" api"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "50616508a31e3c47e5b9479f30286c0944a3eaee294a5843b5082fe831443149",
"md5": "c54b5c01d2e65d88f88641195b6e3f00",
"sha256": "9aaea30c8a6a0555115650c7ae8211c38b64468573e651757030ed137c6e324d"
},
"downloads": -1,
"filename": "pyensemblrest-0.3.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c54b5c01d2e65d88f88641195b6e3f00",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 30697,
"upload_time": "2024-12-03T09:24:04",
"upload_time_iso_8601": "2024-12-03T09:24:04.515128Z",
"url": "https://files.pythonhosted.org/packages/50/61/6508a31e3c47e5b9479f30286c0944a3eaee294a5843b5082fe831443149/pyensemblrest-0.3.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8c946eb60d9dfc97a8d7548271450b52d1923ca830e87027662df88f21c75191",
"md5": "8665df721950b897aa5c06ef16700fd2",
"sha256": "233988ae2de6d56cb7d4200fbfd3c19917f02b77b00a3e1060b244423dfc5571"
},
"downloads": -1,
"filename": "pyensemblrest-0.3.2.tar.gz",
"has_sig": false,
"md5_digest": "8665df721950b897aa5c06ef16700fd2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 34610,
"upload_time": "2024-12-03T09:24:06",
"upload_time_iso_8601": "2024-12-03T09:24:06.363998Z",
"url": "https://files.pythonhosted.org/packages/8c/94/6eb60d9dfc97a8d7548271450b52d1923ca830e87027662df88f21c75191/pyensemblrest-0.3.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-03 09:24:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gawbul",
"github_project": "pyEnsemblRest",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pyensemblrest"
}