[](https://pypi.org/project/rcsbsearchapi/)
[](https://dev.azure.com/rcsb/RCSB%20PDB%20Python%20Projects/_build/latest?definitionId=39&branchName=master)
[](https://rcsbsearchapi.readthedocs.io/en/latest/?badge=latest)
<a href="https://colab.research.google.com/github/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
# rcsbsearchapi
Python interface for the RCSB PDB Search API.
This package requires Python 3.7 or later.
# Quickstart
## Quickstart
## Installation
Get it from PyPI:
pip install rcsbsearchapi
Or, download from [GitHub](https://github.com/rcsb/py-rcsbsearchapi)
## Getting Started
Full documentation available at [readthedocs](https://rcsbsearchapi.readthedocs.io/en/latest/index.html)
### Basic Query Construction
#### Full-text search
To perform a "full-text" search for structures associated with the term "Hemoglobin", you can create a `TextQuery`:
```python
from rcsbsearchapi import TextQuery
# Search for structures associated with the phrase "Hemoglobin"
query = TextQuery(value="Hemoglobin")
# Execute the query by running it as a function
results = query()
# Results are returned as an iterator of result identifiers.
for rid in results:
print(rid)
```
#### Attribute search
To perform a search for specific structure or chemical attributes, you can create an `AttributeQuery`.
```python
from rcsbsearchapi import AttributeQuery
# Construct a query searching for structures from humans
query = AttributeQuery(
attribute="rcsb_entity_source_organism.scientific_name",
operator="exact_match", # Other operators include "contains_phrase", "exists", and more
value="Homo sapiens"
)
# Execute query and construct a list from results
results = list(query())
print(results)
```
Refer to the [Search Attributes](https://search.rcsb.org/structure-search-attributes.html) and [Chemical Attributes](https://search.rcsb.org/chemical-search-attributes.html) documentation for a full list of attributes and applicable operators.
Alternatively, you can also construct attribute queries with comparative operators using the `rcsb_attributes` object (which also allows for names to be tab-completed):
```python
from rcsbsearchapi import rcsb_attributes as attrs
# Search for structures from humans
query = attrs.rcsb_entity_source_organism.scientific_name == "Homo sapiens"
# Run query and construct a list from results
results = list(query())
print(results)
```
#### Grouping sub-queries
You can combine multiple queries using Python bitwise operators.
```python
from rcsbsearchapi import rcsb_attributes as attrs
# Query for human epidermal growth factor receptor (EGFR) structures (UniProt ID P00533)
# with investigational or experimental drugs bound
q1 = attrs.rcsb_polymer_entity_container_identifiers.reference_sequence_identifiers.database_accession == "P00533"
q2 = attrs.rcsb_entity_source_organism.scientific_name == "Homo sapiens"
q3 = attrs.drugbank_info.drug_groups == "investigational"
q4 = attrs.drugbank_info.drug_groups == "experimental"
# Structures matching UniProt ID P00533 AND from humans
# AND (investigational OR experimental drug group)
query = q1 & q2 & (q3 | q4)
# Execute query and print first 10 ids
results = list(query())
print(results[:10])
```
These examples are in `operator` syntax. You can also make queries in `fluent` syntax. Learn more about both syntaxes and implementation details in [Constructing and Executing Queries](query_construction.md#constructing-and-executing-queries).
### Supported Search Services
The list of supported search service types are listed in the table below. For more details on their usage, see [Search Service Types](query_construction.md#search-service-types).
|Search service |QueryType |
|----------------------------------|--------------------------|
|Full-text |`TextQuery()` |
|Attribute (structure or chemical) |`AttributeQuery()` |
|Sequence similarity |`SequenceQuery()` |
|Sequence motif |`SequenceMotifQuery()` |
|Structure similarity |`StructSimilarityQuery()` |
|Structure motif |`StructMotifQuery()` |
|Chemical similarity |`ChemSimilarityQuery()` |
Learn more about available search services on the [RCSB PDB Search API docs](https://search.rcsb.org/#search-services).
## Jupyter Notebooks
A runnable jupyter notebook is available in [notebooks/quickstart.ipynb](https://github.com/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb), or can be run online using Google Colab:
<a href="https://colab.research.google.com/github/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
An additional Covid-19 related example is in [notebooks/covid.ipynb](https://github.com/rcsb/py-rcsbsearchapi/blob/master/notebooks/covid.ipynb):
<a href="https://colab.research.google.com/github//rcsb/py-rcsbsearchapi/blob/master/notebooks/covid.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
## Supported Features
The following table lists the status of current and planned features.
- [x] Structure and chemical attribute search
- [x] Attribute Comparison operations
- [x] Query set operations
- [x] Attribute `contains`, `in_` (fluent only)
- [x] Option to include computed structure models (CSMs) in search
- [x] Sequence search
- [x] Sequence motif search
- [x] Structure similarity search
- [X] Structure motif search
- [X] Chemical similarity search
- [ ] Rich results using the Data API
Contributions are welcome for unchecked items!
## License
Code is licensed under the BSD 3-clause license. See [LICENSE](LICENSE) for details.
## Citing rcsbsearchapi
Please cite the rcsbsearchapi package by URL:
> https://rcsbsearchapi.readthedocs.io
You should also cite the RCSB PDB service this package utilizes:
> Yana Rose, Jose M. Duarte, Robert Lowe, Joan Segura, Chunxiao Bi, Charmi
> Bhikadiya, Li Chen, Alexander S. Rose, Sebastian Bittrich, Stephen K. Burley,
> John D. Westbrook. RCSB Protein Data Bank: Architectural Advances Towards
> Integrated Searching and Efficient Access to Macromolecular Structure Data
> from the PDB Archive, Journal of Molecular Biology, 2020.
> DOI: [10.1016/j.jmb.2020.11.003](https://doi.org/10.1016/j.jmb.2020.11.003)
## Attributions
The source code for this project was originally written by [Spencer Bliven](https://github.com/sbliven) and forked from [sbliven/rcsbsearch](https://github.com/sbliven/rcsbsearch). We would like to express our tremendous gratitude for his generous efforts in designing such a comprehensive public utility Python package for interacting with the RCSB PDB search API.
## Developers
For information about building and developing `rcsbsearchapi`, see
[CONTRIBUTING.md](CONTRIBUTING.md)
Raw data
{
"_id": null,
"home_page": "https://github.com/rcsb/py-rcsbsearchapi",
"name": "rcsbsearchapi",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Dennis Piehl",
"author_email": "dennis.piehl@rcsb.org",
"download_url": "https://files.pythonhosted.org/packages/86/91/a18789016a05d76eacc3a3352376a0684dcf11440441d70827a4597339e6/rcsbsearchapi-2.0.0.tar.gz",
"platform": null,
"description": "[](https://pypi.org/project/rcsbsearchapi/)\n[](https://dev.azure.com/rcsb/RCSB%20PDB%20Python%20Projects/_build/latest?definitionId=39&branchName=master)\n[](https://rcsbsearchapi.readthedocs.io/en/latest/?badge=latest)\n<a href=\"https://colab.research.google.com/github/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n# rcsbsearchapi\n\nPython interface for the RCSB PDB Search API.\n\nThis package requires Python 3.7 or later.\n\n# Quickstart\n\n## Quickstart\n\n## Installation\n\nGet it from PyPI:\n\n pip install rcsbsearchapi\n\nOr, download from [GitHub](https://github.com/rcsb/py-rcsbsearchapi)\n\n## Getting Started\nFull documentation available at [readthedocs](https://rcsbsearchapi.readthedocs.io/en/latest/index.html)\n\n### Basic Query Construction\n\n#### Full-text search\nTo perform a \"full-text\" search for structures associated with the term \"Hemoglobin\", you can create a `TextQuery`:\n\n```python\nfrom rcsbsearchapi import TextQuery\n\n# Search for structures associated with the phrase \"Hemoglobin\"\nquery = TextQuery(value=\"Hemoglobin\")\n\n# Execute the query by running it as a function\nresults = query()\n\n# Results are returned as an iterator of result identifiers.\nfor rid in results:\n print(rid)\n```\n\n#### Attribute search\nTo perform a search for specific structure or chemical attributes, you can create an `AttributeQuery`.\n\n```python\nfrom rcsbsearchapi import AttributeQuery\n\n# Construct a query searching for structures from humans\nquery = AttributeQuery(\n attribute=\"rcsb_entity_source_organism.scientific_name\",\n operator=\"exact_match\", # Other operators include \"contains_phrase\", \"exists\", and more\n value=\"Homo sapiens\"\n)\n\n# Execute query and construct a list from results\nresults = list(query())\nprint(results)\n```\n\nRefer to the [Search Attributes](https://search.rcsb.org/structure-search-attributes.html) and [Chemical Attributes](https://search.rcsb.org/chemical-search-attributes.html) documentation for a full list of attributes and applicable operators.\n\nAlternatively, you can also construct attribute queries with comparative operators using the `rcsb_attributes` object (which also allows for names to be tab-completed):\n\n```python\nfrom rcsbsearchapi import rcsb_attributes as attrs\n\n# Search for structures from humans\nquery = attrs.rcsb_entity_source_organism.scientific_name == \"Homo sapiens\"\n\n# Run query and construct a list from results\nresults = list(query())\nprint(results)\n```\n\n#### Grouping sub-queries\n\nYou can combine multiple queries using Python bitwise operators. \n\n```python\nfrom rcsbsearchapi import rcsb_attributes as attrs\n\n# Query for human epidermal growth factor receptor (EGFR) structures (UniProt ID P00533)\n# with investigational or experimental drugs bound\nq1 = attrs.rcsb_polymer_entity_container_identifiers.reference_sequence_identifiers.database_accession == \"P00533\"\nq2 = attrs.rcsb_entity_source_organism.scientific_name == \"Homo sapiens\"\nq3 = attrs.drugbank_info.drug_groups == \"investigational\"\nq4 = attrs.drugbank_info.drug_groups == \"experimental\"\n\n# Structures matching UniProt ID P00533 AND from humans\n# AND (investigational OR experimental drug group)\nquery = q1 & q2 & (q3 | q4)\n\n# Execute query and print first 10 ids\nresults = list(query())\nprint(results[:10])\n```\n\nThese examples are in `operator` syntax. You can also make queries in `fluent` syntax. Learn more about both syntaxes and implementation details in [Constructing and Executing Queries](query_construction.md#constructing-and-executing-queries).\n\n### Supported Search Services\nThe list of supported search service types are listed in the table below. For more details on their usage, see [Search Service Types](query_construction.md#search-service-types).\n\n|Search service |QueryType |\n|----------------------------------|--------------------------|\n|Full-text |`TextQuery()` |\n|Attribute (structure or chemical) |`AttributeQuery()` |\n|Sequence similarity |`SequenceQuery()` |\n|Sequence motif |`SequenceMotifQuery()` |\n|Structure similarity |`StructSimilarityQuery()` |\n|Structure motif |`StructMotifQuery()` |\n|Chemical similarity |`ChemSimilarityQuery()` |\n\nLearn more about available search services on the [RCSB PDB Search API docs](https://search.rcsb.org/#search-services).\n\n## Jupyter Notebooks\nA runnable jupyter notebook is available in [notebooks/quickstart.ipynb](https://github.com/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb), or can be run online using Google Colab:\n<a href=\"https://colab.research.google.com/github/rcsb/py-rcsbsearchapi/blob/master/notebooks/quickstart.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\nAn additional Covid-19 related example is in [notebooks/covid.ipynb](https://github.com/rcsb/py-rcsbsearchapi/blob/master/notebooks/covid.ipynb):\n<a href=\"https://colab.research.google.com/github//rcsb/py-rcsbsearchapi/blob/master/notebooks/covid.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n\n\n## Supported Features\n\nThe following table lists the status of current and planned features.\n\n- [x] Structure and chemical attribute search\n - [x] Attribute Comparison operations\n - [x] Query set operations\n - [x] Attribute `contains`, `in_` (fluent only)\n- [x] Option to include computed structure models (CSMs) in search\n- [x] Sequence search\n- [x] Sequence motif search\n- [x] Structure similarity search\n- [X] Structure motif search\n- [X] Chemical similarity search\n- [ ] Rich results using the Data API\n\nContributions are welcome for unchecked items!\n\n## License\n\nCode is licensed under the BSD 3-clause license. See [LICENSE](LICENSE) for details.\n\n## Citing rcsbsearchapi\n\nPlease cite the rcsbsearchapi package by URL:\n\n> https://rcsbsearchapi.readthedocs.io\n\nYou should also cite the RCSB PDB service this package utilizes:\n\n> Yana Rose, Jose M. Duarte, Robert Lowe, Joan Segura, Chunxiao Bi, Charmi\n> Bhikadiya, Li Chen, Alexander S. Rose, Sebastian Bittrich, Stephen K. Burley,\n> John D. Westbrook. RCSB Protein Data Bank: Architectural Advances Towards\n> Integrated Searching and Efficient Access to Macromolecular Structure Data\n> from the PDB Archive, Journal of Molecular Biology, 2020.\n> DOI: [10.1016/j.jmb.2020.11.003](https://doi.org/10.1016/j.jmb.2020.11.003)\n\n## Attributions\n\nThe source code for this project was originally written by [Spencer Bliven](https://github.com/sbliven) and forked from [sbliven/rcsbsearch](https://github.com/sbliven/rcsbsearch). We would like to express our tremendous gratitude for his generous efforts in designing such a comprehensive public utility Python package for interacting with the RCSB PDB search API.\n\n## Developers\n\nFor information about building and developing `rcsbsearchapi`, see\n[CONTRIBUTING.md](CONTRIBUTING.md)\n",
"bugtrack_url": null,
"license": "BSD 3-Clause",
"summary": "Python package interface for the RCSB PDB search API service",
"version": "2.0.0",
"project_urls": {
"Homepage": "https://github.com/rcsb/py-rcsbsearchapi"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8691a18789016a05d76eacc3a3352376a0684dcf11440441d70827a4597339e6",
"md5": "c099fa24a25e45673c3b03755db9d563",
"sha256": "de5b46d2f5b75539860ac65bd9c47ad1b834feb743048acae5a8296a073edcfe"
},
"downloads": -1,
"filename": "rcsbsearchapi-2.0.0.tar.gz",
"has_sig": false,
"md5_digest": "c099fa24a25e45673c3b03755db9d563",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 181655,
"upload_time": "2024-10-04T20:05:18",
"upload_time_iso_8601": "2024-10-04T20:05:18.824125Z",
"url": "https://files.pythonhosted.org/packages/86/91/a18789016a05d76eacc3a3352376a0684dcf11440441d70827a4597339e6/rcsbsearchapi-2.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-04 20:05:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rcsb",
"github_project": "py-rcsbsearchapi",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "requests",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "tqdm",
"specs": []
}
],
"tox": true,
"lcname": "rcsbsearchapi"
}