Name | pubmed-id JSON |
Version |
1.0
JSON |
| download |
home_page | None |
Summary | Simple interface to query or scrape IDs from PubMed. |
upload_time | 2025-01-26 23:09:53 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.7 |
license | None |
keywords |
pubmed
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# pubmed-id
Simple interface to query or scrape IDs from [PubMed](https://pubmed.ncbi.nlm.nih.gov/) (The US National Library of Medicine).
> This tool was originally developed to obtain temporal data for the well-known [PubMed graph dataset](https://github.com/nelsonaloysio/pubmed-temporal).
## Usage
### Command line interface
A CLI is included that allows querying the PubMed via their API or by web scraping.
```bash
usage: pubmed-id [-h] [-o OUTPUT_FILE] [-m METHOD] [-w WORKERS] [-c SIZE]
[--email ADDRESS] [--tool NAME] [--quiet]
ID [ID ...]
positional arguments:
ID IDs to query (separated by whitespaces).
options:
-h, --help show this help message and exit
-o OUTPUT_FILE, --output-file OUTPUT_FILE
File to write results to (default: 'PubMedAPI.json').
-m METHOD, --method METHOD
Method to obtain data with (default: 'api'). Choices:
('api', 'citedin', 'refs', 'scrape').
-w WORKERS, --max-workers WORKERS
Number of processes to use (optional).
-c SIZE, --chunksize SIZE
Number of objects sent to each worker (optional).
--email ADDRESS Your e-mail address (required to query API only).
--tool NAME Tool name (optional, used to query API only).
--quiet Does not print results (limited to a single item only
by default).
```
### Importing as a class
Quick example on how to obtain data from the API:
```python
>>> from pubmed_id import PubMedAPI
>>>
>>> api = PubMedAPI(email="myemail@domain.com", tool="MyToolName")
```
For more information on the API, please check the [official documentation](https://www.ncbi.nlm.nih.gov/home/develop/api/).
#### Obtain data from API
By default, the returned data is a dictionary with the PMCID, the PMID, and the DOI of a paper:
```python
>>> api(6798965)
{
"pmcid": "PMC1163140",
"pmid": "6798965",
"doi": "10.1042/bj1970405"
}
```
Either an integer (PMID), a string (PMID or PMCID), or a list is accepted as input when calling the class directly.
**Note:** NCBI recommends that users post no more than three URL requests per second and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays. See more: [Usage Guidelines](https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen).
#### Scrape data from website
Scraping the PMID or PMICD instead returns more data (strings shortened for brevity):
```python
>>> api(6798965, method="scrape")
{
"6798965": {
"date": "1981 Aug 1",
"title": "Characterization of N-glycosylated...",
"abstract": "The N epsilon-glycosylation of...",
"author_names": "A Le Pape;J P Muh;A J Bailey",
"author_ids": "6798965;6798965;6798965",
"doi": "PMC1163140",
"pmid": "6798965"
}
}
```
**Note**: some papers are unavailable from the API, but still return data when scraped, e.g., [PMID 15356126](https://pubmed.ncbi.nlm.nih.gov/15356126/).
#### Get paper references
Returns list of references from a paper:
```python
>>> api(6798965, method="refs")
{
"6798965": [
"7430347",
"..."
]
}
```
#### Get citations for a paper
Returns list of citations to a paper:
```python
>>> api(6798965, method="citedin")
{
"15356126": [
"32868408",
"..."
]
}
```
___
### References
* [PubMed API](https://www.ncbi.nlm.nih.gov/home/develop/api/)
Raw data
{
"_id": null,
"home_page": null,
"name": "pubmed-id",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "PubMed",
"author": null,
"author_email": "Nelson Aloysio Reis de Almeida Passos <nelson.reis@phd.unipi.it>",
"download_url": "https://files.pythonhosted.org/packages/a8/06/797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344/pubmed_id-1.0.tar.gz",
"platform": null,
"description": "# pubmed-id\n\nSimple interface to query or scrape IDs from [PubMed](https://pubmed.ncbi.nlm.nih.gov/) (The US National Library of Medicine).\n\n> This tool was originally developed to obtain temporal data for the well-known [PubMed graph dataset](https://github.com/nelsonaloysio/pubmed-temporal).\n\n## Usage\n\n### Command line interface\n\nA CLI is included that allows querying the PubMed via their API or by web scraping.\n\n```bash\nusage: pubmed-id [-h] [-o OUTPUT_FILE] [-m METHOD] [-w WORKERS] [-c SIZE]\n [--email ADDRESS] [--tool NAME] [--quiet]\n ID [ID ...]\n\npositional arguments:\n ID IDs to query (separated by whitespaces).\n\noptions:\n -h, --help show this help message and exit\n -o OUTPUT_FILE, --output-file OUTPUT_FILE\n File to write results to (default: 'PubMedAPI.json').\n -m METHOD, --method METHOD\n Method to obtain data with (default: 'api'). Choices:\n ('api', 'citedin', 'refs', 'scrape').\n -w WORKERS, --max-workers WORKERS\n Number of processes to use (optional).\n -c SIZE, --chunksize SIZE\n Number of objects sent to each worker (optional).\n --email ADDRESS Your e-mail address (required to query API only).\n --tool NAME Tool name (optional, used to query API only).\n --quiet Does not print results (limited to a single item only\n by default).\n```\n\n### Importing as a class\n\nQuick example on how to obtain data from the API:\n\n```python\n>>> from pubmed_id import PubMedAPI\n>>>\n>>> api = PubMedAPI(email=\"myemail@domain.com\", tool=\"MyToolName\")\n```\n\nFor more information on the API, please check the [official documentation](https://www.ncbi.nlm.nih.gov/home/develop/api/).\n\n\n#### Obtain data from API\n\nBy default, the returned data is a dictionary with the PMCID, the PMID, and the DOI of a paper:\n\n```python\n>>> api(6798965)\n\n{\n \"pmcid\": \"PMC1163140\",\n \"pmid\": \"6798965\",\n \"doi\": \"10.1042/bj1970405\"\n}\n```\n\nEither an integer (PMID), a string (PMID or PMCID), or a list is accepted as input when calling the class directly.\n\n**Note:** NCBI recommends that users post no more than three URL requests per second and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays. See more: [Usage Guidelines](https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen).\n\n#### Scrape data from website\n\nScraping the PMID or PMICD instead returns more data (strings shortened for brevity):\n\n```python\n>>> api(6798965, method=\"scrape\")\n\n{\n \"6798965\": {\n \"date\": \"1981 Aug 1\",\n \"title\": \"Characterization of N-glycosylated...\",\n \"abstract\": \"The N epsilon-glycosylation of...\",\n \"author_names\": \"A Le Pape;J P Muh;A J Bailey\",\n \"author_ids\": \"6798965;6798965;6798965\",\n \"doi\": \"PMC1163140\",\n \"pmid\": \"6798965\"\n }\n}\n```\n\n**Note**: some papers are unavailable from the API, but still return data when scraped, e.g., [PMID 15356126](https://pubmed.ncbi.nlm.nih.gov/15356126/).\n\n#### Get paper references\n\nReturns list of references from a paper:\n\n```python\n>>> api(6798965, method=\"refs\")\n\n{\n \"6798965\": [\n \"7430347\",\n \"...\"\n ]\n}\n```\n\n#### Get citations for a paper\n\nReturns list of citations to a paper:\n\n```python\n>>> api(6798965, method=\"citedin\")\n\n{\n \"15356126\": [\n \"32868408\",\n \"...\"\n ]\n}\n```\n\n___\n\n### References\n\n* [PubMed API](https://www.ncbi.nlm.nih.gov/home/develop/api/)\n",
"bugtrack_url": null,
"license": null,
"summary": "Simple interface to query or scrape IDs from PubMed.",
"version": "1.0",
"project_urls": {
"Changelog": "https://github.com/nelsonaloysio/pubmed-id/blob/main/CHANGELOG.md",
"Homepage": "https://pypi.org/p/pubmed-id/",
"Issues": "https://github.com/nelsonaloysio/pubmed-id/issues",
"Repository": "https://github.com/nelsonaloysio/pubmed-id"
},
"split_keywords": [
"pubmed"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d18247f82bad8ade91268eb72693ff85fd848b5532ae4b1520a352b2dc82e4d8",
"md5": "13ee96c002b534af37955a4580d3a0f0",
"sha256": "4db1ddcdac03b29a89bcc676a4d1dbefa28f54b69ad44fa438e8145062bf52db"
},
"downloads": -1,
"filename": "pubmed_id-1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "13ee96c002b534af37955a4580d3a0f0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 7584,
"upload_time": "2025-01-26T23:09:52",
"upload_time_iso_8601": "2025-01-26T23:09:52.416247Z",
"url": "https://files.pythonhosted.org/packages/d1/82/47f82bad8ade91268eb72693ff85fd848b5532ae4b1520a352b2dc82e4d8/pubmed_id-1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a806797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344",
"md5": "8729d4d668c9690aaafe492eba3dd54b",
"sha256": "2ffc9df66a5429484420c9327a6816a47b9fe48639c3be200128625b394cd2a5"
},
"downloads": -1,
"filename": "pubmed_id-1.0.tar.gz",
"has_sig": false,
"md5_digest": "8729d4d668c9690aaafe492eba3dd54b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 8348,
"upload_time": "2025-01-26T23:09:53",
"upload_time_iso_8601": "2025-01-26T23:09:53.761784Z",
"url": "https://files.pythonhosted.org/packages/a8/06/797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344/pubmed_id-1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-26 23:09:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nelsonaloysio",
"github_project": "pubmed-id",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pubmed-id"
}