pubmed-id


Namepubmed-id JSON
Version 1.0 PyPI version JSON
download
home_pageNone
SummarySimple interface to query or scrape IDs from PubMed.
upload_time2025-01-26 23:09:53
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords pubmed
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pubmed-id

Simple interface to query or scrape IDs from [PubMed](https://pubmed.ncbi.nlm.nih.gov/) (The US National Library of Medicine).

> This tool was originally developed to obtain temporal data for the well-known [PubMed graph dataset](https://github.com/nelsonaloysio/pubmed-temporal).

## Usage

### Command line interface

A CLI is included that allows querying the PubMed via their API or by web scraping.

```bash
usage: pubmed-id [-h] [-o OUTPUT_FILE] [-m METHOD] [-w WORKERS] [-c SIZE]
                 [--email ADDRESS] [--tool NAME] [--quiet]
                 ID [ID ...]

positional arguments:
  ID                    IDs to query (separated by whitespaces).

options:
  -h, --help            show this help message and exit
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        File to write results to (default: 'PubMedAPI.json').
  -m METHOD, --method METHOD
                        Method to obtain data with (default: 'api'). Choices:
                        ('api', 'citedin', 'refs', 'scrape').
  -w WORKERS, --max-workers WORKERS
                        Number of processes to use (optional).
  -c SIZE, --chunksize SIZE
                        Number of objects sent to each worker (optional).
  --email ADDRESS       Your e-mail address (required to query API only).
  --tool NAME           Tool name (optional, used to query API only).
  --quiet               Does not print results (limited to a single item only
                        by default).
```

### Importing as a class

Quick example on how to obtain data from the API:

```python
>>> from pubmed_id import PubMedAPI
>>>
>>> api = PubMedAPI(email="myemail@domain.com", tool="MyToolName")
```

For more information on the API, please check the [official documentation](https://www.ncbi.nlm.nih.gov/home/develop/api/).


#### Obtain data from API

By default, the returned data is a dictionary with the PMCID, the PMID, and the DOI of a paper:

```python
>>> api(6798965)

{
  "pmcid": "PMC1163140",
  "pmid": "6798965",
  "doi": "10.1042/bj1970405"
}
```

Either an integer (PMID), a string (PMID or PMCID), or a list is accepted as input when calling the class directly.

**Note:** NCBI recommends that users post no more than three URL requests per second and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays. See more: [Usage Guidelines](https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen).

#### Scrape data from website

Scraping the PMID or PMICD instead returns more data (strings shortened for brevity):

```python
>>> api(6798965, method="scrape")

{
  "6798965": {
    "date": "1981 Aug 1",
    "title": "Characterization of N-glycosylated...",
    "abstract": "The N epsilon-glycosylation of...",
    "author_names": "A Le Pape;J P Muh;A J Bailey",
    "author_ids": "6798965;6798965;6798965",
    "doi": "PMC1163140",
    "pmid": "6798965"
  }
}
```

**Note**: some papers are unavailable from the API, but still return data when scraped, e.g., [PMID 15356126](https://pubmed.ncbi.nlm.nih.gov/15356126/).

#### Get paper references

Returns list of references from a paper:

```python
>>> api(6798965, method="refs")

{
  "6798965": [
    "7430347",
    "..."
  ]
}
```

#### Get citations for a paper

Returns list of citations to a paper:

```python
>>> api(6798965, method="citedin")

{
  "15356126": [
    "32868408",
    "..."
  ]
}
```

___

### References

* [PubMed API](https://www.ncbi.nlm.nih.gov/home/develop/api/)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pubmed-id",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "PubMed",
    "author": null,
    "author_email": "Nelson Aloysio Reis de Almeida Passos <nelson.reis@phd.unipi.it>",
    "download_url": "https://files.pythonhosted.org/packages/a8/06/797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344/pubmed_id-1.0.tar.gz",
    "platform": null,
    "description": "# pubmed-id\n\nSimple interface to query or scrape IDs from [PubMed](https://pubmed.ncbi.nlm.nih.gov/) (The US National Library of Medicine).\n\n> This tool was originally developed to obtain temporal data for the well-known [PubMed graph dataset](https://github.com/nelsonaloysio/pubmed-temporal).\n\n## Usage\n\n### Command line interface\n\nA CLI is included that allows querying the PubMed via their API or by web scraping.\n\n```bash\nusage: pubmed-id [-h] [-o OUTPUT_FILE] [-m METHOD] [-w WORKERS] [-c SIZE]\n                 [--email ADDRESS] [--tool NAME] [--quiet]\n                 ID [ID ...]\n\npositional arguments:\n  ID                    IDs to query (separated by whitespaces).\n\noptions:\n  -h, --help            show this help message and exit\n  -o OUTPUT_FILE, --output-file OUTPUT_FILE\n                        File to write results to (default: 'PubMedAPI.json').\n  -m METHOD, --method METHOD\n                        Method to obtain data with (default: 'api'). Choices:\n                        ('api', 'citedin', 'refs', 'scrape').\n  -w WORKERS, --max-workers WORKERS\n                        Number of processes to use (optional).\n  -c SIZE, --chunksize SIZE\n                        Number of objects sent to each worker (optional).\n  --email ADDRESS       Your e-mail address (required to query API only).\n  --tool NAME           Tool name (optional, used to query API only).\n  --quiet               Does not print results (limited to a single item only\n                        by default).\n```\n\n### Importing as a class\n\nQuick example on how to obtain data from the API:\n\n```python\n>>> from pubmed_id import PubMedAPI\n>>>\n>>> api = PubMedAPI(email=\"myemail@domain.com\", tool=\"MyToolName\")\n```\n\nFor more information on the API, please check the [official documentation](https://www.ncbi.nlm.nih.gov/home/develop/api/).\n\n\n#### Obtain data from API\n\nBy default, the returned data is a dictionary with the PMCID, the PMID, and the DOI of a paper:\n\n```python\n>>> api(6798965)\n\n{\n  \"pmcid\": \"PMC1163140\",\n  \"pmid\": \"6798965\",\n  \"doi\": \"10.1042/bj1970405\"\n}\n```\n\nEither an integer (PMID), a string (PMID or PMCID), or a list is accepted as input when calling the class directly.\n\n**Note:** NCBI recommends that users post no more than three URL requests per second and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays. See more: [Usage Guidelines](https://www.ncbi.nlm.nih.gov/books/NBK25497/#chapter2.Usage_Guidelines_and_Requiremen).\n\n#### Scrape data from website\n\nScraping the PMID or PMICD instead returns more data (strings shortened for brevity):\n\n```python\n>>> api(6798965, method=\"scrape\")\n\n{\n  \"6798965\": {\n    \"date\": \"1981 Aug 1\",\n    \"title\": \"Characterization of N-glycosylated...\",\n    \"abstract\": \"The N epsilon-glycosylation of...\",\n    \"author_names\": \"A Le Pape;J P Muh;A J Bailey\",\n    \"author_ids\": \"6798965;6798965;6798965\",\n    \"doi\": \"PMC1163140\",\n    \"pmid\": \"6798965\"\n  }\n}\n```\n\n**Note**: some papers are unavailable from the API, but still return data when scraped, e.g., [PMID 15356126](https://pubmed.ncbi.nlm.nih.gov/15356126/).\n\n#### Get paper references\n\nReturns list of references from a paper:\n\n```python\n>>> api(6798965, method=\"refs\")\n\n{\n  \"6798965\": [\n    \"7430347\",\n    \"...\"\n  ]\n}\n```\n\n#### Get citations for a paper\n\nReturns list of citations to a paper:\n\n```python\n>>> api(6798965, method=\"citedin\")\n\n{\n  \"15356126\": [\n    \"32868408\",\n    \"...\"\n  ]\n}\n```\n\n___\n\n### References\n\n* [PubMed API](https://www.ncbi.nlm.nih.gov/home/develop/api/)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Simple interface to query or scrape IDs from PubMed.",
    "version": "1.0",
    "project_urls": {
        "Changelog": "https://github.com/nelsonaloysio/pubmed-id/blob/main/CHANGELOG.md",
        "Homepage": "https://pypi.org/p/pubmed-id/",
        "Issues": "https://github.com/nelsonaloysio/pubmed-id/issues",
        "Repository": "https://github.com/nelsonaloysio/pubmed-id"
    },
    "split_keywords": [
        "pubmed"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d18247f82bad8ade91268eb72693ff85fd848b5532ae4b1520a352b2dc82e4d8",
                "md5": "13ee96c002b534af37955a4580d3a0f0",
                "sha256": "4db1ddcdac03b29a89bcc676a4d1dbefa28f54b69ad44fa438e8145062bf52db"
            },
            "downloads": -1,
            "filename": "pubmed_id-1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "13ee96c002b534af37955a4580d3a0f0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 7584,
            "upload_time": "2025-01-26T23:09:52",
            "upload_time_iso_8601": "2025-01-26T23:09:52.416247Z",
            "url": "https://files.pythonhosted.org/packages/d1/82/47f82bad8ade91268eb72693ff85fd848b5532ae4b1520a352b2dc82e4d8/pubmed_id-1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a806797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344",
                "md5": "8729d4d668c9690aaafe492eba3dd54b",
                "sha256": "2ffc9df66a5429484420c9327a6816a47b9fe48639c3be200128625b394cd2a5"
            },
            "downloads": -1,
            "filename": "pubmed_id-1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8729d4d668c9690aaafe492eba3dd54b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 8348,
            "upload_time": "2025-01-26T23:09:53",
            "upload_time_iso_8601": "2025-01-26T23:09:53.761784Z",
            "url": "https://files.pythonhosted.org/packages/a8/06/797269ea9863749a1b681fbe47d1167a38914d878168ebeface5612a1344/pubmed_id-1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-26 23:09:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nelsonaloysio",
    "github_project": "pubmed-id",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pubmed-id"
}
        
Elapsed time: 0.73769s