unpywall


Nameunpywall JSON
Version 0.2.3 PyPI version JSON
download
home_pagehttps://github.com/unpywall/unpywall
SummaryInterfacing the Unpaywall Database with Python
upload_time2024-02-19 10:21:11
maintainer
docs_urlNone
authorNick Haupka, bganglia
requires_python
licenseMIT
keywords unpaywall open access full text
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # unpywall - Interfacing the Unpaywall API with Python

[![Build Status](https://circleci.com/gh/unpywall/unpywall.svg?style=shield)](https://app.circleci.com/pipelines/github/unpywall/unpywall)
[![codecov.io](https://codecov.io/gh/unpywall/unpywall/branch/master/graph/badge.svg)](https://codecov.io/gh/unpywall/unpywall?branch=master)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/unpywall)](https://pypi.org/project/unpywall/)
[![License](https://img.shields.io/github/license/unpywall/unpywall)](https://github.com/unpywall/unpywall/blob/master/LICENSE.txt)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4085414.svg)](https://doi.org/10.5281/zenodo.4085414)
[![PyPI - Version](https://img.shields.io/pypi/v/unpywall)](https://pypi.org/project/unpywall/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/unpywall)](https://pypi.org/project/unpywall/)
[![Documentation Status](https://readthedocs.org/projects/unpywall/badge/?version=latest)](https://unpywall.readthedocs.io/en/latest/?badge=latest)

## Introduction

unpywall is a Python client that utilizes the [Unpaywall REST API](https://unpaywall.org/products/api) for scholarly analysis with [pandas](https://pandas.pydata.org/). This package is influenced by [roadoi](https://github.com/ropensci/roadoi), a R client that interacts with the Unpaywall API.

You can find more about the Unpaywall service here: https://unpaywall.org/.

The documentation about the Unpaywall REST API is located here: https://unpaywall.org/products/api.


## Install

Install from [pypi](https://pypi.org/project/unpywall/) using pip:
```bash
pip install unpywall
```

## Use

### Authentication

An authentification is required to use the Unpaywall Service. For that, unpywall offers two options for authorizing the client. You can either import `UnpywallCredentials` which generates an environment variable or you can set the environment variable by yourself. Both methods require an email.

```python
from unpywall.utils import UnpywallCredentials

UnpywallCredentials('nick.haupka@gmail.com')
```

Notice that the environment variable for authentication needs to be called `UNPAYWALL_EMAIL`.

```bash
export UNPAYWALL_EMAIL=nick.haupka@gmail.com
```

### Query Unpaywall by DOI

If you want to search articles by a given DOI use the method `doi`. The result is a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).

```python
from unpywall import Unpywall

Unpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'])

#   data_standard  ... best_oa_location.version
#0              2  ...         publishedVersion
#1              2  ...         publishedVersion

#[2 rows x 32 columns]
```

You can track the progress of your API call by setting the parameter `progress` to True. This is especially useful for estimating the time required.

```python
Unpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'],
             progress=True)

#|=========================                        | 50%
```

This method also allows two options for catching errors (`raise` and `ignore`)

```python
Unpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'],
             errors='ignore')
```

### Query Unpaywall by text search

If you want to search articles by a given term use the method `query`. The result is a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)

```python
Unpywall.query(query='sea lion',
               is_oa=True)
#   data_standard  ... first_oa_location.version
#0              2  ...          publishedVersion
#1              2  ...          publishedVersion
#2              2  ...          publishedVersion
```

### Conveniently obtain full text

If you are using Unpaywall to obtain full-text copies of papers for literature mining, you may benefit from the following functions:

You can use the `download_pdf_handle` method to return a PDF handle for the given DOI.

```python
Unpywall.download_pdf_handle(doi='10.1038/nature12373')

#<http.client.HTTPResponse object at 0x7fd08ef677c0>
```

To return an URL to a PDF for the given DOI, use `get_pdf_link`.

```python
Unpywall.get_pdf_link(doi='10.1038/nature12373')

#'https://dash.harvard.edu/bitstream/1/12285462/1/Nanometer-Scale%20Thermometry.pdf'
```

To return an URL to the best available OA copy, regardless of the format, use `get_doc_link`.

```python
Unpywall.get_doc_link(doi='10.1016/j.envint.2020.105730')

#'https://doi.org/10.1016/j.envint.2020.105730'
```
To return a list of all URLS to OA copies, use `get_all_links`.

```python
Unpywall.get_all_links(doi='10.1038/nature12373')

#['https://dash.harvard.edu/bitstream/1/12285462/1/Nanometer-Scale%20Thermometry.pdf']
```

You can also directly access all data provided by unpaywall in json format using `get_json`.

```python
Unpywall.get_json(doi='10.1038/nature12373')

#{'best_oa_location': {'endpoint_id': '8c9d8ba370a84253deb', 'evidence': 'oa repository (via OAI-PMH doi match)', 'host_type': ...
```

## Command-Line-Interface

unpywall comes with a command-line-interface that can be used to quickly look up a PDF or to download free full-text articles to your device.

### Obtain a PDF URL

Retrieve the URL of a PDF for a given DOI with the following command.

```bash
unpywall link 10.1038/nature12373
```

### View a PDF

If you want to view a PDF in your Browser or on your system use `view`.

```bash
unpywall view 10.1038/nature12373 -m browser
```

### PDF Download

Use `download` if you want to store a PDF on your machine.

```bash
unpywall download 10.1038/nature12373 -f article.pdf -p ./documents
```

### Help

You can always use `help` to open a description for the provided functions.

```bash
unpywall -h
```

## Documentation

Full documentation is available at https://unpywall.readthedocs.io/.

## Develop

To install unpywall, along with dev tools, run:

```bash
pip install -e '.[dev]'
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/unpywall/unpywall",
    "name": "unpywall",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Unpaywall,Open Access,full text",
    "author": "Nick Haupka, bganglia",
    "author_email": "nick.haupka@gmail.com, bganglia892@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1b/d4/c7734a4b188db5eba57c50e283ebbc05673ff0ab85b6d9485356f18643de/unpywall-0.2.3.tar.gz",
    "platform": null,
    "description": "# unpywall - Interfacing the Unpaywall API with Python\n\n[![Build Status](https://circleci.com/gh/unpywall/unpywall.svg?style=shield)](https://app.circleci.com/pipelines/github/unpywall/unpywall)\n[![codecov.io](https://codecov.io/gh/unpywall/unpywall/branch/master/graph/badge.svg)](https://codecov.io/gh/unpywall/unpywall?branch=master)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/unpywall)](https://pypi.org/project/unpywall/)\n[![License](https://img.shields.io/github/license/unpywall/unpywall)](https://github.com/unpywall/unpywall/blob/master/LICENSE.txt)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4085414.svg)](https://doi.org/10.5281/zenodo.4085414)\n[![PyPI - Version](https://img.shields.io/pypi/v/unpywall)](https://pypi.org/project/unpywall/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/unpywall)](https://pypi.org/project/unpywall/)\n[![Documentation Status](https://readthedocs.org/projects/unpywall/badge/?version=latest)](https://unpywall.readthedocs.io/en/latest/?badge=latest)\n\n## Introduction\n\nunpywall is a Python client that utilizes the [Unpaywall REST API](https://unpaywall.org/products/api) for scholarly analysis with [pandas](https://pandas.pydata.org/). This package is influenced by [roadoi](https://github.com/ropensci/roadoi), a R client that interacts with the Unpaywall API.\n\nYou can find more about the Unpaywall service here: https://unpaywall.org/.\n\nThe documentation about the Unpaywall REST API is located here: https://unpaywall.org/products/api.\n\n\n## Install\n\nInstall from [pypi](https://pypi.org/project/unpywall/) using pip:\n```bash\npip install unpywall\n```\n\n## Use\n\n### Authentication\n\nAn authentification is required to use the Unpaywall Service. For that, unpywall offers two options for authorizing the client. You can either import `UnpywallCredentials` which generates an environment variable or you can set the environment variable by yourself. Both methods require an email.\n\n```python\nfrom unpywall.utils import UnpywallCredentials\n\nUnpywallCredentials('nick.haupka@gmail.com')\n```\n\nNotice that the environment variable for authentication needs to be called `UNPAYWALL_EMAIL`.\n\n```bash\nexport UNPAYWALL_EMAIL=nick.haupka@gmail.com\n```\n\n### Query Unpaywall by DOI\n\nIf you want to search articles by a given DOI use the method `doi`. The result is a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html).\n\n```python\nfrom unpywall import Unpywall\n\nUnpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'])\n\n#   data_standard  ... best_oa_location.version\n#0              2  ...         publishedVersion\n#1              2  ...         publishedVersion\n\n#[2 rows x 32 columns]\n```\n\nYou can track the progress of your API call by setting the parameter `progress` to True. This is especially useful for estimating the time required.\n\n```python\nUnpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'],\n             progress=True)\n\n#|=========================                        | 50%\n```\n\nThis method also allows two options for catching errors (`raise` and `ignore`)\n\n```python\nUnpywall.doi(dois=['10.1038/nature12373', '10.1093/nar/gkr1047'],\n             errors='ignore')\n```\n\n### Query Unpaywall by text search\n\nIf you want to search articles by a given term use the method `query`. The result is a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)\n\n```python\nUnpywall.query(query='sea lion',\n               is_oa=True)\n#   data_standard  ... first_oa_location.version\n#0              2  ...          publishedVersion\n#1              2  ...          publishedVersion\n#2              2  ...          publishedVersion\n```\n\n### Conveniently obtain full text\n\nIf you are using Unpaywall to obtain full-text copies of papers for literature mining, you may benefit from the following functions:\n\nYou can use the `download_pdf_handle` method to return a PDF handle for the given DOI.\n\n```python\nUnpywall.download_pdf_handle(doi='10.1038/nature12373')\n\n#<http.client.HTTPResponse object at 0x7fd08ef677c0>\n```\n\nTo return an URL to a PDF for the given DOI, use `get_pdf_link`.\n\n```python\nUnpywall.get_pdf_link(doi='10.1038/nature12373')\n\n#'https://dash.harvard.edu/bitstream/1/12285462/1/Nanometer-Scale%20Thermometry.pdf'\n```\n\nTo return an URL to the best available OA copy, regardless of the format, use `get_doc_link`.\n\n```python\nUnpywall.get_doc_link(doi='10.1016/j.envint.2020.105730')\n\n#'https://doi.org/10.1016/j.envint.2020.105730'\n```\nTo return a list of all URLS to OA copies, use `get_all_links`.\n\n```python\nUnpywall.get_all_links(doi='10.1038/nature12373')\n\n#['https://dash.harvard.edu/bitstream/1/12285462/1/Nanometer-Scale%20Thermometry.pdf']\n```\n\nYou can also directly access all data provided by unpaywall in json format using `get_json`.\n\n```python\nUnpywall.get_json(doi='10.1038/nature12373')\n\n#{'best_oa_location': {'endpoint_id': '8c9d8ba370a84253deb', 'evidence': 'oa repository (via OAI-PMH doi match)', 'host_type': ...\n```\n\n## Command-Line-Interface\n\nunpywall comes with a command-line-interface that can be used to quickly look up a PDF or to download free full-text articles to your device.\n\n### Obtain a PDF URL\n\nRetrieve the URL of a PDF for a given DOI with the following command.\n\n```bash\nunpywall link 10.1038/nature12373\n```\n\n### View a PDF\n\nIf you want to view a PDF in your Browser or on your system use `view`.\n\n```bash\nunpywall view 10.1038/nature12373 -m browser\n```\n\n### PDF Download\n\nUse `download` if you want to store a PDF on your machine.\n\n```bash\nunpywall download 10.1038/nature12373 -f article.pdf -p ./documents\n```\n\n### Help\n\nYou can always use `help` to open a description for the provided functions.\n\n```bash\nunpywall -h\n```\n\n## Documentation\n\nFull documentation is available at https://unpywall.readthedocs.io/.\n\n## Develop\n\nTo install unpywall, along with dev tools, run:\n\n```bash\npip install -e '.[dev]'\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Interfacing the Unpaywall Database with Python",
    "version": "0.2.3",
    "project_urls": {
        "Documentation": "https://unpywall.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/unpywall/unpywall",
        "Source": "https://github.com/unpywall/unpywall",
        "Tracker": "https://github.com/unpywall/unpywall/issues"
    },
    "split_keywords": [
        "unpaywall",
        "open access",
        "full text"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1bd4c7734a4b188db5eba57c50e283ebbc05673ff0ab85b6d9485356f18643de",
                "md5": "7f4b5463743d24b147e11518ccc08842",
                "sha256": "4b1977d4e90ae5638a851bf1a8c1c04aa092ad3f8c0137aae0d0c07038f86e68"
            },
            "downloads": -1,
            "filename": "unpywall-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "7f4b5463743d24b147e11518ccc08842",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15417,
            "upload_time": "2024-02-19T10:21:11",
            "upload_time_iso_8601": "2024-02-19T10:21:11.677277Z",
            "url": "https://files.pythonhosted.org/packages/1b/d4/c7734a4b188db5eba57c50e283ebbc05673ff0ab85b6d9485356f18643de/unpywall-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-19 10:21:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "unpywall",
    "github_project": "unpywall",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": false,
    "circle": true,
    "requirements": [],
    "lcname": "unpywall"
}
        
Elapsed time: 0.24274s