arxiv-dl


Namearxiv-dl JSON
Version 1.2.2 PyPI version JSON
download
home_pageNone
SummaryCommand-line Papers Downloader. Citation extraction and PDF naming automation.
upload_time2025-07-27 08:40:08
maintainerNone
docs_urlNone
authorNone
requires_python>=3.6
licenseMIT License Copyright (c) 2021-2026 Mark H. Huang Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords cvf cvpr eccv iccv wacv arxiv downloader paper
VCS
bugtrack_url
requirements beautifulsoup4 pydantic pymupdf requests rich setuptools
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # arXiv-dl

Command-line Research Paper Downloader for [`arXiv.org`](https://arxiv.org/), [`ECVA`](https://www.ecva.net/papers.php) & [`CVF Open Access`](https://openaccess.thecvf.com/menu).

[![](https://img.shields.io/pypi/v/arxiv-dl)](https://pypi.org/project/arxiv-dl/)
[![](https://img.shields.io/pypi/dm/Arxiv-dl)](https://pypistats.org/packages/arxiv-dl)
[![](https://img.shields.io/badge/code%20style-black-black)](https://github.com/psf/black)
[![](https://img.shields.io/badge/license-MIT-black)](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE)

_Disclaimer: This is a highly-opinionated command-line tool for downloading papers. It priorities ease of use for researchers. Obviously, this is not an ArXiv official project._

![](imgs/demo_v1.2.0.png)

## What does it do?

-   Support downloading papers from [arXiv](https://arxiv.org/), [ECCV](https://www.ecva.net/papers.php), [CVPR, ICCV, WACV](https://openaccess.thecvf.com/menu) via simple CLI.
-   Support downloading speedup by using [aria2](https://aria2.github.io/).
-   Retrieve the paper's metadata such as:
    -   Title, Abstract, Year
    -   Authors
    -   Comments (Conference acceptance info)
    -   Repository URLs
    -   `BibTeX` Citation
-   Automatically maintain a list of local papers and their metadata in a JSON file.
-   Configure the desired download destination via an environment variable or a command-line argument.
-   All downloaded papers will have standardized filename for easy browsing.

## Why?

-   Save time and effort to download and organize papers on your machine.
-   Speedup downloading process by using multiple parallel connections.
-   Local paper list would be handy for quick local lookup, making notes, and doing citations.

## How to install it?

This is a command-line tool, simply use `pip` to install the package globally, then you are good to go!

-   Pre-requisite: `Python 3.x`

```bash
python3 -m pip install -U arxiv-dl
```

> [!NOTE]
> After installation, you need to ensure the installation path is included in your PATH environment variable (tips: [here](https://github.com/MarkHershey/arxiv-dl/issues/16#issue-3266539938)). If you encounter any difficulty finding / setting the PATH, there is this recommended way of [installing stand alone command line tools](https://packaging.python.org/en/latest/guides/installing-stand-alone-command-line-tools/), kindly follow its instruction when installing `arxiv-dl`.

Optionally, install [aria2c](https://aria2.github.io/) for multi-connection download speedup.

-   MacOS: `brew install aria2`
-   Linux: `sudo snap install aria2c`

## How to use it?

After installation, you may use the command `paper` in your shell to download papers. 
(Legacy commands `arxiv-dl` and `getpaper` are equivalent to the command `paper`.)

```bash
paper [OPTIONS] TARGET(s)
```

### Use in your shell:

```bash
# download a single TARGET
$ paper 1512.03385

# download multiple TARGETs separated by space
$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385
```

### Supported types of download TARGETs:

<details>
<summary><strong>Click to expand</strong></summary>

✅ Supported, 🚧 Not Yet Supported, ❌ Not Supported

-   **[ArXiv](https://arxiv.org/)** 
    -   ✅ ArXiv ID: `1512.03385` or `arXiv:1512.03385`
    -   ✅ Legacy ArXiv ID: `alg-geom/9708001` or `cs/0002001`, etc.
    -   ✅ ArXiv Abstract Page URL: `https://arxiv.org/abs/1512.03385` 
    -   ✅ ArXiv PDF Page URL: `https://arxiv.org/pdf/1512.03385.pdf`
    -   ✅ ArXiv HTML Page URL: `https://arxiv.org/html/2506.15442`
-   **[CVF Open Access](https://openaccess.thecvf.com/menu) (CVPR, ICCV, WACV)**
    -   ✅ CVF Abstract Page URL: `https://openaccess.thecvf.com/content/**/html/**/*.html`
    -   ✅ CVF PDF Page URL: `https://openaccess.thecvf.com/content/**/papers/**/*.pdf`
-   **[ECVA](https://www.ecva.net/papers.php) (ECCV)** 
    -   ✅ ECVA Abstract Page URL: `https://www.ecva.net/html/**/*.php`
    -   ❌ ECVA PDF Page URL: `https://www.ecva.net/papers/**/*.pdf`
-   **[NeurIPS](https://papers.nips.cc/)**
    -   🚧 NeurIPS Abstract Page URL
    -   🚧 NeurIPS PDF Page URL
-   **[OpenReview](https://openreview.net/)**
    -   🚧 TODO
</details>

### Frequently used OPTIONS:

-   `-v`, `--verbose` (optional): set to verbose, print full details.
-   `-d`, `--download-dir` (optional): Specify one-time download directory. This option will override the default download directory or the one specified in the environment variable `ARXIV_DOWNLOAD_FOLDER`.
-   `-n`, `--n-threads` (optional): Specify the number of parallel connections to be used by `aria2`. 

> [!TIP]
> more options are available, run `paper -h` to see all options.

### Use it in your code:

```python
from arxiv_dl import download_paper

download_paper(target="1512.03385", download_dir=".", set_verbose_level="silent")
```


## Configurations

### Default Download Destination

-   Without any configurations, all paper will be downloaded to `$HOME/Downloads/ArXiv_Papers`, where `$HOME` is current user's home directory.

### Set Your Custom Download Destination _(Optional)_

You may configure your preferred download destination once and for all via an environment variable. This will override the default download destination. To do that, include the following line in your `.bashrc` or `.zshrc` file:

```bash
export ARXIV_DOWNLOAD_FOLDER="YOUR/PATH/TO/ANY/FOLDER"
```

-   Every time you use the `paper` command, the download destination will be set to the following order of priority:
    1.  Command-line option `-d` (highest priority)
    2.  Environment variable `ARXIV_DOWNLOAD_FOLDER`
    3.  Default download destination (lowest priority)

### Set Custom Command Alias _(Optional)_

-   You can always set your own preferred alias to rename the command or add more options.
-   Include the following line(s) in your `.bashrc` or `.zshrc` file to set your preferred alias:
    ```bash
    alias dp="paper"
    alias dpv="paper -v -d '~/Documents/Papers'"
    ```

## Development

### Set up development environment

```bash
# create a virtual environment
python3 -m venv venv && source venv/bin/activate

# install dependencies
pip install -U -r requirements.txt

# install the package in editable mode & dev dependencies
pip install -e ".[dev]"
```

### Run Tests

```bash
pytest
```

### Build the package

```bash
make
```

### Clean cache & build artifacts

```bash
make clean
```

## License

This project is licensed under the [MIT License](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE).  
&copy; Mark H. Huang. All rights reserved.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "arxiv-dl",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "CVF, CVPR, ECCV, ICCV, WACV, arxiv, downloader, paper",
    "author": null,
    "author_email": "Mark He Huang <dev@markhh.com>",
    "download_url": "https://files.pythonhosted.org/packages/1e/e7/efe872ddd800ed7e38126c82462ea4454f903ca5dfc72f83ec16f0e78a0b/arxiv_dl-1.2.2.tar.gz",
    "platform": null,
    "description": "# arXiv-dl\n\nCommand-line Research Paper Downloader for [`arXiv.org`](https://arxiv.org/), [`ECVA`](https://www.ecva.net/papers.php) & [`CVF Open Access`](https://openaccess.thecvf.com/menu).\n\n[![](https://img.shields.io/pypi/v/arxiv-dl)](https://pypi.org/project/arxiv-dl/)\n[![](https://img.shields.io/pypi/dm/Arxiv-dl)](https://pypistats.org/packages/arxiv-dl)\n[![](https://img.shields.io/badge/code%20style-black-black)](https://github.com/psf/black)\n[![](https://img.shields.io/badge/license-MIT-black)](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE)\n\n_Disclaimer: This is a highly-opinionated command-line tool for downloading papers. It priorities ease of use for researchers. Obviously, this is not an ArXiv official project._\n\n![](imgs/demo_v1.2.0.png)\n\n## What does it do?\n\n-   Support downloading papers from [arXiv](https://arxiv.org/), [ECCV](https://www.ecva.net/papers.php), [CVPR, ICCV, WACV](https://openaccess.thecvf.com/menu) via simple CLI.\n-   Support downloading speedup by using [aria2](https://aria2.github.io/).\n-   Retrieve the paper's metadata such as:\n    -   Title, Abstract, Year\n    -   Authors\n    -   Comments (Conference acceptance info)\n    -   Repository URLs\n    -   `BibTeX` Citation\n-   Automatically maintain a list of local papers and their metadata in a JSON file.\n-   Configure the desired download destination via an environment variable or a command-line argument.\n-   All downloaded papers will have standardized filename for easy browsing.\n\n## Why?\n\n-   Save time and effort to download and organize papers on your machine.\n-   Speedup downloading process by using multiple parallel connections.\n-   Local paper list would be handy for quick local lookup, making notes, and doing citations.\n\n## How to install it?\n\nThis is a command-line tool, simply use `pip` to install the package globally, then you are good to go!\n\n-   Pre-requisite: `Python 3.x`\n\n```bash\npython3 -m pip install -U arxiv-dl\n```\n\n> [!NOTE]\n> After installation, you need to ensure the installation path is included in your PATH environment variable (tips: [here](https://github.com/MarkHershey/arxiv-dl/issues/16#issue-3266539938)). If you encounter any difficulty finding / setting the PATH, there is this recommended way of [installing stand alone command line tools](https://packaging.python.org/en/latest/guides/installing-stand-alone-command-line-tools/), kindly follow its instruction when installing `arxiv-dl`.\n\nOptionally, install [aria2c](https://aria2.github.io/) for multi-connection download speedup.\n\n-   MacOS: `brew install aria2`\n-   Linux: `sudo snap install aria2c`\n\n## How to use it?\n\nAfter installation, you may use the command `paper` in your shell to download papers. \n(Legacy commands `arxiv-dl` and `getpaper` are equivalent to the command `paper`.)\n\n```bash\npaper [OPTIONS] TARGET(s)\n```\n\n### Use in your shell:\n\n```bash\n# download a single TARGET\n$ paper 1512.03385\n\n# download multiple TARGETs separated by space\n$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385\n```\n\n### Supported types of download TARGETs:\n\n<details>\n<summary><strong>Click to expand</strong></summary>\n\n\u2705 Supported, \ud83d\udea7 Not Yet Supported, \u274c Not Supported\n\n-   **[ArXiv](https://arxiv.org/)** \n    -   \u2705 ArXiv ID: `1512.03385` or `arXiv:1512.03385`\n    -   \u2705 Legacy ArXiv ID: `alg-geom/9708001` or `cs/0002001`, etc.\n    -   \u2705 ArXiv Abstract Page URL: `https://arxiv.org/abs/1512.03385` \n    -   \u2705 ArXiv PDF Page URL: `https://arxiv.org/pdf/1512.03385.pdf`\n    -   \u2705 ArXiv HTML Page URL: `https://arxiv.org/html/2506.15442`\n-   **[CVF Open Access](https://openaccess.thecvf.com/menu) (CVPR, ICCV, WACV)**\n    -   \u2705 CVF Abstract Page URL: `https://openaccess.thecvf.com/content/**/html/**/*.html`\n    -   \u2705 CVF PDF Page URL: `https://openaccess.thecvf.com/content/**/papers/**/*.pdf`\n-   **[ECVA](https://www.ecva.net/papers.php) (ECCV)** \n    -   \u2705 ECVA Abstract Page URL: `https://www.ecva.net/html/**/*.php`\n    -   \u274c ECVA PDF Page URL: `https://www.ecva.net/papers/**/*.pdf`\n-   **[NeurIPS](https://papers.nips.cc/)**\n    -   \ud83d\udea7 NeurIPS Abstract Page URL\n    -   \ud83d\udea7 NeurIPS PDF Page URL\n-   **[OpenReview](https://openreview.net/)**\n    -   \ud83d\udea7 TODO\n</details>\n\n### Frequently used OPTIONS:\n\n-   `-v`, `--verbose` (optional): set to verbose, print full details.\n-   `-d`, `--download-dir` (optional): Specify one-time download directory. This option will override the default download directory or the one specified in the environment variable `ARXIV_DOWNLOAD_FOLDER`.\n-   `-n`, `--n-threads` (optional): Specify the number of parallel connections to be used by `aria2`. \n\n> [!TIP]\n> more options are available, run `paper -h` to see all options.\n\n### Use it in your code:\n\n```python\nfrom arxiv_dl import download_paper\n\ndownload_paper(target=\"1512.03385\", download_dir=\".\", set_verbose_level=\"silent\")\n```\n\n\n## Configurations\n\n### Default Download Destination\n\n-   Without any configurations, all paper will be downloaded to `$HOME/Downloads/ArXiv_Papers`, where `$HOME` is current user's home directory.\n\n### Set Your Custom Download Destination _(Optional)_\n\nYou may configure your preferred download destination once and for all via an environment variable. This will override the default download destination. To do that, include the following line in your `.bashrc` or `.zshrc` file:\n\n```bash\nexport ARXIV_DOWNLOAD_FOLDER=\"YOUR/PATH/TO/ANY/FOLDER\"\n```\n\n-   Every time you use the `paper` command, the download destination will be set to the following order of priority:\n    1.  Command-line option `-d` (highest priority)\n    2.  Environment variable `ARXIV_DOWNLOAD_FOLDER`\n    3.  Default download destination (lowest priority)\n\n### Set Custom Command Alias _(Optional)_\n\n-   You can always set your own preferred alias to rename the command or add more options.\n-   Include the following line(s) in your `.bashrc` or `.zshrc` file to set your preferred alias:\n    ```bash\n    alias dp=\"paper\"\n    alias dpv=\"paper -v -d '~/Documents/Papers'\"\n    ```\n\n## Development\n\n### Set up development environment\n\n```bash\n# create a virtual environment\npython3 -m venv venv && source venv/bin/activate\n\n# install dependencies\npip install -U -r requirements.txt\n\n# install the package in editable mode & dev dependencies\npip install -e \".[dev]\"\n```\n\n### Run Tests\n\n```bash\npytest\n```\n\n### Build the package\n\n```bash\nmake\n```\n\n### Clean cache & build artifacts\n\n```bash\nmake clean\n```\n\n## License\n\nThis project is licensed under the [MIT License](https://github.com/MarkHershey/arxiv-dl/blob/master/LICENSE).  \n&copy; Mark H. Huang. All rights reserved.\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2021-2026 Mark H. Huang\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.",
    "summary": "Command-line Papers Downloader. Citation extraction and PDF naming automation.",
    "version": "1.2.2",
    "project_urls": {
        "Homepage": "https://github.com/MarkHershey/arxiv-dl",
        "Issues": "https://github.com/MarkHershey/arxiv-dl/issues"
    },
    "split_keywords": [
        "cvf",
        " cvpr",
        " eccv",
        " iccv",
        " wacv",
        " arxiv",
        " downloader",
        " paper"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2b062a2c8894904d43e290849468a6a2612f8c8eec1adca197b4eea340271188",
                "md5": "d69ebcb487d27c01cda72f7dae51c13c",
                "sha256": "943aaa2c5667618f53b48e6e8ee56d5b34a4e913891e15864dc1963c799e8674"
            },
            "downloads": -1,
            "filename": "arxiv_dl-1.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d69ebcb487d27c01cda72f7dae51c13c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 22074,
            "upload_time": "2025-07-27T08:40:00",
            "upload_time_iso_8601": "2025-07-27T08:40:00.811932Z",
            "url": "https://files.pythonhosted.org/packages/2b/06/2a2c8894904d43e290849468a6a2612f8c8eec1adca197b4eea340271188/arxiv_dl-1.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1ee7efe872ddd800ed7e38126c82462ea4454f903ca5dfc72f83ec16f0e78a0b",
                "md5": "836208bfc352e6106d695e23b6f2b712",
                "sha256": "3d3293c6b58119e2b24029eb947e89712a35c48321f37502bc3c38b2761d7c8e"
            },
            "downloads": -1,
            "filename": "arxiv_dl-1.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "836208bfc352e6106d695e23b6f2b712",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 908720,
            "upload_time": "2025-07-27T08:40:08",
            "upload_time_iso_8601": "2025-07-27T08:40:08.245558Z",
            "url": "https://files.pythonhosted.org/packages/1e/e7/efe872ddd800ed7e38126c82462ea4454f903ca5dfc72f83ec16f0e78a0b/arxiv_dl-1.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-27 08:40:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MarkHershey",
    "github_project": "arxiv-dl",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.13.4"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    ">=",
                    "2.11.7"
                ]
            ]
        },
        {
            "name": "pymupdf",
            "specs": [
                [
                    ">=",
                    "1.26.1"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.32.4"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    ">=",
                    "14.0.0"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    ">=",
                    "80.9.0"
                ]
            ]
        }
    ],
    "lcname": "arxiv-dl"
}
        
Elapsed time: 1.06944s