# arxiv.py
[](https://pypi.org/project/arxiv/)  [](https://github.com/lukasschwab/arxiv.py/actions?query=branch%3Amaster) [](https://lukasschwab.me/arxiv.py/index.html)
Python wrapper for [the arXiv API](https://arxiv.org/help/api/index).
[arXiv](https://arxiv.org/) is a project by the Cornell University Library that provides open access to 1,000,000+ articles in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, and Statistics.
## Usage
### Installation
```bash
$ pip install arxiv
```
In your Python script, include the line
```python
import arxiv
```
### Examples
#### Fetching results
```python
import arxiv
# Construct the default API client.
client = arxiv.Client()
# Search for the 10 most recent articles matching the keyword "quantum."
search = arxiv.Search(
query = "quantum",
max_results = 10,
sort_by = arxiv.SortCriterion.SubmittedDate
)
results = client.results(search)
# `results` is a generator; you can iterate over its elements one by one...
for r in client.results(search):
print(r.title)
# ...or exhaust it into a list. Careful: this is slow for large results sets.
all_results = list(results)
print([r.title for r in all_results])
# For advanced query syntax documentation, see the arXiv API User Manual:
# https://arxiv.org/help/api/user-manual#query_details
search = arxiv.Search(query = "au:del_maestro AND ti:checkerboard")
first_result = next(client.results(search))
print(first_result)
# Search for the paper with ID "1605.08386v1"
search_by_id = arxiv.Search(id_list=["1605.08386v1"])
# Reuse client to fetch the paper, then print its title.
first_result = next(client.results(search_by_id))
print(first_result.title)
```
#### Fetching results with a custom client
```python
import arxiv
big_slow_client = arxiv.Client(
page_size = 1000,
delay_seconds = 10.0,
num_retries = 5
)
# Prints 1000 titles before needing to make another request.
for result in big_slow_client.results(arxiv.Search(query="quantum")):
print(result.title)
```
#### Logging
To inspect this package's network behavior and API logic, configure a `DEBUG`-level logger.
```pycon
>>> import logging, arxiv
>>> logging.basicConfig(level=logging.DEBUG)
>>> client = arxiv.Client()
>>> paper = next(client.results(arxiv.Search(id_list=["1605.08386v1"])))
INFO:arxiv.arxiv:Requesting 100 results at offset 0
INFO:arxiv.arxiv:Requesting page (first: False, try: 0): https://export.arxiv.org/api/query?search_query=&id_list=1605.08386v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): export.arxiv.org:443
DEBUG:urllib3.connectionpool:https://export.arxiv.org:443 "GET /api/query?search_query=&id_list=1605.08386v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100&user-agent=arxiv.py%2F1.4.8 HTTP/1.1" 200 979
```
## Types
### Client
A `Client` specifies a reusable strategy for fetching results from arXiv's API. For most use cases the default client should suffice.
Clients configurations specify pagination and retry logic. *Reusing* a client allows successive API calls to use the same connection pool and ensures they abide by the rate limit you set.
### Search
A `Search` specifies a search of arXiv's database. Use `Client.results` to get a generator yielding `Result`s.
### Result
The `Result` objects yielded by `Client.results` include metadata about each paper and helper methods for downloading their content.
The meaning of the underlying raw data is documented in the [arXiv API User Manual: Details of Atom Results Returned](https://arxiv.org/help/api/user-manual#_details_of_atom_results_returned).
`Result` also exposes helper methods for downloading papers: `Result.download_pdf` and `Result.download_source`.
Raw data
{
"_id": null,
"home_page": "https://github.com/lukasschwab/arxiv.py",
"name": "arxiv",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "arxiv api wrapper academic journals papers",
"author": "Lukas Schwab",
"author_email": "lukas.schwab@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d2/51/62bc73f67c9c9d3d3ea2cff5868ed4a9f4ecf74f07694d63fe724e51c344/arxiv-2.3.0.tar.gz",
"platform": null,
"description": "# arxiv.py\n[](https://pypi.org/project/arxiv/)  [](https://github.com/lukasschwab/arxiv.py/actions?query=branch%3Amaster) [](https://lukasschwab.me/arxiv.py/index.html)\n\nPython wrapper for [the arXiv API](https://arxiv.org/help/api/index).\n\n[arXiv](https://arxiv.org/) is a project by the Cornell University Library that provides open access to 1,000,000+ articles in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, and Statistics.\n\n## Usage\n\n### Installation\n\n```bash\n$ pip install arxiv\n```\n\nIn your Python script, include the line\n\n```python\nimport arxiv\n```\n\n### Examples\n\n#### Fetching results\n\n```python\nimport arxiv\n\n# Construct the default API client.\nclient = arxiv.Client()\n\n# Search for the 10 most recent articles matching the keyword \"quantum.\"\nsearch = arxiv.Search(\n query = \"quantum\",\n max_results = 10,\n sort_by = arxiv.SortCriterion.SubmittedDate\n)\n\nresults = client.results(search)\n\n# `results` is a generator; you can iterate over its elements one by one...\nfor r in client.results(search):\n print(r.title)\n# ...or exhaust it into a list. Careful: this is slow for large results sets.\nall_results = list(results)\nprint([r.title for r in all_results])\n\n# For advanced query syntax documentation, see the arXiv API User Manual:\n# https://arxiv.org/help/api/user-manual#query_details\nsearch = arxiv.Search(query = \"au:del_maestro AND ti:checkerboard\")\nfirst_result = next(client.results(search))\nprint(first_result)\n\n# Search for the paper with ID \"1605.08386v1\"\nsearch_by_id = arxiv.Search(id_list=[\"1605.08386v1\"])\n# Reuse client to fetch the paper, then print its title.\nfirst_result = next(client.results(search_by_id))\nprint(first_result.title)\n```\n\n#### Fetching results with a custom client\n\n```python\nimport arxiv\n\nbig_slow_client = arxiv.Client(\n page_size = 1000,\n delay_seconds = 10.0,\n num_retries = 5\n)\n\n# Prints 1000 titles before needing to make another request.\nfor result in big_slow_client.results(arxiv.Search(query=\"quantum\")):\n print(result.title)\n```\n\n#### Logging\n\nTo inspect this package's network behavior and API logic, configure a `DEBUG`-level logger.\n\n```pycon\n>>> import logging, arxiv\n>>> logging.basicConfig(level=logging.DEBUG)\n>>> client = arxiv.Client()\n>>> paper = next(client.results(arxiv.Search(id_list=[\"1605.08386v1\"])))\nINFO:arxiv.arxiv:Requesting 100 results at offset 0\nINFO:arxiv.arxiv:Requesting page (first: False, try: 0): https://export.arxiv.org/api/query?search_query=&id_list=1605.08386v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100\nDEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): export.arxiv.org:443\nDEBUG:urllib3.connectionpool:https://export.arxiv.org:443 \"GET /api/query?search_query=&id_list=1605.08386v1&sortBy=relevance&sortOrder=descending&start=0&max_results=100&user-agent=arxiv.py%2F1.4.8 HTTP/1.1\" 200 979\n```\n\n## Types \n\n### Client\n\nA `Client` specifies a reusable strategy for fetching results from arXiv's API. For most use cases the default client should suffice.\n\nClients configurations specify pagination and retry logic. *Reusing* a client allows successive API calls to use the same connection pool and ensures they abide by the rate limit you set.\n\n### Search\n\nA `Search` specifies a search of arXiv's database. Use `Client.results` to get a generator yielding `Result`s.\n\n### Result\n\nThe `Result` objects yielded by `Client.results` include metadata about each paper and helper methods for downloading their content.\n\nThe meaning of the underlying raw data is documented in the [arXiv API User Manual: Details of Atom Results Returned](https://arxiv.org/help/api/user-manual#_details_of_atom_results_returned).\n\n`Result` also exposes helper methods for downloading papers: `Result.download_pdf` and `Result.download_source`.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python wrapper for the arXiv API: https://arxiv.org/help/api/",
"version": "2.3.0",
"project_urls": {
"Homepage": "https://github.com/lukasschwab/arxiv.py"
},
"split_keywords": [
"arxiv",
"api",
"wrapper",
"academic",
"journals",
"papers"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "593d7f72afb3f560ac6d7019fce32ae745a205ddeba55562d3ec8fe1497bbc76",
"md5": "84ea0c95e8321db313f649c961b59d12",
"sha256": "a5a052643b4855bb57a2674bdf1e46137060892a097691ce082b502a3128f662"
},
"downloads": -1,
"filename": "arxiv-2.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "84ea0c95e8321db313f649c961b59d12",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 11557,
"upload_time": "2025-11-01T01:48:26",
"upload_time_iso_8601": "2025-11-01T01:48:26.374974Z",
"url": "https://files.pythonhosted.org/packages/59/3d/7f72afb3f560ac6d7019fce32ae745a205ddeba55562d3ec8fe1497bbc76/arxiv-2.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d25162bc73f67c9c9d3d3ea2cff5868ed4a9f4ecf74f07694d63fe724e51c344",
"md5": "869757e69b5319e2f1b556ab1f46b617",
"sha256": "0fd8224180819cf8d0c6c3964bdca18cd33775adc8938562c788abed3bab6b1f"
},
"downloads": -1,
"filename": "arxiv-2.3.0.tar.gz",
"has_sig": false,
"md5_digest": "869757e69b5319e2f1b556ab1f46b617",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 16664,
"upload_time": "2025-11-01T01:48:27",
"upload_time_iso_8601": "2025-11-01T01:48:27.250512Z",
"url": "https://files.pythonhosted.org/packages/d2/51/62bc73f67c9c9d3d3ea2cff5868ed4a9f4ecf74f07694d63fe724e51c344/arxiv-2.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-01 01:48:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lukasschwab",
"github_project": "arxiv.py",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "feedparser",
"specs": [
[
"~=",
"6.0.10"
]
]
},
{
"name": "requests",
"specs": [
[
"~=",
"2.32.0"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"6.2.2"
]
]
},
{
"name": "ruff",
"specs": [
[
">=",
"0.1.2"
]
]
},
{
"name": "pdoc",
"specs": [
[
"==",
"14.5.1"
]
]
},
{
"name": "pip-audit",
"specs": [
[
">=",
"1.1.2"
]
]
}
],
"lcname": "arxiv"
}