| Name | scrapxd JSON |
| Version |
0.1.1
JSON |
| download |
| home_page | None |
| Summary | A Python library for scraping, analysing and exporting Letterboxd data. |
| upload_time | 2025-10-21 14:52:42 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.10 |
| license | MIT License
Copyright (c) 2025 Cauã Santos
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
|
| keywords |
letterboxd
scraper
web scraping
api
movies
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
annotated-types
beautifulsoup4
bs4
certifi
charset-normalizer
coverage
et_xmlfile
fake-useragent
idna
iniconfig
lxml
numpy
openpyxl
packaging
pluggy
pydantic
pydantic_core
Pygments
pytest
pytest-cov
pytest-dependency
pytest-mock
requests
scipy
setuptools
six
soupsieve
style
tenacity
typing-inspection
typing_extensions
update
urllib3
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# scrapxd: The Library for Letterboxd Data
[](https://pypi.org/project/scrapxd/)
[](https://pypi.org/project/scrapxd/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/cauafsantosdev/scrapxd)
`scrapxd` is a Python library designed for web scraping, analyzing, and exporting data from [Letterboxd](https://letterboxd.com/), the social network for cinephiles. With an intuitive, strictly-typed API using Pydantic, `scrapxd` makes it easy to access user profiles, film lists, diaries, and much more.
---
## Key Features
* **Scrape Whatever You Need:** Extract detailed data from user profiles, including watched films, ratings, diary entries, lists, followers, and more.
* **Film Search:** Search for films on Letterboxd based on various filters.
* **Pydantic Data Models:** All returned data is validated and structured into Pydantic models, ensuring consistency and ease of use in your code.
* **Analytics Module:** Perform statistical analysis on the collected data, such as correlations and trends (requires the `[analytics]` extra).
* **Exporting Module:** Export collected data to popular formats like CSV, JSON, and Excel (`.xlsx`) (requires the `[export]` extra).
* **Retries Logic:** Utilizes `tenacity` for automatic retries on network failures, making the scraping process more reliable.
* **Simple and Intuitive:** Designed with a clean and easy-to-use API, as demonstrated in the examples.
---
## Installation
You can install the library directly from PyPI.
**Standard Installation:**
**Bash**
```bash
pip install scrapxd
```
The library has optional dependencies for extra features. You can install them as needed:
**For Data Analytics:**
**Bash**
```
pip install "scrapxd[analytics]"
```
**For File Exporting:**
**Bash**
```
pip install "scrapxd[export]"
```
**To Install Everything (including testing dependencies):**
**Bash**
```
pip install "scrapxd[all]"
```
---
## Quickstart
Using `scrapxd` is very simple. Here is a basic example to get a user's watched films:
**Python**
```
from scrapxd import Scrapxd
# 1. Create a client instance
client = Scrapxd()
# 2. Get data for a Letterboxd user
# The client handles searching and pagination automatically
user = client.get_user("your_username_here")
user_films = user.logs
# 3. Access the data
print(f"Total films watched by '{user_films.username}': {user_films.number_of_entries}")
# Each entry is a Pydantic object with structured data
for entry in user_films.entries[:5]: # Displaying the first 5
print(f"- {entry.film.title} ({entry.film.year}) - Rating: {entry.rating}")
# 4. (Optional) Export the data to an Excel file
try:
user_films.to_xlsx(f"{user_films.username}_films")
print(f"\nData exported to {user_films.username}_films.xlsx")
except ImportError:
print("\nTo export data, please install with: pip install \"scrapxd[export]\"")
```
---
## Detailed Examples
For a more in-depth guide covering all features like profile analysis, comparisons, and advanced use cases, please explore the Jupyter notebooks in the `/examples` folder:
* **[1. Quickstart Guide](https://www.google.com/search?q=./examples/1_quickstart_guide.ipynb&authuser=2)**
* **[2. Deep Dive Analysis](https://www.google.com/search?q=./examples/2_deep_dive_analysis.ipynb&authuser=2)**
* **[3. Comparing Profiles](https://www.google.com/search?q=./examples/3_comparing_profiles.ipynb&authuser=2)**
* **[4. Advanced Guide](https://www.google.com/search?q=./examples/4_advanced_guide.ipynb&authuser=2)**
---
## Contributing
Contributions are very welcome! If you have an idea for a new feature, find a bug, or want to improve the documentation, please open an [Issue](https://github.com/cauafsantosdev/scrapxd/issues) or submit a [Pull Request](https://github.com/cauafsantosdev/scrapxd/pulls).
---
## License
This project is licensed under the MIT License. See the [LICENSE](https://www.google.com/search?q=./LICENSE&authuser=2) file for more details.
---
## Contact
Cauã Santos - [My LinkedIn Profile](https://www.linkedin.com/in/cauafsantosdev/) - cauafsantosdev@gmail.com
GitHub URL: [https://github.com/cauafsantosdev/scrapxd](https://github.com/cauafsantosdev/scrapxd)
Raw data
{
"_id": null,
"home_page": null,
"name": "scrapxd",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "letterboxd, scraper, web scraping, api, movies",
"author": null,
"author_email": "Cau\u00e3 Santos <cauafsantosdev@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/b2/93/3d42c916fc7812834eeebab14380ad6b480171fc8d16c65baf5c7cde640c/scrapxd-0.1.1.tar.gz",
"platform": null,
"description": "\n# scrapxd: The Library for Letterboxd Data\n\n[](https://pypi.org/project/scrapxd/)\n[](https://pypi.org/project/scrapxd/)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/cauafsantosdev/scrapxd)\n\n`scrapxd` is a Python library designed for web scraping, analyzing, and exporting data from [Letterboxd](https://letterboxd.com/), the social network for cinephiles. With an intuitive, strictly-typed API using Pydantic, `scrapxd` makes it easy to access user profiles, film lists, diaries, and much more.\n\n---\n\n## Key Features\n\n* **Scrape Whatever You Need:** Extract detailed data from user profiles, including watched films, ratings, diary entries, lists, followers, and more.\n* **Film Search:** Search for films on Letterboxd based on various filters.\n* **Pydantic Data Models:** All returned data is validated and structured into Pydantic models, ensuring consistency and ease of use in your code.\n* **Analytics Module:** Perform statistical analysis on the collected data, such as correlations and trends (requires the `[analytics]` extra).\n* **Exporting Module:** Export collected data to popular formats like CSV, JSON, and Excel (`.xlsx`) (requires the `[export]` extra).\n* **Retries Logic:** Utilizes `tenacity` for automatic retries on network failures, making the scraping process more reliable.\n* **Simple and Intuitive:** Designed with a clean and easy-to-use API, as demonstrated in the examples.\n\n---\n\n## Installation\n\nYou can install the library directly from PyPI.\n\n**Standard Installation:**\n\n**Bash**\n\n```bash\npip install scrapxd\n```\n\nThe library has optional dependencies for extra features. You can install them as needed:\n\n**For Data Analytics:**\n\n**Bash**\n\n```\npip install \"scrapxd[analytics]\"\n```\n\n**For File Exporting:**\n\n**Bash**\n\n```\npip install \"scrapxd[export]\"\n```\n\n**To Install Everything (including testing dependencies):**\n\n**Bash**\n\n```\npip install \"scrapxd[all]\"\n```\n\n---\n\n## Quickstart\n\nUsing `scrapxd` is very simple. Here is a basic example to get a user's watched films:\n\n**Python**\n\n```\nfrom scrapxd import Scrapxd\n\n# 1. Create a client instance\nclient = Scrapxd()\n\n# 2. Get data for a Letterboxd user\n# The client handles searching and pagination automatically\nuser = client.get_user(\"your_username_here\")\nuser_films = user.logs\n\n# 3. Access the data\nprint(f\"Total films watched by '{user_films.username}': {user_films.number_of_entries}\")\n\n# Each entry is a Pydantic object with structured data\nfor entry in user_films.entries[:5]: # Displaying the first 5\n print(f\"- {entry.film.title} ({entry.film.year}) - Rating: {entry.rating}\")\n\n# 4. (Optional) Export the data to an Excel file\ntry:\n user_films.to_xlsx(f\"{user_films.username}_films\")\n print(f\"\\nData exported to {user_films.username}_films.xlsx\")\nexcept ImportError:\n print(\"\\nTo export data, please install with: pip install \\\"scrapxd[export]\\\"\")\n```\n\n\n\n---\n\n## Detailed Examples\n\nFor a more in-depth guide covering all features like profile analysis, comparisons, and advanced use cases, please explore the Jupyter notebooks in the `/examples` folder:\n\n* **[1. Quickstart Guide](https://www.google.com/search?q=./examples/1_quickstart_guide.ipynb&authuser=2)**\n* **[2. Deep Dive Analysis](https://www.google.com/search?q=./examples/2_deep_dive_analysis.ipynb&authuser=2)**\n* **[3. Comparing Profiles](https://www.google.com/search?q=./examples/3_comparing_profiles.ipynb&authuser=2)**\n* **[4. Advanced Guide](https://www.google.com/search?q=./examples/4_advanced_guide.ipynb&authuser=2)**\n\n---\n\n## Contributing\n\nContributions are very welcome! If you have an idea for a new feature, find a bug, or want to improve the documentation, please open an [Issue](https://github.com/cauafsantosdev/scrapxd/issues) or submit a [Pull Request](https://github.com/cauafsantosdev/scrapxd/pulls).\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://www.google.com/search?q=./LICENSE&authuser=2) file for more details.\n\n---\n\n## Contact\n\nCau\u00e3 Santos - [My LinkedIn Profile](https://www.linkedin.com/in/cauafsantosdev/) - cauafsantosdev@gmail.com\n\nGitHub URL: [https://github.com/cauafsantosdev/scrapxd](https://github.com/cauafsantosdev/scrapxd)\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2025 Cau\u00e3 Santos\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.\n ",
"summary": "A Python library for scraping, analysing and exporting Letterboxd data.",
"version": "0.1.1",
"project_urls": {
"Bug Tracker": "https://github.com/cauafsantosdev/scrapxd/issues",
"Homepage": "https://github.com/cauafsantosdev/scrapxd",
"Repository": "https://github.com/cauafsantosdev/scrapxd"
},
"split_keywords": [
"letterboxd",
" scraper",
" web scraping",
" api",
" movies"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8eef83fffa942e9696020eb106c00d6c29513768c6cf884387de9d43813498c5",
"md5": "5984dcc872002490af2b3177e99f944a",
"sha256": "a3f714ebb144383ac31ba4188fd6fb704e4562336bb54c42989bbdc83373fe27"
},
"downloads": -1,
"filename": "scrapxd-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5984dcc872002490af2b3177e99f944a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 37859,
"upload_time": "2025-10-21T14:52:40",
"upload_time_iso_8601": "2025-10-21T14:52:40.880437Z",
"url": "https://files.pythonhosted.org/packages/8e/ef/83fffa942e9696020eb106c00d6c29513768c6cf884387de9d43813498c5/scrapxd-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b2933d42c916fc7812834eeebab14380ad6b480171fc8d16c65baf5c7cde640c",
"md5": "bfec41825a2f79fc0bc0cf63e5501b04",
"sha256": "e12ca916a9652b88440d2f308e8fcec299e124fb35e2cadf4739f5de506e4c2f"
},
"downloads": -1,
"filename": "scrapxd-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "bfec41825a2f79fc0bc0cf63e5501b04",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 44438,
"upload_time": "2025-10-21T14:52:42",
"upload_time_iso_8601": "2025-10-21T14:52:42.569154Z",
"url": "https://files.pythonhosted.org/packages/b2/93/3d42c916fc7812834eeebab14380ad6b480171fc8d16c65baf5c7cde640c/scrapxd-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-21 14:52:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cauafsantosdev",
"github_project": "scrapxd",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
"==",
"4.13.4"
]
]
},
{
"name": "bs4",
"specs": [
[
"==",
"0.0.2"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2025.4.26"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.4.2"
]
]
},
{
"name": "coverage",
"specs": [
[
"==",
"7.10.7"
]
]
},
{
"name": "et_xmlfile",
"specs": [
[
"==",
"2.0.0"
]
]
},
{
"name": "fake-useragent",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "iniconfig",
"specs": [
[
"==",
"2.1.0"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"5.4.0"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"2.3.2"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.1.5"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"25.0"
]
]
},
{
"name": "pluggy",
"specs": [
[
"==",
"1.6.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.11.5"
]
]
},
{
"name": "pydantic_core",
"specs": [
[
"==",
"2.33.2"
]
]
},
{
"name": "Pygments",
"specs": [
[
"==",
"2.19.1"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"8.4.0"
]
]
},
{
"name": "pytest-cov",
"specs": [
[
"==",
"7.0.0"
]
]
},
{
"name": "pytest-dependency",
"specs": [
[
"==",
"0.6.0"
]
]
},
{
"name": "pytest-mock",
"specs": [
[
"==",
"3.15.1"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.4"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.16.1"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"80.9.0"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "soupsieve",
"specs": [
[
"==",
"2.7"
]
]
},
{
"name": "style",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "tenacity",
"specs": [
[
"==",
"9.1.2"
]
]
},
{
"name": "typing-inspection",
"specs": [
[
"==",
"0.4.1"
]
]
},
{
"name": "typing_extensions",
"specs": [
[
"==",
"4.14.0"
]
]
},
{
"name": "update",
"specs": [
[
"==",
"0.0.1"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.4.0"
]
]
}
],
"lcname": "scrapxd"
}