Name | ao3-parser JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | Package for parsing AO3 pages and creating urls based on requirements. |
upload_time | 2024-08-06 09:28:56 |
maintainer | None |
docs_url | None |
author | petak33 |
requires_python | >=3.8 |
license | MIT License Copyright (c) 2024 petak33 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
ao3
archiveofourown
archive of our own
|
VCS |
|
bugtrack_url |
|
requirements |
BeautifulSoup4
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# AO3 Parser
Tools for parsing AO3 pages and creating urls based on requirements.
Main advantage over similar packages is it's complete control over requests to AO3.
Instead of handling requests on it's own, it shifts this to the user, giving more room for optimization.
The main bottleneck for anyone in need of collecting larger amounts of data.
(Scraping data for AI training is discouraged)
If this is not what you're looking for, I'd recommend [ao3_api](https://github.com/ArmindoFlores/ao3_api) that handles requests on it's own.
## Installation
```bash
pip install ao3-parser
```
# Usage
An average user will find themselves using two main modules the most, `Search` and `Page`.
## Search
Common example of using `Search` would look like this.
Just like on AO3, pages are numbered from 1 and up.
```python
import AO3Parser as AO3P
from AO3Parser import Params
from datetime import datetime
search = AO3P.Search("Original Work", Sort_by=Params.Sort.Kudos,
Include_Ratings=[Params.Rating.General_Audiences],
Words_From=1000, Words_To=1500,
Date_From=datetime(2024, 6, 30))
url = search.GetUrl(page=1)
print(f"URL: {url}")
```
```
URL: https://archiveofourown.org/works?commit=Sort+and+Filter&page=1&work_search%5Bsort_colum%5D=Kudos&tag_id=Original+Work&include_work_search%5Brating_ids%5D%5B%5D=10&work_search%5Bwords_from%5D=1000&work_search%5Bwords_to%5D=1500&work_search%5Bdate_from%5D=2024-06-30
```
## Page
```python
import AO3Parser as AO3P
import requests
search = AO3P.Search("Original Work")
url = search.GetUrl()
page_data = requests.get(url).content
page = AO3P.Page(page_data)
print(f"Total works: {page.Total_Works}")
print(f"Works on page: {len(page.Works)}")
print(f"Title of the first work: [{page.Works[0].Title}]")
```
```
Total works: 282069
Works on page: 20
Title of the first work: [Title Of This Work]
```
### Notes
`Params.Category.No_Category` is not recognized as a valid ID on AO3 and should not be used with `Search`.
Raw data
{
"_id": null,
"home_page": null,
"name": "ao3-parser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "ao3, archiveofourown, archive of our own",
"author": "petak33",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/bc/73/bd75ee3a1774a74bca6346346aa792d2406849692cf3ec85aeac2be71f37/ao3_parser-1.0.0.tar.gz",
"platform": null,
"description": "# AO3 Parser\r\nTools for parsing AO3 pages and creating urls based on requirements.\r\n\r\nMain advantage over similar packages is it's complete control over requests to AO3.\r\nInstead of handling requests on it's own, it shifts this to the user, giving more room for optimization.\r\nThe main bottleneck for anyone in need of collecting larger amounts of data.\r\n(Scraping data for AI training is discouraged)\r\n\r\nIf this is not what you're looking for, I'd recommend [ao3_api](https://github.com/ArmindoFlores/ao3_api) that handles requests on it's own.\r\n\r\n## Installation\r\n```bash\r\npip install ao3-parser\r\n```\r\n\r\n# Usage\r\nAn average user will find themselves using two main modules the most, `Search` and `Page`. \r\n\r\n## Search\r\nCommon example of using `Search` would look like this.\r\nJust like on AO3, pages are numbered from 1 and up.\r\n\r\n```python\r\nimport AO3Parser as AO3P\r\nfrom AO3Parser import Params\r\nfrom datetime import datetime\r\n\r\nsearch = AO3P.Search(\"Original Work\", Sort_by=Params.Sort.Kudos,\r\n Include_Ratings=[Params.Rating.General_Audiences],\r\n Words_From=1000, Words_To=1500,\r\n Date_From=datetime(2024, 6, 30))\r\nurl = search.GetUrl(page=1)\r\nprint(f\"URL: {url}\")\r\n```\r\n```\r\nURL: https://archiveofourown.org/works?commit=Sort+and+Filter&page=1&work_search%5Bsort_colum%5D=Kudos&tag_id=Original+Work&include_work_search%5Brating_ids%5D%5B%5D=10&work_search%5Bwords_from%5D=1000&work_search%5Bwords_to%5D=1500&work_search%5Bdate_from%5D=2024-06-30\r\n```\r\n\r\n## Page\r\n\r\n```python\r\nimport AO3Parser as AO3P\r\nimport requests\r\n\r\nsearch = AO3P.Search(\"Original Work\")\r\nurl = search.GetUrl()\r\npage_data = requests.get(url).content\r\n\r\npage = AO3P.Page(page_data)\r\nprint(f\"Total works: {page.Total_Works}\")\r\nprint(f\"Works on page: {len(page.Works)}\")\r\nprint(f\"Title of the first work: [{page.Works[0].Title}]\")\r\n```\r\n```\r\nTotal works: 282069\r\nWorks on page: 20\r\nTitle of the first work: [Title Of This Work]\r\n```\r\n\r\n### Notes\r\n`Params.Category.No_Category` is not recognized as a valid ID on AO3 and should not be used with `Search`.\r\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2024 petak33 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
"summary": "Package for parsing AO3 pages and creating urls based on requirements.",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/petak33/ao3-parser",
"Issues": "https://github.com/petak33/ao3-parser/issues"
},
"split_keywords": [
"ao3",
" archiveofourown",
" archive of our own"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fa37c297bc3758cdaf81f233bbd81199393f760b7eef1f8250444f347713a20f",
"md5": "a1496572ba6e0701235999b78d92f509",
"sha256": "9185589125e589a8b1c942926fd33e39810ab0a2fb737637f820131da00e0982"
},
"downloads": -1,
"filename": "ao3_parser-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a1496572ba6e0701235999b78d92f509",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8312,
"upload_time": "2024-08-06T09:28:54",
"upload_time_iso_8601": "2024-08-06T09:28:54.440648Z",
"url": "https://files.pythonhosted.org/packages/fa/37/c297bc3758cdaf81f233bbd81199393f760b7eef1f8250444f347713a20f/ao3_parser-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bc73bd75ee3a1774a74bca6346346aa792d2406849692cf3ec85aeac2be71f37",
"md5": "c42a8ade39eb03d2d87072a5b862b186",
"sha256": "b9d71fc6e267bb8359fce1b868db482939cfb1cd092f33535d6babf269592578"
},
"downloads": -1,
"filename": "ao3_parser-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "c42a8ade39eb03d2d87072a5b862b186",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 7065,
"upload_time": "2024-08-06T09:28:56",
"upload_time_iso_8601": "2024-08-06T09:28:56.535310Z",
"url": "https://files.pythonhosted.org/packages/bc/73/bd75ee3a1774a74bca6346346aa792d2406849692cf3ec85aeac2be71f37/ao3_parser-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-06 09:28:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "petak33",
"github_project": "ao3-parser",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "BeautifulSoup4",
"specs": []
}
],
"lcname": "ao3-parser"
}