Name | soupsorcery JSON |
Version |
0.1.5
JSON |
| download |
home_page | |
Summary | Package for advanced web scraping with BeautifulSoup |
upload_time | 2024-02-12 11:18:34 |
maintainer | |
docs_url | None |
author | sewcio543 |
requires_python | >=3.9 |
license | MIT License Copyright (c) 2024 Wojciech Seweryn Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
web-scraping
html
soup
bs4
markup
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[![!pypi](https://img.shields.io/pypi/v/soupsavvy?color=orange)](https://pypi.org/project/soupsavvy/)
[![!python-versions](https://img.shields.io/pypi/pyversions/soupsavvy)](https://www.python.org/)
## Testing
![example workflow](https://github.com/sewcio543/test/actions/workflows/tests.yml/badge.svg)
## Code Quality
![Build](https://github.com/sewcio543/test/actions/workflows/build_package.yml/badge.svg)
![example workflow](https://github.com/sewcio543/test/actions/workflows/linting.yml/badge.svg)
[![Linter: flake8](https://img.shields.io/badge/flake8-checked-blueviolet)](https://github.com/PyCQA/flake8)
![example workflow](https://github.com/sewcio543/test/actions/workflows/formatting.yml/badge.svg)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![example workflow](https://github.com/sewcio543/test/actions/workflows/type_checking.yml/badge.svg)
[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
## SoupSavvy
SoupSavvy is a library designed to make web scraping tasks more efficient and manageable. Automating web scraping can be a thankless and time-consuming job. SoupSavvy builds around <a href="https://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> library enabling developers to create more complex workflows and advanced searches with ease.
## Key Features
- **Automated Web Scraping**: SoupSavvy simplifies the process of web scraping by providing intuitive interfaces and tools for automating tasks.
- **Complex Workflows**: With SoupSavvy, developers can create complex scraping workflows effortlessly, allowing for more intricate data extraction.
- **Advanced Searches**: SoupSavvy extends BeautifulSoup's capabilities by offering advanced search options, enabling users to find and extract specific elements from HTML markup with precision.
- **Clear Type Hinting**: The library offers clear and concise type hinting throughout its API, enhancing code readability and maintainability.
- **Productionalize Scraping Code**: By providing structured workflows and error handling mechanisms, SoupSavvy facilitates the productionalization of scraping code, making it easier to integrate into larger projects and pipelines.
## Getting Started
### Installation
SoupSavvy is published on PyPi and latest stable package version can be installed via pip, simply using the following command:
```bash
pip install soupsavvy
```
```python
from soupsavvy import ElementTag, AttributeTag, PatternElementTag
from bs4 import BeautifulSoup
text = """
<div href="github">
<a class="github/settings", href="github.com"></a>
<a id="github pages"></a>
<a href="github "></a>
</div>
"""
markup = BeautifulSoup(text)
tag = ElementTag(
tag="a",
attributes=[
AttributeTag(name="href", value="github", re=True),
AttributeTag(name="class", value="settings")
]
)
tag.find(markup)
tag.find_all(markup)
```
## Contributing
If you'd like to contribute to SoupSavvy, feel free to check out the [GitHub repository](https://github.com/sewcio543/soupsavvy) and submit pull requests. Any feedback, bug reports, or feature requests are welcome!
## License
SoupSavvy is licensed under the [MIT License](https://opensource.org/licenses/MIT), allowing for both personal and commercial use. See the `LICENSE` file for more information.
## Acknowledgements
SoupSavvy is built upon the foundation of excellent BeautifulSoup. We extend our gratitude to the developers and contributors of this projects for their invaluable contributions to the Python community and making our life a lot easier!
---
Make your soup even more beautiful and savvier!
Happy scraping! 🍲✨
from soup to nuts
soup sandwich
be duck soup
Raw data
{
"_id": null,
"home_page": "",
"name": "soupsorcery",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "",
"keywords": "web-scraping,html,soup,bs4,markup",
"author": "sewcio543",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/74/e4/1a4e18049da5f0017b77ddf73e2373a9fe057e15040c79349c1dabf81720/soupsorcery-0.1.5.tar.gz",
"platform": null,
"description": "[![!pypi](https://img.shields.io/pypi/v/soupsavvy?color=orange)](https://pypi.org/project/soupsavvy/)\n[![!python-versions](https://img.shields.io/pypi/pyversions/soupsavvy)](https://www.python.org/)\n\n## Testing\n\n![example workflow](https://github.com/sewcio543/test/actions/workflows/tests.yml/badge.svg)\n\n## Code Quality\n\n![Build](https://github.com/sewcio543/test/actions/workflows/build_package.yml/badge.svg)\n\n![example workflow](https://github.com/sewcio543/test/actions/workflows/linting.yml/badge.svg)\n[![Linter: flake8](https://img.shields.io/badge/flake8-checked-blueviolet)](https://github.com/PyCQA/flake8)\n\n![example workflow](https://github.com/sewcio543/test/actions/workflows/formatting.yml/badge.svg)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n![example workflow](https://github.com/sewcio543/test/actions/workflows/type_checking.yml/badge.svg)\n[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n\n## SoupSavvy\n\nSoupSavvy is a library designed to make web scraping tasks more efficient and manageable. Automating web scraping can be a thankless and time-consuming job. SoupSavvy builds around <a href=\"https://www.crummy.com/software/BeautifulSoup/\">BeautifulSoup</a> library enabling developers to create more complex workflows and advanced searches with ease.\n\n## Key Features\n\n- **Automated Web Scraping**: SoupSavvy simplifies the process of web scraping by providing intuitive interfaces and tools for automating tasks.\n\n- **Complex Workflows**: With SoupSavvy, developers can create complex scraping workflows effortlessly, allowing for more intricate data extraction.\n\n- **Advanced Searches**: SoupSavvy extends BeautifulSoup's capabilities by offering advanced search options, enabling users to find and extract specific elements from HTML markup with precision.\n\n- **Clear Type Hinting**: The library offers clear and concise type hinting throughout its API, enhancing code readability and maintainability.\n\n- **Productionalize Scraping Code**: By providing structured workflows and error handling mechanisms, SoupSavvy facilitates the productionalization of scraping code, making it easier to integrate into larger projects and pipelines.\n\n## Getting Started\n\n### Installation\n\n SoupSavvy is published on PyPi and latest stable package version can be installed via pip, simply using the following command:\n\n```bash\npip install soupsavvy\n```\n\n```python\nfrom soupsavvy import ElementTag, AttributeTag, PatternElementTag\nfrom bs4 import BeautifulSoup\n\ntext = \"\"\"\n <div href=\"github\">\n <a class=\"github/settings\", href=\"github.com\"></a>\n <a id=\"github pages\"></a>\n <a href=\"github \"></a>\n </div>\n\"\"\"\nmarkup = BeautifulSoup(text)\ntag = ElementTag(\n tag=\"a\",\n attributes=[\n AttributeTag(name=\"href\", value=\"github\", re=True),\n AttributeTag(name=\"class\", value=\"settings\")\n ]\n)\ntag.find(markup)\ntag.find_all(markup)\n```\n\n## Contributing\n\nIf you'd like to contribute to SoupSavvy, feel free to check out the [GitHub repository](https://github.com/sewcio543/soupsavvy) and submit pull requests. Any feedback, bug reports, or feature requests are welcome!\n\n## License\n\nSoupSavvy is licensed under the [MIT License](https://opensource.org/licenses/MIT), allowing for both personal and commercial use. See the `LICENSE` file for more information.\n\n## Acknowledgements\n\nSoupSavvy is built upon the foundation of excellent BeautifulSoup. We extend our gratitude to the developers and contributors of this projects for their invaluable contributions to the Python community and making our life a lot easier!\n\n---\n\nMake your soup even more beautiful and savvier!\nHappy scraping! \ud83c\udf72\u2728\n\nfrom soup to nuts\nsoup sandwich\nbe duck soup\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2024 Wojciech Seweryn Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "Package for advanced web scraping with BeautifulSoup",
"version": "0.1.5",
"project_urls": {
"source": "https://github.com/sewcio543/soupsaavy"
},
"split_keywords": [
"web-scraping",
"html",
"soup",
"bs4",
"markup"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bf09643bab1db2942312dce96f954811798d500001f45b4bdf66a585276e9864",
"md5": "710b219066df45ecb24eebb3ba590ea5",
"sha256": "2941b489bc28093f568094b460382f4bbeecfb0cc8a6ab08d84422394c607c90"
},
"downloads": -1,
"filename": "soupsorcery-0.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "710b219066df45ecb24eebb3ba590ea5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12480,
"upload_time": "2024-02-12T11:18:33",
"upload_time_iso_8601": "2024-02-12T11:18:33.496232Z",
"url": "https://files.pythonhosted.org/packages/bf/09/643bab1db2942312dce96f954811798d500001f45b4bdf66a585276e9864/soupsorcery-0.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "74e41a4e18049da5f0017b77ddf73e2373a9fe057e15040c79349c1dabf81720",
"md5": "1c078d24e010f827154b58f2404fbed7",
"sha256": "394a3f59da41a9a8d21c03153e4d666a55eacccde5f9f41e5898bbc8686c0af8"
},
"downloads": -1,
"filename": "soupsorcery-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "1c078d24e010f827154b58f2404fbed7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 12674,
"upload_time": "2024-02-12T11:18:34",
"upload_time_iso_8601": "2024-02-12T11:18:34.654469Z",
"url": "https://files.pythonhosted.org/packages/74/e4/1a4e18049da5f0017b77ddf73e2373a9fe057e15040c79349c1dabf81720/soupsorcery-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-12 11:18:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sewcio543",
"github_project": "soupsaavy",
"github_not_found": true,
"lcname": "soupsorcery"
}