Name | spider4 JSON |
Version |
4.0.1
JSON |
| download |
home_page | None |
Summary | Screen-scraping library |
upload_time | 2024-06-12 14:27:25 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.6.0 |
license | MIT License |
keywords |
html
xml
parse
soup
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
Spider is a library that makes it easy to scrape information
from web pages. It sits atop an HTML or XML parser, providing Pythonic
idioms for iterating, searching, and modifying the parse tree.
# Quick start
```
>>> from spider import Spider
>>> sp = Spider("<p>Some<b>bad<i>HTML")
>>> print(sp.prettify())
<html>
<body>
<p>
Some
<b>
bad
<i>
HTML
</i>
</b>
</p>
</body>
</html>
>>> sp.find(text="bad")
'bad'
>>> sp.i
<i>HTML</i>
#
>>> sp = Spider("<tag1>Some<tag2/>bad<tag3>XML", "xml")
#
>>> print(sp.prettify())
<?xml version="1.0" encoding="utf-8"?>
<tag1>
Some
<tag2/>
bad
<tag3>
XML
</tag3>
</tag1>
```
# Running the unit tests
Spider supports unit test discovery using Pytest:
```
$ pytest
```
Raw data
{
"_id": null,
"home_page": null,
"name": "spider4",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6.0",
"maintainer_email": null,
"keywords": "HTML, XML, parse, soup",
"author": null,
"author_email": "Shariful Alam <2ashariful@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/70/6b/3e13a72e4cb1e8aa9c478e4878200622720f3d4aa830f8b4d1b7bbcbdafe/spider4-4.0.1.tar.gz",
"platform": null,
"description": "Spider is a library that makes it easy to scrape information\nfrom web pages. It sits atop an HTML or XML parser, providing Pythonic\nidioms for iterating, searching, and modifying the parse tree.\n\n# Quick start\n\n```\n>>> from spider import Spider\n>>> sp = Spider(\"<p>Some<b>bad<i>HTML\")\n>>> print(sp.prettify())\n<html>\n <body>\n <p>\n Some\n <b>\n bad\n <i>\n HTML\n </i>\n </b>\n </p>\n </body>\n</html>\n>>> sp.find(text=\"bad\")\n'bad'\n>>> sp.i\n<i>HTML</i>\n#\n>>> sp = Spider(\"<tag1>Some<tag2/>bad<tag3>XML\", \"xml\")\n#\n>>> print(sp.prettify())\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<tag1>\n Some\n <tag2/>\n bad\n <tag3>\n XML\n </tag3>\n</tag1>\n```\n\n\n# Running the unit tests\n\nSpider supports unit test discovery using Pytest:\n\n```\n$ pytest\n```\n\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Screen-scraping library",
"version": "4.0.1",
"project_urls": {
"Download": "https://github.com/shari-ful/spider.git",
"Homepage": "https://github.com/shari-ful/spider.git"
},
"split_keywords": [
"html",
" xml",
" parse",
" soup"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4574d823a2dc1c758c1c6376fd241ec0d2f415879a598b028ad67d616795677f",
"md5": "f7228789ffeb5901d3f0b0748725f5ff",
"sha256": "5f48e253db9c1bc5fd69aeca1eb36fa0eee4fb38e519baa38a46dd8cb5cb38aa"
},
"downloads": -1,
"filename": "spider4-4.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f7228789ffeb5901d3f0b0748725f5ff",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6.0",
"size": 145656,
"upload_time": "2024-06-12T14:27:22",
"upload_time_iso_8601": "2024-06-12T14:27:22.637714Z",
"url": "https://files.pythonhosted.org/packages/45/74/d823a2dc1c758c1c6376fd241ec0d2f415879a598b028ad67d616795677f/spider4-4.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "706b3e13a72e4cb1e8aa9c478e4878200622720f3d4aa830f8b4d1b7bbcbdafe",
"md5": "1194aafa09650196b39cbd1ef4b69796",
"sha256": "230da9e2aa587da1d007fcef045da5db729f8233247bcb0c183ee43f044ed077"
},
"downloads": -1,
"filename": "spider4-4.0.1.tar.gz",
"has_sig": false,
"md5_digest": "1194aafa09650196b39cbd1ef4b69796",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6.0",
"size": 124094,
"upload_time": "2024-06-12T14:27:25",
"upload_time_iso_8601": "2024-06-12T14:27:25.709567Z",
"url": "https://files.pythonhosted.org/packages/70/6b/3e13a72e4cb1e8aa9c478e4878200622720f3d4aa830f8b4d1b7bbcbdafe/spider4-4.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-12 14:27:25",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "shari-ful",
"github_project": "spider",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "spider4"
}