Name | reliq JSON |
Version |
0.0.26
JSON |
| download |
home_page | None |
Summary | Python ctypes bindings for reliq |
upload_time | 2024-10-05 12:32:03 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.7 |
license | GPLv3 |
keywords |
ctypes
html
parser
text-processing
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# reliq-python
A python module for [reliq](https://github.com/TUVIMEN/reliq) library.
## Requirements
- [reliq](https://github.com/TUVIMEN/reliq)
## Installation
pip install reliq
## Import
from reliq import reliq
## Usage
```python
from reliq import reliq, ReliqError
html = ""
with open('index.html','r') as f:
html = f.read()
rq = reliq(html) #parse html
expr = reliq.expr(r"""
div .user; {
a href; {
.name * l@[0] | "%i"
.link * l@[0] | "%(href)v"
},
.score.u span .score,
.info dl; {
.key dt | "%i",
.value dd | "%i"
},
.achievements.a li class=b>"achievement-" | "%i\n"
}
""") #expressions can be compiled
users = []
links = []
images = []
#filter()
# returns object holding list of results such object
# behaves like an array, but can be converted to array with
# self() - objects with lvl() 0
# children() - objects with lvl() 1
# descendants() - objects with lvl > 0
# full() - same as indexing filter(), all objects
for i in rq.filter(r'table; tr').self()[:-2]:
#"i"
# It has a set of functions for getting its properties:
# tag() tag name
# insides() string containing contents inside tag
# desc_count() count of descendants
# lvl() level in html structure
# attribsl() number of attributes
# attribs() returns dictionary of attributes
if i.child_count() < 3 and i[0].tag() == "div":
continue
#objects can be accessed as an array which is the same
#as array returned by descendants() method
link = i[5].attribs()['href']
if re.match('^https://$',href):
links.append(link)
continue
#search() returns str, in this case expression is already compiled
user = json.loads(i.search(expr))
users.append(user)
#reliq objects have __str__ method
#get_data() returns data from which the html structure has been compiled
#if the second argument of filter() is True the returned
#object will use independent data, allowing garbage collector
#to free the previous unused data
#fsearch()
# executes expression at parsing saving memory, and because
# of that it supports only chain expressions i.e use of
# grouping brackets and separating commas will throw an exception
for i in reliq.fsearch(r'ul; img src | "%(src)v\n"',html).split('\n')[:-1]:
images.append(i)
try: #handle errors
reliq.fsearch('p / /','<p></p>')
except ReliqError:
print("error")
```
## Projects using reliq
- [forumscraper](https://github.com/TUVIMEN/forumscraper)
Raw data
{
"_id": null,
"home_page": null,
"name": "reliq",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "ctypes, html, parser, text-processing",
"author": null,
"author_email": "Dominik Stanis\u0142aw Suchora <suchora.dominik7@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/d8/cf/8757d7e1ebd0f9645c07e67850d5ac7301cfd62de8618a3c653b8b7f0000/reliq-0.0.26.tar.gz",
"platform": null,
"description": "# reliq-python\n\nA python module for [reliq](https://github.com/TUVIMEN/reliq) library.\n\n## Requirements\n\n- [reliq](https://github.com/TUVIMEN/reliq)\n\n## Installation\n\n pip install reliq\n\n## Import\n\n from reliq import reliq\n\n## Usage\n\n```python\nfrom reliq import reliq, ReliqError\n\nhtml = \"\"\nwith open('index.html','r') as f:\n html = f.read()\n\nrq = reliq(html) #parse html\nexpr = reliq.expr(r\"\"\"\n div .user; {\n a href; {\n .name * l@[0] | \"%i\"\n .link * l@[0] | \"%(href)v\"\n },\n .score.u span .score,\n .info dl; {\n .key dt | \"%i\",\n .value dd | \"%i\"\n },\n .achievements.a li class=b>\"achievement-\" | \"%i\\n\"\n }\n\"\"\") #expressions can be compiled\n\nusers = []\nlinks = []\nimages = []\n\n#filter()\n# returns object holding list of results such object\n# behaves like an array, but can be converted to array with\n# self() - objects with lvl() 0\n# children() - objects with lvl() 1\n# descendants() - objects with lvl > 0\n# full() - same as indexing filter(), all objects\n\nfor i in rq.filter(r'table; tr').self()[:-2]:\n #\"i\"\n # It has a set of functions for getting its properties:\n # tag() tag name\n # insides() string containing contents inside tag\n # desc_count() count of descendants\n # lvl() level in html structure\n # attribsl() number of attributes\n # attribs() returns dictionary of attributes\n\n if i.child_count() < 3 and i[0].tag() == \"div\":\n continue\n\n #objects can be accessed as an array which is the same\n #as array returned by descendants() method\n link = i[5].attribs()['href']\n if re.match('^https://$',href):\n links.append(link)\n continue\n\n #search() returns str, in this case expression is already compiled\n user = json.loads(i.search(expr))\n users.append(user)\n\n#reliq objects have __str__ method\n#get_data() returns data from which the html structure has been compiled\n\n#if the second argument of filter() is True the returned\n#object will use independent data, allowing garbage collector\n#to free the previous unused data\n\n#fsearch()\n# executes expression at parsing saving memory, and because\n# of that it supports only chain expressions i.e use of\n# grouping brackets and separating commas will throw an exception\nfor i in reliq.fsearch(r'ul; img src | \"%(src)v\\n\"',html).split('\\n')[:-1]:\n images.append(i)\n\ntry: #handle errors\n reliq.fsearch('p / /','<p></p>')\nexcept ReliqError:\n print(\"error\")\n```\n## Projects using reliq\n\n- [forumscraper](https://github.com/TUVIMEN/forumscraper)\n",
"bugtrack_url": null,
"license": "GPLv3",
"summary": "Python ctypes bindings for reliq",
"version": "0.0.26",
"project_urls": {
"Homepage": "https://github.com/TUVIMEN/reliq-python"
},
"split_keywords": [
"ctypes",
" html",
" parser",
" text-processing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "73448a42c16694306a7868c0b9e89bbb2fc6e29ffc193bd39d63a31abb695855",
"md5": "82f2035e1d531aabc51155e66e91c575",
"sha256": "3f78c4d4ac108ad9759e036bf3d8cbf7b403f04bc398688e9855286f12217309"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp310-cp310-manylinux2014_aarch64.whl",
"has_sig": false,
"md5_digest": "82f2035e1d531aabc51155e66e91c575",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.7",
"size": 110833,
"upload_time": "2024-10-05T12:31:52",
"upload_time_iso_8601": "2024-10-05T12:31:52.917119Z",
"url": "https://files.pythonhosted.org/packages/73/44/8a42c16694306a7868c0b9e89bbb2fc6e29ffc193bd39d63a31abb695855/reliq-0.0.26-cp310-cp310-manylinux2014_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c1683cb608dbdcc007b89b02401c5fc2d6555e80c71304c108fa07fc34b28c62",
"md5": "49243927f50e4c7b9261fd58406fe1d6",
"sha256": "ea7c794ab519f10214213ffd4223a7c9fc0622e70cf795a5b358f886b2eb822d"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp310-cp310-manylinux2014_armv7l.whl",
"has_sig": false,
"md5_digest": "49243927f50e4c7b9261fd58406fe1d6",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.7",
"size": 90106,
"upload_time": "2024-10-05T12:31:54",
"upload_time_iso_8601": "2024-10-05T12:31:54.520107Z",
"url": "https://files.pythonhosted.org/packages/c1/68/3cb608dbdcc007b89b02401c5fc2d6555e80c71304c108fa07fc34b28c62/reliq-0.0.26-cp310-cp310-manylinux2014_armv7l.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "86870ae76fe83d456aa8684e997300d12fbd20a22430ed26693ac8d8b1f755df",
"md5": "f7b65c984b2a1a458e977207768962f2",
"sha256": "f9643ab02b00e3fabee06a63ade3b55c0ba100ac5940164e887a297328f4629f"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp312-cp312-macosx_12_0_arm64.whl",
"has_sig": false,
"md5_digest": "f7b65c984b2a1a458e977207768962f2",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.7",
"size": 90632,
"upload_time": "2024-10-05T12:31:56",
"upload_time_iso_8601": "2024-10-05T12:31:56.214652Z",
"url": "https://files.pythonhosted.org/packages/86/87/0ae76fe83d456aa8684e997300d12fbd20a22430ed26693ac8d8b1f755df/reliq-0.0.26-cp312-cp312-macosx_12_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "922a22a564d9b238435a925f889560c90cfc4cab769824eee24d1d9786edb71e",
"md5": "657f62cae15d85967a2c090d9e2a7d3a",
"sha256": "a144a3ae78c467da213f4477bf885103bec646312555f3f0877cbdaee552990a"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp312-cp312-macosx_13_0_arm64.whl",
"has_sig": false,
"md5_digest": "657f62cae15d85967a2c090d9e2a7d3a",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.7",
"size": 89594,
"upload_time": "2024-10-05T12:31:57",
"upload_time_iso_8601": "2024-10-05T12:31:57.858607Z",
"url": "https://files.pythonhosted.org/packages/92/2a/22a564d9b238435a925f889560c90cfc4cab769824eee24d1d9786edb71e/reliq-0.0.26-cp312-cp312-macosx_13_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8abafec348465298bd2258d1525ab590beb1835687e190e6ea29bc70553b34dc",
"md5": "7010c8d347874d98a6cd03fef78fa15a",
"sha256": "c9963fec09cdd91765846a13bc20993d68c07be5c91ddac9559c436ea2860d5c"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp312-cp312-manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "7010c8d347874d98a6cd03fef78fa15a",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.7",
"size": 114241,
"upload_time": "2024-10-05T12:31:59",
"upload_time_iso_8601": "2024-10-05T12:31:59.921093Z",
"url": "https://files.pythonhosted.org/packages/8a/ba/fec348465298bd2258d1525ab590beb1835687e190e6ea29bc70553b34dc/reliq-0.0.26-cp312-cp312-manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "94543b60f0da317467cafa12dee15029d248907608f8137eff5e1422c5359242",
"md5": "9e0dc2fa839ffbf8a291d49bea8812cc",
"sha256": "66c5e38d7023b2773ce043565c94161a456d9af3c9ad41349b9a10784ad95d27"
},
"downloads": -1,
"filename": "reliq-0.0.26-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "9e0dc2fa839ffbf8a291d49bea8812cc",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.7",
"size": 143602,
"upload_time": "2024-10-05T12:32:01",
"upload_time_iso_8601": "2024-10-05T12:32:01.593344Z",
"url": "https://files.pythonhosted.org/packages/94/54/3b60f0da317467cafa12dee15029d248907608f8137eff5e1422c5359242/reliq-0.0.26-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d8cf8757d7e1ebd0f9645c07e67850d5ac7301cfd62de8618a3c653b8b7f0000",
"md5": "9538b8df2437b0c00481f0e79ff326a1",
"sha256": "df445473a1ef1ba832b9df8b2bd357a7cffd52db3859b4141856bba71d47eab4"
},
"downloads": -1,
"filename": "reliq-0.0.26.tar.gz",
"has_sig": false,
"md5_digest": "9538b8df2437b0c00481f0e79ff326a1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 17619,
"upload_time": "2024-10-05T12:32:03",
"upload_time_iso_8601": "2024-10-05T12:32:03.225448Z",
"url": "https://files.pythonhosted.org/packages/d8/cf/8757d7e1ebd0f9645c07e67850d5ac7301cfd62de8618a3c653b8b7f0000/reliq-0.0.26.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-05 12:32:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "TUVIMEN",
"github_project": "reliq-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "reliq"
}