easy-spider-tool-document


Nameeasy-spider-tool-document JSON
Version 1.0.13 PyPI version JSON
download
home_pagehttps://easy-spider-tool-document.xink.top/
Summaryeasy-spider-tool 可选xpath/jsonpath聚合解析扩展包
upload_time2023-09-21 02:19:33
maintainer
docs_urlNone
authorhanxinkong
requires_python>=3.6.8
licenseMIT
keywords easy spider tool document
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # simple-spider-tool-document

easy-spider-tool 可选xpath/jsonpath聚合解析扩展包

## 安装

```shell
pip install easy-spider-tool[document]
```

## 主要功能

- `data_extractor` 表达式数据解析(支持jsonpath,xpath)
- `xpath` xpath语法解析数据(支持首选项,设置默认值)

## 简单使用

```python
from easy_spider_tool_document import data_extractor

data = '<p>这是一个easy_spider_tool的document扩展的示例</p>'
print(data_extractor(data, ['//p//text()'], first=True, default=''))
# 这是一个easy_spider_tool的document扩展的示例

data = {
    "code": 200,
    "data": [
        {
            "id": 1,
            "username": "admin",
            "level": "boss"
        },
        {
            "id": 2,
            "username": "user",
            "level": "staff"
        }
    ]
}

print(data_extractor(data, ['$.data[*].username'], first=False, default=''))
# ['admin', 'user']
```

## 链接

Github:https://github.com/hanxinkong/easy-spider-tool-document

在线文档:https://easy-spider-tool-document.xink.top/

## 注明


            

Raw data

            {
    "_id": null,
    "home_page": "https://easy-spider-tool-document.xink.top/",
    "name": "easy-spider-tool-document",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6.8",
    "maintainer_email": "",
    "keywords": "easy,spider,tool,document",
    "author": "hanxinkong",
    "author_email": "xinkonghan@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/cc/c4/a5c79e72293655f3c83e5e6921a1699a3f3ebc37a78ce7e559b0160f39d7/easy-spider-tool-document-1.0.13.tar.gz",
    "platform": null,
    "description": "# simple-spider-tool-document\r\n\r\neasy-spider-tool \u53ef\u9009xpath/jsonpath\u805a\u5408\u89e3\u6790\u6269\u5c55\u5305\r\n\r\n## \u5b89\u88c5\r\n\r\n```shell\r\npip install easy-spider-tool[document]\r\n```\r\n\r\n## \u4e3b\u8981\u529f\u80fd\r\n\r\n- `data_extractor` \u8868\u8fbe\u5f0f\u6570\u636e\u89e3\u6790\uff08\u652f\u6301jsonpath,xpath\uff09\r\n- `xpath` xpath\u8bed\u6cd5\u89e3\u6790\u6570\u636e\uff08\u652f\u6301\u9996\u9009\u9879\uff0c\u8bbe\u7f6e\u9ed8\u8ba4\u503c\uff09\r\n\r\n## \u7b80\u5355\u4f7f\u7528\r\n\r\n```python\r\nfrom easy_spider_tool_document import data_extractor\r\n\r\ndata = '<p>\u8fd9\u662f\u4e00\u4e2aeasy_spider_tool\u7684document\u6269\u5c55\u7684\u793a\u4f8b</p>'\r\nprint(data_extractor(data, ['//p//text()'], first=True, default=''))\r\n# \u8fd9\u662f\u4e00\u4e2aeasy_spider_tool\u7684document\u6269\u5c55\u7684\u793a\u4f8b\r\n\r\ndata = {\r\n    \"code\": 200,\r\n    \"data\": [\r\n        {\r\n            \"id\": 1,\r\n            \"username\": \"admin\",\r\n            \"level\": \"boss\"\r\n        },\r\n        {\r\n            \"id\": 2,\r\n            \"username\": \"user\",\r\n            \"level\": \"staff\"\r\n        }\r\n    ]\r\n}\r\n\r\nprint(data_extractor(data, ['$.data[*].username'], first=False, default=''))\r\n# ['admin', 'user']\r\n```\r\n\r\n## \u94fe\u63a5\r\n\r\nGithub\uff1ahttps://github.com/hanxinkong/easy-spider-tool-document\r\n\r\n\u5728\u7ebf\u6587\u6863\uff1ahttps://easy-spider-tool-document.xink.top/\r\n\r\n## \u6ce8\u660e\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "easy-spider-tool \u53ef\u9009xpath/jsonpath\u805a\u5408\u89e3\u6790\u6269\u5c55\u5305",
    "version": "1.0.13",
    "project_urls": {
        "Homepage": "https://easy-spider-tool-document.xink.top/"
    },
    "split_keywords": [
        "easy",
        "spider",
        "tool",
        "document"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "03e65c3d7aefc0e43a5dbbd12231938a67522e16cbdc43b6b96828a3d853999d",
                "md5": "fdb32d2f4568ad90d7f4ea621a08778a",
                "sha256": "a7f1dabd1d1524cac3a0e98b6a6a16406a3bf34fb410decf9137b1ec9080e051"
            },
            "downloads": -1,
            "filename": "easy_spider_tool_document-1.0.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fdb32d2f4568ad90d7f4ea621a08778a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6.8",
            "size": 4765,
            "upload_time": "2023-09-21T02:19:31",
            "upload_time_iso_8601": "2023-09-21T02:19:31.457189Z",
            "url": "https://files.pythonhosted.org/packages/03/e6/5c3d7aefc0e43a5dbbd12231938a67522e16cbdc43b6b96828a3d853999d/easy_spider_tool_document-1.0.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ccc4a5c79e72293655f3c83e5e6921a1699a3f3ebc37a78ce7e559b0160f39d7",
                "md5": "80fd5870f941c76bf75b564a72c52d10",
                "sha256": "882295b48f25639bf3c36919f3ad860164c259dd718187df50c8257c268eb28b"
            },
            "downloads": -1,
            "filename": "easy-spider-tool-document-1.0.13.tar.gz",
            "has_sig": false,
            "md5_digest": "80fd5870f941c76bf75b564a72c52d10",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6.8",
            "size": 3340,
            "upload_time": "2023-09-21T02:19:33",
            "upload_time_iso_8601": "2023-09-21T02:19:33.197504Z",
            "url": "https://files.pythonhosted.org/packages/cc/c4/a5c79e72293655f3c83e5e6921a1699a3f3ebc37a78ce7e559b0160f39d7/easy-spider-tool-document-1.0.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-21 02:19:33",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "easy-spider-tool-document"
}
        
Elapsed time: 0.25243s