# 简介
`Bricks` 旨在将爬虫开发变得像搭建积木一样简单而有趣。这个框架的核心理念是提供一个直观、高效的方式来构建复杂的网络爬虫,同时保持代码的简洁和可维护性。无论您是刚入门的新手还是经验丰富的专家,`Bricks` 都能让您轻松地搭建起强大的爬虫,满足从简单数据抓取到复杂网络爬取的各种需求。
通过精心设计的接口和模块化的结构,`Bricks` 使得组合、扩展和维护爬虫变得前所未有的容易。您可以像搭积木一样,快速组合出适合您需求的爬虫结构,无需深入底层细节,同时也能享受到定制化和控制的乐趣。使用 `Bricks`,您将体验到无与伦比的开发效率和灵活性,让爬虫开发不再是一件费时费力的任务。
# 特性
`Bricks` 拥有以下特性
- **基于事件触发的可拓展爬虫**:在定义好自己爬虫主体逻辑的情况下,可以不修改核心代码,在请求前后,存储前后等多个事件接口进行拓展,让爬虫流程更加清晰,且插槽也可拓展
- **爬虫基类丰富**:内置纯代码开发的 `air` 爬虫、流程化自定义配置式的 `form` 爬虫、固定流程配置式的 `template` 爬虫
- **丰富的解析器**:包括 `json` / `xpath` / `jsonpath` / `regex` / 自定义,简单解析 0 代码
- **丰富的下载器**:目前内置的下载器为 `curl-cffi` ,并且还有可选的 `requests` / `requests-go` / `pycurl` / `Playwright` / `dp` / `httpx` / `tls_client`, 且开发者可以根据规范自己定制拓展
- **灵活的调度器**:调度器支持处理同步任务和异步任务,并且支持根据当前任务数量自动调节 `Worker` 数量(可伸缩线程池)
- **多种任务队列**:内置 `Local` 和 `Redis` 两种任务队列,以便应用单机和分布式爬虫,且开发者可以根据规范自己定制拓展
- **爬虫API化**:内置`rpc`和`listener` 模式,可以将爬虫一键转化为可远程调用的 `api`,方便外部调用
# 安装
## 安装最新代码
```
pip install -U git+https://github.com/KKKKKKKEM/bricks.git
```
## 安装正式版
```
pip install -U bricks-py
```
## 安装测试版
```python
# beta 版本全部都发布在 test.pypi.org
pip install -i https://test.pypi.org/simple/ -U bricks-py
```
# 使用文档
具体文档请查看 [Bricks Docs](https://kkkkkkkem.vercel.app/bricks)
Raw data
{
"_id": null,
"home_page": "https://github.com/KKKKKKKEM/bricks.git",
"name": "bricks-py",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8.0",
"maintainer_email": null,
"keywords": "bricks",
"author": "Kem",
"author_email": "531144129@qq.com",
"download_url": "https://files.pythonhosted.org/packages/de/a5/2d468de5c65ef73a4277afcb68b21ffde08c1b83b7f4eb3a9270dee73269/bricks_py-0.0.28.tar.gz",
"platform": null,
"description": "# \u7b80\u4ecb\n\n`Bricks` \u65e8\u5728\u5c06\u722c\u866b\u5f00\u53d1\u53d8\u5f97\u50cf\u642d\u5efa\u79ef\u6728\u4e00\u6837\u7b80\u5355\u800c\u6709\u8da3\u3002\u8fd9\u4e2a\u6846\u67b6\u7684\u6838\u5fc3\u7406\u5ff5\u662f\u63d0\u4f9b\u4e00\u4e2a\u76f4\u89c2\u3001\u9ad8\u6548\u7684\u65b9\u5f0f\u6765\u6784\u5efa\u590d\u6742\u7684\u7f51\u7edc\u722c\u866b\uff0c\u540c\u65f6\u4fdd\u6301\u4ee3\u7801\u7684\u7b80\u6d01\u548c\u53ef\u7ef4\u62a4\u6027\u3002\u65e0\u8bba\u60a8\u662f\u521a\u5165\u95e8\u7684\u65b0\u624b\u8fd8\u662f\u7ecf\u9a8c\u4e30\u5bcc\u7684\u4e13\u5bb6\uff0c`Bricks` \u90fd\u80fd\u8ba9\u60a8\u8f7b\u677e\u5730\u642d\u5efa\u8d77\u5f3a\u5927\u7684\u722c\u866b\uff0c\u6ee1\u8db3\u4ece\u7b80\u5355\u6570\u636e\u6293\u53d6\u5230\u590d\u6742\u7f51\u7edc\u722c\u53d6\u7684\u5404\u79cd\u9700\u6c42\u3002\n\n\u901a\u8fc7\u7cbe\u5fc3\u8bbe\u8ba1\u7684\u63a5\u53e3\u548c\u6a21\u5757\u5316\u7684\u7ed3\u6784\uff0c`Bricks` \u4f7f\u5f97\u7ec4\u5408\u3001\u6269\u5c55\u548c\u7ef4\u62a4\u722c\u866b\u53d8\u5f97\u524d\u6240\u672a\u6709\u7684\u5bb9\u6613\u3002\u60a8\u53ef\u4ee5\u50cf\u642d\u79ef\u6728\u4e00\u6837\uff0c\u5feb\u901f\u7ec4\u5408\u51fa\u9002\u5408\u60a8\u9700\u6c42\u7684\u722c\u866b\u7ed3\u6784\uff0c\u65e0\u9700\u6df1\u5165\u5e95\u5c42\u7ec6\u8282\uff0c\u540c\u65f6\u4e5f\u80fd\u4eab\u53d7\u5230\u5b9a\u5236\u5316\u548c\u63a7\u5236\u7684\u4e50\u8da3\u3002\u4f7f\u7528 `Bricks`\uff0c\u60a8\u5c06\u4f53\u9a8c\u5230\u65e0\u4e0e\u4f26\u6bd4\u7684\u5f00\u53d1\u6548\u7387\u548c\u7075\u6d3b\u6027\uff0c\u8ba9\u722c\u866b\u5f00\u53d1\u4e0d\u518d\u662f\u4e00\u4ef6\u8d39\u65f6\u8d39\u529b\u7684\u4efb\u52a1\u3002\n\n\n# \u7279\u6027\n\n`Bricks` \u62e5\u6709\u4ee5\u4e0b\u7279\u6027\n\n- **\u57fa\u4e8e\u4e8b\u4ef6\u89e6\u53d1\u7684\u53ef\u62d3\u5c55\u722c\u866b**\uff1a\u5728\u5b9a\u4e49\u597d\u81ea\u5df1\u722c\u866b\u4e3b\u4f53\u903b\u8f91\u7684\u60c5\u51b5\u4e0b\uff0c\u53ef\u4ee5\u4e0d\u4fee\u6539\u6838\u5fc3\u4ee3\u7801\uff0c\u5728\u8bf7\u6c42\u524d\u540e\uff0c\u5b58\u50a8\u524d\u540e\u7b49\u591a\u4e2a\u4e8b\u4ef6\u63a5\u53e3\u8fdb\u884c\u62d3\u5c55\uff0c\u8ba9\u722c\u866b\u6d41\u7a0b\u66f4\u52a0\u6e05\u6670\uff0c\u4e14\u63d2\u69fd\u4e5f\u53ef\u62d3\u5c55\n- **\u722c\u866b\u57fa\u7c7b\u4e30\u5bcc**\uff1a\u5185\u7f6e\u7eaf\u4ee3\u7801\u5f00\u53d1\u7684 `air` \u722c\u866b\u3001\u6d41\u7a0b\u5316\u81ea\u5b9a\u4e49\u914d\u7f6e\u5f0f\u7684 `form` \u722c\u866b\u3001\u56fa\u5b9a\u6d41\u7a0b\u914d\u7f6e\u5f0f\u7684 `template` \u722c\u866b\n- **\u4e30\u5bcc\u7684\u89e3\u6790\u5668**\uff1a\u5305\u62ec `json` / `xpath` / `jsonpath` / `regex` / \u81ea\u5b9a\u4e49\uff0c\u7b80\u5355\u89e3\u6790 0 \u4ee3\u7801\n- **\u4e30\u5bcc\u7684\u4e0b\u8f7d\u5668**\uff1a\u76ee\u524d\u5185\u7f6e\u7684\u4e0b\u8f7d\u5668\u4e3a `curl-cffi` \uff0c\u5e76\u4e14\u8fd8\u6709\u53ef\u9009\u7684 `requests` / `requests-go` / `pycurl` / `Playwright` / `dp` / `httpx` / `tls_client`\uff0c \u4e14\u5f00\u53d1\u8005\u53ef\u4ee5\u6839\u636e\u89c4\u8303\u81ea\u5df1\u5b9a\u5236\u62d3\u5c55\n- **\u7075\u6d3b\u7684\u8c03\u5ea6\u5668**\uff1a\u8c03\u5ea6\u5668\u652f\u6301\u5904\u7406\u540c\u6b65\u4efb\u52a1\u548c\u5f02\u6b65\u4efb\u52a1\uff0c\u5e76\u4e14\u652f\u6301\u6839\u636e\u5f53\u524d\u4efb\u52a1\u6570\u91cf\u81ea\u52a8\u8c03\u8282 `Worker` \u6570\u91cf\uff08\u53ef\u4f38\u7f29\u7ebf\u7a0b\u6c60\uff09\n- **\u591a\u79cd\u4efb\u52a1\u961f\u5217**\uff1a\u5185\u7f6e `Local` \u548c `Redis` \u4e24\u79cd\u4efb\u52a1\u961f\u5217\uff0c\u4ee5\u4fbf\u5e94\u7528\u5355\u673a\u548c\u5206\u5e03\u5f0f\u722c\u866b\uff0c\u4e14\u5f00\u53d1\u8005\u53ef\u4ee5\u6839\u636e\u89c4\u8303\u81ea\u5df1\u5b9a\u5236\u62d3\u5c55\n- **\u722c\u866bAPI\u5316**\uff1a\u5185\u7f6e`rpc`\u548c`listener` \u6a21\u5f0f\uff0c\u53ef\u4ee5\u5c06\u722c\u866b\u4e00\u952e\u8f6c\u5316\u4e3a\u53ef\u8fdc\u7a0b\u8c03\u7528\u7684 `api`\uff0c\u65b9\u4fbf\u5916\u90e8\u8c03\u7528 \n\n\n# \u5b89\u88c5\n## \u5b89\u88c5\u6700\u65b0\u4ee3\u7801\n```\npip install -U git+https://github.com/KKKKKKKEM/bricks.git\n```\n\n## \u5b89\u88c5\u6b63\u5f0f\u7248\n```\npip install -U bricks-py\n```\n\n## \u5b89\u88c5\u6d4b\u8bd5\u7248\n```python\n# beta \u7248\u672c\u5168\u90e8\u90fd\u53d1\u5e03\u5728 test.pypi.org\npip install -i https://test.pypi.org/simple/ -U bricks-py\n\n```\n\n# \u4f7f\u7528\u6587\u6863\n\u5177\u4f53\u6587\u6863\u8bf7\u67e5\u770b [Bricks Docs](https://kkkkkkkem.vercel.app/bricks)\n\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "quickly build your crawler",
"version": "0.0.28",
"project_urls": {
"Homepage": "https://github.com/KKKKKKKEM/bricks.git"
},
"split_keywords": [
"bricks"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c0cd80163dad5980c09af42842fcaf60bf9910dfebc794886ea65bc6858faf2c",
"md5": "290ecf8ce64c1d5ad197c1e629d69d00",
"sha256": "698696da6d4dd8cfc370d61406de4ed2373425da5cdb5c1cbc450e465851b5aa"
},
"downloads": -1,
"filename": "bricks_py-0.0.28-py3-none-any.whl",
"has_sig": false,
"md5_digest": "290ecf8ce64c1d5ad197c1e629d69d00",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8.0",
"size": 180340,
"upload_time": "2025-01-14T09:26:05",
"upload_time_iso_8601": "2025-01-14T09:26:05.586932Z",
"url": "https://files.pythonhosted.org/packages/c0/cd/80163dad5980c09af42842fcaf60bf9910dfebc794886ea65bc6858faf2c/bricks_py-0.0.28-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dea52d468de5c65ef73a4277afcb68b21ffde08c1b83b7f4eb3a9270dee73269",
"md5": "040821eaf3e2bf9a06f5bfdb6e8ce090",
"sha256": "7a892703637df9f8da3eb4ba3b8f8bb0d1d814c7fed378b00bedfec989f4ace7"
},
"downloads": -1,
"filename": "bricks_py-0.0.28.tar.gz",
"has_sig": false,
"md5_digest": "040821eaf3e2bf9a06f5bfdb6e8ce090",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8.0",
"size": 136577,
"upload_time": "2025-01-14T09:26:08",
"upload_time_iso_8601": "2025-01-14T09:26:08.003851Z",
"url": "https://files.pythonhosted.org/packages/de/a5/2d468de5c65ef73a4277afcb68b21ffde08c1b83b7f4eb3a9270dee73269/bricks_py-0.0.28.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-14 09:26:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "KKKKKKKEM",
"github_project": "bricks",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "loguru",
"specs": [
[
"==",
"0.7.2"
]
]
},
{
"name": "jmespath-community",
"specs": [
[
"==",
"1.1.3"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"4.9.3"
]
]
},
{
"name": "jsonpath",
"specs": [
[
"==",
"0.82.2"
]
]
},
{
"name": "w3lib",
"specs": [
[
"==",
"2.1.2"
]
]
},
{
"name": "curl_cffi",
"specs": [
[
"==",
"0.7.3"
]
]
},
{
"name": "redis",
"specs": [
[
"==",
"5.0.1"
]
]
},
{
"name": "better_exceptions",
"specs": [
[
"==",
"0.3.3"
]
]
}
],
"lcname": "bricks-py"
}