# 项目说明
- Python工具大全。爬虫、装饰器(计时器、类型强校验...)、线程池(内存不溢出)、快速地操作数据库(MySQL、PostgreSQL)等
# 安装
```bash
pip install wauo -U
```
# Python解释器
- python3.10+
# 更新历史
- 新增db,操作MySQL、PostgreSQL数据库
- 新增`jsonp2json`静态方法
- 爬虫默认保持会话状态
- 新增`get_uuid`、`base64`加解密的静态方法
- 删除`download_text`、`download_bdata`,合并为`download`
- 新增`update_default_headers`方法
- `make_md5`支持字符串、二进制参数,并且可以加盐
- `send`方法加入`delay`参数,请求时可以设置延迟
- 新增`tools`包、`spiders`包
- 线程池管理者加入上下文,可以使用`with`了
- 新增`get_results`方法,获取所有`fs`的返回值
- 可以提前在send方法之前自定义延迟、超时
- 线程池管理者新增`running`方法,可以用于判断任务状态
- `send`方法加入详细注释
- 新增`todos`方法、tools改为utils
- `done`加入func_name参数,可以定位到具体是哪一个`线程函数`出现异常
- `PoolWait`、`PoolMan`
- 一些参数的变化(改名、补充注解)
- 加入了一些装饰器函数
- 补充`send`方法中`**kwargs`的说明
- 新增`block`方法,可以进行阻塞
- 一些优化
- utils包新增`cget`方法,字典多层取值,KEY不存在则返回<default>
- cprint参数有误则默认不加入颜色打印
- 一些优化,新增raise_for_status、raise_for_text、do方法、函数文档模板修改等
# 如何使用
## 数据库
### PostgreSQL
- 使用
```python
from wauo.db import PostgresqlClient
psql_cfg = {
"host": "localhost",
"port": 5432,
"db": "test",
"user": "wauo",
"password": "admin1",
}
psql = PostgresqlClient(**psql_cfg)
psql.connect()
name = 'temp'
# 删除表
psql.drop_table(name)
print(f"表 {name} 已删除(如果存在)")
# 创建新表
psql.create_table(name, ['name', 'age'])
# 插入数据
n = psql.insert_one(name, {'name': 'Alice', 'age': 30})
print(f"插入的行数: {n}")
psql.insert_many(name, [{'name': 'Bob', 'age': 25}, {'name': 'Charlie', 'age': 35}])
print(f"批量插入的行数: {n}")
# 查询数据
lines = psql.query(f"SELECT * FROM {name}")
for line in lines:
print(dict(line))
# 更新数据
n = psql.update(name, {'age': 31}, "name = %s", ('Alice',))
print(f"更新的行数: {n}")
# 删除数据
psql.delete(name, "name = %s", ('Bob',))
print("删除了 Bob 的记录")
```
## 爬虫
### 简单使用
```python
from wauo import WauoSpider
spider = WauoSpider()
```
## 请求
### GET
- 默认是get请求
```python
url = 'https://github.com/markadc'
resp = spider.send(url)
print(resp.text)
```
### POST
- 使用了`data`或者`json`参数,则是post请求
```python
api = 'https://github.com/markadc'
payload = {
'key1': 'value1',
'key2': 'value2'
}
resp = spider.send(api, data=payload) # 使用data参数
resp = spider.send(api, json=payload) # 使用json参数
```
## 响应
### 校验响应
#### 1、限制响应码
- 如果响应码不在codes范围里则引发异常
```python
resp = spider.send('https://github.com/markadc')
resp.raise_for_status(codes=[301, 302])
```
#### 2、限制响应内容
- 如果is_ok返回False则引发异常
```python
def is_ok(html: str):
return html.find('验证') == -1
resp = spider.send('https://wenku.baidu.com/wkvcode.html')
resp.raise_for_text(validate=is_ok)
```
## 设置默认请求配置
- 给headers设置Cookie
- ...
### 例子1
- 每一次请求的headers都带上`cookie`
```python
from wauo import WauoSpider
cookie = 'Your Cookies'
spider = WauoSpider(default_headers={'Cookie': cookie})
resp1 = spider.send('https://github.com/markadc')
resp2 = spider.send('https://github.com/markadc/wauo')
print(resp1.request.headers)
print(resp2.request.headers)
```
# 一些工具
- 传入变量,可以直接打印该变量的字符串名称、实际值
- 时间戳转时间、时间转时间戳、获取今天任意时刻的时间戳
- 字典多层取值,KEY不存在则返回设定的默认值
- 处理线程任务,有序获取(先返回的靠前)所有线程的返回值(异常的线程、假值除外)
- 带颜色的打印函数
- 检查参数的注解,类型不一致则抛出异常
- 封装的线程池(自带阻塞,不用担心溢出)
- ...
Raw data
{
"_id": null,
"home_page": null,
"name": "wauo",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "WangTuo <markadc@126.com>",
"download_url": "https://files.pythonhosted.org/packages/b2/23/ae7667500843ae2705d32290183cfa003dbc57d454b5fcea9650231f76a3/wauo-0.9.1.1.tar.gz",
"platform": null,
"description": "# \u9879\u76ee\u8bf4\u660e\n\n- Python\u5de5\u5177\u5927\u5168\u3002\u722c\u866b\u3001\u88c5\u9970\u5668\uff08\u8ba1\u65f6\u5668\u3001\u7c7b\u578b\u5f3a\u6821\u9a8c...\uff09\u3001\u7ebf\u7a0b\u6c60\uff08\u5185\u5b58\u4e0d\u6ea2\u51fa\uff09\u3001\u5feb\u901f\u5730\u64cd\u4f5c\u6570\u636e\u5e93\uff08MySQL\u3001PostgreSQL\uff09\u7b49\n\n# \u5b89\u88c5\n\n```bash\npip install wauo -U\n```\n\n# Python\u89e3\u91ca\u5668\n\n- python3.10+\n\n# \u66f4\u65b0\u5386\u53f2\n\n- \u65b0\u589edb\uff0c\u64cd\u4f5cMySQL\u3001PostgreSQL\u6570\u636e\u5e93\n- \u65b0\u589e`jsonp2json`\u9759\u6001\u65b9\u6cd5\n- \u722c\u866b\u9ed8\u8ba4\u4fdd\u6301\u4f1a\u8bdd\u72b6\u6001\n- \u65b0\u589e`get_uuid`\u3001`base64`\u52a0\u89e3\u5bc6\u7684\u9759\u6001\u65b9\u6cd5\n- \u5220\u9664`download_text`\u3001`download_bdata`\uff0c\u5408\u5e76\u4e3a`download`\n- \u65b0\u589e`update_default_headers`\u65b9\u6cd5\n- `make_md5`\u652f\u6301\u5b57\u7b26\u4e32\u3001\u4e8c\u8fdb\u5236\u53c2\u6570\uff0c\u5e76\u4e14\u53ef\u4ee5\u52a0\u76d0\n- `send`\u65b9\u6cd5\u52a0\u5165`delay`\u53c2\u6570\uff0c\u8bf7\u6c42\u65f6\u53ef\u4ee5\u8bbe\u7f6e\u5ef6\u8fdf\n- \u65b0\u589e`tools`\u5305\u3001`spiders`\u5305\n- \u7ebf\u7a0b\u6c60\u7ba1\u7406\u8005\u52a0\u5165\u4e0a\u4e0b\u6587\uff0c\u53ef\u4ee5\u4f7f\u7528`with`\u4e86\n- \u65b0\u589e`get_results`\u65b9\u6cd5\uff0c\u83b7\u53d6\u6240\u6709`fs`\u7684\u8fd4\u56de\u503c\n- \u53ef\u4ee5\u63d0\u524d\u5728send\u65b9\u6cd5\u4e4b\u524d\u81ea\u5b9a\u4e49\u5ef6\u8fdf\u3001\u8d85\u65f6\n- \u7ebf\u7a0b\u6c60\u7ba1\u7406\u8005\u65b0\u589e`running`\u65b9\u6cd5\uff0c\u53ef\u4ee5\u7528\u4e8e\u5224\u65ad\u4efb\u52a1\u72b6\u6001\n- `send`\u65b9\u6cd5\u52a0\u5165\u8be6\u7ec6\u6ce8\u91ca\n- \u65b0\u589e`todos`\u65b9\u6cd5\u3001tools\u6539\u4e3autils\n- `done`\u52a0\u5165func_name\u53c2\u6570\uff0c\u53ef\u4ee5\u5b9a\u4f4d\u5230\u5177\u4f53\u662f\u54ea\u4e00\u4e2a`\u7ebf\u7a0b\u51fd\u6570`\u51fa\u73b0\u5f02\u5e38\n- `PoolWait`\u3001`PoolMan`\n- \u4e00\u4e9b\u53c2\u6570\u7684\u53d8\u5316\uff08\u6539\u540d\u3001\u8865\u5145\u6ce8\u89e3\uff09\n- \u52a0\u5165\u4e86\u4e00\u4e9b\u88c5\u9970\u5668\u51fd\u6570\n- \u8865\u5145`send`\u65b9\u6cd5\u4e2d`**kwargs`\u7684\u8bf4\u660e\n- \u65b0\u589e`block`\u65b9\u6cd5\uff0c\u53ef\u4ee5\u8fdb\u884c\u963b\u585e\n- \u4e00\u4e9b\u4f18\u5316\n- utils\u5305\u65b0\u589e`cget`\u65b9\u6cd5\uff0c\u5b57\u5178\u591a\u5c42\u53d6\u503c\uff0cKEY\u4e0d\u5b58\u5728\u5219\u8fd4\u56de<default>\n- cprint\u53c2\u6570\u6709\u8bef\u5219\u9ed8\u8ba4\u4e0d\u52a0\u5165\u989c\u8272\u6253\u5370\n- \u4e00\u4e9b\u4f18\u5316\uff0c\u65b0\u589eraise_for_status\u3001raise_for_text\u3001do\u65b9\u6cd5\u3001\u51fd\u6570\u6587\u6863\u6a21\u677f\u4fee\u6539\u7b49\n\n# \u5982\u4f55\u4f7f\u7528\n\n## \u6570\u636e\u5e93\n\n### PostgreSQL\n\n- \u4f7f\u7528\n\n```python\nfrom wauo.db import PostgresqlClient\n\npsql_cfg = {\n \"host\": \"localhost\",\n \"port\": 5432,\n \"db\": \"test\",\n \"user\": \"wauo\",\n \"password\": \"admin1\",\n}\npsql = PostgresqlClient(**psql_cfg)\npsql.connect()\n\nname = 'temp'\n\n# \u5220\u9664\u8868\npsql.drop_table(name)\nprint(f\"\u8868 {name} \u5df2\u5220\u9664\uff08\u5982\u679c\u5b58\u5728\uff09\")\n\n# \u521b\u5efa\u65b0\u8868\npsql.create_table(name, ['name', 'age'])\n\n# \u63d2\u5165\u6570\u636e\nn = psql.insert_one(name, {'name': 'Alice', 'age': 30})\nprint(f\"\u63d2\u5165\u7684\u884c\u6570: {n}\")\npsql.insert_many(name, [{'name': 'Bob', 'age': 25}, {'name': 'Charlie', 'age': 35}])\nprint(f\"\u6279\u91cf\u63d2\u5165\u7684\u884c\u6570: {n}\")\n\n# \u67e5\u8be2\u6570\u636e\nlines = psql.query(f\"SELECT * FROM {name}\")\nfor line in lines:\n print(dict(line))\n\n# \u66f4\u65b0\u6570\u636e\nn = psql.update(name, {'age': 31}, \"name = %s\", ('Alice',))\nprint(f\"\u66f4\u65b0\u7684\u884c\u6570: {n}\")\n\n# \u5220\u9664\u6570\u636e\npsql.delete(name, \"name = %s\", ('Bob',))\nprint(\"\u5220\u9664\u4e86 Bob \u7684\u8bb0\u5f55\")\n\n```\n\n## \u722c\u866b\n\n### \u7b80\u5355\u4f7f\u7528\n\n```python\nfrom wauo import WauoSpider\n\nspider = WauoSpider()\n```\n\n## \u8bf7\u6c42\n\n### GET\n\n- \u9ed8\u8ba4\u662fget\u8bf7\u6c42\n\n```python\nurl = 'https://github.com/markadc'\nresp = spider.send(url)\nprint(resp.text)\n```\n\n### POST\n\n- \u4f7f\u7528\u4e86`data`\u6216\u8005`json`\u53c2\u6570\uff0c\u5219\u662fpost\u8bf7\u6c42\n\n```python\napi = 'https://github.com/markadc'\npayload = {\n 'key1': 'value1',\n 'key2': 'value2'\n}\nresp = spider.send(api, data=payload) # \u4f7f\u7528data\u53c2\u6570\nresp = spider.send(api, json=payload) # \u4f7f\u7528json\u53c2\u6570\n```\n\n## \u54cd\u5e94\n\n### \u6821\u9a8c\u54cd\u5e94\n\n#### 1\u3001\u9650\u5236\u54cd\u5e94\u7801\n\n- \u5982\u679c\u54cd\u5e94\u7801\u4e0d\u5728codes\u8303\u56f4\u91cc\u5219\u5f15\u53d1\u5f02\u5e38\n\n```python\nresp = spider.send('https://github.com/markadc')\nresp.raise_for_status(codes=[301, 302])\n```\n\n#### 2\u3001\u9650\u5236\u54cd\u5e94\u5185\u5bb9\n\n- \u5982\u679cis_ok\u8fd4\u56deFalse\u5219\u5f15\u53d1\u5f02\u5e38\n\n```python\ndef is_ok(html: str):\n return html.find('\u9a8c\u8bc1') == -1\n\n\nresp = spider.send('https://wenku.baidu.com/wkvcode.html')\nresp.raise_for_text(validate=is_ok)\n```\n\n## \u8bbe\u7f6e\u9ed8\u8ba4\u8bf7\u6c42\u914d\u7f6e\n\n- \u7ed9headers\u8bbe\u7f6eCookie\n- ...\n\n### \u4f8b\u5b501\n\n- \u6bcf\u4e00\u6b21\u8bf7\u6c42\u7684headers\u90fd\u5e26\u4e0a`cookie`\n\n```python\nfrom wauo import WauoSpider\n\ncookie = 'Your Cookies'\nspider = WauoSpider(default_headers={'Cookie': cookie})\nresp1 = spider.send('https://github.com/markadc')\nresp2 = spider.send('https://github.com/markadc/wauo')\nprint(resp1.request.headers)\nprint(resp2.request.headers)\n```\n\n# \u4e00\u4e9b\u5de5\u5177\n\n- \u4f20\u5165\u53d8\u91cf\uff0c\u53ef\u4ee5\u76f4\u63a5\u6253\u5370\u8be5\u53d8\u91cf\u7684\u5b57\u7b26\u4e32\u540d\u79f0\u3001\u5b9e\u9645\u503c\n\n- \u65f6\u95f4\u6233\u8f6c\u65f6\u95f4\u3001\u65f6\u95f4\u8f6c\u65f6\u95f4\u6233\u3001\u83b7\u53d6\u4eca\u5929\u4efb\u610f\u65f6\u523b\u7684\u65f6\u95f4\u6233\n\n- \u5b57\u5178\u591a\u5c42\u53d6\u503c\uff0cKEY\u4e0d\u5b58\u5728\u5219\u8fd4\u56de\u8bbe\u5b9a\u7684\u9ed8\u8ba4\u503c\n\n- \u5904\u7406\u7ebf\u7a0b\u4efb\u52a1\uff0c\u6709\u5e8f\u83b7\u53d6\uff08\u5148\u8fd4\u56de\u7684\u9760\u524d\uff09\u6240\u6709\u7ebf\u7a0b\u7684\u8fd4\u56de\u503c\uff08\u5f02\u5e38\u7684\u7ebf\u7a0b\u3001\u5047\u503c\u9664\u5916\uff09\n\n- \u5e26\u989c\u8272\u7684\u6253\u5370\u51fd\u6570\n\n- \u68c0\u67e5\u53c2\u6570\u7684\u6ce8\u89e3\uff0c\u7c7b\u578b\u4e0d\u4e00\u81f4\u5219\u629b\u51fa\u5f02\u5e38\n\n- \u5c01\u88c5\u7684\u7ebf\u7a0b\u6c60\uff08\u81ea\u5e26\u963b\u585e\uff0c\u4e0d\u7528\u62c5\u5fc3\u6ea2\u51fa\uff09\n\n- ...\n",
"bugtrack_url": null,
"license": null,
"summary": "Python\u5de5\u5177\u5927\u5168",
"version": "0.9.1.1",
"project_urls": {
"Homepage": "https://github.com/markadc/wauo"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4283b3de8838807533635a1201b2f2711bbea14e5bfccfc693910dcc9efc9aa3",
"md5": "3352aca9c3fd5fa0559501ceb18d796c",
"sha256": "1eaaeb92e730ca4ad0224db56a23410cd65c0c28f4ccd395c14ad1abbfce4bf2"
},
"downloads": -1,
"filename": "wauo-0.9.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3352aca9c3fd5fa0559501ceb18d796c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 23784,
"upload_time": "2025-09-11T03:36:25",
"upload_time_iso_8601": "2025-09-11T03:36:25.387094Z",
"url": "https://files.pythonhosted.org/packages/42/83/b3de8838807533635a1201b2f2711bbea14e5bfccfc693910dcc9efc9aa3/wauo-0.9.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b223ae7667500843ae2705d32290183cfa003dbc57d454b5fcea9650231f76a3",
"md5": "0c81ba620da8a8f320120db0c353c7c4",
"sha256": "ee4648aee328de96920639c22f141d38c3515d17ed77e6c2bc05edeebfe8b744"
},
"downloads": -1,
"filename": "wauo-0.9.1.1.tar.gz",
"has_sig": false,
"md5_digest": "0c81ba620da8a8f320120db0c353c7c4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 20218,
"upload_time": "2025-09-11T03:36:26",
"upload_time_iso_8601": "2025-09-11T03:36:26.668630Z",
"url": "https://files.pythonhosted.org/packages/b2/23/ae7667500843ae2705d32290183cfa003dbc57d454b5fcea9650231f76a3/wauo-0.9.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-11 03:36:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "markadc",
"github_project": "wauo",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "fake_useragent",
"specs": [
[
"==",
"0.1.11"
]
]
},
{
"name": "loguru",
"specs": [
[
"==",
"0.5.3"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.28.1"
]
]
},
{
"name": "parsel",
"specs": [
[
"==",
"1.9.1"
]
]
}
],
"lcname": "wauo"
}