# 更新历史
- 新增`jsonp2json`静态方法
- 爬虫`默认保持会话`状态
- 新增`get_uuid`、`base64加解密`静态方法
# 项目说明
- 基于requests封装的一个爬虫类
# Python解释器
- python3
# 如何使用?
```python
from wauo import WauoSpider
spider = WauoSpider()
```
## GET
```python
url = 'https://github.com/markadc'
resp = spider.send(url)
print(resp.text)
```
## POST
#### 使用data参数
```python
api = 'https://github.com/markadc'
data = {
'key1': 'value1',
'key2': 'value2'
}
resp = spider.send(api, data=data)
```
#### 使用json参数
```python
api = 'https://github.com/markadc'
json = {
'key1': 'value1',
'key2': 'value2'
}
resp = spider.send(api, json=json)
```
## 限制响应
#### 限制响应码
- 如果响应码不在codes范围里则抛弃响应
```python
resp = spider.send('https://github.com/markadc', codes=[200, 301, 302])
```
#### 限制响应内容
- 如果checker返回False则抛弃响应
```python
def is_ok(response):
html = response.text
if html.find('验证码') != -1:
return False
resp = spider.send('https://github.com/markadc', checker=is_ok)
```
#### 为headers增加默认字段
- 实例化的时候使用default_headers参数
##### 例子1
- 每一次请求的headers都带上cookie
```python
spider = WauoSpider(default_headers={'Cookie': 'Your Cookies'})
resp = spider.send('https://github.com/markadc')
print(resp.request.headers)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/markadc/wauo",
"name": "wauo",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "python, requests, spider",
"author": "WangTuo",
"author_email": "markadc@126.com",
"download_url": "https://files.pythonhosted.org/packages/6d/0c/da5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf/wauo-0.5.4.tar.gz",
"platform": null,
"description": "# \u66f4\u65b0\u5386\u53f2\n\n- \u65b0\u589e`jsonp2json`\u9759\u6001\u65b9\u6cd5\n- \u722c\u866b`\u9ed8\u8ba4\u4fdd\u6301\u4f1a\u8bdd`\u72b6\u6001\n- \u65b0\u589e`get_uuid`\u3001`base64\u52a0\u89e3\u5bc6`\u9759\u6001\u65b9\u6cd5\n\n# \u9879\u76ee\u8bf4\u660e\n\n- \u57fa\u4e8erequests\u5c01\u88c5\u7684\u4e00\u4e2a\u722c\u866b\u7c7b\n\n# Python\u89e3\u91ca\u5668\n\n- python3\n\n# \u5982\u4f55\u4f7f\u7528\uff1f\n\n```python\nfrom wauo import WauoSpider\n\nspider = WauoSpider()\n```\n\n## GET\n\n```python\nurl = 'https://github.com/markadc'\nresp = spider.send(url)\nprint(resp.text)\n```\n\n## POST\n\n#### \u4f7f\u7528data\u53c2\u6570\n\n```python\napi = 'https://github.com/markadc'\ndata = {\n 'key1': 'value1',\n 'key2': 'value2'\n}\nresp = spider.send(api, data=data)\n```\n\n#### \u4f7f\u7528json\u53c2\u6570\n\n```python\napi = 'https://github.com/markadc'\njson = {\n 'key1': 'value1',\n 'key2': 'value2'\n}\nresp = spider.send(api, json=json)\n```\n\n## \u9650\u5236\u54cd\u5e94\n\n#### \u9650\u5236\u54cd\u5e94\u7801\n\n- \u5982\u679c\u54cd\u5e94\u7801\u4e0d\u5728codes\u8303\u56f4\u91cc\u5219\u629b\u5f03\u54cd\u5e94\n\n```python\nresp = spider.send('https://github.com/markadc', codes=[200, 301, 302])\n```\n\n#### \u9650\u5236\u54cd\u5e94\u5185\u5bb9\n\n- \u5982\u679cchecker\u8fd4\u56deFalse\u5219\u629b\u5f03\u54cd\u5e94\n\n```python\ndef is_ok(response):\n html = response.text\n if html.find('\u9a8c\u8bc1\u7801') != -1:\n return False\n\n\nresp = spider.send('https://github.com/markadc', checker=is_ok)\n```\n\n#### \u4e3aheaders\u589e\u52a0\u9ed8\u8ba4\u5b57\u6bb5\n\n- \u5b9e\u4f8b\u5316\u7684\u65f6\u5019\u4f7f\u7528default_headers\u53c2\u6570\n\n##### \u4f8b\u5b501\n\n- \u6bcf\u4e00\u6b21\u8bf7\u6c42\u7684headers\u90fd\u5e26\u4e0acookie\n\n```python\nspider = WauoSpider(default_headers={'Cookie': 'Your Cookies'})\nresp = spider.send('https://github.com/markadc')\nprint(resp.request.headers)\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "\u722c\u866b\u8005\u7684\u8d34\u5fc3\u52a9\u624b",
"version": "0.5.4",
"project_urls": {
"Homepage": "https://github.com/markadc/wauo"
},
"split_keywords": [
"python",
" requests",
" spider"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6d0cda5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf",
"md5": "bea8d0385abf56a08fbe2ff2567caa78",
"sha256": "6758d885d2673470371b8cc18382fd5124599c02d8a17ae57513fc5b5c9ef44a"
},
"downloads": -1,
"filename": "wauo-0.5.4.tar.gz",
"has_sig": false,
"md5_digest": "bea8d0385abf56a08fbe2ff2567caa78",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4537,
"upload_time": "2024-05-15T15:38:47",
"upload_time_iso_8601": "2024-05-15T15:38:47.122716Z",
"url": "https://files.pythonhosted.org/packages/6d/0c/da5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf/wauo-0.5.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-15 15:38:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "markadc",
"github_project": "wauo",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "wauo"
}