wauo


Namewauo JSON
Version 0.5.4 PyPI version JSON
download
home_pagehttps://github.com/markadc/wauo
Summary爬虫者的贴心助手
upload_time2024-05-15 15:38:47
maintainerNone
docs_urlNone
authorWangTuo
requires_pythonNone
licenseMIT
keywords python requests spider
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 更新历史

- 新增`jsonp2json`静态方法
- 爬虫`默认保持会话`状态
- 新增`get_uuid`、`base64加解密`静态方法

# 项目说明

- 基于requests封装的一个爬虫类

# Python解释器

- python3

# 如何使用?

```python
from wauo import WauoSpider

spider = WauoSpider()
```

## GET

```python
url = 'https://github.com/markadc'
resp = spider.send(url)
print(resp.text)
```

## POST

#### 使用data参数

```python
api = 'https://github.com/markadc'
data = {
    'key1': 'value1',
    'key2': 'value2'
}
resp = spider.send(api, data=data)
```

#### 使用json参数

```python
api = 'https://github.com/markadc'
json = {
    'key1': 'value1',
    'key2': 'value2'
}
resp = spider.send(api, json=json)
```

## 限制响应

#### 限制响应码

- 如果响应码不在codes范围里则抛弃响应

```python
resp = spider.send('https://github.com/markadc', codes=[200, 301, 302])
```

#### 限制响应内容

- 如果checker返回False则抛弃响应

```python
def is_ok(response):
    html = response.text
    if html.find('验证码') != -1:
        return False


resp = spider.send('https://github.com/markadc', checker=is_ok)
```

#### 为headers增加默认字段

- 实例化的时候使用default_headers参数

##### 例子1

- 每一次请求的headers都带上cookie

```python
spider = WauoSpider(default_headers={'Cookie': 'Your Cookies'})
resp = spider.send('https://github.com/markadc')
print(resp.request.headers)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/markadc/wauo",
    "name": "wauo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, requests, spider",
    "author": "WangTuo",
    "author_email": "markadc@126.com",
    "download_url": "https://files.pythonhosted.org/packages/6d/0c/da5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf/wauo-0.5.4.tar.gz",
    "platform": null,
    "description": "# \u66f4\u65b0\u5386\u53f2\n\n- \u65b0\u589e`jsonp2json`\u9759\u6001\u65b9\u6cd5\n- \u722c\u866b`\u9ed8\u8ba4\u4fdd\u6301\u4f1a\u8bdd`\u72b6\u6001\n- \u65b0\u589e`get_uuid`\u3001`base64\u52a0\u89e3\u5bc6`\u9759\u6001\u65b9\u6cd5\n\n# \u9879\u76ee\u8bf4\u660e\n\n- \u57fa\u4e8erequests\u5c01\u88c5\u7684\u4e00\u4e2a\u722c\u866b\u7c7b\n\n# Python\u89e3\u91ca\u5668\n\n- python3\n\n# \u5982\u4f55\u4f7f\u7528\uff1f\n\n```python\nfrom wauo import WauoSpider\n\nspider = WauoSpider()\n```\n\n## GET\n\n```python\nurl = 'https://github.com/markadc'\nresp = spider.send(url)\nprint(resp.text)\n```\n\n## POST\n\n#### \u4f7f\u7528data\u53c2\u6570\n\n```python\napi = 'https://github.com/markadc'\ndata = {\n    'key1': 'value1',\n    'key2': 'value2'\n}\nresp = spider.send(api, data=data)\n```\n\n#### \u4f7f\u7528json\u53c2\u6570\n\n```python\napi = 'https://github.com/markadc'\njson = {\n    'key1': 'value1',\n    'key2': 'value2'\n}\nresp = spider.send(api, json=json)\n```\n\n## \u9650\u5236\u54cd\u5e94\n\n#### \u9650\u5236\u54cd\u5e94\u7801\n\n- \u5982\u679c\u54cd\u5e94\u7801\u4e0d\u5728codes\u8303\u56f4\u91cc\u5219\u629b\u5f03\u54cd\u5e94\n\n```python\nresp = spider.send('https://github.com/markadc', codes=[200, 301, 302])\n```\n\n#### \u9650\u5236\u54cd\u5e94\u5185\u5bb9\n\n- \u5982\u679cchecker\u8fd4\u56deFalse\u5219\u629b\u5f03\u54cd\u5e94\n\n```python\ndef is_ok(response):\n    html = response.text\n    if html.find('\u9a8c\u8bc1\u7801') != -1:\n        return False\n\n\nresp = spider.send('https://github.com/markadc', checker=is_ok)\n```\n\n#### \u4e3aheaders\u589e\u52a0\u9ed8\u8ba4\u5b57\u6bb5\n\n- \u5b9e\u4f8b\u5316\u7684\u65f6\u5019\u4f7f\u7528default_headers\u53c2\u6570\n\n##### \u4f8b\u5b501\n\n- \u6bcf\u4e00\u6b21\u8bf7\u6c42\u7684headers\u90fd\u5e26\u4e0acookie\n\n```python\nspider = WauoSpider(default_headers={'Cookie': 'Your Cookies'})\nresp = spider.send('https://github.com/markadc')\nprint(resp.request.headers)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "\u722c\u866b\u8005\u7684\u8d34\u5fc3\u52a9\u624b",
    "version": "0.5.4",
    "project_urls": {
        "Homepage": "https://github.com/markadc/wauo"
    },
    "split_keywords": [
        "python",
        " requests",
        " spider"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6d0cda5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf",
                "md5": "bea8d0385abf56a08fbe2ff2567caa78",
                "sha256": "6758d885d2673470371b8cc18382fd5124599c02d8a17ae57513fc5b5c9ef44a"
            },
            "downloads": -1,
            "filename": "wauo-0.5.4.tar.gz",
            "has_sig": false,
            "md5_digest": "bea8d0385abf56a08fbe2ff2567caa78",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4537,
            "upload_time": "2024-05-15T15:38:47",
            "upload_time_iso_8601": "2024-05-15T15:38:47.122716Z",
            "url": "https://files.pythonhosted.org/packages/6d/0c/da5b7b7d9c1a3a749aa57281016e5e22248fc9fff08b12eba61b95ffdedf/wauo-0.5.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-15 15:38:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "markadc",
    "github_project": "wauo",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "wauo"
}
        
Elapsed time: 0.31727s