# Scrapy-QOS
QOS components for Scrapy
## Usage
### Active the `QosDownloaderMiddleware` in settings.py
```python
DOWNLOADER_MIDDLEWARES = {
"scrapy_qos.QosDownloaderMiddleware": 543
}
```
### Config following option in settings.py
- QOS_IOPS_ENABLED
- default `False`
- set `True` to enable IOPS limiter
- QOS_IOPS_CAPACITY
- default `1`
- burst IO count per seconds
- QOS_IOPS_LIMIT
- default `1` / s
- how many requests sent per seconds
- QOS_BPS_ENABLED
- default `False`
- set `True` to enable BPS limiter
- QOS_BPS_CAPACITY
- default `1048576` Bytes
- burst IO Bytes per seconds
- QOS_BPS_LIMIT
- default `1048576` Bytes / s
- how many response Bytes receive per seconds
- QOS_SMALL_RESPONSE_SIZE
- default `1048576` Bytes
- guess next response size filter response less than this value
## Requirements
- Python 3.7+
- Scrapy >= 2.0
- asyncio
## Installation
From pip
```shell
pip install scrapy-qos
```
From Gitee
```
git clone https://gitee.com/hgdsdq/scrapy_qos.git
cd scrapy_qos
python setup.py install
```
## Implementation
- Basic implement QOS with Token Bucket Algorithm
- For scrapy, QosDownloaderMiddleware will guess next response body size that used for BPS limiter
```python
α = 0.8
guess_response_size = (1 - α) * guess_response_size + α * guess_response_size
```
Raw data
{
"_id": null,
"home_page": "https://gitee.com/hgdsdq/scrapy_qos/tree/master",
"name": "scrapy-qos",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "scrapy-qos,scrapy,qos,iops,bps",
"author": "Caft",
"author_email": "caft0505@163.com",
"download_url": "https://files.pythonhosted.org/packages/98/4a/35c02a605db99de581645acacb4b5708468bc3ec5ca3e43b429bb5b5052c/scrapy-qos-0.0.2.tar.gz",
"platform": null,
"description": "# Scrapy-QOS\r\n\r\nQOS components for Scrapy\r\n\r\n## Usage\r\n\r\n### Active the `QosDownloaderMiddleware` in settings.py\r\n\r\n```python\r\nDOWNLOADER_MIDDLEWARES = {\r\n \"scrapy_qos.QosDownloaderMiddleware\": 543\r\n}\r\n```\r\n\r\n### Config following option in settings.py\r\n\r\n- QOS_IOPS_ENABLED\r\n - default `False`\r\n - set `True` to enable IOPS limiter\r\n- QOS_IOPS_CAPACITY\r\n - default `1`\r\n - burst IO count per seconds\r\n- QOS_IOPS_LIMIT\r\n - default `1` / s\r\n - how many requests sent per seconds\r\n- QOS_BPS_ENABLED\r\n - default `False`\r\n - set `True` to enable BPS limiter\r\n- QOS_BPS_CAPACITY\r\n - default `1048576` Bytes\r\n - burst IO Bytes per seconds\r\n- QOS_BPS_LIMIT\r\n - default `1048576` Bytes / s\r\n - how many response Bytes receive per seconds\r\n- QOS_SMALL_RESPONSE_SIZE\r\n - default `1048576` Bytes\r\n - guess next response size filter response less than this value\r\n\r\n## Requirements\r\n\r\n- Python 3.7+\r\n- Scrapy >= 2.0\r\n- asyncio\r\n\r\n## Installation\r\n\r\nFrom pip\r\n\r\n```shell\r\npip install scrapy-qos\r\n```\r\n\r\nFrom Gitee\r\n\r\n```\r\ngit clone https://gitee.com/hgdsdq/scrapy_qos.git\r\ncd scrapy_qos\r\npython setup.py install\r\n```\r\n\r\n## Implementation\r\n\r\n- Basic implement QOS with Token Bucket Algorithm\r\n- For scrapy, QosDownloaderMiddleware will guess next response body size that used for BPS limiter\r\n\r\n```python\r\n\u03b1 = 0.8\r\nguess_response_size = (1 - \u03b1) * guess_response_size + \u03b1 * guess_response_size\r\n```\r\n",
"bugtrack_url": null,
"license": "MIT License (MIT)",
"summary": "implement QOS(TokenBucket) in scrapy download middleware",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://gitee.com/hgdsdq/scrapy_qos/tree/master"
},
"split_keywords": [
"scrapy-qos",
"scrapy",
"qos",
"iops",
"bps"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0feb7c3dd1a218abb9463e058aadba796881fea99f554ff56a656f567192cb4f",
"md5": "4f65ff4930034fb5d2f21f9a6f025d25",
"sha256": "97051473db34c37e2b38ec4fb1e8d3da3a2b1390f735cd4dfed71acd0cfe0c43"
},
"downloads": -1,
"filename": "scrapy_qos-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4f65ff4930034fb5d2f21f9a6f025d25",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 3718,
"upload_time": "2024-01-06T10:22:48",
"upload_time_iso_8601": "2024-01-06T10:22:48.600491Z",
"url": "https://files.pythonhosted.org/packages/0f/eb/7c3dd1a218abb9463e058aadba796881fea99f554ff56a656f567192cb4f/scrapy_qos-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "984a35c02a605db99de581645acacb4b5708468bc3ec5ca3e43b429bb5b5052c",
"md5": "9b43bada1ca8839e4fee28431ddabd35",
"sha256": "b2dc10b98f12ad64bb6754061845854f2e07283aa695a4b4f2f9b63734b1a17d"
},
"downloads": -1,
"filename": "scrapy-qos-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "9b43bada1ca8839e4fee28431ddabd35",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 3467,
"upload_time": "2024-01-06T10:22:49",
"upload_time_iso_8601": "2024-01-06T10:22:49.986847Z",
"url": "https://files.pythonhosted.org/packages/98/4a/35c02a605db99de581645acacb4b5708468bc3ec5ca3e43b429bb5b5052c/scrapy-qos-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-06 10:22:49",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "scrapy-qos"
}