masterai-scrapy-extensions


Namemasterai-scrapy-extensions JSON
Version 2024.8.1 PyPI version JSON
download
home_pageNone
SummaryA Scrapy extension that report your log from your scraped data.
upload_time2024-08-01 03:48:47
maintainerNone
docs_urlNone
authorNone
requires_python==3.11.*
licenseMIT
keywords scrapy log report colorful
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Scrapy Log Report Extension

A Scrapy extension that report your log from your scraped data.

## Usage

This Scrapy extension provides a way to report your log from your scraped data. It will generate a report every `LOGSTATS_INTERVAL` seconds, and send it to your log server.

```bash
# log report demo
{
  "items_add": 0,
  "pages_add": 0,
  "items_rate": 0,
  "pages_rate": 0,
  "items_count": 0,
  "pages_count": 0,
  "spider_name": "douban",
  "log_count/INFO": 8,
  "log_count/DEBUG": 1,
  "log_count/WARNING": 2,
  "item_scraped_add_count": 0,
  "response_received_add_count": 0
}
```

## Installation

First, pip install this package:

```bash
$ pip install masterai-scrapy-extensions
```

## Usage

Enable the extension in your project's `settings.py` file, by adding the following lines:

```python
EXTENSIONS = {
    "masterai_scrapy_extensions.logreport.ReportStats": 100,
}
#
LOGSTATS_INTERVAL = 60
# set the URL to your log server
# method POST is used to send the report data
LOGREPORT_URL = "http://127.0.0.1:5000/api/v1/task/worker/status"

# log color
from masterai_scrapy_extensions import logcolor

logcolor.log_color_init()
```

That's all! Now run your job and have a look at the field stats.

## Settings

The settings below can be defined as any other Scrapy settings, as described on [Scrapy docs](https://doc.scrapy.org/en/latest/topics/settings.html#populating-the-settings).

- `LOGREPORT_URL`: set the interval in seconds to generate the report.
- `LOGSTATS_INTERVAL`: set the URL to your log server,method POST is used to send the report data.
- `COLORLOG_FORMAT`: log color format.
- `COLORLOG_COLORS`: log colors.
- `COLORLOG_DATEFORMAT`: log color date format.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "masterai-scrapy-extensions",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "==3.11.*",
    "maintainer_email": null,
    "keywords": "scrapy, log, report, colorful",
    "author": null,
    "author_email": "alexsuthree <alexsuthree@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/90/1c/d39aeac8d21b6309c8f40d8ee6d99923bbb2fc02c5da2501caf7d463f43f/masterai_scrapy_extensions-2024.8.1.tar.gz",
    "platform": null,
    "description": "# Scrapy Log Report Extension\n\nA Scrapy extension that report your log from your scraped data.\n\n## Usage\n\nThis Scrapy extension provides a way to report your log from your scraped data. It will generate a report every `LOGSTATS_INTERVAL` seconds, and send it to your log server.\n\n```bash\n# log report demo\n{\n  \"items_add\": 0,\n  \"pages_add\": 0,\n  \"items_rate\": 0,\n  \"pages_rate\": 0,\n  \"items_count\": 0,\n  \"pages_count\": 0,\n  \"spider_name\": \"douban\",\n  \"log_count/INFO\": 8,\n  \"log_count/DEBUG\": 1,\n  \"log_count/WARNING\": 2,\n  \"item_scraped_add_count\": 0,\n  \"response_received_add_count\": 0\n}\n```\n\n## Installation\n\nFirst, pip install this package:\n\n```bash\n$ pip install masterai-scrapy-extensions\n```\n\n## Usage\n\nEnable the extension in your project's `settings.py` file, by adding the following lines:\n\n```python\nEXTENSIONS = {\n    \"masterai_scrapy_extensions.logreport.ReportStats\": 100,\n}\n#\nLOGSTATS_INTERVAL = 60\n# set the URL to your log server\n# method POST is used to send the report data\nLOGREPORT_URL = \"http://127.0.0.1:5000/api/v1/task/worker/status\"\n\n# log color\nfrom masterai_scrapy_extensions import logcolor\n\nlogcolor.log_color_init()\n```\n\nThat's all! Now run your job and have a look at the field stats.\n\n## Settings\n\nThe settings below can be defined as any other Scrapy settings, as described on [Scrapy docs](https://doc.scrapy.org/en/latest/topics/settings.html#populating-the-settings).\n\n- `LOGREPORT_URL`: set the interval in seconds to generate the report.\n- `LOGSTATS_INTERVAL`: set the URL to your log server,method POST is used to send the report data.\n- `COLORLOG_FORMAT`: log color format.\n- `COLORLOG_COLORS`: log colors.\n- `COLORLOG_DATEFORMAT`: log color date format.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Scrapy extension that report your log from your scraped data.",
    "version": "2024.8.1",
    "project_urls": null,
    "split_keywords": [
        "scrapy",
        " log",
        " report",
        " colorful"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "473b591e0cd345545e4ab3a4b4416d533e5ab7bf5fe5867a5e4b809d999faead",
                "md5": "9a4d363c8ef45a57e45dddc7247f9711",
                "sha256": "6d19ba584dfbcb050bade1f10dfce283c829e18b4c9590eab4c1fe2c464d7761"
            },
            "downloads": -1,
            "filename": "masterai_scrapy_extensions-2024.8.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9a4d363c8ef45a57e45dddc7247f9711",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "==3.11.*",
            "size": 4667,
            "upload_time": "2024-08-01T03:48:46",
            "upload_time_iso_8601": "2024-08-01T03:48:46.458083Z",
            "url": "https://files.pythonhosted.org/packages/47/3b/591e0cd345545e4ab3a4b4416d533e5ab7bf5fe5867a5e4b809d999faead/masterai_scrapy_extensions-2024.8.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "901cd39aeac8d21b6309c8f40d8ee6d99923bbb2fc02c5da2501caf7d463f43f",
                "md5": "0e2510ba0cb3bb12c821f886f6b92830",
                "sha256": "0f8c27862a1b257ddf03be577754aefc87e07f7c8050cb57d424581f7d5cbcb9"
            },
            "downloads": -1,
            "filename": "masterai_scrapy_extensions-2024.8.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0e2510ba0cb3bb12c821f886f6b92830",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "==3.11.*",
            "size": 4404,
            "upload_time": "2024-08-01T03:48:47",
            "upload_time_iso_8601": "2024-08-01T03:48:47.648333Z",
            "url": "https://files.pythonhosted.org/packages/90/1c/d39aeac8d21b6309c8f40d8ee6d99923bbb2fc02c5da2501caf7d463f43f/masterai_scrapy_extensions-2024.8.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-01 03:48:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "masterai-scrapy-extensions"
}
        
Elapsed time: 0.35724s