===========
scrapy-poet
===========
.. image:: https://img.shields.io/pypi/v/scrapy-poet.svg
:target: https://pypi.python.org/pypi/scrapy-poet
:alt: PyPI Version
.. image:: https://img.shields.io/pypi/pyversions/scrapy-poet.svg
:target: https://pypi.python.org/pypi/scrapy-poet
:alt: Supported Python Versions
.. image:: https://github.com/scrapinghub/scrapy-poet/workflows/tox/badge.svg
:target: https://github.com/scrapinghub/scrapy-poet/actions
:alt: Build Status
.. image:: https://codecov.io/github/scrapinghub/scrapy-poet/coverage.svg?branch=master
:target: https://codecov.io/gh/scrapinghub/scrapy-poet
:alt: Coverage report
.. image:: https://readthedocs.org/projects/scrapy-poet/badge/?version=stable
:target: https://scrapy-poet.readthedocs.io/en/stable/?badge=stable
:alt: Documentation Status
``scrapy-poet`` is the `web-poet`_ Page Object pattern implementation for Scrapy.
``scrapy-poet`` allows to write spiders where extraction logic is separated from the crawling one.
With ``scrapy-poet`` is possible to make a single spider that supports many sites with
different layouts.
Read the `documentation <https://scrapy-poet.readthedocs.io>`_ for more information.
License is BSD 3-clause.
* Documentation: https://scrapy-poet.readthedocs.io
* Source code: https://github.com/scrapinghub/scrapy-poet
* Issue tracker: https://github.com/scrapinghub/scrapy-poet/issues
.. _`web-poet`: https://github.com/scrapinghub/web-poet
Quick Start
***********
Installation
============
.. code-block::
pip install scrapy-poet
Requires **Python 3.9+** and **Scrapy >= 2.6.0**.
Usage in a Scrapy Project
=========================
Add the following inside Scrapy's ``settings.py`` file:
.. code-block:: python
DOWNLOADER_MIDDLEWARES = {
"scrapy_poet.InjectionMiddleware": 543,
"scrapy.downloadermiddlewares.stats.DownloaderStats": None,
"scrapy_poet.DownloaderStatsMiddleware": 850,
}
SPIDER_MIDDLEWARES = {
"scrapy_poet.RetryMiddleware": 275,
}
REQUEST_FINGERPRINTER_CLASS = "scrapy_poet.ScrapyPoetRequestFingerprinter"
Developing
==========
Setup your local Python environment via:
1. `pip install -r requirements-dev.txt`
2. `pre-commit install`
Now everytime you perform a `git commit`, these tools will run against the
staged files:
* `black`
* `isort`
* `flake8`
You can also directly invoke `pre-commit run --all-files` or `tox -e linters`
to run them without performing a commit.
Raw data
{
"_id": null,
"home_page": "https://github.com/scrapinghub/scrapy-poet",
"name": "scrapy-poet",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Mikhail Korobov",
"author_email": "kmike84@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/3f/92/abcb176ee4ddf17af9379695abaeff2f60fdab0ca71dcdbafabb8c91afc5/scrapy_poet-0.25.0.tar.gz",
"platform": null,
"description": "===========\nscrapy-poet\n===========\n\n.. image:: https://img.shields.io/pypi/v/scrapy-poet.svg\n :target: https://pypi.python.org/pypi/scrapy-poet\n :alt: PyPI Version\n\n.. image:: https://img.shields.io/pypi/pyversions/scrapy-poet.svg\n :target: https://pypi.python.org/pypi/scrapy-poet\n :alt: Supported Python Versions\n\n.. image:: https://github.com/scrapinghub/scrapy-poet/workflows/tox/badge.svg\n :target: https://github.com/scrapinghub/scrapy-poet/actions\n :alt: Build Status\n\n.. image:: https://codecov.io/github/scrapinghub/scrapy-poet/coverage.svg?branch=master\n :target: https://codecov.io/gh/scrapinghub/scrapy-poet\n :alt: Coverage report\n\n.. image:: https://readthedocs.org/projects/scrapy-poet/badge/?version=stable\n :target: https://scrapy-poet.readthedocs.io/en/stable/?badge=stable\n :alt: Documentation Status\n\n``scrapy-poet`` is the `web-poet`_ Page Object pattern implementation for Scrapy.\n``scrapy-poet`` allows to write spiders where extraction logic is separated from the crawling one.\nWith ``scrapy-poet`` is possible to make a single spider that supports many sites with\ndifferent layouts.\n\nRead the `documentation <https://scrapy-poet.readthedocs.io>`_ for more information.\n\nLicense is BSD 3-clause.\n\n* Documentation: https://scrapy-poet.readthedocs.io\n* Source code: https://github.com/scrapinghub/scrapy-poet\n* Issue tracker: https://github.com/scrapinghub/scrapy-poet/issues\n\n.. _`web-poet`: https://github.com/scrapinghub/web-poet\n\n\nQuick Start\n***********\n\nInstallation\n============\n\n.. code-block::\n\n pip install scrapy-poet\n\nRequires **Python 3.9+** and **Scrapy >= 2.6.0**.\n\nUsage in a Scrapy Project\n=========================\n\nAdd the following inside Scrapy's ``settings.py`` file:\n\n.. code-block:: python\n\n DOWNLOADER_MIDDLEWARES = {\n \"scrapy_poet.InjectionMiddleware\": 543,\n \"scrapy.downloadermiddlewares.stats.DownloaderStats\": None,\n \"scrapy_poet.DownloaderStatsMiddleware\": 850,\n }\n SPIDER_MIDDLEWARES = {\n \"scrapy_poet.RetryMiddleware\": 275,\n }\n REQUEST_FINGERPRINTER_CLASS = \"scrapy_poet.ScrapyPoetRequestFingerprinter\"\n\nDeveloping\n==========\n\nSetup your local Python environment via:\n\n1. `pip install -r requirements-dev.txt`\n2. `pre-commit install`\n\nNow everytime you perform a `git commit`, these tools will run against the\nstaged files:\n\n* `black`\n* `isort`\n* `flake8`\n\nYou can also directly invoke `pre-commit run --all-files` or `tox -e linters`\nto run them without performing a commit.\n",
"bugtrack_url": null,
"license": null,
"summary": "Page Object pattern for Scrapy",
"version": "0.25.0",
"project_urls": {
"Homepage": "https://github.com/scrapinghub/scrapy-poet"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2f07b16f129188b106dd853197a30c423ae282f5c147699717cb323568aaf2e6",
"md5": "740b60c98975330d82a386ad2b9dd323",
"sha256": "e266333d0acf260915fef1a7604566bd44e4e01f0e802073f9f914becb312150"
},
"downloads": -1,
"filename": "scrapy_poet-0.25.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "740b60c98975330d82a386ad2b9dd323",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 30249,
"upload_time": "2024-12-27T15:54:40",
"upload_time_iso_8601": "2024-12-27T15:54:40.647207Z",
"url": "https://files.pythonhosted.org/packages/2f/07/b16f129188b106dd853197a30c423ae282f5c147699717cb323568aaf2e6/scrapy_poet-0.25.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3f92abcb176ee4ddf17af9379695abaeff2f60fdab0ca71dcdbafabb8c91afc5",
"md5": "9e335f10c58f15474f64ee2d9034aa9b",
"sha256": "52be64a1857b1b3287f693f7ffcb3c646c41fc9701f993d6fa998868fbc55cce"
},
"downloads": -1,
"filename": "scrapy_poet-0.25.0.tar.gz",
"has_sig": false,
"md5_digest": "9e335f10c58f15474f64ee2d9034aa9b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 66787,
"upload_time": "2024-12-27T15:54:43",
"upload_time_iso_8601": "2024-12-27T15:54:43.044188Z",
"url": "https://files.pythonhosted.org/packages/3f/92/abcb176ee4ddf17af9379695abaeff2f60fdab0ca71dcdbafabb8c91afc5/scrapy_poet-0.25.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-27 15:54:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scrapinghub",
"github_project": "scrapy-poet",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "scrapy-poet"
}