scrapyrt


Namescrapyrt JSON
Version 0.16.0 PyPI version JSON
download
home_pagehttps://github.com/scrapinghub/scrapyrt
SummaryPut Scrapy spiders behind an HTTP API
upload_time2024-02-14 09:20:11
maintainerScrapinghub
docs_urlNone
authorScrapinghub
requires_python>=3.8
licenseBSD
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://raw.githubusercontent.com/scrapinghub/scrapyrt/master/artwork/logo.gif
   :width: 400px
   :align: center

==========================
ScrapyRT (Scrapy realtime)
==========================

.. image:: https://github.com/scrapinghub/scrapyrt/workflows/CI/badge.svg
   :target: https://github.com/scrapinghub/scrapyrt/actions

.. image:: https://img.shields.io/pypi/pyversions/scrapyrt.svg
    :target: https://pypi.python.org/pypi/scrapyrt

.. image:: https://img.shields.io/pypi/v/scrapyrt.svg
    :target: https://pypi.python.org/pypi/scrapyrt

.. image:: https://img.shields.io/pypi/l/scrapyrt.svg
    :target: https://pypi.python.org/pypi/scrapyrt

.. image:: https://img.shields.io/pypi/dm/scrapyrt.svg
   :target: https://pypistats.org/packages/scrapyrt
   :alt: Downloads count

.. image:: https://readthedocs.org/projects/scrapyrt/badge/?version=latest
   :target: https://scrapyrt.readthedocs.io/en/latest/api.html

Add HTTP API for your `Scrapy <https://scrapy.org/>`_ project in minutes.

You send a request to ScrapyRT with spider name and URL, and in response, you get items collected by a spider
visiting this URL. 

* All Scrapy project components (e.g. middleware, pipelines, extensions) are supported
* You run Scrapyrt in Scrapy project directory. It starts HTTP server allowing you to schedule spiders and get spider output in JSON.


Quickstart
===============

**1. install**

.. code-block:: shell

    > pip install scrapyrt

**2. switch to Scrapy project (e.g. quotesbot project)**

.. code-block:: shell

    > cd my/project_path/is/quotesbot

**3. launch ScrapyRT**

.. code-block:: shell

    > scrapyrt

**4. run your spiders**

.. code-block:: shell

    > curl "localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/"

**5. run more complex query, e.g. specify callback for Scrapy request and zipcode argument for spider**

.. code-block:: shell

    >  curl --data '{"request": {"url": "http://quotes.toscrape.com/page/2/", "callback":"some_callback"}, "spider_name": "toscrape-css", "crawl_args": {"zipcode":"14000"}}' http://localhost:9080/crawl.json -v

Scrapyrt will look for ``scrapy.cfg`` file to determine your project settings,
and will raise error if it won't find one.  Note that you need to have all
your project requirements installed.

Note
====
* Project is not a replacement for `Scrapyd <https://scrapyd.readthedocs.io/en/stable/>`_ or `Scrapy Cloud <https://www.zyte.com/scrapy-cloud/>`_ or other infrastructure to run long running crawls
* Not suitable for long running spiders, good for spiders that will fetch one response from some website and return items quickly


Documentation
=============

`Documentation is available on readthedocs <http://scrapyrt.readthedocs.org/en/latest/index.html>`_.

Support
=======

Open source support is provided here in Github. Please `create a question
issue`_ (ie. issue with "question" label).

Commercial support is also available by `Zyte`_.

.. _create a question issue: https://github.com/scrapinghub/scrapyrt/issues/new?labels=question
.. _Zyte: http://zyte.com

License
=======
ScrapyRT is offered under `BSD 3-Clause license <https://en.wikipedia.org/wiki/BSD_licenses#3-clause_license_(%22BSD_License_2.0%22,_%22Revised_BSD_License%22,_%22New_BSD_License%22,_or_%22Modified_BSD_License%22)>`_.


Development
===========
Development taking place on `Github <https://github.com/scrapinghub/scrapyrt>`_.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/scrapinghub/scrapyrt",
    "name": "scrapyrt",
    "maintainer": "Scrapinghub",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "info@scrapinghub.com",
    "keywords": "",
    "author": "Scrapinghub",
    "author_email": "info@scrapinghub.com",
    "download_url": "https://files.pythonhosted.org/packages/8b/f9/63cbe0aeb83619fee0dd913bc5b2e660f99f5a608a6ba181adf386540573/scrapyrt-0.16.0.tar.gz",
    "platform": null,
    "description": ".. image:: https://raw.githubusercontent.com/scrapinghub/scrapyrt/master/artwork/logo.gif\n   :width: 400px\n   :align: center\n\n==========================\nScrapyRT (Scrapy realtime)\n==========================\n\n.. image:: https://github.com/scrapinghub/scrapyrt/workflows/CI/badge.svg\n   :target: https://github.com/scrapinghub/scrapyrt/actions\n\n.. image:: https://img.shields.io/pypi/pyversions/scrapyrt.svg\n    :target: https://pypi.python.org/pypi/scrapyrt\n\n.. image:: https://img.shields.io/pypi/v/scrapyrt.svg\n    :target: https://pypi.python.org/pypi/scrapyrt\n\n.. image:: https://img.shields.io/pypi/l/scrapyrt.svg\n    :target: https://pypi.python.org/pypi/scrapyrt\n\n.. image:: https://img.shields.io/pypi/dm/scrapyrt.svg\n   :target: https://pypistats.org/packages/scrapyrt\n   :alt: Downloads count\n\n.. image:: https://readthedocs.org/projects/scrapyrt/badge/?version=latest\n   :target: https://scrapyrt.readthedocs.io/en/latest/api.html\n\nAdd HTTP API for your `Scrapy <https://scrapy.org/>`_ project in minutes.\n\nYou send a request to ScrapyRT with spider name and URL, and in response, you get items collected by a spider\nvisiting this URL. \n\n* All Scrapy project components (e.g. middleware, pipelines, extensions) are supported\n* You run Scrapyrt in Scrapy project directory. It starts HTTP server allowing you to schedule spiders and get spider output in JSON.\n\n\nQuickstart\n===============\n\n**1. install**\n\n.. code-block:: shell\n\n    > pip install scrapyrt\n\n**2. switch to Scrapy project (e.g. quotesbot project)**\n\n.. code-block:: shell\n\n    > cd my/project_path/is/quotesbot\n\n**3. launch ScrapyRT**\n\n.. code-block:: shell\n\n    > scrapyrt\n\n**4. run your spiders**\n\n.. code-block:: shell\n\n    > curl \"localhost:9080/crawl.json?spider_name=toscrape-css&url=http://quotes.toscrape.com/\"\n\n**5. run more complex query, e.g. specify callback for Scrapy request and zipcode argument for spider**\n\n.. code-block:: shell\n\n    >  curl --data '{\"request\": {\"url\": \"http://quotes.toscrape.com/page/2/\", \"callback\":\"some_callback\"}, \"spider_name\": \"toscrape-css\", \"crawl_args\": {\"zipcode\":\"14000\"}}' http://localhost:9080/crawl.json -v\n\nScrapyrt will look for ``scrapy.cfg`` file to determine your project settings,\nand will raise error if it won't find one.  Note that you need to have all\nyour project requirements installed.\n\nNote\n====\n* Project is not a replacement for `Scrapyd <https://scrapyd.readthedocs.io/en/stable/>`_ or `Scrapy Cloud <https://www.zyte.com/scrapy-cloud/>`_ or other infrastructure to run long running crawls\n* Not suitable for long running spiders, good for spiders that will fetch one response from some website and return items quickly\n\n\nDocumentation\n=============\n\n`Documentation is available on readthedocs <http://scrapyrt.readthedocs.org/en/latest/index.html>`_.\n\nSupport\n=======\n\nOpen source support is provided here in Github. Please `create a question\nissue`_ (ie. issue with \"question\" label).\n\nCommercial support is also available by `Zyte`_.\n\n.. _create a question issue: https://github.com/scrapinghub/scrapyrt/issues/new?labels=question\n.. _Zyte: http://zyte.com\n\nLicense\n=======\nScrapyRT is offered under `BSD 3-Clause license <https://en.wikipedia.org/wiki/BSD_licenses#3-clause_license_(%22BSD_License_2.0%22,_%22Revised_BSD_License%22,_%22New_BSD_License%22,_or_%22Modified_BSD_License%22)>`_.\n\n\nDevelopment\n===========\nDevelopment taking place on `Github <https://github.com/scrapinghub/scrapyrt>`_.\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Put Scrapy spiders behind an HTTP API",
    "version": "0.16.0",
    "project_urls": {
        "Documentation": "https://scrapyrt.readthedocs.io/en/latest/index.html",
        "Homepage": "https://github.com/scrapinghub/scrapyrt",
        "Source": "https://github.com/scrapinghub/scrapyrt",
        "Tracker": "https://github.com/scrapinghub/scrapyrt/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ea9fdf4dcc9c914edf64f52c84aed86afb3424c0637572c1d773fa2a6fe42cbf",
                "md5": "7330f941ea1707656c8759dba34e385f",
                "sha256": "8d6be014746f5e201d645ee8b9c8415b7ff9bd71d834ebb5a81084cc3d2d6752"
            },
            "downloads": -1,
            "filename": "scrapyrt-0.16.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7330f941ea1707656c8759dba34e385f",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.8",
            "size": 36566,
            "upload_time": "2024-02-14T09:20:10",
            "upload_time_iso_8601": "2024-02-14T09:20:10.093817Z",
            "url": "https://files.pythonhosted.org/packages/ea/9f/df4dcc9c914edf64f52c84aed86afb3424c0637572c1d773fa2a6fe42cbf/scrapyrt-0.16.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8bf963cbe0aeb83619fee0dd913bc5b2e660f99f5a608a6ba181adf386540573",
                "md5": "0065b7c51023f6b175444a8d6a04895e",
                "sha256": "753ef3645444dba71d0f0a7b5a7707e52e1ae4b6088ac02d81611015dd55a63d"
            },
            "downloads": -1,
            "filename": "scrapyrt-0.16.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0065b7c51023f6b175444a8d6a04895e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 29827,
            "upload_time": "2024-02-14T09:20:11",
            "upload_time_iso_8601": "2024-02-14T09:20:11.719644Z",
            "url": "https://files.pythonhosted.org/packages/8b/f9/63cbe0aeb83619fee0dd913bc5b2e660f99f5a608a6ba181adf386540573/scrapyrt-0.16.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-14 09:20:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "scrapinghub",
    "github_project": "scrapyrt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "scrapyrt"
}
        
Elapsed time: 0.19731s