ScrapPyJS

Name	ScrapPyJS JSON
Version	1.1.0 JSON
	download
home_page
Summary	An easy to use web scrapping library via JS scripts
upload_time	2023-05-20 20:00:23
maintainer
docs_url	None
author	Hind Sagar Biswas
requires_python
license
keywords	python web scrapping scrape data
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
# ScrapPyJS



![Project Language](https://img.shields.io/static/v1?label=language&message=python&color=blue)

![Project Type](https://img.shields.io/static/v1?label=type&message=package&color=red)

[![PyPI project](https://img.shields.io/static/v1?label=PyPI&message=ScrapPyJS&color=blue)](https://pypi.org/project/ScrapPyJS/)

![Current Version](https://img.shields.io/static/v1?label=current-version&message=v1.0.2&color=lightgrey)

![Stable Version](https://img.shields.io/static/v1?label=stable-version&message=v1.0.0&color=brightgreen)

![Maintained](https://img.shields.io/static/v1?label=maintained&message=yes&color=green)

![Ask Me Anything](https://img.shields.io/static/v1?label=ask-me&message=anything&color=green)

[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)



The `ScrapPyJS` class provides functionality for web scraping using Selenium were you can Scrap data via running JS script directly from python.



## Installing



```terminal

pip install ScrapPyJS

```



## How to Use



### Including and Initiating



```python

from ScrapPyJS import ScrapPyJS



# initiate ScrapPyJS

scrappy = ScrapPyJS()



# set js script

JS_SCRIPT = "return 'ScrapPy scrapping!'"

scrappy.set_script(JS_SCRIPT)



# rest of the code goes here...



# close ScrapPyJS

scrappy.end()

```



### Simple way



1. Use the `scrap` method to scrape a webpage:



    ```python

    result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')

    ```



2. Retrieve the result of the scraping operation:



    ```python

    print(result)

    ```



### Loop through list of URLs



1. Set up a list of target URLs



    ```python

    URLS = [

        'https://url1.com/',

        'https://url2.com/homepage/',

        'https://url2.com/about',

    ]

    ```



2. Use the `loop_through` method to scrape through the target webpages webpage:



    ```python

    # The result value will be a list if save mode is on, else a JSON string

    result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')

    ```



3. Retrieve the result of the scraping operation:



    ```python

    print(result)

    ```



### Save results to a file



#### Activate save mode



1. Via toggle:



    ```python

    scrappy.toggle_save_mode()

    ```



    Here, the save mode which is set to `False` by Default is toggled to `True`. So the save file informations are default.



2. Via `set_save_info` method:



    ```python

    scrappy.set_save_info(save=True)

    ```



    Here, we directly set save mode to `True` leaving other infos to default.



#### Configure save mode



1. Via `set_save_info` method:



    ```python

    FILE_NAME = "output"

    FILE_FORMAT = "json"

    SAVE_LOCATION = "path/to/file/"



    scrappy.toggle_save_mode(save=True, file_name=FILE_NAME, file_format=FILE_FORMAT, location=SAVE_LOCATION)

    ```



Please note that you will need to have the necessary `Selenium` and `WebDriver` dependencies installed to use this code.



## Documentation



The necessary informations on the ScrapPyJS class is available in `.\CLASS_STRUCTURE.md`



## License



This code has been licensed under `GNU AGPLv3` open source copyleft license.



## Author



**NAME:** *Hind Sagar Biswas*



**Website:** [coderaptors.epizy.com](http://coderaptors.epizy.com/)



[![Author Facebook](https://img.shields.io/static/v1?label=facebook&message=hindsagar.biswas&style=social&logo=facebook)](https://m.facebook.com/hindsagar.biswas)

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ScrapPyJS",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,web scrapping,scrape data",
    "author": "Hind Sagar Biswas",
    "author_email": "<hindsbhk@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/70/f4/3d84e2cf6aa25ce88e7ee6f7756801aa2f4d75a5bc9afffc4127eefaf62f/ScrapPyJS-1.1.0.tar.gz",
    "platform": null,
    "description": "\r\n# ScrapPyJS\r\n\r\n\r\n\r\n![Project Language](https://img.shields.io/static/v1?label=language&message=python&color=blue)\r\n\r\n![Project Type](https://img.shields.io/static/v1?label=type&message=package&color=red)\r\n\r\n[![PyPI project](https://img.shields.io/static/v1?label=PyPI&message=ScrapPyJS&color=blue)](https://pypi.org/project/ScrapPyJS/)\r\n\r\n![Current Version](https://img.shields.io/static/v1?label=current-version&message=v1.0.2&color=lightgrey)\r\n\r\n![Stable Version](https://img.shields.io/static/v1?label=stable-version&message=v1.0.0&color=brightgreen)\r\n\r\n![Maintained](https://img.shields.io/static/v1?label=maintained&message=yes&color=green)\r\n\r\n![Ask Me Anything](https://img.shields.io/static/v1?label=ask-me&message=anything&color=green)\r\n\r\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)\r\n\r\n\r\n\r\nThe `ScrapPyJS` class provides functionality for web scraping using Selenium were you can Scrap data via running JS script directly from python.\r\n\r\n\r\n\r\n## Installing\r\n\r\n\r\n\r\n```terminal\r\n\r\npip install ScrapPyJS\r\n\r\n```\r\n\r\n\r\n\r\n## How to Use\r\n\r\n\r\n\r\n### Including and Initiating\r\n\r\n\r\n\r\n```python\r\n\r\nfrom ScrapPyJS import ScrapPyJS\r\n\r\n\r\n\r\n# initiate ScrapPyJS\r\n\r\nscrappy = ScrapPyJS()\r\n\r\n\r\n\r\n# set js script\r\n\r\nJS_SCRIPT = \"return 'ScrapPy scrapping!'\"\r\n\r\nscrappy.set_script(JS_SCRIPT)\r\n\r\n\r\n\r\n# rest of the code goes here...\r\n\r\n\r\n\r\n# close ScrapPyJS\r\n\r\nscrappy.end()\r\n\r\n```\r\n\r\n\r\n\r\n### Simple way\r\n\r\n\r\n\r\n1. Use the `scrap` method to scrape a webpage:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')\r\n\r\n    ```\r\n\r\n\r\n\r\n2. Retrieve the result of the scraping operation:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    print(result)\r\n\r\n    ```\r\n\r\n\r\n\r\n### Loop through list of URLs\r\n\r\n\r\n\r\n1. Set up a list of target URLs\r\n\r\n\r\n\r\n    ```python\r\n\r\n    URLS = [\r\n\r\n        'https://url1.com/',\r\n\r\n        'https://url2.com/homepage/',\r\n\r\n        'https://url2.com/about',\r\n\r\n    ]\r\n\r\n    ```\r\n\r\n\r\n\r\n2. Use the `loop_through` method to scrape through the target webpages webpage:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    # The result value will be a list if save mode is on, else a JSON string\r\n\r\n    result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')\r\n\r\n    ```\r\n\r\n\r\n\r\n3. Retrieve the result of the scraping operation:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    print(result)\r\n\r\n    ```\r\n\r\n\r\n\r\n### Save results to a file\r\n\r\n\r\n\r\n#### Activate save mode\r\n\r\n\r\n\r\n1. Via toggle:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    scrappy.toggle_save_mode()\r\n\r\n    ```\r\n\r\n\r\n\r\n    Here, the save mode which is set to `False` by Default is toggled to `True`. So the save file informations are default.\r\n\r\n\r\n\r\n2. Via `set_save_info` method:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    scrappy.set_save_info(save=True)\r\n\r\n    ```\r\n\r\n\r\n\r\n    Here, we directly set save mode to `True` leaving other infos to default.\r\n\r\n\r\n\r\n#### Configure save mode\r\n\r\n\r\n\r\n1. Via `set_save_info` method:\r\n\r\n\r\n\r\n    ```python\r\n\r\n    FILE_NAME = \"output\"\r\n\r\n    FILE_FORMAT = \"json\"\r\n\r\n    SAVE_LOCATION = \"path/to/file/\"\r\n\r\n\r\n\r\n    scrappy.toggle_save_mode(save=True, file_name=FILE_NAME, file_format=FILE_FORMAT, location=SAVE_LOCATION)\r\n\r\n    ```\r\n\r\n\r\n\r\nPlease note that you will need to have the necessary `Selenium` and `WebDriver` dependencies installed to use this code.\r\n\r\n\r\n\r\n## Documentation\r\n\r\n\r\n\r\nThe necessary informations on the ScrapPyJS class is available in `.\\CLASS_STRUCTURE.md`\r\n\r\n\r\n\r\n## License\r\n\r\n\r\n\r\nThis code has been licensed under `GNU AGPLv3` open source copyleft license.\r\n\r\n\r\n\r\n## Author\r\n\r\n\r\n\r\n**NAME:** *Hind Sagar Biswas*\r\n\r\n\r\n\r\n**Website:** [coderaptors.epizy.com](http://coderaptors.epizy.com/)\r\n\r\n\r\n\r\n[![Author Facebook](https://img.shields.io/static/v1?label=facebook&message=hindsagar.biswas&style=social&logo=facebook)](https://m.facebook.com/hindsagar.biswas)\r\n\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "An easy to use web scrapping library via JS scripts",
    "version": "1.1.0",
    "project_urls": null,
    "split_keywords": [
        "python",
        "web scrapping",
        "scrape data"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3b95d32c12e9ef88d2baa5a11e8359eaa3ea4b8cce132e1fe21364e3e88126fc",
                "md5": "2c5e1469b30abac80263b97c9f0f8776",
                "sha256": "4770bd9985be81327fd3092385f9cb6df2762125eb2cb6f6115580d4a35e18dd"
            },
            "downloads": -1,
            "filename": "ScrapPyJS-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2c5e1469b30abac80263b97c9f0f8776",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 5849,
            "upload_time": "2023-05-20T20:00:20",
            "upload_time_iso_8601": "2023-05-20T20:00:20.532569Z",
            "url": "https://files.pythonhosted.org/packages/3b/95/d32c12e9ef88d2baa5a11e8359eaa3ea4b8cce132e1fe21364e3e88126fc/ScrapPyJS-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "70f43d84e2cf6aa25ce88e7ee6f7756801aa2f4d75a5bc9afffc4127eefaf62f",
                "md5": "ddf4d6caffa9559f08e1886f8040c109",
                "sha256": "f8fa767d2771b0b406b60b81571a304a4b179bf3347c40779f3de1a32f6f110f"
            },
            "downloads": -1,
            "filename": "ScrapPyJS-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ddf4d6caffa9559f08e1886f8040c109",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 5876,
            "upload_time": "2023-05-20T20:00:23",
            "upload_time_iso_8601": "2023-05-20T20:00:23.802921Z",
            "url": "https://files.pythonhosted.org/packages/70/f4/3d84e2cf6aa25ce88e7ee6f7756801aa2f4d75a5bc9afffc4127eefaf62f/ScrapPyJS-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-20 20:00:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "scrappyjs"
}

Hind Sagar Biswas