gi-scraper

Name	gi-scraper JSON
Version	0.4.6 JSON
	download
home_page	None
Summary	Google Image Scraper.
upload_time	2024-03-24 19:03:18
maintainer	None
docs_url	None
author	Roy6801
requires_python	None
license	None
keywords	python selenium web scraping images google image scraper web scraper image scraping google images image scraper image api api automation data extraction scraping tool image downloader web automation scraping framework data scraping image search image retrieval
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Google-Image-Scraper

## About

This module is based on web-scraping technology and uses Google Images to provide a Streamable Image API.

### Supported Browsers

- **Chrome**

## How to Use?

```python
# import Scraper class
from gi_scraper import Scraper


# Pass a Cache instance with a custom directory path and timeout
# Set cache timeout to -1 for caching indefinitely

"""
from gi_scraper import Cache

cache = Cache(dir_path="gi_cache", timeout=-1)
sc = Scraper(workers=8, headless=False, cache=cache)
"""

# The object creation has an overhead time
# The same object can be reused to fire multiple queries
sc = Scraper(headless=False)

for query, count in {"Naruto": 20, "Gintoki": 30}.items():
    print("Scraping...", query, ":", count)

    # scrape method returns a stream object
    stream = sc.scrape(query, count)

    # stream.get method yields Response object with following attributes
    # - query (str): The query associated with the response.
    # - name (str): The name attribute of the response.
    # - src_name (str): The source name attribute of the response.
    # - src_page (str): The source page attribute of the response.
    # - thumbnail (str): The thumbnail attribute of the response.
    # - image (str): The image attribute of the response.
    # - width (int): The width attribute of the response.
    # - height (int): The height attribute of the response.

    for index, response in enumerate(stream.get()):
        if index == 10:
            sc.terminate_query()  # Terminate current query midway
            break
        # response.to_dict returns python representable dictionary
        print(response.width, "x", response.height, ":", response.image)


# call this to terminate scraping (auto-called by destructor)
sc.terminate()
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "gi-scraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, selenium, web scraping, images, google image scraper, web scraper, image scraping, google images, image scraper, image API, API, automation, data extraction, scraping tool, image downloader, web automation, scraping framework, data scraping, image search, image retrieval",
    "author": "Roy6801",
    "author_email": "<mondal6801@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/1d/03/9333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48/gi_scraper-0.4.6.tar.gz",
    "platform": null,
    "description": "# Google-Image-Scraper\n\n## About\n\nThis module is based on web-scraping technology and uses Google Images to provide a Streamable Image API.\n\n### Supported Browsers\n\n- **Chrome**\n\n## How to Use?\n\n```python\n# import Scraper class\nfrom gi_scraper import Scraper\n\n\n# Pass a Cache instance with a custom directory path and timeout\n# Set cache timeout to -1 for caching indefinitely\n\n\"\"\"\nfrom gi_scraper import Cache\n\ncache = Cache(dir_path=\"gi_cache\", timeout=-1)\nsc = Scraper(workers=8, headless=False, cache=cache)\n\"\"\"\n\n# The object creation has an overhead time\n# The same object can be reused to fire multiple queries\nsc = Scraper(headless=False)\n\nfor query, count in {\"Naruto\": 20, \"Gintoki\": 30}.items():\n    print(\"Scraping...\", query, \":\", count)\n\n    # scrape method returns a stream object\n    stream = sc.scrape(query, count)\n\n    # stream.get method yields Response object with following attributes\n    # - query (str): The query associated with the response.\n    # - name (str): The name attribute of the response.\n    # - src_name (str): The source name attribute of the response.\n    # - src_page (str): The source page attribute of the response.\n    # - thumbnail (str): The thumbnail attribute of the response.\n    # - image (str): The image attribute of the response.\n    # - width (int): The width attribute of the response.\n    # - height (int): The height attribute of the response.\n\n    for index, response in enumerate(stream.get()):\n        if index == 10:\n            sc.terminate_query()  # Terminate current query midway\n            break\n        # response.to_dict returns python representable dictionary\n        print(response.width, \"x\", response.height, \":\", response.image)\n\n\n# call this to terminate scraping (auto-called by destructor)\nsc.terminate()\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Google Image Scraper.",
    "version": "0.4.6",
    "project_urls": null,
    "split_keywords": [
        "python",
        " selenium",
        " web scraping",
        " images",
        " google image scraper",
        " web scraper",
        " image scraping",
        " google images",
        " image scraper",
        " image api",
        " api",
        " automation",
        " data extraction",
        " scraping tool",
        " image downloader",
        " web automation",
        " scraping framework",
        " data scraping",
        " image search",
        " image retrieval"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c862e1928081d14a1b8539ca236c28b5e020f69217656b3e2006346b184b19c4",
                "md5": "318d088fbd05da23c5439b41666c8d70",
                "sha256": "67f42deb3b5b4898c2169e11a98416322eaff626b3931432a046aa072d519bba"
            },
            "downloads": -1,
            "filename": "gi_scraper-0.4.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "318d088fbd05da23c5439b41666c8d70",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9910,
            "upload_time": "2024-03-24T19:03:17",
            "upload_time_iso_8601": "2024-03-24T19:03:17.278452Z",
            "url": "https://files.pythonhosted.org/packages/c8/62/e1928081d14a1b8539ca236c28b5e020f69217656b3e2006346b184b19c4/gi_scraper-0.4.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d039333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48",
                "md5": "35f6bfbe75ada73f7732d7da55e87dc5",
                "sha256": "b8351dfe076c1346c66b1d4271fafc4912bc9ff008f9ce190043440b5e3e336f"
            },
            "downloads": -1,
            "filename": "gi_scraper-0.4.6.tar.gz",
            "has_sig": false,
            "md5_digest": "35f6bfbe75ada73f7732d7da55e87dc5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 9038,
            "upload_time": "2024-03-24T19:03:18",
            "upload_time_iso_8601": "2024-03-24T19:03:18.875973Z",
            "url": "https://files.pythonhosted.org/packages/1d/03/9333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48/gi_scraper-0.4.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-24 19:03:18",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "gi-scraper"
}

Roy6801