gi-scraper


Namegi-scraper JSON
Version 0.4.6 PyPI version JSON
download
home_pageNone
SummaryGoogle Image Scraper.
upload_time2024-03-24 19:03:18
maintainerNone
docs_urlNone
authorRoy6801
requires_pythonNone
licenseNone
keywords python selenium web scraping images google image scraper web scraper image scraping google images image scraper image api api automation data extraction scraping tool image downloader web automation scraping framework data scraping image search image retrieval
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Google-Image-Scraper

## About

This module is based on web-scraping technology and uses Google Images to provide a Streamable Image API.

### Supported Browsers

- **Chrome**

## How to Use?

```python
# import Scraper class
from gi_scraper import Scraper


# Pass a Cache instance with a custom directory path and timeout
# Set cache timeout to -1 for caching indefinitely

"""
from gi_scraper import Cache

cache = Cache(dir_path="gi_cache", timeout=-1)
sc = Scraper(workers=8, headless=False, cache=cache)
"""

# The object creation has an overhead time
# The same object can be reused to fire multiple queries
sc = Scraper(headless=False)

for query, count in {"Naruto": 20, "Gintoki": 30}.items():
    print("Scraping...", query, ":", count)

    # scrape method returns a stream object
    stream = sc.scrape(query, count)

    # stream.get method yields Response object with following attributes
    # - query (str): The query associated with the response.
    # - name (str): The name attribute of the response.
    # - src_name (str): The source name attribute of the response.
    # - src_page (str): The source page attribute of the response.
    # - thumbnail (str): The thumbnail attribute of the response.
    # - image (str): The image attribute of the response.
    # - width (int): The width attribute of the response.
    # - height (int): The height attribute of the response.

    for index, response in enumerate(stream.get()):
        if index == 10:
            sc.terminate_query()  # Terminate current query midway
            break
        # response.to_dict returns python representable dictionary
        print(response.width, "x", response.height, ":", response.image)


# call this to terminate scraping (auto-called by destructor)
sc.terminate()
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "gi-scraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, selenium, web scraping, images, google image scraper, web scraper, image scraping, google images, image scraper, image API, API, automation, data extraction, scraping tool, image downloader, web automation, scraping framework, data scraping, image search, image retrieval",
    "author": "Roy6801",
    "author_email": "<mondal6801@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/1d/03/9333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48/gi_scraper-0.4.6.tar.gz",
    "platform": null,
    "description": "# Google-Image-Scraper\n\n## About\n\nThis module is based on web-scraping technology and uses Google Images to provide a Streamable Image API.\n\n### Supported Browsers\n\n- **Chrome**\n\n## How to Use?\n\n```python\n# import Scraper class\nfrom gi_scraper import Scraper\n\n\n# Pass a Cache instance with a custom directory path and timeout\n# Set cache timeout to -1 for caching indefinitely\n\n\"\"\"\nfrom gi_scraper import Cache\n\ncache = Cache(dir_path=\"gi_cache\", timeout=-1)\nsc = Scraper(workers=8, headless=False, cache=cache)\n\"\"\"\n\n# The object creation has an overhead time\n# The same object can be reused to fire multiple queries\nsc = Scraper(headless=False)\n\nfor query, count in {\"Naruto\": 20, \"Gintoki\": 30}.items():\n    print(\"Scraping...\", query, \":\", count)\n\n    # scrape method returns a stream object\n    stream = sc.scrape(query, count)\n\n    # stream.get method yields Response object with following attributes\n    # - query (str): The query associated with the response.\n    # - name (str): The name attribute of the response.\n    # - src_name (str): The source name attribute of the response.\n    # - src_page (str): The source page attribute of the response.\n    # - thumbnail (str): The thumbnail attribute of the response.\n    # - image (str): The image attribute of the response.\n    # - width (int): The width attribute of the response.\n    # - height (int): The height attribute of the response.\n\n    for index, response in enumerate(stream.get()):\n        if index == 10:\n            sc.terminate_query()  # Terminate current query midway\n            break\n        # response.to_dict returns python representable dictionary\n        print(response.width, \"x\", response.height, \":\", response.image)\n\n\n# call this to terminate scraping (auto-called by destructor)\nsc.terminate()\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Google Image Scraper.",
    "version": "0.4.6",
    "project_urls": null,
    "split_keywords": [
        "python",
        " selenium",
        " web scraping",
        " images",
        " google image scraper",
        " web scraper",
        " image scraping",
        " google images",
        " image scraper",
        " image api",
        " api",
        " automation",
        " data extraction",
        " scraping tool",
        " image downloader",
        " web automation",
        " scraping framework",
        " data scraping",
        " image search",
        " image retrieval"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c862e1928081d14a1b8539ca236c28b5e020f69217656b3e2006346b184b19c4",
                "md5": "318d088fbd05da23c5439b41666c8d70",
                "sha256": "67f42deb3b5b4898c2169e11a98416322eaff626b3931432a046aa072d519bba"
            },
            "downloads": -1,
            "filename": "gi_scraper-0.4.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "318d088fbd05da23c5439b41666c8d70",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9910,
            "upload_time": "2024-03-24T19:03:17",
            "upload_time_iso_8601": "2024-03-24T19:03:17.278452Z",
            "url": "https://files.pythonhosted.org/packages/c8/62/e1928081d14a1b8539ca236c28b5e020f69217656b3e2006346b184b19c4/gi_scraper-0.4.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d039333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48",
                "md5": "35f6bfbe75ada73f7732d7da55e87dc5",
                "sha256": "b8351dfe076c1346c66b1d4271fafc4912bc9ff008f9ce190043440b5e3e336f"
            },
            "downloads": -1,
            "filename": "gi_scraper-0.4.6.tar.gz",
            "has_sig": false,
            "md5_digest": "35f6bfbe75ada73f7732d7da55e87dc5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 9038,
            "upload_time": "2024-03-24T19:03:18",
            "upload_time_iso_8601": "2024-03-24T19:03:18.875973Z",
            "url": "https://files.pythonhosted.org/packages/1d/03/9333737254bda8f9b45a5c405630828cef30288574d07b1ffb539ee1ab48/gi_scraper-0.4.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-24 19:03:18",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "gi-scraper"
}
        
Elapsed time: 3.34515s