llama-index-tools-scrapegraphai


Namellama-index-tools-scrapegraphai JSON
Version 0.2.1 PyPI version JSON
download
home_pageNone
Summaryllama-index tools integrating ScrapegraphAI
upload_time2025-09-08 20:47:59
maintainerVincigit00
docs_urlNone
authorNone
requires_python<4.0,>=3.10
licenseNone
keywords scraping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LlamaIndex Tool - Scrapegraph

This tool integrates [Scrapegraph](https://scrapegraphai.com) with LlamaIndex, providing intelligent web scraping capabilities with structured data extraction.

## Installation

```bash
pip install llama-index-tools-scrapegraph
```

## Usage

First, import and initialize the ScrapegraphToolSpec:

```python
from llama_index.tools.scrapegraph import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()
```

### Available Functions

The tool provides the following capabilities:

1. **Smart Scraper**

```python
from pydantic import BaseModel


# Define your schema (optional)
class ProductSchema(BaseModel):
    name: str
    price: float
    description: str


schema = [ProductSchema]

# Perform the scraping
result = scrapegraph_tool.scrapegraph_smartscraper(
    prompt="Extract product information",
    url="https://example.com/product",
    api_key="your-api-key",
    schema=schema,  # Optional
)
```

2. **Markdownify**

Convert webpage content to markdown format:

```python
markdown_content = scrapegraph_tool.scrapegraph_markdownify(
    url="https://example.com", api_key="your-api-key"
)
```

3. **Local Scrape**

Extract structured data from raw text:

```python
text = """
Your raw text content here...
"""

structured_data = scrapegraph_tool.scrapegraph_local_scrape(
    text=text, api_key="your-api-key"
)
```

## Requirements

- Python 3.8+
- `scrapegraph-py` package
- Valid Scrapegraph API key

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-tools-scrapegraphai",
    "maintainer": "Vincigit00",
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "scraping",
    "author": null,
    "author_email": "Marco Vinciguerra <mvincig11@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/28/e4/0f4838086008d9e050a28b50b01160f89f979c7643f89594f3bc249f5929/llama_index_tools_scrapegraphai-0.2.1.tar.gz",
    "platform": null,
    "description": "# LlamaIndex Tool - Scrapegraph\n\nThis tool integrates [Scrapegraph](https://scrapegraphai.com) with LlamaIndex, providing intelligent web scraping capabilities with structured data extraction.\n\n## Installation\n\n```bash\npip install llama-index-tools-scrapegraph\n```\n\n## Usage\n\nFirst, import and initialize the ScrapegraphToolSpec:\n\n```python\nfrom llama_index.tools.scrapegraph import ScrapegraphToolSpec\n\nscrapegraph_tool = ScrapegraphToolSpec()\n```\n\n### Available Functions\n\nThe tool provides the following capabilities:\n\n1. **Smart Scraper**\n\n```python\nfrom pydantic import BaseModel\n\n\n# Define your schema (optional)\nclass ProductSchema(BaseModel):\n    name: str\n    price: float\n    description: str\n\n\nschema = [ProductSchema]\n\n# Perform the scraping\nresult = scrapegraph_tool.scrapegraph_smartscraper(\n    prompt=\"Extract product information\",\n    url=\"https://example.com/product\",\n    api_key=\"your-api-key\",\n    schema=schema,  # Optional\n)\n```\n\n2. **Markdownify**\n\nConvert webpage content to markdown format:\n\n```python\nmarkdown_content = scrapegraph_tool.scrapegraph_markdownify(\n    url=\"https://example.com\", api_key=\"your-api-key\"\n)\n```\n\n3. **Local Scrape**\n\nExtract structured data from raw text:\n\n```python\ntext = \"\"\"\nYour raw text content here...\n\"\"\"\n\nstructured_data = scrapegraph_tool.scrapegraph_local_scrape(\n    text=text, api_key=\"your-api-key\"\n)\n```\n\n## Requirements\n\n- Python 3.8+\n- `scrapegraph-py` package\n- Valid Scrapegraph API key\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "llama-index tools integrating ScrapegraphAI",
    "version": "0.2.1",
    "project_urls": null,
    "split_keywords": [
        "scraping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "85f3e87370b59369aa48cb79b3d857b14813d10ca0d5f852ec545875bc4d9e01",
                "md5": "4c18ad948adfb1891efda89079fefe33",
                "sha256": "e8cf5e012ce1fa92216df6e8ee564417903bc481a6740fdc9f2f29ebfbd49a4c"
            },
            "downloads": -1,
            "filename": "llama_index_tools_scrapegraphai-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4c18ad948adfb1891efda89079fefe33",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 3751,
            "upload_time": "2025-09-08T20:47:58",
            "upload_time_iso_8601": "2025-09-08T20:47:58.887626Z",
            "url": "https://files.pythonhosted.org/packages/85/f3/e87370b59369aa48cb79b3d857b14813d10ca0d5f852ec545875bc4d9e01/llama_index_tools_scrapegraphai-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "28e40f4838086008d9e050a28b50b01160f89f979c7643f89594f3bc249f5929",
                "md5": "d275386b08bc46cd73a14257148bc15e",
                "sha256": "e82eb9d5fa84ff87b3b8bb9ed1e779edf93374384dc8ac1f5626286f8756e2b3"
            },
            "downloads": -1,
            "filename": "llama_index_tools_scrapegraphai-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d275386b08bc46cd73a14257148bc15e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 4079,
            "upload_time": "2025-09-08T20:47:59",
            "upload_time_iso_8601": "2025-09-08T20:47:59.556954Z",
            "url": "https://files.pythonhosted.org/packages/28/e4/0f4838086008d9e050a28b50b01160f89f979c7643f89594f3bc249f5929/llama_index_tools_scrapegraphai-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-08 20:47:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-tools-scrapegraphai"
}
        
Elapsed time: 0.62589s