# 🌐 ScrapeGraph Python SDK
[](https://badge.fury.io/py/scrapegraph-py)
[](https://pypi.org/project/scrapegraph-py/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/psf/black)
[](https://docs.scrapegraphai.com)
<p align="left">
<img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png" alt="ScrapeGraph API Banner" style="width: 70%;">
</p>
Official [Python SDK ](https://scrapegraphai.com) for the ScrapeGraph API - Smart web scraping powered by AI.
## 📦 Installation
```bash
pip install scrapegraph-py
```
## 🚀 Features
- 🤖 AI-powered web scraping and search
- 🔄 Both sync and async clients
- 📊 Structured output with Pydantic schemas
- 🔍 Detailed logging
- ⚡ Automatic retries
- 🔐 Secure authentication
## 🎯 Quick Start
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
```
> [!NOTE]
> You can set the `SGAI_API_KEY` environment variable and initialize the client without parameters: `client = Client()`
## 📚 Available Endpoints
### 🤖 SmartScraper
Extract structured data from any webpage or HTML content using AI.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
# Using a URL
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
# Or using HTML content
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
</body>
</html>
"""
response = client.smartscraper(
website_html=html_content,
user_prompt="Extract the company description"
)
print(response)
```
<details>
<summary>Output Schema (Optional)</summary>
```python
from pydantic import BaseModel, Field
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
class WebsiteData(BaseModel):
title: str = Field(description="The page title")
description: str = Field(description="The meta description")
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the title and description",
output_schema=WebsiteData
)
```
</details>
### 🔍 SearchScraper
Perform AI-powered web searches with structured results and reference URLs.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
response = client.searchscraper(
user_prompt="What is the latest version of Python and its main features?"
)
print(f"Answer: {response['result']}")
print(f"Sources: {response['reference_urls']}")
```
<details>
<summary>Output Schema (Optional)</summary>
```python
from pydantic import BaseModel, Field
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
class PythonVersionInfo(BaseModel):
version: str = Field(description="The latest Python version number")
release_date: str = Field(description="When this version was released")
major_features: list[str] = Field(description="List of main features")
response = client.searchscraper(
user_prompt="What is the latest version of Python and its main features?",
output_schema=PythonVersionInfo
)
```
</details>
### 📝 Markdownify
Converts any webpage into clean, formatted markdown.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
response = client.markdownify(
website_url="https://example.com"
)
print(response)
```
## ⚡ Async Support
All endpoints support async operations:
```python
import asyncio
from scrapegraph_py import AsyncClient
async def main():
async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
asyncio.run(main())
```
## 📖 Documentation
For detailed documentation, visit [docs.scrapegraphai.com](https://docs.scrapegraphai.com)
## 🛠️ Development
For information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).
## 💬 Support & Feedback
- 📧 Email: support@scrapegraphai.com
- 💻 GitHub Issues: [Create an issue](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues)
- 🌟 Feature Requests: [Request a feature](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues/new)
- ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)
```
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🔗 Links
- [Website](https://scrapegraphai.com)
- [Documentation](https://docs.scrapegraphai.com)
- [GitHub](https://github.com/ScrapeGraphAI/scrapegraph-sdk)
---
Made with ❤️ by [ScrapeGraph AI](https://scrapegraphai.com)
Raw data
{
"_id": null,
"home_page": null,
"name": "scrapegraph-py",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "ai, api, artificial intelligence, gpt, graph, machine learning, natural language processing, nlp, openai, scraping, sdk, web scraping tool, webscraping",
"author": null,
"author_email": "Marco Vinciguerra <mvincig11@gmail.com>, perinim.98@gmail.com, Lorenzo Padoan <lorenzo.padoan977@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/44/69/c4b971bfe32a48fcc2dd9363fdc07c87481fb7d4147c5d6ded1b915baaf0/scrapegraph_py-1.14.1.tar.gz",
"platform": null,
"description": "# \ud83c\udf10 ScrapeGraph Python SDK\n\n[](https://badge.fury.io/py/scrapegraph-py)\n[](https://pypi.org/project/scrapegraph-py/)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/psf/black)\n[](https://docs.scrapegraphai.com)\n\n<p align=\"left\">\n <img src=\"https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png\" alt=\"ScrapeGraph API Banner\" style=\"width: 70%;\">\n</p>\n\nOfficial [Python SDK ](https://scrapegraphai.com) for the ScrapeGraph API - Smart web scraping powered by AI.\n\n## \ud83d\udce6 Installation\n\n```bash\npip install scrapegraph-py\n```\n\n## \ud83d\ude80 Features\n\n- \ud83e\udd16 AI-powered web scraping and search\n- \ud83d\udd04 Both sync and async clients\n- \ud83d\udcca Structured output with Pydantic schemas\n- \ud83d\udd0d Detailed logging\n- \u26a1 Automatic retries\n- \ud83d\udd10 Secure authentication\n\n## \ud83c\udfaf Quick Start\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n```\n\n> [!NOTE]\n> You can set the `SGAI_API_KEY` environment variable and initialize the client without parameters: `client = Client()`\n\n## \ud83d\udcda Available Endpoints\n\n### \ud83e\udd16 SmartScraper\n\nExtract structured data from any webpage or HTML content using AI.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\n# Using a URL\nresponse = client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the main heading and description\"\n)\n\n# Or using HTML content\nhtml_content = \"\"\"\n<html>\n <body>\n <h1>Company Name</h1>\n <p>We are a technology company focused on AI solutions.</p>\n </body>\n</html>\n\"\"\"\n\nresponse = client.smartscraper(\n website_html=html_content,\n user_prompt=\"Extract the company description\"\n)\n\nprint(response)\n```\n\n<details>\n<summary>Output Schema (Optional)</summary>\n\n```python\nfrom pydantic import BaseModel, Field\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nclass WebsiteData(BaseModel):\n title: str = Field(description=\"The page title\")\n description: str = Field(description=\"The meta description\")\n\nresponse = client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the title and description\",\n output_schema=WebsiteData\n)\n```\n\n</details>\n\n### \ud83d\udd0d SearchScraper\n\nPerform AI-powered web searches with structured results and reference URLs.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nresponse = client.searchscraper(\n user_prompt=\"What is the latest version of Python and its main features?\"\n)\n\nprint(f\"Answer: {response['result']}\")\nprint(f\"Sources: {response['reference_urls']}\")\n```\n\n<details>\n<summary>Output Schema (Optional)</summary>\n\n```python\nfrom pydantic import BaseModel, Field\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nclass PythonVersionInfo(BaseModel):\n version: str = Field(description=\"The latest Python version number\")\n release_date: str = Field(description=\"When this version was released\")\n major_features: list[str] = Field(description=\"List of main features\")\n\nresponse = client.searchscraper(\n user_prompt=\"What is the latest version of Python and its main features?\",\n output_schema=PythonVersionInfo\n)\n```\n\n</details>\n\n### \ud83d\udcdd Markdownify\n\nConverts any webpage into clean, formatted markdown.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nresponse = client.markdownify(\n website_url=\"https://example.com\"\n)\n\nprint(response)\n```\n\n## \u26a1 Async Support\n\nAll endpoints support async operations:\n\n```python\nimport asyncio\nfrom scrapegraph_py import AsyncClient\n\nasync def main():\n async with AsyncClient() as client:\n response = await client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the main content\"\n )\n print(response)\n\nasyncio.run(main())\n```\n\n## \ud83d\udcd6 Documentation\n\nFor detailed documentation, visit [docs.scrapegraphai.com](https://docs.scrapegraphai.com)\n\n## \ud83d\udee0\ufe0f Development\n\nFor information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).\n\n## \ud83d\udcac Support & Feedback\n\n- \ud83d\udce7 Email: support@scrapegraphai.com\n- \ud83d\udcbb GitHub Issues: [Create an issue](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues)\n- \ud83c\udf1f Feature Requests: [Request a feature](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues/new)\n- \u2b50 API Feedback: You can also submit feedback programmatically using the feedback endpoint:\n ```python\n from scrapegraph_py import Client\n\n client = Client(api_key=\"your-api-key-here\")\n\n client.submit_feedback(\n request_id=\"your-request-id\",\n rating=5,\n feedback_text=\"Great results!\"\n )\n ```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- [Website](https://scrapegraphai.com)\n- [Documentation](https://docs.scrapegraphai.com)\n- [GitHub](https://github.com/ScrapeGraphAI/scrapegraph-sdk)\n\n---\n\nMade with \u2764\ufe0f by [ScrapeGraph AI](https://scrapegraphai.com)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "ScrapeGraph Python SDK for API",
"version": "1.14.1",
"project_urls": null,
"split_keywords": [
"ai",
" api",
" artificial intelligence",
" gpt",
" graph",
" machine learning",
" natural language processing",
" nlp",
" openai",
" scraping",
" sdk",
" web scraping tool",
" webscraping"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9d936f0713d285c0bb61f03fbb27b226846aa469ff4ff32528fcb983846b254a",
"md5": "d368ed8678c84a031b96be05da781e75",
"sha256": "04d4799a756c840d89ed8ca4e627b2c9d7c1b07a90d1b9120361eb74a96c0abf"
},
"downloads": -1,
"filename": "scrapegraph_py-1.14.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d368ed8678c84a031b96be05da781e75",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 18059,
"upload_time": "2025-07-08T14:48:00",
"upload_time_iso_8601": "2025-07-08T14:48:00.042177Z",
"url": "https://files.pythonhosted.org/packages/9d/93/6f0713d285c0bb61f03fbb27b226846aa469ff4ff32528fcb983846b254a/scrapegraph_py-1.14.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4469c4b971bfe32a48fcc2dd9363fdc07c87481fb7d4147c5d6ded1b915baaf0",
"md5": "bc8c35a0c28a03970f538dcd653277ce",
"sha256": "0b8b5b2877648fbdde27387dece06401375cf3c2c88552ad170efd39bf9355dc"
},
"downloads": -1,
"filename": "scrapegraph_py-1.14.1.tar.gz",
"has_sig": false,
"md5_digest": "bc8c35a0c28a03970f538dcd653277ce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 148957,
"upload_time": "2025-07-08T14:48:00",
"upload_time_iso_8601": "2025-07-08T14:48:00.980562Z",
"url": "https://files.pythonhosted.org/packages/44/69/c4b971bfe32a48fcc2dd9363fdc07c87481fb7d4147c5d6ded1b915baaf0/scrapegraph_py-1.14.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-08 14:48:00",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "scrapegraph-py"
}