# 🌐 ScrapeGraph Python SDK
[![PyPI version](https://badge.fury.io/py/scrapegraph-py.svg)](https://badge.fury.io/py/scrapegraph-py)
[![Python Support](https://img.shields.io/pypi/pyversions/scrapegraph-py.svg)](https://pypi.org/project/scrapegraph-py/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Documentation Status](https://readthedocs.org/projects/scrapegraph-py/badge/?version=latest)](https://scrapegraph-py.readthedocs.io/en/latest/?badge=latest)
Official Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.
## 📦 Installation
```bash
pip install scrapegraph-py
```
## 🚀 Features
- 🤖 AI-powered web scraping
- 🔄 Both sync and async clients
- 📊 Structured output with Pydantic schemas
- 🔍 Detailed logging
- ⚡ Automatic retries
- 🔐 Secure authentication
## 🎯 Quick Start
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
```
> [!NOTE]
> You can set the `SGAI_API_KEY` environment variable and initialize the client without parameters: `client = Client()`
## 📚 Available Endpoints
### 🔍 SmartScraper
Scrapes any webpage using AI to extract specific information.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
# Basic usage
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main heading and description"
)
print(response)
```
<details>
<summary>Output Schema (Optional)</summary>
```python
from pydantic import BaseModel, Field
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
class WebsiteData(BaseModel):
title: str = Field(description="The page title")
description: str = Field(description="The meta description")
response = client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the title and description",
output_schema=WebsiteData
)
```
</details>
### 📝 Markdownify
Converts any webpage into clean, formatted markdown.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
response = client.markdownify(
website_url="https://example.com"
)
print(response)
```
### 💻 LocalScraper
Extracts information from HTML content using AI.
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
html_content = """
<html>
<body>
<h1>Company Name</h1>
<p>We are a technology company focused on AI solutions.</p>
<div class="contact">
<p>Email: contact@example.com</p>
</div>
</body>
</html>
"""
response = client.localscraper(
user_prompt="Extract the company description",
website_html=html_content
)
print(response)
```
## ⚡ Async Support
All endpoints support async operations:
```python
import asyncio
from scrapegraph_py import AsyncClient
async def main():
async with AsyncClient() as client:
response = await client.smartscraper(
website_url="https://example.com",
user_prompt="Extract the main content"
)
print(response)
asyncio.run(main())
```
## 📖 Documentation
For detailed documentation, visit [scrapegraphai.com/docs](https://scrapegraphai.com/docs)
## 🛠️ Development
For information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).
## 💬 Support & Feedback
- 📧 Email: support@scrapegraphai.com
- 💻 GitHub Issues: [Create an issue](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues)
- 🌟 Feature Requests: [Request a feature](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues/new)
- ⭐ API Feedback: You can also submit feedback programmatically using the feedback endpoint:
```python
from scrapegraph_py import Client
client = Client(api_key="your-api-key-here")
client.submit_feedback(
request_id="your-request-id",
rating=5,
feedback_text="Great results!"
)
```
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🔗 Links
- [Website](https://scrapegraphai.com)
- [Documentation](https://scrapegraphai.com/docs)
- [GitHub](https://github.com/ScrapeGraphAI/scrapegraph-sdk)
---
Made with ❤️ by [ScrapeGraph AI](https://scrapegraphai.com)
Raw data
{
"_id": null,
"home_page": null,
"name": "scrapegraph-py",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "ai, api, artificial intelligence, gpt, graph, machine learning, natural language processing, nlp, openai, scraping, sdk, web scraping tool, webscraping",
"author": null,
"author_email": "Marco Vinciguerra <mvincig11@gmail.com>, Marco Perini <perinim.98@gmail.com>, Lorenzo Padoan <lorenzo.padoan977@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/33/90/2388754061394a6c95fd5ad48cf4550208ce081c99cbc883672d52ccc360/scrapegraph_py-1.8.0.tar.gz",
"platform": null,
"description": "# \ud83c\udf10 ScrapeGraph Python SDK\n\n[![PyPI version](https://badge.fury.io/py/scrapegraph-py.svg)](https://badge.fury.io/py/scrapegraph-py)\n[![Python Support](https://img.shields.io/pypi/pyversions/scrapegraph-py.svg)](https://pypi.org/project/scrapegraph-py/)\n[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Documentation Status](https://readthedocs.org/projects/scrapegraph-py/badge/?version=latest)](https://scrapegraph-py.readthedocs.io/en/latest/?badge=latest)\n\nOfficial Python SDK for the ScrapeGraph API - Smart web scraping powered by AI.\n\n## \ud83d\udce6 Installation\n\n```bash\npip install scrapegraph-py\n```\n\n## \ud83d\ude80 Features\n\n- \ud83e\udd16 AI-powered web scraping\n- \ud83d\udd04 Both sync and async clients\n- \ud83d\udcca Structured output with Pydantic schemas\n- \ud83d\udd0d Detailed logging\n- \u26a1 Automatic retries\n- \ud83d\udd10 Secure authentication\n\n## \ud83c\udfaf Quick Start\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n```\n\n> [!NOTE]\n> You can set the `SGAI_API_KEY` environment variable and initialize the client without parameters: `client = Client()`\n\n## \ud83d\udcda Available Endpoints\n\n### \ud83d\udd0d SmartScraper\n\nScrapes any webpage using AI to extract specific information.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\n# Basic usage\nresponse = client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the main heading and description\"\n)\n\nprint(response)\n```\n\n<details>\n<summary>Output Schema (Optional)</summary>\n\n```python\nfrom pydantic import BaseModel, Field\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nclass WebsiteData(BaseModel):\n title: str = Field(description=\"The page title\")\n description: str = Field(description=\"The meta description\")\n\nresponse = client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the title and description\",\n output_schema=WebsiteData\n)\n```\n\n</details>\n\n### \ud83d\udcdd Markdownify\n\nConverts any webpage into clean, formatted markdown.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nresponse = client.markdownify(\n website_url=\"https://example.com\"\n)\n\nprint(response)\n```\n\n### \ud83d\udcbb LocalScraper\n\nExtracts information from HTML content using AI.\n\n```python\nfrom scrapegraph_py import Client\n\nclient = Client(api_key=\"your-api-key-here\")\n\nhtml_content = \"\"\"\n<html>\n <body>\n <h1>Company Name</h1>\n <p>We are a technology company focused on AI solutions.</p>\n <div class=\"contact\">\n <p>Email: contact@example.com</p>\n </div>\n </body>\n</html>\n\"\"\"\n\nresponse = client.localscraper(\n user_prompt=\"Extract the company description\",\n website_html=html_content\n)\n\nprint(response)\n```\n\n## \u26a1 Async Support\n\nAll endpoints support async operations:\n\n```python\nimport asyncio\nfrom scrapegraph_py import AsyncClient\n\nasync def main():\n async with AsyncClient() as client:\n response = await client.smartscraper(\n website_url=\"https://example.com\",\n user_prompt=\"Extract the main content\"\n )\n print(response)\n\nasyncio.run(main())\n```\n\n## \ud83d\udcd6 Documentation\n\nFor detailed documentation, visit [scrapegraphai.com/docs](https://scrapegraphai.com/docs)\n\n## \ud83d\udee0\ufe0f Development\n\nFor information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).\n\n## \ud83d\udcac Support & Feedback\n\n- \ud83d\udce7 Email: support@scrapegraphai.com\n- \ud83d\udcbb GitHub Issues: [Create an issue](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues)\n- \ud83c\udf1f Feature Requests: [Request a feature](https://github.com/ScrapeGraphAI/scrapegraph-sdk/issues/new)\n- \u2b50 API Feedback: You can also submit feedback programmatically using the feedback endpoint:\n ```python\n from scrapegraph_py import Client\n\n client = Client(api_key=\"your-api-key-here\")\n\n client.submit_feedback(\n request_id=\"your-request-id\",\n rating=5,\n feedback_text=\"Great results!\"\n )\n ```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- [Website](https://scrapegraphai.com)\n- [Documentation](https://scrapegraphai.com/docs)\n- [GitHub](https://github.com/ScrapeGraphAI/scrapegraph-sdk)\n\n---\n\nMade with \u2764\ufe0f by [ScrapeGraph AI](https://scrapegraphai.com)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "ScrapeGraph Python SDK for API",
"version": "1.8.0",
"project_urls": null,
"split_keywords": [
"ai",
" api",
" artificial intelligence",
" gpt",
" graph",
" machine learning",
" natural language processing",
" nlp",
" openai",
" scraping",
" sdk",
" web scraping tool",
" webscraping"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f78014aeb7ba092cfc6928844a6726855f0c33489107f344e71dd8071f6433ed",
"md5": "4b2e6f8bf485f092930f96362f8cef3c",
"sha256": "279176c972a770bac37a284e0bc25e34793797f30ff24dfba8fbcbfda79c8c88"
},
"downloads": -1,
"filename": "scrapegraph_py-1.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4b2e6f8bf485f092930f96362f8cef3c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 14460,
"upload_time": "2024-12-08T09:02:04",
"upload_time_iso_8601": "2024-12-08T09:02:04.899240Z",
"url": "https://files.pythonhosted.org/packages/f7/80/14aeb7ba092cfc6928844a6726855f0c33489107f344e71dd8071f6433ed/scrapegraph_py-1.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "33902388754061394a6c95fd5ad48cf4550208ce081c99cbc883672d52ccc360",
"md5": "0becf5c4d27731e8382a2c431205382d",
"sha256": "e075f6e6012a14a038537d0664609229069d9d2c2956bcbf9362f0c5c48de786"
},
"downloads": -1,
"filename": "scrapegraph_py-1.8.0.tar.gz",
"has_sig": false,
"md5_digest": "0becf5c4d27731e8382a2c431205382d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 108112,
"upload_time": "2024-12-08T09:02:05",
"upload_time_iso_8601": "2024-12-08T09:02:05.816851Z",
"url": "https://files.pythonhosted.org/packages/33/90/2388754061394a6c95fd5ad48cf4550208ce081c99cbc883672d52ccc360/scrapegraph_py-1.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-08 09:02:05",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "scrapegraph-py"
}