# SimpleCrawl
A typed client for the [`firecrawl-simple`](https://github.com/nustato/firecrawl-simple) self-hosted API.
## Installation
```bash
pip install firecrawl-simple-client
```
## Quick Start
### Synchronous Usage
`export FIRECRAWL_URL_BASE="url"`
```python
from simplecrawl import Client
# Initialize client
client = Client(base_url="some-url", ) # defaults to https://api.firecrawl.dev/v1 as base URL if not found in environment
# Scrape a single page
result = client.scrape("https://example.com")
print(result.markdown)
print(result.metadata.title)
# Crawl multiple pages
job = client.crawl(
"https://example.com",
include_paths=["/blog/*"],
max_depth=2,
limit=10
)
```
### Async Usage
```python
import asyncio
from simplecrawl import AsyncClient
async def main():
async with AsyncClient(token="your-api-token") as client:
result = await client.scrape("https://example.com")
print(result.markdown)
asyncio.run(main())
```
## Features
- Synchronous and asynchronous clients
- Single page scraping
- Multi-page crawling
- URL discovery/mapping
- Content format options (Markdown, HTML, Links, etc.)
- Customizable scraping options
## Documentation
For detailed examples, check out the examples folder.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "firecrawl-simple-client",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "web-scraping, crawling, async, http",
"author": null,
"author_email": "Darin <86675935+darinkishore@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/36/ca/11fcbe6aa2a6a7eedcc730a635a6c097524d4b9e063edf8ab95e9a415305/firecrawl_simple_client-0.1.3.tar.gz",
"platform": null,
"description": "# SimpleCrawl\n\nA typed client for the [`firecrawl-simple`](https://github.com/nustato/firecrawl-simple) self-hosted API.\n\n## Installation\n\n```bash\npip install firecrawl-simple-client\n```\n\n## Quick Start\n\n### Synchronous Usage\n\n`export FIRECRAWL_URL_BASE=\"url\"`\n\n```python\nfrom simplecrawl import Client\n\n# Initialize client\nclient = Client(base_url=\"some-url\", ) # defaults to https://api.firecrawl.dev/v1 as base URL if not found in environment\n\n# Scrape a single page\nresult = client.scrape(\"https://example.com\")\nprint(result.markdown)\nprint(result.metadata.title)\n\n# Crawl multiple pages\njob = client.crawl(\n \"https://example.com\",\n include_paths=[\"/blog/*\"],\n max_depth=2,\n limit=10\n)\n```\n\n### Async Usage\n\n```python\nimport asyncio\nfrom simplecrawl import AsyncClient\n\nasync def main():\n async with AsyncClient(token=\"your-api-token\") as client:\n result = await client.scrape(\"https://example.com\")\n print(result.markdown)\n\nasyncio.run(main())\n```\n\n## Features\n\n- Synchronous and asynchronous clients\n- Single page scraping\n- Multi-page crawling\n- URL discovery/mapping\n- Content format options (Markdown, HTML, Links, etc.)\n- Customizable scraping options\n\n## Documentation\n\nFor detailed examples, check out the examples folder.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python client for Firecrawl-Simple",
"version": "0.1.3",
"project_urls": {
"Documentation": "https://github.com/darinkishore/simplecrawl#readme",
"Homepage": "https://github.com/darinkishore/simplecrawl",
"Repository": "https://github.com/darinkishore/simplecrawl"
},
"split_keywords": [
"web-scraping",
" crawling",
" async",
" http"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "86ff04ea4d7e9ba9396a58239656da2b780867d74a42943e100e621a4ecebb02",
"md5": "0baeb8664d2fab68c5683bcd00ab5436",
"sha256": "8f52f2ad3982e456f5391e4e9445f06bdbdf189bcda915636a5c46f8a4962e69"
},
"downloads": -1,
"filename": "firecrawl_simple_client-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0baeb8664d2fab68c5683bcd00ab5436",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 10074,
"upload_time": "2024-11-15T01:32:11",
"upload_time_iso_8601": "2024-11-15T01:32:11.918926Z",
"url": "https://files.pythonhosted.org/packages/86/ff/04ea4d7e9ba9396a58239656da2b780867d74a42943e100e621a4ecebb02/firecrawl_simple_client-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "36ca11fcbe6aa2a6a7eedcc730a635a6c097524d4b9e063edf8ab95e9a415305",
"md5": "8c5c20e7e56bbe55d9b2d49e1f557216",
"sha256": "e4252b0f641b4cdc1aed66fca766dba96743252e2b35abad2e7bae4080ea23d7"
},
"downloads": -1,
"filename": "firecrawl_simple_client-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "8c5c20e7e56bbe55d9b2d49e1f557216",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 7157,
"upload_time": "2024-11-15T01:32:13",
"upload_time_iso_8601": "2024-11-15T01:32:13.116008Z",
"url": "https://files.pythonhosted.org/packages/36/ca/11fcbe6aa2a6a7eedcc730a635a6c097524d4b9e063edf8ab95e9a415305/firecrawl_simple_client-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-15 01:32:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "darinkishore",
"github_project": "simplecrawl#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "firecrawl-simple-client"
}