# Scrapfly SDK
## Installation
`pip install scrapfly-sdk`
You can also install extra dependencies
* `pip install "scrapfly-sdk[seepdup]"` for performance improvement
* `pip install "scrapfly-sdk[concurrency]"` for concurrency out of the box (asyncio / thread)
* `pip install "scrapfly-sdk[scrapy]"` for scrapy integration
* `pip install "scrapfly-sdk[all]"` Everything!
For use of built-in HTML parser (via `ScrapeApiResponse.selector` property) additional requirement of either [parsel](https://pypi.org/project/parsel/) or [scrapy](https://pypi.org/project/Scrapy/) is required.
For reference of usage or examples, please checkout the folder `/examples` in this repository.
## Get Your API Key
You can create a free account on [Scrapfly](https://scrapfly.io/register) to get your API Key.
* [Usage](https://scrapfly.io/docs/sdk/python)
* [Python API](https://scrapfly.github.io/python-scrapfly/scrapfly)
* [Open API 3 Spec](https://scrapfly.io/docs/openapi#get-/scrape)
* [Scrapy Integration](https://scrapfly.io/docs/sdk/scrapy)
## Migration
### Migrate from 0.7.x to 0.8
asyncio-pool dependency has been dropped
`scrapfly.concurrent_scrape` is now an async generator. If the concurrency is `None` or not defined, the max concurrency allowed by
your current subscription is used.
```python
async for result in scrapfly.concurrent_scrape(concurrency=10, scrape_configs=[ScrapConfig(...), ...]):
print(result)
```
brotli args is deprecated and will be removed in the next minor. There is not benefit in most of case
versus gzip regarding and size and use more CPU.
### What's new
### 0.8.x
* Better error log
* Async/Improvement for concurrent scrape with asyncio
* Scrapy media pipeline are now supported out of the box
Raw data
{
"_id": null,
"home_page": "https://github.com/scrapfly/python-sdk",
"name": "scrapfly-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "scraping, web scraping, data, extraction, scrapfly, sdk, cloud, scrapy",
"author": "Scrapfly",
"author_email": "tech@scrapfly.io",
"download_url": "https://files.pythonhosted.org/packages/f6/e5/e0c00669bd94c82b0dd3dea9932e130008f2ad3503b259c31f6fa45a4ac9/scrapfly-sdk-0.8.16.tar.gz",
"platform": null,
"description": "# Scrapfly SDK\n\n## Installation\n\n`pip install scrapfly-sdk`\n\nYou can also install extra dependencies\n\n* `pip install \"scrapfly-sdk[seepdup]\"` for performance improvement\n* `pip install \"scrapfly-sdk[concurrency]\"` for concurrency out of the box (asyncio / thread)\n* `pip install \"scrapfly-sdk[scrapy]\"` for scrapy integration\n* `pip install \"scrapfly-sdk[all]\"` Everything!\n\nFor use of built-in HTML parser (via `ScrapeApiResponse.selector` property) additional requirement of either [parsel](https://pypi.org/project/parsel/) or [scrapy](https://pypi.org/project/Scrapy/) is required.\n\nFor reference of usage or examples, please checkout the folder `/examples` in this repository.\n\n## Get Your API Key\n\nYou can create a free account on [Scrapfly](https://scrapfly.io/register) to get your API Key.\n\n* [Usage](https://scrapfly.io/docs/sdk/python)\n* [Python API](https://scrapfly.github.io/python-scrapfly/scrapfly)\n* [Open API 3 Spec](https://scrapfly.io/docs/openapi#get-/scrape) \n* [Scrapy Integration](https://scrapfly.io/docs/sdk/scrapy)\n\n## Migration\n\n### Migrate from 0.7.x to 0.8\n\nasyncio-pool dependency has been dropped\n\n`scrapfly.concurrent_scrape` is now an async generator. If the concurrency is `None` or not defined, the max concurrency allowed by\nyour current subscription is used.\n\n```python\n async for result in scrapfly.concurrent_scrape(concurrency=10, scrape_configs=[ScrapConfig(...), ...]):\n print(result)\n```\n\nbrotli args is deprecated and will be removed in the next minor. There is not benefit in most of case\nversus gzip regarding and size and use more CPU.\n\n### What's new\n\n### 0.8.x\n\n* Better error log\n* Async/Improvement for concurrent scrape with asyncio\n* Scrapy media pipeline are now supported out of the box\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "Scrapfly SDK for Scrapfly",
"version": "0.8.16",
"project_urls": {
"Company": "https://scrapfly.io",
"Documentation": "https://scrapfly.io/docs",
"Homepage": "https://github.com/scrapfly/python-sdk",
"Source": "https://github.com/scrapfly/python-sdk"
},
"split_keywords": [
"scraping",
" web scraping",
" data",
" extraction",
" scrapfly",
" sdk",
" cloud",
" scrapy"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "70539d1c62e111c4a3ae7e61a6a4e9ce1423d7352226a30265c13b680fc60f07",
"md5": "df1258f2233de66de4b54f22e4da6997",
"sha256": "4b5e5f060c02eeffb6859aea10a3f945e1988750b664968b116fb5f798f1c050"
},
"downloads": -1,
"filename": "scrapfly_sdk-0.8.16-py3-none-any.whl",
"has_sig": false,
"md5_digest": "df1258f2233de66de4b54f22e4da6997",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 31901,
"upload_time": "2024-04-19T02:33:02",
"upload_time_iso_8601": "2024-04-19T02:33:02.557708Z",
"url": "https://files.pythonhosted.org/packages/70/53/9d1c62e111c4a3ae7e61a6a4e9ce1423d7352226a30265c13b680fc60f07/scrapfly_sdk-0.8.16-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f6e5e0c00669bd94c82b0dd3dea9932e130008f2ad3503b259c31f6fa45a4ac9",
"md5": "745de289fab8230f7e7056760a6f2142",
"sha256": "e77b5645e2516a1c73ef454fb3b51715c9e9e4264a59ae9cb4b065000e00c211"
},
"downloads": -1,
"filename": "scrapfly-sdk-0.8.16.tar.gz",
"has_sig": false,
"md5_digest": "745de289fab8230f7e7056760a6f2142",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 27060,
"upload_time": "2024-04-19T02:33:04",
"upload_time_iso_8601": "2024-04-19T02:33:04.868546Z",
"url": "https://files.pythonhosted.org/packages/f6/e5/e0c00669bd94c82b0dd3dea9932e130008f2ad3503b259c31f6fa45a4ac9/scrapfly-sdk-0.8.16.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-19 02:33:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scrapfly",
"github_project": "python-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "scrapfly-sdk"
}