## Baidu Image Crawling
一个超级轻量的百度图片爬虫, modified from <https://github.com/kong36088/BaiduImageCrawling>
### 安装
```bash
pip install baidu_image_crawling
```
### Python使用
```python
from baidu_image_crawling.main import Crawler
crawler = Crawler(0.05, save_dir="outputs") # 抓取延迟为 0.05
# 抓取关键词为 “美女”,总数为2页,开始页码为1,每页 30 张, 即总共2*30=60张
crawler(word="美女", total_page=2, start_page=1, per_page=30)
```
### 终端使用
```bash
baidu_image_crawling -w 美女 -tp 1 -sp 1 -pp 2
```
查看参数文档:
```bash
$ baidu_image_crawling -h
usage: baidu_image_crawling [-h] -w WORD -tp TOTAL_PAGE -sp START_PAGE [-pp [PER_PAGE]] [-sd SAVE_DIR] [-d DELAY]
options:
-h, --help show this help message and exit
-w WORD, --word WORD 抓取关键词
-tp TOTAL_PAGE, --total_page TOTAL_PAGE
需要抓取的总页数
-sp START_PAGE, --start_page START_PAGE
起始页数
-pp [PER_PAGE], --per_page [PER_PAGE]
每页大小
-sd SAVE_DIR, --save_dir SAVE_DIR
图片保存目录
-d DELAY, --delay DELAY
抓取延时(间隔)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/SWHL/BaiduImageCrawling",
"name": "baidu-image-crawling",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.6",
"maintainer_email": null,
"keywords": "spider, baidu",
"author": "SWHL",
"author_email": "liekkaskono@163.com",
"download_url": null,
"platform": "Any",
"description": "## Baidu Image Crawling\n\n\u4e00\u4e2a\u8d85\u7ea7\u8f7b\u91cf\u7684\u767e\u5ea6\u56fe\u7247\u722c\u866b, modified from <https://github.com/kong36088/BaiduImageCrawling>\n\n### \u5b89\u88c5\n\n```bash\npip install baidu_image_crawling\n```\n\n### Python\u4f7f\u7528\n\n```python\nfrom baidu_image_crawling.main import Crawler\n\ncrawler = Crawler(0.05, save_dir=\"outputs\") # \u6293\u53d6\u5ef6\u8fdf\u4e3a 0.05\n\n# \u6293\u53d6\u5173\u952e\u8bcd\u4e3a \u201c\u7f8e\u5973\u201d\uff0c\u603b\u6570\u4e3a2\u9875\uff0c\u5f00\u59cb\u9875\u7801\u4e3a1\uff0c\u6bcf\u9875 30 \u5f20, \u5373\u603b\u51712*30=60\u5f20\ncrawler(word=\"\u7f8e\u5973\", total_page=2, start_page=1, per_page=30)\n```\n\n### \u7ec8\u7aef\u4f7f\u7528\n\n```bash\nbaidu_image_crawling -w \u7f8e\u5973 -tp 1 -sp 1 -pp 2\n```\n\n\u67e5\u770b\u53c2\u6570\u6587\u6863\uff1a\n\n```bash\n$ baidu_image_crawling -h\nusage: baidu_image_crawling [-h] -w WORD -tp TOTAL_PAGE -sp START_PAGE [-pp [PER_PAGE]] [-sd SAVE_DIR] [-d DELAY]\n\noptions:\n -h, --help show this help message and exit\n -w WORD, --word WORD \u6293\u53d6\u5173\u952e\u8bcd\n -tp TOTAL_PAGE, --total_page TOTAL_PAGE\n \u9700\u8981\u6293\u53d6\u7684\u603b\u9875\u6570\n -sp START_PAGE, --start_page START_PAGE\n \u8d77\u59cb\u9875\u6570\n -pp [PER_PAGE], --per_page [PER_PAGE]\n \u6bcf\u9875\u5927\u5c0f\n -sd SAVE_DIR, --save_dir SAVE_DIR\n \u56fe\u7247\u4fdd\u5b58\u76ee\u5f55\n -d DELAY, --delay DELAY\n \u6293\u53d6\u5ef6\u65f6\uff08\u95f4\u9694\uff09\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Baidu Image Spider",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/SWHL/BaiduImageCrawling"
},
"split_keywords": [
"spider",
" baidu"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "89674631eeec8dbe49b67eac2378c8aaeeffc7759e3c7f326bee847a2d5d8183",
"md5": "8fdfddc0a9743bfc56f24383baae2635",
"sha256": "2c955d3dce0395ae1abe88f8effc013a3c8052fa973458cb8749ff7861e2b9c4"
},
"downloads": -1,
"filename": "baidu_image_crawling-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8fdfddc0a9743bfc56f24383baae2635",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.6",
"size": 6021,
"upload_time": "2025-01-15T15:10:24",
"upload_time_iso_8601": "2025-01-15T15:10:24.698102Z",
"url": "https://files.pythonhosted.org/packages/89/67/4631eeec8dbe49b67eac2378c8aaeeffc7759e3c7f326bee847a2d5d8183/baidu_image_crawling-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-15 15:10:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "SWHL",
"github_project": "BaiduImageCrawling",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "fake_useragent",
"specs": []
}
],
"lcname": "baidu-image-crawling"
}