Name | xhs-spider JSON |
Version |
1.1.0
JSON |
| download |
home_page | |
Summary | Little Red Book notes, home page, detailed page crawler |
upload_time | 2023-10-08 06:35:59 |
maintainer | |
docs_url | None |
author | cv_cat |
requires_python | |
license | |
keywords |
python
xhs
spider
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Spider_XHS
![image](https://img.shields.io/badge/cv_cat-Spider_XHS-blue)
小红书个人主页图片和视频无水印爬取
## 效果图
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/ef8990bc-d568-4b63-9dfc-4e2e4f235f99)
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/a5eb7df4-434a-4e6e-91e1-b60b40ca08e8)
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/d8c2e84e-3e78-4ca8-8c93-406a3e74da91)
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/7a0ea368-5507-469f-84f4-6dda59568b86)
## 运行环境
Python环境
NodeJS环境
运行方法:把你想要的id全部放到列表里
```
# 主页处理
from xhs_spider.home import Home
home = Home()
url_list = [
'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',
'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',
]
home.main(url_list)
# 笔记处理
from xhs_spider.note import Note
one_note = OneNote()
url_list = [
'https://www.xiaohongshu.com/explore/64356527000000001303282b',
]
one_note.main(url_list)
# 搜索结果处理
from xhs_spider.search import Search
search = Search()
query = '你好'
# 搜索的数量(前多少个)
number = 22
search.main(query, number)
```
## 日志
1. 23/08/08 first commit
2. 23/09/13 【api更改params增加两个字段】修复图片无法下载,有些页面无法访问导致报错。
3. 23/09/16 【较大视频出现编码问题】修复视频编码问题,加入异常处理。
4. 23/09/18 代码重构,加入失败重试。
5. 23/09/19 新增下载搜索结果功能
## 注意事项
**本项目仅供学习与交流,侵权必删**
other
1. 自行将cookies放到目录下cookies.txt中,去设置里的应用程序里找或者网络请求里找,需要哪些可以参考cookie.txt文件。
2. 可采用以下方法获取cookie,并运行对应文件。
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/e2ceaa15-defc-4d41-a6db-4a9d3f3055e4)
![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/78e791a6-ba51-455a-a438-3c829db5c387)
3. 欢迎star,不时更新。
4. 有问题可以加QQ或者微信交流(992822653)
Raw data
{
"_id": null,
"home_page": "",
"name": "xhs-spider",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,xhs,spider",
"author": "cv_cat",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/61/7d/567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee/xhs_spider-1.1.0.tar.gz",
"platform": null,
"description": "\r\n# Spider_XHS\r\r\n![image](https://img.shields.io/badge/cv_cat-Spider_XHS-blue)\r\r\n\r\r\n\u5c0f\u7ea2\u4e66\u4e2a\u4eba\u4e3b\u9875\u56fe\u7247\u548c\u89c6\u9891\u65e0\u6c34\u5370\u722c\u53d6\r\r\n\r\r\n## \u6548\u679c\u56fe\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/ef8990bc-d568-4b63-9dfc-4e2e4f235f99)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/a5eb7df4-434a-4e6e-91e1-b60b40ca08e8)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/d8c2e84e-3e78-4ca8-8c93-406a3e74da91)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/7a0ea368-5507-469f-84f4-6dda59568b86)\r\r\n\r\r\n## \u8fd0\u884c\u73af\u5883\r\r\nPython\u73af\u5883\r\r\nNodeJS\u73af\u5883\r\r\n\r\r\n\u8fd0\u884c\u65b9\u6cd5\uff1a\u628a\u4f60\u60f3\u8981\u7684id\u5168\u90e8\u653e\u5230\u5217\u8868\u91cc\r\r\n```\r\r\n# \u4e3b\u9875\u5904\u7406\r\r\nfrom xhs_spider.home import Home\r\r\nhome = Home()\r\r\nurl_list = [\r\r\n 'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',\r\r\n 'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',\r\r\n]\r\r\nhome.main(url_list)\r\r\n# \u7b14\u8bb0\u5904\u7406\r\r\nfrom xhs_spider.note import Note\r\r\none_note = OneNote()\r\r\nurl_list = [\r\r\n 'https://www.xiaohongshu.com/explore/64356527000000001303282b',\r\r\n]\r\r\none_note.main(url_list)\r\r\n# \u641c\u7d22\u7ed3\u679c\u5904\u7406\r\r\nfrom xhs_spider.search import Search\r\r\nsearch = Search()\r\r\nquery = '\u4f60\u597d'\r\r\n# \u641c\u7d22\u7684\u6570\u91cf\uff08\u524d\u591a\u5c11\u4e2a\uff09\r\r\nnumber = 22\r\r\nsearch.main(query, number)\r\r\n```\r\r\n## \u65e5\u5fd7\r\r\n1. 23/08/08 first commit\r\r\n2. 23/09/13 \u3010api\u66f4\u6539params\u589e\u52a0\u4e24\u4e2a\u5b57\u6bb5\u3011\u4fee\u590d\u56fe\u7247\u65e0\u6cd5\u4e0b\u8f7d\uff0c\u6709\u4e9b\u9875\u9762\u65e0\u6cd5\u8bbf\u95ee\u5bfc\u81f4\u62a5\u9519\u3002\r\r\n3. 23/09/16 \u3010\u8f83\u5927\u89c6\u9891\u51fa\u73b0\u7f16\u7801\u95ee\u9898\u3011\u4fee\u590d\u89c6\u9891\u7f16\u7801\u95ee\u9898\uff0c\u52a0\u5165\u5f02\u5e38\u5904\u7406\u3002\r\r\n4. 23/09/18 \u4ee3\u7801\u91cd\u6784\uff0c\u52a0\u5165\u5931\u8d25\u91cd\u8bd5\u3002\r\r\n5. 23/09/19 \u65b0\u589e\u4e0b\u8f7d\u641c\u7d22\u7ed3\u679c\u529f\u80fd\r\r\n\r\r\n## \u6ce8\u610f\u4e8b\u9879\r\r\n**\u672c\u9879\u76ee\u4ec5\u4f9b\u5b66\u4e60\u4e0e\u4ea4\u6d41\uff0c\u4fb5\u6743\u5fc5\u5220**\r\r\n\r\r\nother\r\r\n1. \u81ea\u884c\u5c06cookies\u653e\u5230\u76ee\u5f55\u4e0bcookies.txt\u4e2d\uff0c\u53bb\u8bbe\u7f6e\u91cc\u7684\u5e94\u7528\u7a0b\u5e8f\u91cc\u627e\u6216\u8005\u7f51\u7edc\u8bf7\u6c42\u91cc\u627e\uff0c\u9700\u8981\u54ea\u4e9b\u53ef\u4ee5\u53c2\u8003cookie.txt\u6587\u4ef6\u3002\r\r\n2. \u53ef\u91c7\u7528\u4ee5\u4e0b\u65b9\u6cd5\u83b7\u53d6cookie\uff0c\u5e76\u8fd0\u884c\u5bf9\u5e94\u6587\u4ef6\u3002\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/e2ceaa15-defc-4d41-a6db-4a9d3f3055e4)\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/78e791a6-ba51-455a-a438-3c829db5c387)\r\r\n\r\r\n3. \u6b22\u8fcestar\uff0c\u4e0d\u65f6\u66f4\u65b0\u3002\r\r\n4. \u6709\u95ee\u9898\u53ef\u4ee5\u52a0QQ\u6216\u8005\u5fae\u4fe1\u4ea4\u6d41\uff08992822653\uff09\r\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Little Red Book notes, home page, detailed page crawler",
"version": "1.1.0",
"project_urls": null,
"split_keywords": [
"python",
"xhs",
"spider"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "617d567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee",
"md5": "232366ada138472276a0c8901c5aae90",
"sha256": "3e88d48b9ac89b5a5ebcac9b0a3a94402cfc3b22726f0773cac473420fb4bdba"
},
"downloads": -1,
"filename": "xhs_spider-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "232366ada138472276a0c8901c5aae90",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 44901,
"upload_time": "2023-10-08T06:35:59",
"upload_time_iso_8601": "2023-10-08T06:35:59.163001Z",
"url": "https://files.pythonhosted.org/packages/61/7d/567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee/xhs_spider-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-08 06:35:59",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "xhs-spider"
}