xhs-spider


Namexhs-spider JSON
Version 1.1.0 PyPI version JSON
download
home_page
SummaryLittle Red Book notes, home page, detailed page crawler
upload_time2023-10-08 06:35:59
maintainer
docs_urlNone
authorcv_cat
requires_python
license
keywords python xhs spider
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Spider_XHS

![image](https://img.shields.io/badge/cv_cat-Spider_XHS-blue)



小红书个人主页图片和视频无水印爬取



## 效果图

![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/ef8990bc-d568-4b63-9dfc-4e2e4f235f99)



![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/a5eb7df4-434a-4e6e-91e1-b60b40ca08e8)



![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/d8c2e84e-3e78-4ca8-8c93-406a3e74da91)



![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/7a0ea368-5507-469f-84f4-6dda59568b86)



## 运行环境

Python环境

NodeJS环境



运行方法:把你想要的id全部放到列表里

```

# 主页处理

from xhs_spider.home import Home

home = Home()

url_list = [

    'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',

    'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',

]

home.main(url_list)

# 笔记处理

from xhs_spider.note import Note

one_note = OneNote()

url_list = [

    'https://www.xiaohongshu.com/explore/64356527000000001303282b',

]

one_note.main(url_list)

# 搜索结果处理

from xhs_spider.search import Search

search = Search()

query = '你好'

# 搜索的数量(前多少个)

number = 22

search.main(query, number)

```

## 日志

1. 23/08/08   first commit

2. 23/09/13 【api更改params增加两个字段】修复图片无法下载,有些页面无法访问导致报错。

3. 23/09/16 【较大视频出现编码问题】修复视频编码问题,加入异常处理。

4. 23/09/18   代码重构,加入失败重试。

5. 23/09/19   新增下载搜索结果功能



## 注意事项

**本项目仅供学习与交流,侵权必删**



other

1. 自行将cookies放到目录下cookies.txt中,去设置里的应用程序里找或者网络请求里找,需要哪些可以参考cookie.txt文件。

2. 可采用以下方法获取cookie,并运行对应文件。

![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/e2ceaa15-defc-4d41-a6db-4a9d3f3055e4)

![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/78e791a6-ba51-455a-a438-3c829db5c387)



3. 欢迎star,不时更新。

4. 有问题可以加QQ或者微信交流(992822653)


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "xhs-spider",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,xhs,spider",
    "author": "cv_cat",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/61/7d/567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee/xhs_spider-1.1.0.tar.gz",
    "platform": null,
    "description": "\r\n# Spider_XHS\r\r\n![image](https://img.shields.io/badge/cv_cat-Spider_XHS-blue)\r\r\n\r\r\n\u5c0f\u7ea2\u4e66\u4e2a\u4eba\u4e3b\u9875\u56fe\u7247\u548c\u89c6\u9891\u65e0\u6c34\u5370\u722c\u53d6\r\r\n\r\r\n## \u6548\u679c\u56fe\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/ef8990bc-d568-4b63-9dfc-4e2e4f235f99)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/a5eb7df4-434a-4e6e-91e1-b60b40ca08e8)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/d8c2e84e-3e78-4ca8-8c93-406a3e74da91)\r\r\n\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/7a0ea368-5507-469f-84f4-6dda59568b86)\r\r\n\r\r\n## \u8fd0\u884c\u73af\u5883\r\r\nPython\u73af\u5883\r\r\nNodeJS\u73af\u5883\r\r\n\r\r\n\u8fd0\u884c\u65b9\u6cd5\uff1a\u628a\u4f60\u60f3\u8981\u7684id\u5168\u90e8\u653e\u5230\u5217\u8868\u91cc\r\r\n```\r\r\n# \u4e3b\u9875\u5904\u7406\r\r\nfrom xhs_spider.home import Home\r\r\nhome = Home()\r\r\nurl_list = [\r\r\n    'https://www.xiaohongshu.com/user/profile/6185ce66000000001000705b',\r\r\n    'https://www.xiaohongshu.com/user/profile/6034d6f20000000001006fbb',\r\r\n]\r\r\nhome.main(url_list)\r\r\n# \u7b14\u8bb0\u5904\u7406\r\r\nfrom xhs_spider.note import Note\r\r\none_note = OneNote()\r\r\nurl_list = [\r\r\n    'https://www.xiaohongshu.com/explore/64356527000000001303282b',\r\r\n]\r\r\none_note.main(url_list)\r\r\n# \u641c\u7d22\u7ed3\u679c\u5904\u7406\r\r\nfrom xhs_spider.search import Search\r\r\nsearch = Search()\r\r\nquery = '\u4f60\u597d'\r\r\n# \u641c\u7d22\u7684\u6570\u91cf\uff08\u524d\u591a\u5c11\u4e2a\uff09\r\r\nnumber = 22\r\r\nsearch.main(query, number)\r\r\n```\r\r\n## \u65e5\u5fd7\r\r\n1. 23/08/08   first commit\r\r\n2. 23/09/13 \u3010api\u66f4\u6539params\u589e\u52a0\u4e24\u4e2a\u5b57\u6bb5\u3011\u4fee\u590d\u56fe\u7247\u65e0\u6cd5\u4e0b\u8f7d\uff0c\u6709\u4e9b\u9875\u9762\u65e0\u6cd5\u8bbf\u95ee\u5bfc\u81f4\u62a5\u9519\u3002\r\r\n3. 23/09/16 \u3010\u8f83\u5927\u89c6\u9891\u51fa\u73b0\u7f16\u7801\u95ee\u9898\u3011\u4fee\u590d\u89c6\u9891\u7f16\u7801\u95ee\u9898\uff0c\u52a0\u5165\u5f02\u5e38\u5904\u7406\u3002\r\r\n4. 23/09/18   \u4ee3\u7801\u91cd\u6784\uff0c\u52a0\u5165\u5931\u8d25\u91cd\u8bd5\u3002\r\r\n5. 23/09/19   \u65b0\u589e\u4e0b\u8f7d\u641c\u7d22\u7ed3\u679c\u529f\u80fd\r\r\n\r\r\n## \u6ce8\u610f\u4e8b\u9879\r\r\n**\u672c\u9879\u76ee\u4ec5\u4f9b\u5b66\u4e60\u4e0e\u4ea4\u6d41\uff0c\u4fb5\u6743\u5fc5\u5220**\r\r\n\r\r\nother\r\r\n1. \u81ea\u884c\u5c06cookies\u653e\u5230\u76ee\u5f55\u4e0bcookies.txt\u4e2d\uff0c\u53bb\u8bbe\u7f6e\u91cc\u7684\u5e94\u7528\u7a0b\u5e8f\u91cc\u627e\u6216\u8005\u7f51\u7edc\u8bf7\u6c42\u91cc\u627e\uff0c\u9700\u8981\u54ea\u4e9b\u53ef\u4ee5\u53c2\u8003cookie.txt\u6587\u4ef6\u3002\r\r\n2. \u53ef\u91c7\u7528\u4ee5\u4e0b\u65b9\u6cd5\u83b7\u53d6cookie\uff0c\u5e76\u8fd0\u884c\u5bf9\u5e94\u6587\u4ef6\u3002\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/e2ceaa15-defc-4d41-a6db-4a9d3f3055e4)\r\r\n![image](https://github.com/cv-cat/Spider_XHS/assets/94289429/78e791a6-ba51-455a-a438-3c829db5c387)\r\r\n\r\r\n3. \u6b22\u8fcestar\uff0c\u4e0d\u65f6\u66f4\u65b0\u3002\r\r\n4. \u6709\u95ee\u9898\u53ef\u4ee5\u52a0QQ\u6216\u8005\u5fae\u4fe1\u4ea4\u6d41\uff08992822653\uff09\r\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Little Red Book notes, home page, detailed page crawler",
    "version": "1.1.0",
    "project_urls": null,
    "split_keywords": [
        "python",
        "xhs",
        "spider"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "617d567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee",
                "md5": "232366ada138472276a0c8901c5aae90",
                "sha256": "3e88d48b9ac89b5a5ebcac9b0a3a94402cfc3b22726f0773cac473420fb4bdba"
            },
            "downloads": -1,
            "filename": "xhs_spider-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "232366ada138472276a0c8901c5aae90",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 44901,
            "upload_time": "2023-10-08T06:35:59",
            "upload_time_iso_8601": "2023-10-08T06:35:59.163001Z",
            "url": "https://files.pythonhosted.org/packages/61/7d/567062956d3326676bc7fe9627d7f44fd4ec534358f350a9b50dfece86ee/xhs_spider-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-08 06:35:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "xhs-spider"
}
        
Elapsed time: 0.12596s