netnovelcrawler


Namenetnovelcrawler JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryCrawler framework to download Internet-novels from web.
upload_time2024-09-16 06:27:24
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseMIT
keywords net novel crawler
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## 爬取小说网站并生成TXT

### 目前支持
- [夜读库](https://m.yeduku.net)
- [顶点小说](https://www.dingdianks.com)
- [笔趣阁](https://www.22biqu.com)
- [~~欢乐书客/刺猬猫~~](https://www.ciweimao.com)
- [~~sf轻小说~~](https://book.sfacg.com)

### 安装


### 用法

#### CLI
```python
config = (
    r'D:\net_novels\crawler_ocr\lord',
    {
        'start_page': 'https://ccc.xxxx.com/Novel/xxxxxx/',
        'login_info': ('test_login', 'test_pwd'),
        'image_folder': 'vip_images',
        'image_process': 'ocr',
        'text_file': 'xxx.txt',
    }
)
from netnovelcrawler import Crawler
from netnovelcrawler.utils.starter_stopper import AfterChapterStarter, CountStopper

mycrawler = Crawler(*config)
mycrawler.crawl(starter=AfterChapterStarter("10. 某章节"), stopper=CountStopper(50))
```

#### GUI
```bash
python -m netnovelcrawlertaskmgr
```


#### 绕过滑块验证反爬虫机制

######修改chromedriver.exe
- 文本编辑器打开chromedriver.exe
- 找到`cdc_`字符串
- 等长替换$cdc_lasutopfhvcZLmcfl
- 保存


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "netnovelcrawler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "net novel, crawler",
    "author": null,
    "author_email": "NovelReader <xxxx@hotmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/8e/59/e22840195f7f24d51fb0c8a501d94385218e84eadd3efeda008e5f7859cb/netnovelcrawler-0.0.2.tar.gz",
    "platform": null,
    "description": "## \u722c\u53d6\u5c0f\u8bf4\u7f51\u7ad9\u5e76\u751f\u6210TXT\r\n\r\n### \u76ee\u524d\u652f\u6301\r\n- [\u591c\u8bfb\u5e93](https://m.yeduku.net)\r\n- [\u9876\u70b9\u5c0f\u8bf4](https://www.dingdianks.com)\r\n- [\u7b14\u8da3\u9601](https://www.22biqu.com)\r\n- [~~\u6b22\u4e50\u4e66\u5ba2/\u523a\u732c\u732b~~](https://www.ciweimao.com)\r\n- [~~sf\u8f7b\u5c0f\u8bf4~~](https://book.sfacg.com)\r\n\r\n### \u5b89\u88c5\r\n\r\n\r\n### \u7528\u6cd5\r\n\r\n#### CLI\r\n```python\r\nconfig = (\r\n    r'D:\\net_novels\\crawler_ocr\\lord',\r\n    {\r\n        'start_page': 'https://ccc.xxxx.com/Novel/xxxxxx/',\r\n        'login_info': ('test_login', 'test_pwd'),\r\n        'image_folder': 'vip_images',\r\n        'image_process': 'ocr',\r\n        'text_file': 'xxx.txt',\r\n    }\r\n)\r\nfrom netnovelcrawler import Crawler\r\nfrom netnovelcrawler.utils.starter_stopper import AfterChapterStarter, CountStopper\r\n\r\nmycrawler = Crawler(*config)\r\nmycrawler.crawl(starter=AfterChapterStarter(\"10. \u67d0\u7ae0\u8282\"), stopper=CountStopper(50))\r\n```\r\n\r\n#### GUI\r\n```bash\r\npython -m netnovelcrawlertaskmgr\r\n```\r\n\r\n\r\n#### \u7ed5\u8fc7\u6ed1\u5757\u9a8c\u8bc1\u53cd\u722c\u866b\u673a\u5236\r\n\r\n######\u4fee\u6539chromedriver.exe\r\n- \u6587\u672c\u7f16\u8f91\u5668\u6253\u5f00chromedriver.exe\r\n- \u627e\u5230`cdc_`\u5b57\u7b26\u4e32\r\n- \u7b49\u957f\u66ff\u6362$cdc_lasutopfhvcZLmcfl\r\n- \u4fdd\u5b58\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Crawler framework to download Internet-novels from web.",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "net novel",
        " crawler"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dce722e1501b93433511075322f2c34e02cecb5aeaa31028f99a7cee677024be",
                "md5": "32184814f367d73c564675c2f70387d3",
                "sha256": "13e8f51512a33c739366b502caa3c17d752e6e48bd12e8f20defee2a600eb811"
            },
            "downloads": -1,
            "filename": "netnovelcrawler-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "32184814f367d73c564675c2f70387d3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 26200,
            "upload_time": "2024-09-16T06:27:23",
            "upload_time_iso_8601": "2024-09-16T06:27:23.377052Z",
            "url": "https://files.pythonhosted.org/packages/dc/e7/22e1501b93433511075322f2c34e02cecb5aeaa31028f99a7cee677024be/netnovelcrawler-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8e59e22840195f7f24d51fb0c8a501d94385218e84eadd3efeda008e5f7859cb",
                "md5": "35794f4b0455ffe7166c72984fc2dcbc",
                "sha256": "4264c6c68a8ff46aaed21f97881f988dde6bd392e7fd05ccd77fa51e7d90dd18"
            },
            "downloads": -1,
            "filename": "netnovelcrawler-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "35794f4b0455ffe7166c72984fc2dcbc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 25438,
            "upload_time": "2024-09-16T06:27:24",
            "upload_time_iso_8601": "2024-09-16T06:27:24.637592Z",
            "url": "https://files.pythonhosted.org/packages/8e/59/e22840195f7f24d51fb0c8a501d94385218e84eadd3efeda008e5f7859cb/netnovelcrawler-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-16 06:27:24",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "netnovelcrawler"
}
        
Elapsed time: 0.32474s