Name | weibo-scrapy JSON |
Version |
2.2.5
JSON |
| download |
home_page | https://github.com/yanjlee/weibo_scrapy |
Summary | WEIBO\_SCRAPY是一个PYTHON实现的,使用多线程抓取WEIBO信息的框架。WEIBO\_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口,让用户只需关心抓取的业务逻辑,而不用处理棘手的WEIBO模拟登录和多线程编程. |
upload_time | 2024-06-01 08:41:52 |
maintainer | None |
docs_url | None |
author | yanjlee |
requires_python | None |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
WEIBO_SCRAPY
============
WEIBO\_SCRAPY是一个PYTHON实现的,使用多线程抓取WEIBO信息的框架。WEIBO\_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口,让用户只需关心抓取的业务逻辑,而不用处理棘手的WEIBO模拟登录和多线程编程。
WEIBO\_SCRAPY is a **Multi-Threading** SINA WEIBO data extraction Framework in Python. WEIBO\_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own **extraction** logic.
=======
###WEIBO\_SCRAPY的功能
1\. 微博模拟登录
2\. 多线程抓取框架
3\. **抓取任务**接口
4\. 抓取参数配置
###WEIBO\_SCRAPY Provides
1\. WEIBO Login Simulator
2\. Multi-Threading Extraction Framework
3\. **Extraction Task** Interface
4\. Easy Way of Parameters Configuration
###How to Use WEIBO\_SCRAPY
#!/usr/bin/env python
#coding=utf8
from weibo_scrapy import scrapy
class my_scrapy(scrapy):
def scrapy_do_task(self, uid=None):
'''
User needs to overwrite this method to perform uid-based scrapy task.
@param uid: weibo uid
@return: a list of uids gained from this task, optional
'''
super(my_scrapy, self).__init__(**kwds)
#do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue,
#or gain new uids from this function
print 'WOW...'
return 'replace this string with uid list which gained from this task'
if __name__ == '__main__':
s = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')
s.scrapy()
###相关阅读(Readings)
[基于UID的WEIBO信息抓取框架WEIBO_SCRAPY](http://yoyzhou.github.io/blog/2013/04/08/weibo-scrapy-framework-with-multi-threading/)
Raw data
{
"_id": null,
"home_page": "https://github.com/yanjlee/weibo_scrapy",
"name": "weibo-scrapy",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "yanjlee",
"author_email": "yanjlee@163.com",
"download_url": "https://files.pythonhosted.org/packages/ea/7f/6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950/weibo_scrapy-2.2.5.tar.gz",
"platform": null,
"description": "WEIBO_SCRAPY\r\n============\r\n\r\nWEIBO\\_SCRAPY\u662f\u4e00\u4e2aPYTHON\u5b9e\u73b0\u7684\uff0c\u4f7f\u7528\u591a\u7ebf\u7a0b\u6293\u53d6WEIBO\u4fe1\u606f\u7684\u6846\u67b6\u3002WEIBO\\_SCRAPY\u6846\u67b6\u7ed9\u7528\u6237\u63d0\u4f9bWEIBO\u7684\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u6293\u53d6\u5fae\u535a\u4fe1\u606f\u7684\u63a5\u53e3\uff0c\u8ba9\u7528\u6237\u53ea\u9700\u5173\u5fc3\u6293\u53d6\u7684\u4e1a\u52a1\u903b\u8f91\uff0c\u800c\u4e0d\u7528\u5904\u7406\u68d8\u624b\u7684WEIBO\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u7f16\u7a0b\u3002\r\n\r\nWEIBO\\_SCRAPY is a **Multi-Threading** SINA WEIBO data extraction Framework in Python. WEIBO\\_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own **extraction** logic.\r\n\r\n\r\n=======\r\n\r\n###WEIBO\\_SCRAPY\u7684\u529f\u80fd\r\n1\\. \u5fae\u535a\u6a21\u62df\u767b\u5f55\r\n\r\n2\\. \u591a\u7ebf\u7a0b\u6293\u53d6\u6846\u67b6\r\n\r\n3\\. **\u6293\u53d6\u4efb\u52a1**\u63a5\u53e3\r\n\r\n4\\. \u6293\u53d6\u53c2\u6570\u914d\u7f6e\r\n\r\n###WEIBO\\_SCRAPY Provides\r\n1\\. WEIBO Login Simulator\r\n\r\n2\\. Multi-Threading Extraction Framework\r\n\r\n3\\. **Extraction Task** Interface\r\n\r\n4\\. Easy Way of Parameters Configuration\r\n\r\n###How to Use WEIBO\\_SCRAPY\r\n\t#!/usr/bin/env python\r\n\t#coding=utf8\r\n\r\n\tfrom weibo_scrapy import scrapy\r\n\r\n\tclass my_scrapy(scrapy):\r\n\t\t\r\n\t\tdef scrapy_do_task(self, uid=None):\r\n\t\t '''\r\n\t\t User needs to overwrite this method to perform uid-based scrapy task.\r\n\t\t @param uid: weibo uid\r\n\t\t @return: a list of uids gained from this task, optional\r\n\t\t '''\r\n\t\t super(my_scrapy, self).__init__(**kwds)\r\n\t\t \r\n\t\t #do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue, \r\n\t\t #or gain new uids from this function\r\n\t\t print 'WOW...'\r\n\t\t return 'replace this string with uid list which gained from this task'\r\n\t\t \r\n\tif __name__ == '__main__':\r\n\t\t\r\n\t\ts = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')\r\n\t\ts.scrapy()\r\n\r\n###\u76f8\u5173\u9605\u8bfb(Readings)\r\n[\u57fa\u4e8eUID\u7684WEIBO\u4fe1\u606f\u6293\u53d6\u6846\u67b6WEIBO_SCRAPY](http://yoyzhou.github.io/blog/2013/04/08/weibo-scrapy-framework-with-multi-threading/)\r\n",
"bugtrack_url": null,
"license": null,
"summary": "WEIBO\\_SCRAPY\u662f\u4e00\u4e2aPYTHON\u5b9e\u73b0\u7684\uff0c\u4f7f\u7528\u591a\u7ebf\u7a0b\u6293\u53d6WEIBO\u4fe1\u606f\u7684\u6846\u67b6\u3002WEIBO\\_SCRAPY\u6846\u67b6\u7ed9\u7528\u6237\u63d0\u4f9bWEIBO\u7684\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u6293\u53d6\u5fae\u535a\u4fe1\u606f\u7684\u63a5\u53e3\uff0c\u8ba9\u7528\u6237\u53ea\u9700\u5173\u5fc3\u6293\u53d6\u7684\u4e1a\u52a1\u903b\u8f91\uff0c\u800c\u4e0d\u7528\u5904\u7406\u68d8\u624b\u7684WEIBO\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u7f16\u7a0b.",
"version": "2.2.5",
"project_urls": {
"Homepage": "https://github.com/yanjlee/weibo_scrapy"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4cb435712e2d263f01951473aeb482d3c84f5cfc480ac47caebeeaf3eb32e4cb",
"md5": "84d696824ba3eed7161229fd42142000",
"sha256": "d9d2754bc0ebc489a8c511bf8337e258db2d1994af71be579fde05731ff08dad"
},
"downloads": -1,
"filename": "weibo_scrapy-2.2.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "84d696824ba3eed7161229fd42142000",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 3333,
"upload_time": "2024-06-01T08:41:50",
"upload_time_iso_8601": "2024-06-01T08:41:50.013498Z",
"url": "https://files.pythonhosted.org/packages/4c/b4/35712e2d263f01951473aeb482d3c84f5cfc480ac47caebeeaf3eb32e4cb/weibo_scrapy-2.2.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ea7f6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950",
"md5": "053fb6e5a734fe423bb092ccfca5a606",
"sha256": "4af2de4a6f1da5e26f2e724f9537be9c79c520a4bbd8044499f5f60f2b87da19"
},
"downloads": -1,
"filename": "weibo_scrapy-2.2.5.tar.gz",
"has_sig": false,
"md5_digest": "053fb6e5a734fe423bb092ccfca5a606",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3980,
"upload_time": "2024-06-01T08:41:52",
"upload_time_iso_8601": "2024-06-01T08:41:52.783991Z",
"url": "https://files.pythonhosted.org/packages/ea/7f/6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950/weibo_scrapy-2.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-01 08:41:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yanjlee",
"github_project": "weibo_scrapy",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "weibo-scrapy"
}