weibo-scrapy

Name	weibo-scrapy JSON
Version	2.2.5 JSON
	download
home_page	https://github.com/yanjlee/weibo_scrapy
Summary	WEIBO\_SCRAPY是一个PYTHON实现的，使用多线程抓取WEIBO信息的框架。WEIBO\_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口，让用户只需关心抓取的业务逻辑，而不用处理棘手的WEIBO模拟登录和多线程编程.
upload_time	2024-06-01 08:41:52
maintainer	None
docs_url	None
author	yanjlee
requires_python	None
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            WEIBO_SCRAPY
============

WEIBO\_SCRAPY是一个PYTHON实现的，使用多线程抓取WEIBO信息的框架。WEIBO\_SCRAPY框架给用户提供WEIBO的模拟登录和多线程抓取微博信息的接口，让用户只需关心抓取的业务逻辑，而不用处理棘手的WEIBO模拟登录和多线程编程。

WEIBO\_SCRAPY is a **Multi-Threading** SINA WEIBO data extraction Framework in Python. WEIBO\_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own **extraction** logic.


=======

###WEIBO\_SCRAPY的功能
1\. 微博模拟登录

2\. 多线程抓取框架

3\. **抓取任务**接口

4\. 抓取参数配置

###WEIBO\_SCRAPY Provides
1\. WEIBO Login Simulator

2\. Multi-Threading Extraction Framework

3\. **Extraction Task** Interface

4\. Easy Way of Parameters Configuration

###How to Use WEIBO\_SCRAPY
	#!/usr/bin/env python
	#coding=utf8

	from weibo_scrapy import scrapy

	class my_scrapy(scrapy):
		
		def scrapy_do_task(self, uid=None):
		     '''
		    User needs to overwrite this method to perform uid-based scrapy task.
		    @param uid: weibo uid
		    @return: a list of uids gained from this task, optional
		    '''
		     super(my_scrapy, self).__init__(**kwds)
		     
		     #do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue, 
		     #or gain new uids from this function
		     print 'WOW...'
		     return 'replace this string with uid list which gained from this task'
		 
	if __name__ == '__main__':
		
		s = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')
		s.scrapy()

###相关阅读(Readings)
[基于UID的WEIBO信息抓取框架WEIBO_SCRAPY](http://yoyzhou.github.io/blog/2013/04/08/weibo-scrapy-framework-with-multi-threading/)

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yanjlee/weibo_scrapy",
    "name": "weibo-scrapy",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "yanjlee",
    "author_email": "yanjlee@163.com",
    "download_url": "https://files.pythonhosted.org/packages/ea/7f/6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950/weibo_scrapy-2.2.5.tar.gz",
    "platform": null,
    "description": "WEIBO_SCRAPY\r\n============\r\n\r\nWEIBO\\_SCRAPY\u662f\u4e00\u4e2aPYTHON\u5b9e\u73b0\u7684\uff0c\u4f7f\u7528\u591a\u7ebf\u7a0b\u6293\u53d6WEIBO\u4fe1\u606f\u7684\u6846\u67b6\u3002WEIBO\\_SCRAPY\u6846\u67b6\u7ed9\u7528\u6237\u63d0\u4f9bWEIBO\u7684\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u6293\u53d6\u5fae\u535a\u4fe1\u606f\u7684\u63a5\u53e3\uff0c\u8ba9\u7528\u6237\u53ea\u9700\u5173\u5fc3\u6293\u53d6\u7684\u4e1a\u52a1\u903b\u8f91\uff0c\u800c\u4e0d\u7528\u5904\u7406\u68d8\u624b\u7684WEIBO\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u7f16\u7a0b\u3002\r\n\r\nWEIBO\\_SCRAPY is a **Multi-Threading** SINA WEIBO data extraction Framework in Python. WEIBO\\_SCRAPY provides WEIBO login simulator and interface for WEIBO data extraction with multi-threading, it saves users a lot of time by getting users out of writing WEIBO login simulator from scratch and multi-threading programming, users now can focus on their own **extraction** logic.\r\n\r\n\r\n=======\r\n\r\n###WEIBO\\_SCRAPY\u7684\u529f\u80fd\r\n1\\. \u5fae\u535a\u6a21\u62df\u767b\u5f55\r\n\r\n2\\. \u591a\u7ebf\u7a0b\u6293\u53d6\u6846\u67b6\r\n\r\n3\\. **\u6293\u53d6\u4efb\u52a1**\u63a5\u53e3\r\n\r\n4\\. \u6293\u53d6\u53c2\u6570\u914d\u7f6e\r\n\r\n###WEIBO\\_SCRAPY Provides\r\n1\\. WEIBO Login Simulator\r\n\r\n2\\. Multi-Threading Extraction Framework\r\n\r\n3\\. **Extraction Task** Interface\r\n\r\n4\\. Easy Way of Parameters Configuration\r\n\r\n###How to Use WEIBO\\_SCRAPY\r\n\t#!/usr/bin/env python\r\n\t#coding=utf8\r\n\r\n\tfrom weibo_scrapy import scrapy\r\n\r\n\tclass my_scrapy(scrapy):\r\n\t\t\r\n\t\tdef scrapy_do_task(self, uid=None):\r\n\t\t     '''\r\n\t\t    User needs to overwrite this method to perform uid-based scrapy task.\r\n\t\t    @param uid: weibo uid\r\n\t\t    @return: a list of uids gained from this task, optional\r\n\t\t    '''\r\n\t\t     super(my_scrapy, self).__init__(**kwds)\r\n\t\t     \r\n\t\t     #do what you want with uid here, note that this scrapy is uid based, so make sure there are uids in task queue, \r\n\t\t     #or gain new uids from this function\r\n\t\t     print 'WOW...'\r\n\t\t     return 'replace this string with uid list which gained from this task'\r\n\t\t \r\n\tif __name__ == '__main__':\r\n\t\t\r\n\t\ts = my_scrapy(uids_file = 'uids_all.txt', config = 'my.ini')\r\n\t\ts.scrapy()\r\n\r\n###\u76f8\u5173\u9605\u8bfb(Readings)\r\n[\u57fa\u4e8eUID\u7684WEIBO\u4fe1\u606f\u6293\u53d6\u6846\u67b6WEIBO_SCRAPY](http://yoyzhou.github.io/blog/2013/04/08/weibo-scrapy-framework-with-multi-threading/)\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "WEIBO\\_SCRAPY\u662f\u4e00\u4e2aPYTHON\u5b9e\u73b0\u7684\uff0c\u4f7f\u7528\u591a\u7ebf\u7a0b\u6293\u53d6WEIBO\u4fe1\u606f\u7684\u6846\u67b6\u3002WEIBO\\_SCRAPY\u6846\u67b6\u7ed9\u7528\u6237\u63d0\u4f9bWEIBO\u7684\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u6293\u53d6\u5fae\u535a\u4fe1\u606f\u7684\u63a5\u53e3\uff0c\u8ba9\u7528\u6237\u53ea\u9700\u5173\u5fc3\u6293\u53d6\u7684\u4e1a\u52a1\u903b\u8f91\uff0c\u800c\u4e0d\u7528\u5904\u7406\u68d8\u624b\u7684WEIBO\u6a21\u62df\u767b\u5f55\u548c\u591a\u7ebf\u7a0b\u7f16\u7a0b.",
    "version": "2.2.5",
    "project_urls": {
        "Homepage": "https://github.com/yanjlee/weibo_scrapy"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4cb435712e2d263f01951473aeb482d3c84f5cfc480ac47caebeeaf3eb32e4cb",
                "md5": "84d696824ba3eed7161229fd42142000",
                "sha256": "d9d2754bc0ebc489a8c511bf8337e258db2d1994af71be579fde05731ff08dad"
            },
            "downloads": -1,
            "filename": "weibo_scrapy-2.2.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "84d696824ba3eed7161229fd42142000",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 3333,
            "upload_time": "2024-06-01T08:41:50",
            "upload_time_iso_8601": "2024-06-01T08:41:50.013498Z",
            "url": "https://files.pythonhosted.org/packages/4c/b4/35712e2d263f01951473aeb482d3c84f5cfc480ac47caebeeaf3eb32e4cb/weibo_scrapy-2.2.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ea7f6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950",
                "md5": "053fb6e5a734fe423bb092ccfca5a606",
                "sha256": "4af2de4a6f1da5e26f2e724f9537be9c79c520a4bbd8044499f5f60f2b87da19"
            },
            "downloads": -1,
            "filename": "weibo_scrapy-2.2.5.tar.gz",
            "has_sig": false,
            "md5_digest": "053fb6e5a734fe423bb092ccfca5a606",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3980,
            "upload_time": "2024-06-01T08:41:52",
            "upload_time_iso_8601": "2024-06-01T08:41:52.783991Z",
            "url": "https://files.pythonhosted.org/packages/ea/7f/6feb10c5cd5182bab7b58b0cf270df79061889ac3be8473878755d4f8950/weibo_scrapy-2.2.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-01 08:41:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yanjlee",
    "github_project": "weibo_scrapy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "weibo-scrapy"
}

yanjlee