## Opinion Analysis Toolkit
A toolkit to extract opinions and useful information from text
### Installation
```pip
pip install opinionx
```
### Example Usage
1. Find opinions
```python
from opinionx.text import get_opinion
text=open("test.txt",'r',encoding='utf-8').read()
opinion_words=['表示','认为','说','介绍','提出','透露','指出','强调',':']
list_opinion,_,_=get_opinion(text,lang='zh',opinion_words=opinion_words)
for opinion in list_opinion:
print(opinion)
```
2. Find Leader's Opinions
```python
from opinionx.text import get_leader_opinions
text=open("test.txt",'r',encoding='utf-8').read()
list_opinion = get_leader_opinions(text,save_path="", search_keywords_path="data/search_keywords.csv",leader_path="data/g20_leaders.csv")
print()
for opinion in list_opinion:
print(opinion)
print(opinion["opinion"])
print(opinion["first_found_keyword"])
print(opinion["first_found_leader"])
print()
```
3. run tf-idf and tf models for massive text files
```python
from opinionx.tfidf_shell import *
run_tfidf_shell(input_folder="tfidf_folder/raw_data", # a list of text files
output_folder="tfidf_folder/output", # output folder
user_dict_path="tfidf_folder/user_dictionaries", # the folder contains csv files with each line as a word
font_path="utils/fonts/SimHei.ttf",# use it when analysis Chinese text
is_html=True
)
```
### Credits & References
- [Stanza](https://stanfordnlp.github.io/stanza/index.html)
- [jieba](https://github.com/fxsjy/jieba)
### License
The `opinionx` project is provided by [Donghua Chen](https://github.com/dhchenx).
Raw data
{
"_id": null,
"home_page": "https://github.com/dhchenx/opinionx",
"name": "opinionx",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6, <4",
"maintainer_email": "",
"keywords": "public opinion analysis,text analysis",
"author": "Donghua Chen",
"author_email": "douglaschan@126.com",
"download_url": "https://files.pythonhosted.org/packages/4b/46/69fa1d446c82fcc9769242aa94b2059eb3dd07cc9f3b010b2151cd20e5c0/opinionx-0.0.2.tar.gz",
"platform": null,
"description": "## Opinion Analysis Toolkit\r\n\r\nA toolkit to extract opinions and useful information from text\r\n\r\n### Installation\r\n```pip\r\npip install opinionx\r\n```\r\n\r\n### Example Usage\r\n1. Find opinions\r\n```python\r\nfrom opinionx.text import get_opinion\r\ntext=open(\"test.txt\",'r',encoding='utf-8').read()\r\nopinion_words=['\u8868\u793a','\u8ba4\u4e3a','\u8bf4','\u4ecb\u7ecd','\u63d0\u51fa','\u900f\u9732','\u6307\u51fa','\u5f3a\u8c03','\uff1a']\r\nlist_opinion,_,_=get_opinion(text,lang='zh',opinion_words=opinion_words)\r\nfor opinion in list_opinion:\r\n print(opinion)\r\n```\r\n2. Find Leader's Opinions\r\n```python\r\nfrom opinionx.text import get_leader_opinions\r\n\r\ntext=open(\"test.txt\",'r',encoding='utf-8').read()\r\n\r\nlist_opinion = get_leader_opinions(text,save_path=\"\", search_keywords_path=\"data/search_keywords.csv\",leader_path=\"data/g20_leaders.csv\")\r\nprint()\r\nfor opinion in list_opinion:\r\n print(opinion)\r\n print(opinion[\"opinion\"])\r\n print(opinion[\"first_found_keyword\"])\r\n print(opinion[\"first_found_leader\"])\r\n print()\r\n\r\n```\r\n3. run tf-idf and tf models for massive text files\r\n```python\r\nfrom opinionx.tfidf_shell import *\r\nrun_tfidf_shell(input_folder=\"tfidf_folder/raw_data\", # a list of text files\r\n output_folder=\"tfidf_folder/output\", # output folder\r\n user_dict_path=\"tfidf_folder/user_dictionaries\", # the folder contains csv files with each line as a word\r\n font_path=\"utils/fonts/SimHei.ttf\",# use it when analysis Chinese text\r\n is_html=True\r\n )\r\n```\r\n\r\n### Credits & References\r\n\r\n- [Stanza](https://stanfordnlp.github.io/stanza/index.html)\r\n- [jieba](https://github.com/fxsjy/jieba)\r\n\r\n### License\r\nThe `opinionx` project is provided by [Donghua Chen](https://github.com/dhchenx). \r\n\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Opinion Analysis Toolkit",
"version": "0.0.2",
"project_urls": {
"Bug Reports": "https://github.com/dhchenx/opinionx/issues",
"Homepage": "https://github.com/dhchenx/opinionx",
"Source": "https://github.com/dhchenx/opinionx"
},
"split_keywords": [
"public opinion analysis",
"text analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6901d26a958fed16c023f7d8e94245f4d2aa07918c12cbcf0fa12d232aaaf06f",
"md5": "c956bead1f381bc6ad532a4a7eceb8ad",
"sha256": "23056b2468b7a8599d69d76f53541c44ba2190f801c6b5eedbe3daae86cb657c"
},
"downloads": -1,
"filename": "opinionx-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c956bead1f381bc6ad532a4a7eceb8ad",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6, <4",
"size": 33405,
"upload_time": "2023-06-07T01:23:34",
"upload_time_iso_8601": "2023-06-07T01:23:34.652336Z",
"url": "https://files.pythonhosted.org/packages/69/01/d26a958fed16c023f7d8e94245f4d2aa07918c12cbcf0fa12d232aaaf06f/opinionx-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4b4669fa1d446c82fcc9769242aa94b2059eb3dd07cc9f3b010b2151cd20e5c0",
"md5": "7e68eb42f45865b95257f1b98581f840",
"sha256": "44123163a2364fd459d496603cec64ce4c2bede87adbe263808dab997b236c21"
},
"downloads": -1,
"filename": "opinionx-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "7e68eb42f45865b95257f1b98581f840",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6, <4",
"size": 32498,
"upload_time": "2023-06-07T01:23:36",
"upload_time_iso_8601": "2023-06-07T01:23:36.251726Z",
"url": "https://files.pythonhosted.org/packages/4b/46/69fa1d446c82fcc9769242aa94b2059eb3dd07cc9f3b010b2151cd20e5c0/opinionx-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-07 01:23:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dhchenx",
"github_project": "opinionx",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "opinionx"
}