# `liwc`
[![PyPI version](https://badge.fury.io/py/liwc.svg)](https://pypi.org/project/liwc/)
[![Travis CI Build Status](https://travis-ci.org/chbrown/liwc-python.svg?branch=master)](https://travis-ci.org/chbrown/liwc-python)
Linguistic Inquiry and Word Count (LIWC) analyzer.
The LIWC lexicon is proprietary, so it is _not_ included in this repository,
but this Python package requires it.
The lexicon data can be acquired (purchased) from [liwc.net](http://liwc.net/).
This package reads from the `LIWC2007_English100131.dic` (MD5: `2a8c06ee3748218aa89b975574b4e84d`) file,
which must be available on any system where this package is used.
The LIWC2007 `.dic` format looks like this:
%
1 funct
2 pronoun
[...]
%
a 1 10
abdomen* 146 147
about 1 16 17
[...]
## Setup
Install from [PyPI](https://pypi.python.org/pypi/liwc):
pip install -U liwc
## Example
```python
import re
from collections import Counter
def tokenize(text):
# you may want to use a smarter tokenizer
for match in re.finditer(r'\w+', text, re.UNICODE):
yield match.group(0)
import liwc
parse, category_names = liwc.load_token_parser('LIWC2007_English100131.dic')
```
* `parse` is a function from a token of text (a string) to a list of matching LIWC categories (a list of strings)
* `category_names` is all LIWC categories in the lexicon (a list of strings)
```python
gettysburg = '''Four score and seven years ago our fathers brought forth on
this continent a new nation, conceived in liberty, and dedicated to the
proposition that all men are created equal. Now we are engaged in a great
civil war, testing whether that nation, or any nation so conceived and so
dedicated, can long endure. We are met on a great battlefield of that war.
We have come to dedicate a portion of that field, as a final resting place
for those who here gave their lives that that nation might live. It is
altogether fitting and proper that we should do this.'''
gettysburg_tokens = tokenize(gettysburg)
# now flatmap over all the categories in all of the tokens using a generator:
gettysburg_counts = Counter(category for token in gettysburg_tokens for category in parse(token))
# and print the results:
print(gettysburg_counts)
```
## License
Copyright (c) 2012-2019 Christopher Brown.
[MIT Licensed](LICENSE.txt).
Raw data
{
"_id": null,
"home_page": "https://github.com/chbrown/liwc-python",
"name": "liwc",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Christopher Brown",
"author_email": "chrisbrown@utexas.edu",
"download_url": "https://files.pythonhosted.org/packages/68/18/f865fabfc903a5f241155db475f8a387d3874a2eed412b7baf988f0b8cab/liwc-0.5.0.tar.gz",
"platform": "",
"description": "# `liwc`\n\n[![PyPI version](https://badge.fury.io/py/liwc.svg)](https://pypi.org/project/liwc/)\n[![Travis CI Build Status](https://travis-ci.org/chbrown/liwc-python.svg?branch=master)](https://travis-ci.org/chbrown/liwc-python)\n\nLinguistic Inquiry and Word Count (LIWC) analyzer.\n\nThe LIWC lexicon is proprietary, so it is _not_ included in this repository,\nbut this Python package requires it.\nThe lexicon data can be acquired (purchased) from [liwc.net](http://liwc.net/).\nThis package reads from the `LIWC2007_English100131.dic` (MD5: `2a8c06ee3748218aa89b975574b4e84d`) file,\nwhich must be available on any system where this package is used.\n\nThe LIWC2007 `.dic` format looks like this:\n\n %\n 1 funct\n 2 pronoun\n [...]\n %\n a 1 10\n abdomen* 146 147\n about 1 16 17\n [...]\n\n\n## Setup\n\nInstall from [PyPI](https://pypi.python.org/pypi/liwc):\n\n pip install -U liwc\n\n\n## Example\n\n```python\nimport re\nfrom collections import Counter\n\ndef tokenize(text):\n # you may want to use a smarter tokenizer\n for match in re.finditer(r'\\w+', text, re.UNICODE):\n yield match.group(0)\n\nimport liwc\nparse, category_names = liwc.load_token_parser('LIWC2007_English100131.dic')\n```\n\n* `parse` is a function from a token of text (a string) to a list of matching LIWC categories (a list of strings)\n* `category_names` is all LIWC categories in the lexicon (a list of strings)\n\n```python\ngettysburg = '''Four score and seven years ago our fathers brought forth on\n this continent a new nation, conceived in liberty, and dedicated to the\n proposition that all men are created equal. Now we are engaged in a great\n civil war, testing whether that nation, or any nation so conceived and so\n dedicated, can long endure. We are met on a great battlefield of that war.\n We have come to dedicate a portion of that field, as a final resting place\n for those who here gave their lives that that nation might live. It is\n altogether fitting and proper that we should do this.'''\ngettysburg_tokens = tokenize(gettysburg)\n# now flatmap over all the categories in all of the tokens using a generator:\ngettysburg_counts = Counter(category for token in gettysburg_tokens for category in parse(token))\n# and print the results:\nprint(gettysburg_counts)\n```\n\n\n## License\n\nCopyright (c) 2012-2019 Christopher Brown.\n[MIT Licensed](LICENSE.txt).\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Linguistic Inquiry and Word Count (LIWC) analyzer (proprietary data not included)",
"version": "0.5.0",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c97b44560f665fbeb8bd8e297bcd3ac87b10336a3f621ad2db292e7a17f3b1da",
"md5": "ce339dc82ec9fda5230b004b7fc53e4c",
"sha256": "dff606f3ed75609117e46550606f2d378fa05527e2168f7342cb428fedc3e657"
},
"downloads": -1,
"filename": "liwc-0.5.0-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "ce339dc82ec9fda5230b004b7fc53e4c",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 5122,
"upload_time": "2019-12-16T13:48:40",
"upload_time_iso_8601": "2019-12-16T13:48:40.055296Z",
"url": "https://files.pythonhosted.org/packages/c9/7b/44560f665fbeb8bd8e297bcd3ac87b10336a3f621ad2db292e7a17f3b1da/liwc-0.5.0-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6818f865fabfc903a5f241155db475f8a387d3874a2eed412b7baf988f0b8cab",
"md5": "f27b8ffb176053031b2d7133d3338ec5",
"sha256": "0e115296ff31e3c25ed409af7cf94d0c02d29fb596e3db896ac3f6687912ee50"
},
"downloads": -1,
"filename": "liwc-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "f27b8ffb176053031b2d7133d3338ec5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4713,
"upload_time": "2019-12-16T13:48:41",
"upload_time_iso_8601": "2019-12-16T13:48:41.290940Z",
"url": "https://files.pythonhosted.org/packages/68/18/f865fabfc903a5f241155db475f8a387d3874a2eed412b7baf988f0b8cab/liwc-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2019-12-16 13:48:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "chbrown",
"github_project": "liwc-python",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"lcname": "liwc"
}