| Name | CTApy JSON |
| Version |
0.1.4
JSON |
| download |
| home_page | https://github.com/twekhof/CTA |
| Summary | Python package for the Conditional Topic Allocation (CTA) |
| upload_time | 2024-08-22 10:06:56 |
| maintainer | None |
| docs_url | None |
| author | Tobias Wekhof |
| requires_python | >=3.9 |
| license | MIT |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# `CTApy`
Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.
* Corresponding research paper: [Conditional Topic Allocations for Open-Ended Survey Responses (2024)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4190308).
## How does CTA work?
CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats?
It consists of three steps:
<br>
1. Predict the outcome variable with text.
* Uses DistilBERT to predict outcome.
<br>
2. Select words with high predictive power (positive or negative).
* Calculates SHAP values for each word and select words with a statistically significant SHAP value.
<br>
3. Group words by semantic similarity.
* Returns topics with either positive or negative correlation with the outcome.
<br>
CTA supports all languages.
## Installation
CTApy requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.
```bash
# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools
# install the package
python -m pip install -U CTApy
```
If you want to use Jupyter, make sure you have it installed in the current environment.
## Quickstart
Please see the hands-on tutorials, which replicate the research paper: [https://github.com/twekhof/CTA/tree/main/tutorials](https://github.com/twekhof/CTA/tree/main/tutorials).
## Author
`CTApy` was developed by
[Tobias Wekhof](https://tobiaswekhof.com), ETH Zurich
## Disclaimer
This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.
Raw data
{
"_id": null,
"home_page": "https://github.com/twekhof/CTA",
"name": "CTApy",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Tobias Wekhof",
"author_email": "tobiaswekhof@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/10/f0/5911f184262209332830461edf159a2aff137e786675285fc3497cacf159/ctapy-0.1.4.tar.gz",
"platform": null,
"description": "# `CTApy`\r\n\r\nPython package for the \"Conditional Topic Allocation\" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.\r\n\r\n\r\n* Corresponding research paper: [Conditional Topic Allocations for Open-Ended Survey Responses (2024)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4190308).\r\n\r\n\r\n## How does CTA work?\r\n\r\n\r\nCTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats?\r\nIt consists of three steps:\r\n\r\n<br>\r\n1. Predict the outcome variable with text.\r\n\r\n* Uses DistilBERT to predict outcome.\r\n \r\n <br>\r\n2. Select words with high predictive power (positive or negative).\r\n\r\n* Calculates SHAP values for each word and select words with a statistically significant SHAP value.\r\n\r\n<br>\r\n3. Group words by semantic similarity.\r\n\r\n* Returns topics with either positive or negative correlation with the outcome.\r\n\r\n<br>\r\nCTA supports all languages.\r\n\r\n## Installation\r\n\r\nCTApy requires Python 3.9 and pip. \r\nIt is highly recommended to use a virtual environment (or conda environment) for the installation.\r\n\r\n```bash\r\n# upgrade pip, wheel and setuptools\r\npython -m pip install -U pip wheel setuptools\r\n\r\n# install the package\r\npython -m pip install -U CTApy\r\n```\r\n\r\nIf you want to use Jupyter, make sure you have it installed in the current environment.\r\n\r\n## Quickstart \r\n\r\nPlease see the hands-on tutorials, which replicate the research paper: [https://github.com/twekhof/CTA/tree/main/tutorials](https://github.com/twekhof/CTA/tree/main/tutorials).\r\n\r\n\r\n## Author\r\n\r\n`CTApy` was developed by\r\n\r\n[Tobias Wekhof](https://tobiaswekhof.com), ETH Zurich\r\n\r\n\r\n## Disclaimer\r\n\r\nThis Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python package for the Conditional Topic Allocation (CTA)",
"version": "0.1.4",
"project_urls": {
"Homepage": "https://github.com/twekhof/CTA"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "04a33142356f920c0532354531a37ed08e7f2c32d298eb323d6d9bb0182014ef",
"md5": "6ccd9081ea686a84ffaa127fec46a0b1",
"sha256": "51e9eb7f901fbb50fec3f4b9e5a651daca2c848ce73240f805675cbda7af65bf"
},
"downloads": -1,
"filename": "CTApy-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6ccd9081ea686a84ffaa127fec46a0b1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12709,
"upload_time": "2024-08-22T10:06:54",
"upload_time_iso_8601": "2024-08-22T10:06:54.798210Z",
"url": "https://files.pythonhosted.org/packages/04/a3/3142356f920c0532354531a37ed08e7f2c32d298eb323d6d9bb0182014ef/CTApy-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "10f05911f184262209332830461edf159a2aff137e786675285fc3497cacf159",
"md5": "9723d6a0cb7c9a2e89bb7387fed20ef1",
"sha256": "166cf8ea9e2b8e93b07a2df359d995ea55ddb5cf2e288b6be50f5b93369b7de9"
},
"downloads": -1,
"filename": "ctapy-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "9723d6a0cb7c9a2e89bb7387fed20ef1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 9743,
"upload_time": "2024-08-22T10:06:56",
"upload_time_iso_8601": "2024-08-22T10:06:56.671145Z",
"url": "https://files.pythonhosted.org/packages/10/f0/5911f184262209332830461edf159a2aff137e786675285fc3497cacf159/ctapy-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-22 10:06:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "twekhof",
"github_project": "CTA",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "ctapy"
}