AutoThemeGenerator

Name	AutoThemeGenerator JSON
Version	0.1.9 JSON
	download
home_page	None
Summary	Performing thematic analysis with OpenAI's GPT-4 models
upload_time	2025-01-11 01:40:36
maintainer	None
docs_url	None
author	Charles Alba
requires_python	None
license	None
keywords	gpt models thematic analysis openai qualitiative studies transcripts interviews
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            AutoThemeGenerator is a package that allows you to perform thematic analysis in qualitative studies using OpenAI's GPT models. 





[![Documentation](https://img.shields.io/badge/Documentation-v0.1.9-orange)](https://cja5553.github.io/ReadTheDocs_AutoThemeGenerator/) [![pypi package](https://img.shields.io/badge/pypi_package-v0.1.9-brightgreen)](https://pypi.org/project/AutoThemeGenerator/) [![GitHub Source Code](https://img.shields.io/badge/github_source_code-source_code?logo=github&color=green)](https://github.com/cja5553/AutoThemeGenerator) [![Colab Example](https://img.shields.io/badge/-Colab_Example-grey?logo=google&logoColor=F9AB00)](https://colab.research.google.com/drive/1BoAI-QNL-yL8j8hUJ3K8cJkbyp4spoQ3)





## User inputs

Users are only required to specify the folder location where their interview transcripts are stored. Accepted formats of transcripts include `PDF`, `.docx`, and `.txt` (prefered). `AutoThemeGenerator` assumes that each document is a transcript of one interviewed participant.



## Requirements

### Required packages

To use `AutoThemeGenerator`, you are required to have the following packages installed:  

- `openai`  

- `docx`    

- `tqdm`    

- `nltk`    

- `nltk.tokenize` (submodule of `nltk`)   

- `python-docx`  

- `textract`  

- `requests`  

- `zipfile` (Python standard library)   

- `shutil`  (Python standard library)  

- `json`  (Python standard library)  

- `pprint`  (Python standard library)  



If you do not have these packages installed in python, you can do the following:

```bash

pip install openai==1.12.0 python-docx docx tqdm nltk textract requests

```

### OpenAI API key

You also need an OpenAI key to be able to use this package. If you do not have one, you can apply for an OpenAI API key at [platform.openai.com/api-keys](https://platform.openai.com/api-keys). 





### `pip` version  

The package could only be installed with version older than `24.1`. Newer versions of `pip` will not work due to compatability issues with `textract`. To downgrade to a version older than `24.1`, please do the following:

```bash

pip install "pip<24.1"

```



## Installation

To install in python, simply do the following: 

```bash

pip install AutoThemeGenerator

```



## Quick Start

Here we provide a quick example on how you can execute `AutoThemeGenerator` to conveniently perform qualitative analysis from your transcript. For details towards each of the package's functions and parameters, refer to the [documention](https://cja5553.github.io/ReadTheDocs_AutoThemeGenerator/). 

```python

from AutoThemeGenerator import analyze_and_synthesize_transcripts



# Specify the folders containing your transcript

# This is the folder containing transcripts in .docx, .PDF or .txt format

directory_path = "my_transcript_folder"

# specify your OpenAI API key

api_key = "<insert your API key>"

# specify the folder you wish to save your themes. 

save_results_path = "folder_of_my_saved_results"



# specify the context of your study

context = (

    "Physical inactivity is a major risk factor for developing several chronic illness. "

    "However, university students and staff in the UK are found to be more physically inactive "

    "compared the general UK population. "

    )

# specify your research questions

research_questions = (

    "This study seeks to understand the barriers and enablers "

    "of physical activity (PA) among university staff and students in "

    "the UK under the university setting, using the Theoretical "

    "Domain Framework (TDF) to guide the investigation. "

    )

# specify your survey script

survey_script = (

    "Knowledge\n "

    "What do you know about physical activity? How might you define physical activity? "

    "... ..." # note: truncated to save space

    "... ..." 

    )



# Analyze and synthesize transcripts

initial_themes, individual_synthesized_themes, overall_synthesized_themes = \

analyze_and_synthesize_transcripts(

    directory_path = directory_path, context = context,

    research_questions = research_questions, script = survey_script,

    api_key = api_key, save_results_path = save_results_path)



# display your study-level themes

print(overall_synthesized_themes)

```

You can now view the themes in the form of a topic sentence, a detailed explaination and a relevant quote



## Citation

Yuyi Yang, Charles Alba, Chenyu Wang, Xi Wang, Jami Anderson, and Ruopeng An. "GPT Models Can Perform Thematic Analysis in Public Health Studies, Akin to Qualitative Researchers." Journal of Social Computing, vol. 5, no. 4, (2024): 293-312. doi: [10.23919/JSC.2024.0024](https://doi.org/10.23919/JSC.2024.0024)



# Questions?



Contact me at [alba@wustl.edu](mailto:alba@wustl.edu)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "AutoThemeGenerator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "GPT models, Thematic analysis, OpenAI, Qualitiative studies, transcripts, interviews",
    "author": "Charles Alba",
    "author_email": "alba@wustl.edu",
    "download_url": "https://files.pythonhosted.org/packages/2a/b2/0e1b1d1492a58990bf70a978b6d9fac0df5cd2c4921151a131c48dad8312/autothemegenerator-0.1.9.tar.gz",
    "platform": null,
    "description": "AutoThemeGenerator is a package that allows you to perform thematic analysis in qualitative studies using OpenAI's GPT models. \r\n\r\n\r\n\r\n\r\n\r\n[![Documentation](https://img.shields.io/badge/Documentation-v0.1.9-orange)](https://cja5553.github.io/ReadTheDocs_AutoThemeGenerator/) [![pypi package](https://img.shields.io/badge/pypi_package-v0.1.9-brightgreen)](https://pypi.org/project/AutoThemeGenerator/) [![GitHub Source Code](https://img.shields.io/badge/github_source_code-source_code?logo=github&color=green)](https://github.com/cja5553/AutoThemeGenerator) [![Colab Example](https://img.shields.io/badge/-Colab_Example-grey?logo=google&logoColor=F9AB00)](https://colab.research.google.com/drive/1BoAI-QNL-yL8j8hUJ3K8cJkbyp4spoQ3)\r\n\r\n\r\n\r\n\r\n\r\n## User inputs\r\n\r\nUsers are only required to specify the folder location where their interview transcripts are stored. Accepted formats of transcripts include `PDF`, `.docx`, and `.txt` (prefered). `AutoThemeGenerator` assumes that each document is a transcript of one interviewed participant.\r\n\r\n\r\n\r\n## Requirements\r\n\r\n### Required packages\r\n\r\nTo use `AutoThemeGenerator`, you are required to have the following packages installed:  \r\n\r\n- `openai`  \r\n\r\n- `docx`    \r\n\r\n- `tqdm`    \r\n\r\n- `nltk`    \r\n\r\n- `nltk.tokenize` (submodule of `nltk`)   \r\n\r\n- `python-docx`  \r\n\r\n- `textract`  \r\n\r\n- `requests`  \r\n\r\n- `zipfile` (Python standard library)   \r\n\r\n- `shutil`  (Python standard library)  \r\n\r\n- `json`  (Python standard library)  \r\n\r\n- `pprint`  (Python standard library)  \r\n\r\n\r\n\r\nIf you do not have these packages installed in python, you can do the following:\r\n\r\n```bash\r\n\r\npip install openai==1.12.0 python-docx docx tqdm nltk textract requests\r\n\r\n```\r\n\r\n### OpenAI API key\r\n\r\nYou also need an OpenAI key to be able to use this package. If you do not have one, you can apply for an OpenAI API key at [platform.openai.com/api-keys](https://platform.openai.com/api-keys). \r\n\r\n\r\n\r\n\r\n\r\n### `pip` version  \r\n\r\nThe package could only be installed with version older than `24.1`. Newer versions of `pip` will not work due to compatability issues with `textract`. To downgrade to a version older than `24.1`, please do the following:\r\n\r\n```bash\r\n\r\npip install \"pip<24.1\"\r\n\r\n```\r\n\r\n\r\n\r\n## Installation\r\n\r\nTo install in python, simply do the following: \r\n\r\n```bash\r\n\r\npip install AutoThemeGenerator\r\n\r\n```\r\n\r\n\r\n\r\n## Quick Start\r\n\r\nHere we provide a quick example on how you can execute `AutoThemeGenerator` to conveniently perform qualitative analysis from your transcript. For details towards each of the package's functions and parameters, refer to the [documention](https://cja5553.github.io/ReadTheDocs_AutoThemeGenerator/). \r\n\r\n```python\r\n\r\nfrom AutoThemeGenerator import analyze_and_synthesize_transcripts\r\n\r\n\r\n\r\n# Specify the folders containing your transcript\r\n\r\n# This is the folder containing transcripts in .docx, .PDF or .txt format\r\n\r\ndirectory_path = \"my_transcript_folder\"\r\n\r\n# specify your OpenAI API key\r\n\r\napi_key = \"<insert your API key>\"\r\n\r\n# specify the folder you wish to save your themes. \r\n\r\nsave_results_path = \"folder_of_my_saved_results\"\r\n\r\n\r\n\r\n# specify the context of your study\r\n\r\ncontext = (\r\n\r\n    \"Physical inactivity is a major risk factor for developing several chronic illness. \"\r\n\r\n    \"However, university students and staff in the UK are found to be more physically inactive \"\r\n\r\n    \"compared the general UK population. \"\r\n\r\n    )\r\n\r\n# specify your research questions\r\n\r\nresearch_questions = (\r\n\r\n    \"This study seeks to understand the barriers and enablers \"\r\n\r\n    \"of physical activity (PA) among university staff and students in \"\r\n\r\n    \"the UK under the university setting, using the Theoretical \"\r\n\r\n    \"Domain Framework (TDF) to guide the investigation. \"\r\n\r\n    )\r\n\r\n# specify your survey script\r\n\r\nsurvey_script = (\r\n\r\n    \"Knowledge\\n \"\r\n\r\n    \"What do you know about physical activity? How might you define physical activity? \"\r\n\r\n    \"... ...\" # note: truncated to save space\r\n\r\n    \"... ...\" \r\n\r\n    )\r\n\r\n\r\n\r\n# Analyze and synthesize transcripts\r\n\r\ninitial_themes, individual_synthesized_themes, overall_synthesized_themes = \\\r\n\r\nanalyze_and_synthesize_transcripts(\r\n\r\n    directory_path = directory_path, context = context,\r\n\r\n    research_questions = research_questions, script = survey_script,\r\n\r\n    api_key = api_key, save_results_path = save_results_path)\r\n\r\n\r\n\r\n# display your study-level themes\r\n\r\nprint(overall_synthesized_themes)\r\n\r\n```\r\n\r\nYou can now view the themes in the form of a topic sentence, a detailed explaination and a relevant quote\r\n\r\n\r\n\r\n## Citation\r\n\r\nYuyi Yang, Charles Alba, Chenyu Wang, Xi Wang, Jami Anderson, and Ruopeng An. \"GPT Models Can Perform Thematic Analysis in Public Health Studies, Akin to Qualitative Researchers.\" Journal of Social Computing, vol. 5, no. 4, (2024): 293-312. doi: [10.23919/JSC.2024.0024](https://doi.org/10.23919/JSC.2024.0024)\r\n\r\n\r\n\r\n# Questions?\r\n\r\n\r\n\r\nContact me at [alba@wustl.edu](mailto:alba@wustl.edu)\r\n\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Performing thematic analysis with OpenAI's GPT-4 models",
    "version": "0.1.9",
    "project_urls": null,
    "split_keywords": [
        "gpt models",
        " thematic analysis",
        " openai",
        " qualitiative studies",
        " transcripts",
        " interviews"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84251b103c59bfb49ae84b99e08fddd9e54bce4777d2440cce425543912be4e4",
                "md5": "931061a9bb28ab17e322fe36693897ee",
                "sha256": "a0b8afd15f134190639d4685172c3c10bda356a19fe1e6b0fd9c214a5d26851f"
            },
            "downloads": -1,
            "filename": "AutoThemeGenerator-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "931061a9bb28ab17e322fe36693897ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12021,
            "upload_time": "2025-01-11T01:40:32",
            "upload_time_iso_8601": "2025-01-11T01:40:32.652283Z",
            "url": "https://files.pythonhosted.org/packages/84/25/1b103c59bfb49ae84b99e08fddd9e54bce4777d2440cce425543912be4e4/AutoThemeGenerator-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ab20e1b1d1492a58990bf70a978b6d9fac0df5cd2c4921151a131c48dad8312",
                "md5": "fcae629c03bbe240e7dfe5d1ca1a615a",
                "sha256": "3ac63ebef8c879181b4816459600ed097cb89ba89f13d0399122b5815c671656"
            },
            "downloads": -1,
            "filename": "autothemegenerator-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "fcae629c03bbe240e7dfe5d1ca1a615a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 12280,
            "upload_time": "2025-01-11T01:40:36",
            "upload_time_iso_8601": "2025-01-11T01:40:36.389917Z",
            "url": "https://files.pythonhosted.org/packages/2a/b2/0e1b1d1492a58990bf70a978b6d9fac0df5cd2c4921151a131c48dad8312/autothemegenerator-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-11 01:40:36",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "autothemegenerator"
}

Charles Alba