GailBot


NameGailBot JSON
Version 0.0.3a7 PyPI version JSON
download
home_pageNone
SummaryGailBot API
upload_time2024-08-23 17:47:10
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License Copyright (c) 2023 jasonycwu Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GailBot

## About

Researchers studying human interaction, such as conversation analysts, psychologists, and linguists all rely on detailed transcriptions of language use. Ideally, these should include so-called paralinguistic features of talk, such as overlaps, prosody, and intonation, as they convey important information. However, transcribing these features by hand requires substantial amounts of time by trained transcribers. There are currently no Speech to Text (STT) systems that are able to annotate these features. To reduce the resources needed to create transcripts that include paralinguistic features, we developed a program called GailBot. GailBot combines STT services with plugins to automatically generate first drafts of conversation analytic transcripts. It also enables researchers to add new plugins to transcribe additional features, or to improve the plugins it currently uses. We argue that despite its limitations, GailBot represents a substantial improvement over existing dialogue transcription software.

Find the full paper published by Dialogue and Discourse [here](https://journals.uic.edu/ojs/index.php/dad/article/view/11392).


## Status

GailBot version: 0.3a6
Release type: API


## Installation

You will need run Gailbot with Python Version 3.10. GailBot and the necessary dependencies can be installed using pip with the following commands:
```
pip install --upgrade pip
pip install gailbot
pip install git+https://github.com/m-bain/whisperx.git
```
To use an virtual environment, you can use the following command:
```
conda create --name gailbot-api python==3.10.6
conda activate gailbot-api
pip install --upgrade pip
pip install gailbot
pip install git+https://github.com/m-bain/whisperx.git

```

## Usage - GailBot API

This release features a convenient API to use GailBot and create custom plugin suites. To use the API and its features, import the GailBot package like the following:

```
from gailbot import GailBot
```

Once you have imported the GailBot package, initialize an instance of GailBot called "gb" (or a name of your choosing) by doing the following ("ws_root" is takes a path to your workspace directory):

```
gb = GailBot()
```
The GailBot API's methods are now available through your GailBot instance. Check out GailBot's backend documentation for a full list of method and their uses [here](https://gailbot-release-document.s3.us-east-2.amazonaws.com/Documentation/Backend_Technical_Documentation.pdf).


### Example Usage 1 - Default Settings
Now, we will try to use the GailBot on an input audio file.
To do so, we will need initiate a GailBot instance, add input audio file as source, and transcribe. 
This example uses GailBot's default settings with default engine (Whisper) and pre-installed plugin suite. Therefore, there is no need to create and apply profile settings.
See the example below:
```
from gailbot import GailBot

gb = GailBot()
gb.add_source(
    source_path="your_source_file_path"
    output_dir="your_output_directory_path"
)
gb.transcribe()
```

### Example Usage 2 - Custom Profile
Here is an example of using GailBot with your customized transcription profile.
To do so, you'll need to create a profile and apply it to input source files before transcribing with GailBot.
See example on how to create a profile:

```
gb = GailBot()
google_api = "path/to/google-api.json"
input = "path/to/source"
output = "path/to/source"

google_engine_setting = {"engine": "google", "google_api_key": google_api}
google_engine_name = "google engine"
gb.add_engine(name=google_engine_name, setting=google_engine_setting)

google_profile_setting = ProfileSetting(
    engine_setting_name=google_engine_name,
    plugin_suite_setting={
        "HiLabSuite": ["XmlPlugin", "ChatPlugin", "TextPlugin", "CSVPlugin"]
    },
)
google_profile_name = "google profile"
gb.create_profile(name=google_profile_name, setting=google_profile_setting)

source_id = gb.add_source(input, output)
gb.apply_profile_to_source(source_id=source_id, profile_name=google_profile_name)
google_transcription_result = gb.transcribe()
```
In the example above, we added Google Cloud STT engine called "google engine" using your Google engine API key.
Then, we used the Google engine to create a new profile called "google profile" . Here we also use GailBot's default plugin suite called HiLabSuite.
Finally, we apply our custom profile to our input source and transcribe.


## Supported Plugin Suites

A core GailBot feature is its ability to apply plugin suites during the transcription process. While different use cases may require custom plugins, the Human Interaction Lab maintains and distributes a pre-developed plugin suite -- HiLabSuite. For more information about the default plugin suite, click [here](https://sites.tufts.edu/hilab/gailbot-an-automatic-transcription-system-for-conversation-analysis/).

### Custom Plugins

A core GailBot feature is its ability to allow researchers to develop and add custom plugins that may be applied during the transcription process, in addition to the provided built-in HiLabSuite. To create a compatible plugin suite for the Gailbot app, follow these steps:

1. Prepare the Folder Structure and Files:
Ensure your plugin suite directory contains the following files with specific names: "init.py," "CHANGELOG.md," "DOCUMENT.md," "TECH_DOCUMENT.md," "README.md," "format.md," and "config.toml." Include a subfolder named "src" within the main directory.

    File Descriptions:

    init.py: Used for package initialization and can be left empty.
    format.md: Provides users information about generated output files.
    CHANGELOG.md: Documents version-to-version changes.
    README.md: Offers a high-level explanation and purpose of the plugin suite.
    DOCUMENT.md: Contains specifics about algorithms, plugins, and developers.
    TECH_DOCUMENT.md: Explains technical aspects like logging implementation.
    config.toml: Vital file that outlines plugin execution order and settings.

2. Configure config.toml:
Begin with setting suite_name = "<mySuite>", matching the directory name. Define metadata in the [metadata] section, including Author, Email, and Version.

3. Define Plugins:
For each plugin, create a section in config.toml under plugins section. Specify plugin_name, dependencies, rel_path, and module_name.

    plugin_name: The name of your plugin.
    dependencies: A list of plugins needed before this one.
    rel_path: Path to the file with the plugin's apply function.
    module_name: The name of the file without the .py suffix.

4. Plugin Coding:
Define each plugin as a class with the exact name from plugin_name.
Inside the class, include an apply function with the signature def apply(self, dependency_outputs: Dict[str, Any], methods: <your methods>).
At the end of the apply function, set self.is_successful = True.

5. Dependencies and Output Flow:
Use the dependency_outputs dictionary to pass outputs between plugins.
When a plugin depends on another, it receives previous plugin outputs through this dictionary, with plugin class name as the key.
Ensure plugin classes are properly formatted in config.toml for the apply function to run.

With correctly structured files, codes, and dependencies, your plugins will run as intended when uploaded to Gailbot. 
Congratulations on creating your plugin suite for Gailbot, and thank you for following our tutorial! Happy transcribing!


## Contribute

Users are encouraged to direct installation and usage questions, provide feedback, details regarding bugs, and development ideas by [email](mailto:hilab-dev@elist.tufts.edu).


## Acknowledgements

Special thanks to members of the [Human Interaction Lab](https://sites.tufts.edu/hilab/) at Tufts University and interns that have worked on this project.


## Cite

Users are encouraged to cite GailBot using the following BibTex:
```
@article{umair2022gailbot,
  title={GailBot: An automatic transcription system for Conversation Analysis},
  author={Umair, Muhammad and Mertens, Julia Beret and Albert, Saul and de Ruiter, Jan P},
  journal={Dialogue \& Discourse},
  volume={13},
  number={1},
  pages={63--95},
  year={2022}
}
```

## Liability Notice

Gailbot is a tool to be used to generate specialized transcripts. However, it
is not responsible for output quality. Generated transcripts are meant to
be first drafts that can be manually improved. They are not meant to replace
manual transcription.

GailBot may use external Speech-to-Text systems or third-party services. The
development team is not responsible for any transactions between users and these
services. Additionally, the development team does not guarantee the accuracy or 
correctness of any plugin. Plugins have been developed in good faith and we hope 
that they are accurate. However, users should always verify results.

By using GailBot, users agree to cite Gailbot and the Tufts Human Interaction Lab
in any publications or results as a direct or indirect result of using Gailbot.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "GailBot",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Human Interaction Lab - Tufts University <hilab-dev@elist.tufts.edu>",
    "keywords": null,
    "author": null,
    "author_email": "Muhammad Umair <muhammad.umair@tufts.edu>",
    "download_url": "https://files.pythonhosted.org/packages/d9/d4/8bc461348b7b1b29ecf7b2facc2f1f802b36592186e94a18e7cf193b6bd5/GailBot-0.0.3a7.tar.gz",
    "platform": null,
    "description": "# GailBot\n\n## About\n\nResearchers studying human interaction, such as conversation analysts, psychologists, and linguists all rely on detailed transcriptions of language use. Ideally, these should include so-called paralinguistic features of talk, such as overlaps, prosody, and intonation, as they convey important information. However, transcribing these features by hand requires substantial amounts of time by trained transcribers. There are currently no Speech to Text (STT) systems that are able to annotate these features. To reduce the resources needed to create transcripts that include paralinguistic features, we developed a program called GailBot. GailBot combines STT services with plugins to automatically generate first drafts of conversation analytic transcripts. It also enables researchers to add new plugins to transcribe additional features, or to improve the plugins it currently uses. We argue that despite its limitations, GailBot represents a substantial improvement over existing dialogue transcription software.\n\nFind the full paper published by Dialogue and Discourse [here](https://journals.uic.edu/ojs/index.php/dad/article/view/11392).\n\n\n## Status\n\nGailBot version: 0.3a6\nRelease type: API\n\n\n## Installation\n\nYou will need run Gailbot with Python Version 3.10. GailBot and the necessary dependencies can be installed using pip with the following commands:\n```\npip install --upgrade pip\npip install gailbot\npip install git+https://github.com/m-bain/whisperx.git\n```\nTo use an virtual environment, you can use the following command:\n```\nconda create --name gailbot-api python==3.10.6\nconda activate gailbot-api\npip install --upgrade pip\npip install gailbot\npip install git+https://github.com/m-bain/whisperx.git\n\n```\n\n## Usage - GailBot API\n\nThis release features a convenient API to use GailBot and create custom plugin suites. To use the API and its features, import the GailBot package like the following:\n\n```\nfrom gailbot import GailBot\n```\n\nOnce you have imported the GailBot package, initialize an instance of GailBot called \"gb\" (or a name of your choosing) by doing the following (\"ws_root\" is takes a path to your workspace directory):\n\n```\ngb = GailBot()\n```\nThe GailBot API's methods are now available through your GailBot instance. Check out GailBot's backend documentation for a full list of method and their uses [here](https://gailbot-release-document.s3.us-east-2.amazonaws.com/Documentation/Backend_Technical_Documentation.pdf).\n\n\n### Example Usage 1 - Default Settings\nNow, we will try to use the GailBot on an input audio file.\nTo do so, we will need initiate a GailBot instance, add input audio file as source, and transcribe. \nThis example uses GailBot's default settings with default engine (Whisper) and pre-installed plugin suite. Therefore, there is no need to create and apply profile settings.\nSee the example below:\n```\nfrom gailbot import GailBot\n\ngb = GailBot()\ngb.add_source(\n    source_path=\"your_source_file_path\"\n    output_dir=\"your_output_directory_path\"\n)\ngb.transcribe()\n```\n\n### Example Usage 2 - Custom Profile\nHere is an example of using GailBot with your customized transcription profile.\nTo do so, you'll need to create a profile and apply it to input source files before transcribing with GailBot.\nSee example on how to create a profile:\n\n```\ngb = GailBot()\ngoogle_api = \"path/to/google-api.json\"\ninput = \"path/to/source\"\noutput = \"path/to/source\"\n\ngoogle_engine_setting = {\"engine\": \"google\", \"google_api_key\": google_api}\ngoogle_engine_name = \"google engine\"\ngb.add_engine(name=google_engine_name, setting=google_engine_setting)\n\ngoogle_profile_setting = ProfileSetting(\n    engine_setting_name=google_engine_name,\n    plugin_suite_setting={\n        \"HiLabSuite\": [\"XmlPlugin\", \"ChatPlugin\", \"TextPlugin\", \"CSVPlugin\"]\n    },\n)\ngoogle_profile_name = \"google profile\"\ngb.create_profile(name=google_profile_name, setting=google_profile_setting)\n\nsource_id = gb.add_source(input, output)\ngb.apply_profile_to_source(source_id=source_id, profile_name=google_profile_name)\ngoogle_transcription_result = gb.transcribe()\n```\nIn the example above, we added Google Cloud STT engine called \"google engine\" using your Google engine API key.\nThen, we used the Google engine to create a new profile called \"google profile\" . Here we also use GailBot's default plugin suite called HiLabSuite.\nFinally, we apply our custom profile to our input source and transcribe.\n\n\n## Supported Plugin Suites\n\nA core GailBot feature is its ability to apply plugin suites during the transcription process. While different use cases may require custom plugins, the Human Interaction Lab maintains and distributes a pre-developed plugin suite -- HiLabSuite. For more information about the default plugin suite, click [here](https://sites.tufts.edu/hilab/gailbot-an-automatic-transcription-system-for-conversation-analysis/).\n\n### Custom Plugins\n\nA core GailBot feature is its ability to allow researchers to develop and add custom plugins that may be applied during the transcription process, in addition to the provided built-in HiLabSuite. To create a compatible plugin suite for the Gailbot app, follow these steps:\n\n1. Prepare the Folder Structure and Files:\nEnsure your plugin suite directory contains the following files with specific names: \"init.py,\" \"CHANGELOG.md,\" \"DOCUMENT.md,\" \"TECH_DOCUMENT.md,\" \"README.md,\" \"format.md,\" and \"config.toml.\" Include a subfolder named \"src\" within the main directory.\n\n    File Descriptions:\n\n    init.py: Used for package initialization and can be left empty.\n    format.md: Provides users information about generated output files.\n    CHANGELOG.md: Documents version-to-version changes.\n    README.md: Offers a high-level explanation and purpose of the plugin suite.\n    DOCUMENT.md: Contains specifics about algorithms, plugins, and developers.\n    TECH_DOCUMENT.md: Explains technical aspects like logging implementation.\n    config.toml: Vital file that outlines plugin execution order and settings.\n\n2. Configure config.toml:\nBegin with setting suite_name = \"<mySuite>\", matching the directory name. Define metadata in the [metadata] section, including Author, Email, and Version.\n\n3. Define Plugins:\nFor each plugin, create a section in config.toml under plugins section. Specify plugin_name, dependencies, rel_path, and module_name.\n\n    plugin_name: The name of your plugin.\n    dependencies: A list of plugins needed before this one.\n    rel_path: Path to the file with the plugin's apply function.\n    module_name: The name of the file without the .py suffix.\n\n4. Plugin Coding:\nDefine each plugin as a class with the exact name from plugin_name.\nInside the class, include an apply function with the signature def apply(self, dependency_outputs: Dict[str, Any], methods: <your methods>).\nAt the end of the apply function, set self.is_successful = True.\n\n5. Dependencies and Output Flow:\nUse the dependency_outputs dictionary to pass outputs between plugins.\nWhen a plugin depends on another, it receives previous plugin outputs through this dictionary, with plugin class name as the key.\nEnsure plugin classes are properly formatted in config.toml for the apply function to run.\n\nWith correctly structured files, codes, and dependencies, your plugins will run as intended when uploaded to Gailbot. \nCongratulations on creating your plugin suite for Gailbot, and thank you for following our tutorial! Happy transcribing!\n\n\n## Contribute\n\nUsers are encouraged to direct installation and usage questions, provide feedback, details regarding bugs, and development ideas by [email](mailto:hilab-dev@elist.tufts.edu).\n\n\n## Acknowledgements\n\nSpecial thanks to members of the [Human Interaction Lab](https://sites.tufts.edu/hilab/) at Tufts University and interns that have worked on this project.\n\n\n## Cite\n\nUsers are encouraged to cite GailBot using the following BibTex:\n```\n@article{umair2022gailbot,\n  title={GailBot: An automatic transcription system for Conversation Analysis},\n  author={Umair, Muhammad and Mertens, Julia Beret and Albert, Saul and de Ruiter, Jan P},\n  journal={Dialogue \\& Discourse},\n  volume={13},\n  number={1},\n  pages={63--95},\n  year={2022}\n}\n```\n\n## Liability Notice\n\nGailbot is a tool to be used to generate specialized transcripts. However, it\nis not responsible for output quality. Generated transcripts are meant to\nbe first drafts that can be manually improved. They are not meant to replace\nmanual transcription.\n\nGailBot may use external Speech-to-Text systems or third-party services. The\ndevelopment team is not responsible for any transactions between users and these\nservices. Additionally, the development team does not guarantee the accuracy or \ncorrectness of any plugin. Plugins have been developed in good faith and we hope \nthat they are accurate. However, users should always verify results.\n\nBy using GailBot, users agree to cite Gailbot and the Tufts Human Interaction Lab\nin any publications or results as a direct or indirect result of using Gailbot.\n",
    "bugtrack_url": null,
    "license": "MIT License Copyright (c) 2023 jasonycwu Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "GailBot API",
    "version": "0.0.3a7",
    "project_urls": {
        "homepage": "https://sites.tufts.edu/hilab/gailbot-an-automatic-transcription-system-for-conversation-analysis/",
        "source": "https://github.com/mumair01/GailBot",
        "tracker": "https://github.com/mumair01/GailBot/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "21c503d414d241a8b97425f070edb954e52071238a0e5789f5b53899d8aed9a1",
                "md5": "3c935d690c0cc976aeb6f436765cf011",
                "sha256": "8667d5c82c1bea3c25095746c2637da370ce1583fb068a46c4442dd9512955d6"
            },
            "downloads": -1,
            "filename": "GailBot-0.0.3a7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3c935d690c0cc976aeb6f436765cf011",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 174537,
            "upload_time": "2024-08-23T17:47:08",
            "upload_time_iso_8601": "2024-08-23T17:47:08.679292Z",
            "url": "https://files.pythonhosted.org/packages/21/c5/03d414d241a8b97425f070edb954e52071238a0e5789f5b53899d8aed9a1/GailBot-0.0.3a7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d9d48bc461348b7b1b29ecf7b2facc2f1f802b36592186e94a18e7cf193b6bd5",
                "md5": "726cda3712dab25c39ddc16efba6f8a2",
                "sha256": "ec63da7676148ace4d5873d9f80a8209087f7dc91466a9ce14b3c2f4ca26b46c"
            },
            "downloads": -1,
            "filename": "GailBot-0.0.3a7.tar.gz",
            "has_sig": false,
            "md5_digest": "726cda3712dab25c39ddc16efba6f8a2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 114451,
            "upload_time": "2024-08-23T17:47:10",
            "upload_time_iso_8601": "2024-08-23T17:47:10.749045Z",
            "url": "https://files.pythonhosted.org/packages/d9/d4/8bc461348b7b1b29ecf7b2facc2f1f802b36592186e94a18e7cf193b6bd5/GailBot-0.0.3a7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-23 17:47:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mumair01",
    "github_project": "GailBot",
    "github_not_found": true,
    "lcname": "gailbot"
}
        
Elapsed time: 0.57206s