live-illustrate

Name	live-illustrate JSON
Version	0.2.0 JSON
	download
home_page
Summary	Live-ish illustration for your role-playing campaign
upload_time	2024-01-27 08:30:07
maintainer
docs_url	None
author
requires_python	>=3.10
license	MIT License Copyright (c) 2023 Eric Hennenfent Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	ttrpg dnd genai llm diffusion art illustration
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# TTRPG live_illustrate
ASR + LLM + Diffusion = ???

This project:
* Uses [Whisper](https://github.com/openai/whisper) to transcribe live audio of a tabletop RPG session
* Uses [GPT-3.5](https://platform.openai.com/docs/guides/text-generation) to extract a description of the current setting from the transcript
* Uses [DALL-E](https://platform.openai.com/docs/guides/images) to draw the setting
* Uses [Flask](https://flask.palletsprojects.com) & [HTMX](https://htmx.org) to display a new image every few minutes

And like most AI projects, it simultaneously works better and worse than one might expect.
The images generated are usually an amusingly flawed rendition of what's going on, but are almost _too_ good to be just ambient background flavor.

## Demo Reel

Some scenes from our party's first trial session:

![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/3525a789-2f07-4b76-b704-bb163b5d6a9e)
The party enjoys dinner together on the deck of the _Daydream_. No one's quite sure where the other ship came from, but it looks nice.

![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/ea25229d-ace4-409f-a4b9-5f6a86921f27)
The party sails the _Daydream_ through a narrow canal in a swamp, searching for the hidden pirate city of Siren's Cove.
Perhaps they should ask the barrel people for directions.

![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/f1c381f4-22b8-49bf-ba29-e7e550045e5c)
The party eavesdrops on a red-haired gnome and a halfling in a Siren's Cove tavern who are plotting to steal a competitor's shipping manifest.
Pay no attention to the faces of the other patrons.

![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/0af383e8-5276-47ce-9ed1-6385348398c9)
The party seeks further gossip at a luxe brothel called _The Rich Dagger_, guarded by a Goliath bouncer and famed for its perplexing architecture.

## Installation
I recommend installing in a [virtual environment](https://docs.python.org/3/library/venv.html).

```
# From PyPI:
pip install live-illustrate

# Or for hacking:
git clone git@github.com:ehennenfent/live_illustrate.git
cd live_illustrate
pip install -e ".[dev]"
```

Whisper will be _much_ faster if you use a cuda-enabled pytorch build. I recommend installing this manually afterwards.
```
pip install --index-url https://download.pytorch.org/whl/cu118 torch torchvision torchaudio # https://pytorch.org/get-started/locally/
```

**You'll need an OpenAI API key, exposed via environment variable or in the `.env` file, like so: `OPENAI_API_KEY=<my_secret_api_key>`.**

With the default settings, it costs about $1/hour to run. You can lower the cost by reducing the size of the generated images, or
increasing the interval between them.

### Running
Once installed, run the `illustrate` command line tool, which will automatically start recording with your default microphone.
A `data\` directory will be created containing the generated images and transcripts, and a web server will start on `localhost:8080` to display the generated images.

A few words about the most important command line options:
* `--wait_minutes`: This controls how frequently the tool draws an image, which directly translates into how expensive it is
to run. The default of 7.5 minutes seems to work well for our campaign.
* `--max_context`: Each interval, the tool looks back at the transcript and collects up to `max_context` tokens to send to GPT3.
It will get as close as possible, so some of these tokens may come from _before_ the previous image was generated. GPT can be
a bit slow about summarizing large amounts of text, so be careful about making this too large. The default of 2000 tokens seems
to correspond _very_ roughly to about ten minutes of conversation from one of our sessions, but YMMV.
* `--persistence_of_memory` When summarizing long conversations, the LLM can seem to get "stuck" on the first setting described.
This argument controls what fraction of the previous context is retained each time an image is generated. The default setting of 0.2
may lead to some discontinuity if your party is in one place for a long time.

Optionally, it's possible to upload generated images to a Discord server automatically by configuring a [Discord webhook](https://support.discord.com/hc/en-us/articles/228383668) and supplying the URL in the `DISCORD_WEBHOOK` environment variable.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "live-illustrate",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "ttrpg,dnd,GenAI,LLM,diffusion,art,illustration",
    "author": "",
    "author_email": "Eric Hennenfent <eric@hennenfent.com>",
    "download_url": "https://files.pythonhosted.org/packages/e3/08/77d827f4ae32e18123f3e4d30e0491088592adb5a54f96a2d5595c15d245/live_illustrate-0.2.0.tar.gz",
    "platform": null,
    "description": "# TTRPG live_illustrate\nASR + LLM + Diffusion = ???\n\nThis project:\n* Uses [Whisper](https://github.com/openai/whisper) to transcribe live audio of a tabletop RPG session\n* Uses [GPT-3.5](https://platform.openai.com/docs/guides/text-generation) to extract a description of the current setting from the transcript\n* Uses [DALL-E](https://platform.openai.com/docs/guides/images) to draw the setting\n* Uses [Flask](https://flask.palletsprojects.com) & [HTMX](https://htmx.org) to display a new image every few minutes\n\nAnd like most AI projects, it simultaneously works better and worse than one might expect. \nThe images generated are usually an amusingly flawed rendition of what's going on, but are almost _too_ good to be just ambient background flavor.\n\n## Demo Reel\n\nSome scenes from our party's first trial session:\n\n![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/3525a789-2f07-4b76-b704-bb163b5d6a9e)\nThe party enjoys dinner together on the deck of the _Daydream_. No one's quite sure where the other ship came from, but it looks nice.\n\n![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/ea25229d-ace4-409f-a4b9-5f6a86921f27)\nThe party sails the _Daydream_ through a narrow canal in a swamp, searching for the hidden pirate city of Siren's Cove. \nPerhaps they should ask the barrel people for directions. \n\n![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/f1c381f4-22b8-49bf-ba29-e7e550045e5c)\nThe party eavesdrops on a red-haired gnome and a halfling in a Siren's Cove tavern who are plotting to steal a competitor's shipping manifest.\nPay no attention to the faces of the other patrons. \n\n![image](https://github.com/ehennenfent/live_illustrate/assets/7294647/0af383e8-5276-47ce-9ed1-6385348398c9)\nThe party seeks further gossip at a luxe brothel called _The Rich Dagger_, guarded by a Goliath bouncer and famed for its perplexing architecture. \n\n## Installation\nI recommend installing in a [virtual environment](https://docs.python.org/3/library/venv.html). \n\n```\n# From PyPI:\npip install live-illustrate\n\n# Or for hacking:\ngit clone git@github.com:ehennenfent/live_illustrate.git\ncd live_illustrate\npip install -e \".[dev]\"\n```\n\nWhisper will be _much_ faster if you use a cuda-enabled pytorch build. I recommend installing this manually afterwards.\n```\npip install --index-url https://download.pytorch.org/whl/cu118 torch torchvision torchaudio  # https://pytorch.org/get-started/locally/\n```\n\n**You'll need an OpenAI API key, exposed via environment variable or in the `.env` file, like so: `OPENAI_API_KEY=<my_secret_api_key>`.**\n\nWith the default settings, it costs about $1/hour to run. You can lower the cost by reducing the size of the generated images, or \nincreasing the interval between them. \n\n### Running\nOnce installed, run the `illustrate` command line tool, which will automatically start recording with your default microphone.\nA `data\\` directory will be created containing the generated images and transcripts, and a web server will start on `localhost:8080` to display the generated images. \n\nA few words about the most important command line options:\n* `--wait_minutes`: This controls how frequently the tool draws an image, which directly translates into how expensive it is\nto run. The default of 7.5 minutes seems to work well for our campaign.\n* `--max_context`: Each interval, the tool looks back at the transcript and collects up to `max_context` tokens to send to GPT3. \nIt will get as close as possible, so some of these tokens may come from _before_ the previous image was generated. GPT can be \na bit slow about summarizing large amounts of text, so be careful about making this too large. The default of 2000 tokens seems\nto correspond _very_ roughly to about ten minutes of conversation from one of our sessions, but YMMV. \n* `--persistence_of_memory` When summarizing long conversations, the LLM can seem to get \"stuck\" on the first setting described.\nThis argument controls what fraction of the previous context is retained each time an image is generated. The default setting of 0.2\nmay lead to some discontinuity if your party is in one place for a long time. \n\nOptionally, it's possible to upload generated images to a Discord server automatically by configuring a [Discord webhook](https://support.discord.com/hc/en-us/articles/228383668) and supplying the URL in the `DISCORD_WEBHOOK` environment variable.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 Eric Hennenfent  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Live-ish illustration for your role-playing campaign",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [
        "ttrpg",
        "dnd",
        "genai",
        "llm",
        "diffusion",
        "art",
        "illustration"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1f0296dccfcb1730a7d98e5c5f2122c2101cb3eaac0fbc83bc7aa4a5a67a6b4",
                "md5": "925f39b1a33b0111c746a3de982aef14",
                "sha256": "117ca419ddc12db1881e78780a5e00f8918300cc4b1e99c78034a4f786dec437"
            },
            "downloads": -1,
            "filename": "live_illustrate-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "925f39b1a33b0111c746a3de982aef14",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 5255,
            "upload_time": "2024-01-27T08:30:06",
            "upload_time_iso_8601": "2024-01-27T08:30:06.273663Z",
            "url": "https://files.pythonhosted.org/packages/c1/f0/296dccfcb1730a7d98e5c5f2122c2101cb3eaac0fbc83bc7aa4a5a67a6b4/live_illustrate-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e30877d827f4ae32e18123f3e4d30e0491088592adb5a54f96a2d5595c15d245",
                "md5": "d8ec306e1ca777e4c62a7ec9bbe63b40",
                "sha256": "542989b6c37970b58ea417ceaf4dc037b49cecd0152c0f1586d93a887dd8197d"
            },
            "downloads": -1,
            "filename": "live_illustrate-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d8ec306e1ca777e4c62a7ec9bbe63b40",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 17238,
            "upload_time": "2024-01-27T08:30:07",
            "upload_time_iso_8601": "2024-01-27T08:30:07.715047Z",
            "url": "https://files.pythonhosted.org/packages/e3/08/77d827f4ae32e18123f3e4d30e0491088592adb5a54f96a2d5595c15d245/live_illustrate-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-27 08:30:07",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "live-illustrate"
}