ai-stream-interact

Name	ai-stream-interact JSON
Version	0.0.9 JSON
	download
home_page
Summary	An model agnostic extensible package that allows for AI & LLM interactions on a video stream
upload_time	2024-02-17 18:05:31
maintainer
docs_url	None
author	Omar Aref
requires_python
license
keywords	python ai llm artificial intelligence large language models nlp natural language processing video video stream
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# AI Stream Interact🧠🎞️ - LLM interaction capabilities on live USB camera video stream.
This package can easily be extended to accommodate different LLMs, but for this first version interactions were implemented only for [Google's Gemini Pro & Vision Pro Models](https://ai.google.dev/tutorials)

**Note: This is a basic Alpha version that's been written & tested in Ubuntu Linux only so it may have unexpected behavior with other operating systems.**

## Installation:
- `pip install ai-stream-interact`

Note that pip install might take a while as it will also install [coqui-ai](https://github.com/coqui-ai/TTS) for Text to Speech. Although TTS is partially implemented it is not turned on by default due to some glitchy behavior. (This will be fixed in future releases.)

## Example Usage:

1. You need a Gemini API key. (if you don't already have one you can get one [here](https://ai.google.dev/tutorials/setup)).
2. Have a USB camera connected.
3. run `aisi --llm gemini` to enter the AI Stream Interact🧠🎞️ main menu. _(note that you can always go back to the main menu from the video stream by press "**m**" while having the video stream focused.)_
4. Enter the API key or press enter if you've added it to .env.
5. You will be asked to enter your camera index. Currently there is no straight forward way to identify the exact index for your camera's name due to how open-cv enumerates such indicies so you'll have to just try a few times till you get the right one if you have multiple camers connected. If you have one camera connected you can try passing "**-1**" as in most cases it'll just pick that one.

Now you're in!. You have access to 3 types of interactions as of today.

### Detect Default:
This fires up a window with your camera stream and whenever you press "**d**" will identify the object the camera is looking at. (Make sure to press "**d**" with the camera window focused and not your terminal).
![](https://github.com/The0mar/ai_stream_interact/blob/main/gifs/detect.gif)

### Detect with Custom Prompt:
Use this to write up a custom prompt before showing the model an object for custom interactions beyond just identifying objects.
![](https://github.com/The0mar/ai_stream_interact/blob/main/gifs/detect_custom.gif)

### Interactions:
This just allows for back & forth chat with the model.

## Troubleshooting:

### Errors:

- `google.api_core.exceptions.FailedPrecondition: 400 User location is not supported for the API use.`: **This is specific to Gemini as they currently do not provide general availability to all regions, so you need to make sure your region is supported [here](https://ai.google.dev/available_regions#available_regions)**

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ai-stream-interact",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,ai,llm,artificial intelligence,large language models,nlp,natural language processing,video,video stream",
    "author": "Omar Aref",
    "author_email": "oa_dev_acc_92@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/5b/74/33c4ef79b8578aaac4ce202512af2743efa22d9719a65be612d184d8eb75/ai_stream_interact-0.0.9.tar.gz",
    "platform": null,
    "description": "# AI Stream Interact\ud83e\udde0\ud83c\udf9e\ufe0f - LLM interaction capabilities on live USB camera video stream.\nThis package can easily be extended to accommodate different LLMs, but for this first version interactions were implemented only for [Google's Gemini Pro & Vision Pro Models](https://ai.google.dev/tutorials)\n\n**Note: This is a basic Alpha version that's been written & tested in Ubuntu Linux only so it may have unexpected behavior with other operating systems.**\n\n<br>\n<br>\n\n## Installation:\n- `pip install ai-stream-interact`\n\nNote that pip install might take a while as it will also install [coqui-ai](https://github.com/coqui-ai/TTS) for Text to Speech. Although TTS is partially implemented it is not turned on by default due to some glitchy behavior. (This will be fixed in future releases.)\n\n## Example Usage:\n\n1. You need a Gemini API key. (if you don't already have one you can get one [here](https://ai.google.dev/tutorials/setup)).\n2. Have a USB camera connected.\n3. run `aisi --llm gemini` to enter the AI Stream Interact\ud83e\udde0\ud83c\udf9e\ufe0f main menu. _(note that you can always go back to the main menu from the video stream by press \"**m**\" while having the video stream focused.)_\n4. Enter the API key or press enter if you've added it to .env.\n5. You will be asked to enter your camera index. Currently there is no straight forward way to identify the exact index for your camera's name due to how open-cv enumerates such indicies so you'll have to just try a few times till you get the right one if you have multiple camers connected. If you have one camera connected you can try passing \"**-1**\" as in most cases it'll just pick that one.\n   \nNow you're in!. You have access to 3 types of interactions as of today.\n\n### Detect Default:\nThis fires up a window with your camera stream and whenever you press \"**d**\" will identify the object the camera is looking at. (Make sure to press \"**d**\" with the camera window focused and not your terminal).\n![](https://github.com/The0mar/ai_stream_interact/blob/main/gifs/detect.gif)\n\n\n### Detect with Custom Prompt:\nUse this to write up a custom prompt before showing the model an object for custom interactions beyond just identifying objects.\n![](https://github.com/The0mar/ai_stream_interact/blob/main/gifs/detect_custom.gif)\n\n### Interactions:\nThis just allows for back & forth chat with the model.\n\n## Troubleshooting:\n\n### Errors:\n\n- `google.api_core.exceptions.FailedPrecondition: 400 User location is not supported for the API use.`: **This is specific to Gemini as they currently do not provide general availability to all regions, so you need to make sure your region is supported [here](https://ai.google.dev/available_regions#available_regions)**\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "An model agnostic extensible package that allows for AI & LLM interactions on a video stream",
    "version": "0.0.9",
    "project_urls": null,
    "split_keywords": [
        "python",
        "ai",
        "llm",
        "artificial intelligence",
        "large language models",
        "nlp",
        "natural language processing",
        "video",
        "video stream"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1ab612ca13276e1a66b7ff92ed94ab61413f46bc5be5ccac046d96a0360e8778",
                "md5": "2dc57997f8fdf41ade21df9936397de0",
                "sha256": "e4dd2047943c63c65036d027a74d8b734846d9655d01854205673a2705260f7e"
            },
            "downloads": -1,
            "filename": "ai_stream_interact-0.0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2dc57997f8fdf41ade21df9936397de0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 25253,
            "upload_time": "2024-02-17T18:05:29",
            "upload_time_iso_8601": "2024-02-17T18:05:29.806185Z",
            "url": "https://files.pythonhosted.org/packages/1a/b6/12ca13276e1a66b7ff92ed94ab61413f46bc5be5ccac046d96a0360e8778/ai_stream_interact-0.0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5b7433c4ef79b8578aaac4ce202512af2743efa22d9719a65be612d184d8eb75",
                "md5": "12bdfd8db13c5ac128066bd644bf4505",
                "sha256": "1aaeac287058f383e1a02148840770e73f3b201b88a189bc904f5c191fe08651"
            },
            "downloads": -1,
            "filename": "ai_stream_interact-0.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "12bdfd8db13c5ac128066bd644bf4505",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 23743,
            "upload_time": "2024-02-17T18:05:31",
            "upload_time_iso_8601": "2024-02-17T18:05:31.793752Z",
            "url": "https://files.pythonhosted.org/packages/5b/74/33c4ef79b8578aaac4ce202512af2743efa22d9719a65be612d184d8eb75/ai_stream_interact-0.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-17 18:05:31",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ai-stream-interact"
}

Omar Aref