Name | mllm JSON |
Version |
0.1.53
JSON |
| download |
home_page | None |
Summary | Multimodal Large Language Models |
upload_time | 2024-12-15 19:31:07 |
maintainer | None |
docs_url | None |
author | Patrick Barker |
requires_python | <4.0,>=3.10 |
license | Apache 2.0 |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<!-- PROJECT LOGO -->
<br />
<p align="center">
<!-- <a href="https://github.com/agentsea/skillpacks">
<img src="https://project-logo.png" alt="Logo" width="80">
</a> -->
<h1 align="center">MLLM</h1>
<p align="center">
Multimodal Large Language Models
<br />
<a href="https://docs.hub.agentsea.ai/introduction"><strong>Explore the docs »</strong></a>
<br />
<br />
<a href="https://youtu.be/exoOUUwFRB8">View Demo</a>
·
<a href="https://github.com/agentsea/mllm/issues">Report Bug</a>
·
<a href="https://github.com/agentsea/mllm/issues">Request Feature</a>
</p>
<br>
</p>
## Installation
```sh
pip install mllm
```
### Extra dependencies
Some features might require extra dependencies.
For example, for the Gemini models, you can install the extra dependencies like this:
```sh
pip install mllm[gemini]
```
## Usage
Create an MLLM router with a list of preferred models
```python
import os
from mllm import Router
os.environ["OPENAI_API_KEY"] = "..."
os.environ["ANTHROPIC_API_KEY"] = "..."
os.environ["GEMINI_API_KEY"] = "..."
router = Router(
preference=["gpt-4-turbo", "anthropic/claude-3-opus-20240229", "gemini/gemini-1.5-pro-latest"]
)
```
Create a new role based chat thread
```python
from mllm import RoleThread
thread = RoleThread(owner_id="dolores@agentsea.ai")
thread.post(role="user", msg="Describe the image", images=["data:image/jpeg;base64,..."])
```
Chat with the MLLM, store the prompt data in the namespace `foo`
```python
response = router.chat(thread, namespace="foo")
thread.add_msg(response.msg)
```
Ask for a structured response
```python
from pydantic import BaseModel
class Animal(BaseModel):
species: str
color: str
thread.post(
role="user",
msg=f"What animal is in this image? Please output as schema {Animal.model_json_schema()}",
images=["data:image/jpeg;base64,..."]
)
response = router.chat(thread, namespace="animal", expect=Animal)
animal_parsed = response.parsed
assert type(animal_parsed) == Animal
```
Find a saved thread or a prompt
```python
RoleThread.find(id="123")
Prompt.find(id="456)
```
To store a raw openai prompt
```python
from mllm import Prompt, RoleThread, RoleMessage
thread = RoleThread()
msg = {
"role": "user",
"content": [
{
"type": "text",
"text": "Whats in this image?",
},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,..."},
}
]
}
role_message = RoleMessage.from_openai(msg)
thread.add_msg(role_message)
response = call_openai(thread.to_openai())
response_msg = RoleMessage.from_openai(response["choices"][0]["message"])
saved_prompt = Prompt(thread, response_msg, namespace="foo")
```
Add images of any variety to the thread. We support base64, filepath, PIL, and URL
```python
from PIL import Image
img1 = Image.open("img1.png")
thread.post(
role="user",
msg="Whats this image?",
images=["data:image/jpeg;base64,...", "./img1.png", img1, "https://shorturl.at/rVyAS"]
)
```
### Custom vLLM endpoints
Custom endpoints are supported. They can be added to a `Router` instance with the `RouterConfig`:
```python
from mllm import RouterConfig
custom_model = RouterConfig(
model="hosted_vllm/allenai/Molmo-7B-D-0924", # needs to have the `hosted_vllm` prefix
api_base="https://hosted-vllm-api.co", # set your api base here
api_key_name="MOLMO_API_KEY" # add the api key name -- this will be searched for in your env
)
router = Router(custom_model)
```
You can also mix the models:
```python
router = Router([custom_model, "gpt-4-turbo"])
```
## Integrations
MLLM is integrated with:
- [Taskara](https://github.com/agentsea/taskara) A task management library for AI agents
- [Skillpacks](https://github.com/agentsea/skillpacks) A library to fine tune AI agents on tasks.
- [Surfkit](https://github.com/agentsea/surfkit) A platform for AI agents
- [Threadmem](https://github.com/agentsea/threadmem) A thread management library for AI agents
## Community
Come join us on [Discord](https://discord.gg/hhaq7XYPS6).
## Backends
Thread and prompt storage can be backed by:
- Sqlite
- Postgresql
Sqlite will be used by default. To use postgres simply configure the env vars:
```sh
DB_TYPE=postgres
DB_NAME=mllm
DB_HOST=localhost
DB_USER=postgres
DB_PASS=abc123
```
Thread image storage by default will utilize the db, to configure bucket storage using GCS:
- Create a bucket with fine grained permissions
- Create a GCP service account JSON with permissions to write to the bucket
```sh
export THREAD_STORAGE_SA_JSON='{
"type": "service_account",
...
}'
export THREAD_STORAGE_BUCKET=my-bucket
```
Raw data
{
"_id": null,
"home_page": null,
"name": "mllm",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Patrick Barker",
"author_email": "patrickbarkerco@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/8e/48/0f5910cf213902618934854ba487bc51d4e010c1ba98c8415e73a1ff73fb/mllm-0.1.53.tar.gz",
"platform": null,
"description": "<!-- PROJECT LOGO -->\n<br />\n<p align=\"center\">\n <!-- <a href=\"https://github.com/agentsea/skillpacks\">\n <img src=\"https://project-logo.png\" alt=\"Logo\" width=\"80\">\n </a> -->\n\n <h1 align=\"center\">MLLM</h1>\n\n <p align=\"center\">\n Multimodal Large Language Models\n <br />\n <a href=\"https://docs.hub.agentsea.ai/introduction\"><strong>Explore the docs \u00bb</strong></a>\n <br />\n <br />\n <a href=\"https://youtu.be/exoOUUwFRB8\">View Demo</a>\n \u00b7\n <a href=\"https://github.com/agentsea/mllm/issues\">Report Bug</a>\n \u00b7\n <a href=\"https://github.com/agentsea/mllm/issues\">Request Feature</a>\n </p>\n <br>\n</p>\n\n## Installation\n\n```sh\npip install mllm\n```\n\n### Extra dependencies\n\nSome features might require extra dependencies.\n\nFor example, for the Gemini models, you can install the extra dependencies like this:\n\n```sh\npip install mllm[gemini]\n```\n\n## Usage\n\nCreate an MLLM router with a list of preferred models\n\n```python\nimport os\nfrom mllm import Router\n\nos.environ[\"OPENAI_API_KEY\"] = \"...\"\nos.environ[\"ANTHROPIC_API_KEY\"] = \"...\"\nos.environ[\"GEMINI_API_KEY\"] = \"...\"\n\nrouter = Router(\n preference=[\"gpt-4-turbo\", \"anthropic/claude-3-opus-20240229\", \"gemini/gemini-1.5-pro-latest\"]\n)\n```\n\nCreate a new role based chat thread\n\n```python\nfrom mllm import RoleThread\n\nthread = RoleThread(owner_id=\"dolores@agentsea.ai\")\nthread.post(role=\"user\", msg=\"Describe the image\", images=[\"data:image/jpeg;base64,...\"])\n```\n\nChat with the MLLM, store the prompt data in the namespace `foo`\n\n```python\nresponse = router.chat(thread, namespace=\"foo\")\nthread.add_msg(response.msg)\n```\n\nAsk for a structured response\n\n```python\nfrom pydantic import BaseModel\n\nclass Animal(BaseModel):\n species: str\n color: str\n\nthread.post(\n role=\"user\",\n msg=f\"What animal is in this image? Please output as schema {Animal.model_json_schema()}\",\n images=[\"data:image/jpeg;base64,...\"]\n)\n\nresponse = router.chat(thread, namespace=\"animal\", expect=Animal)\nanimal_parsed = response.parsed\n\nassert type(animal_parsed) == Animal\n```\n\nFind a saved thread or a prompt\n\n```python\nRoleThread.find(id=\"123\")\nPrompt.find(id=\"456)\n```\n\nTo store a raw openai prompt\n\n```python\nfrom mllm import Prompt, RoleThread, RoleMessage\n\nthread = RoleThread()\n\nmsg = {\n \"role\": \"user\",\n \"content\": [\n {\n \"type\": \"text\",\n \"text\": \"Whats in this image?\",\n },\n {\n \"type\": \"image_url\",\n \"image_url\": {\"url\": f\"data:image/jpeg;base64,...\"},\n }\n ]\n}\nrole_message = RoleMessage.from_openai(msg)\nthread.add_msg(role_message)\n\nresponse = call_openai(thread.to_openai())\nresponse_msg = RoleMessage.from_openai(response[\"choices\"][0][\"message\"])\n\nsaved_prompt = Prompt(thread, response_msg, namespace=\"foo\")\n```\n\nAdd images of any variety to the thread. We support base64, filepath, PIL, and URL\n\n```python\nfrom PIL import Image\n\nimg1 = Image.open(\"img1.png\")\n\nthread.post(\n role=\"user\",\n msg=\"Whats this image?\",\n images=[\"data:image/jpeg;base64,...\", \"./img1.png\", img1, \"https://shorturl.at/rVyAS\"]\n)\n```\n\n### Custom vLLM endpoints\n\nCustom endpoints are supported. They can be added to a `Router` instance with the `RouterConfig`:\n\n```python\nfrom mllm import RouterConfig\ncustom_model = RouterConfig(\n model=\"hosted_vllm/allenai/Molmo-7B-D-0924\", # needs to have the `hosted_vllm` prefix\n api_base=\"https://hosted-vllm-api.co\", # set your api base here\n api_key_name=\"MOLMO_API_KEY\" # add the api key name -- this will be searched for in your env\n)\nrouter = Router(custom_model)\n```\n\nYou can also mix the models:\n\n```python\nrouter = Router([custom_model, \"gpt-4-turbo\"])\n```\n\n## Integrations\n\nMLLM is integrated with:\n\n- [Taskara](https://github.com/agentsea/taskara) A task management library for AI agents\n- [Skillpacks](https://github.com/agentsea/skillpacks) A library to fine tune AI agents on tasks.\n- [Surfkit](https://github.com/agentsea/surfkit) A platform for AI agents\n- [Threadmem](https://github.com/agentsea/threadmem) A thread management library for AI agents\n\n## Community\n\nCome join us on [Discord](https://discord.gg/hhaq7XYPS6).\n\n## Backends\n\nThread and prompt storage can be backed by:\n\n- Sqlite\n- Postgresql\n\nSqlite will be used by default. To use postgres simply configure the env vars:\n\n```sh\nDB_TYPE=postgres\nDB_NAME=mllm\nDB_HOST=localhost\nDB_USER=postgres\nDB_PASS=abc123\n```\n\nThread image storage by default will utilize the db, to configure bucket storage using GCS:\n\n- Create a bucket with fine grained permissions\n- Create a GCP service account JSON with permissions to write to the bucket\n\n```sh\nexport THREAD_STORAGE_SA_JSON='{\n \"type\": \"service_account\",\n ...\n}'\nexport THREAD_STORAGE_BUCKET=my-bucket\n```",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Multimodal Large Language Models",
"version": "0.1.53",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ba761a1021b6271620d25430406decf579c005b26a644a667396efc17ecafb7b",
"md5": "8ed7ffa8ad998f7845242656746cb966",
"sha256": "d4f0b12c51b326f247a43f527a49e815e3905284a2c57c54f7d3e1e01505decb"
},
"downloads": -1,
"filename": "mllm-0.1.53-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8ed7ffa8ad998f7845242656746cb966",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 16497,
"upload_time": "2024-12-15T19:31:04",
"upload_time_iso_8601": "2024-12-15T19:31:04.872988Z",
"url": "https://files.pythonhosted.org/packages/ba/76/1a1021b6271620d25430406decf579c005b26a644a667396efc17ecafb7b/mllm-0.1.53-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8e480f5910cf213902618934854ba487bc51d4e010c1ba98c8415e73a1ff73fb",
"md5": "752dad61e66ce17e4919dfe38f48d9bc",
"sha256": "05c6e2e98f369c407471861473aa21bed96291cf76cb0917f491f11e40a4d0ef"
},
"downloads": -1,
"filename": "mllm-0.1.53.tar.gz",
"has_sig": false,
"md5_digest": "752dad61e66ce17e4919dfe38f48d9bc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 15989,
"upload_time": "2024-12-15T19:31:07",
"upload_time_iso_8601": "2024-12-15T19:31:07.290323Z",
"url": "https://files.pythonhosted.org/packages/8e/48/0f5910cf213902618934854ba487bc51d4e010c1ba98c8415e73a1ff73fb/mllm-0.1.53.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-15 19:31:07",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "mllm"
}