# Cache for LLM, Multimodal LLM and embeddings
## Overview
LaVague Context Cache is a caching layer designed for Language Models, Multimodal Language Models and embeddings.
It ensures deterministic results, avoids unnecessary token consumption, and speeds up local development by caching responses and embeddings.
This tool is ideal for:
- Developers seeking to streamline their workflow when working with models.
- Builders aiming to reduce model costs and enhance performance when agent objectives are stable.
Key features:
- Guarantees consistent results in both automated and manual testing by caching previous responses
- Reduces API token consumption by reusing cached results, avoiding redundant API calls
- Speeds up local development by eliminating the need to repeatedly query the same results
- Cached scenario can be replayed offline
## Installation
### From pypi
```bash
pip install lavague-contexts-cache
```
### From the sources
```bash
pip install -e lavague-integrations/contexts/lavague-contexts-cache
```
## Usage
### Using wrappers
```python
from lavague.contexts.cache import LLMCache
from llama_index.llms.openai import OpenAI
llm = LLMCache(yml_prompts_file="llm.yml", fallback=OpenAI(model = "gpt-4o"))
```
```python
from lavague.contexts.cache import MultiModalLLMCache
from llama_index.multi_modal_llms.openai import OpenAIMultiModal
mm_llm = MultiModalLLMCache(yml_prompts_file="mmllm.yml", fallback=OpenAIMultiModal(model = "gpt-4o"))
```
```python
from lavague.contexts.cache import EmbeddingCache
from llama_index.embeddings.openai import OpenAIEmbedding
embedding = EmbeddingCache(yml_prompts_file="embeddings.yml", fallback=OpenAIEmbedding(model = "text-embedding-3-large"))
```
### Using LaVague context
```python
from lavague.core import WorldModel, ActionEngine
from lavague.core.agents import WebAgent
from lavague.drivers.selenium import SeleniumDriver
from lavague.core.context import get_default_context
from lavague.contexts.cache import ContextCache
driver = SeleniumDriver()
context = get_default_context()
cached_context = ContextCache.from_context(context)
world_model = WorldModel.from_context(cached_context)
action_engine = ActionEngine.from_context(cached_context, driver)
agent = WebAgent(world_model, action_engine)
```
## Performance
Cached values are stored in YAML files and loaded into memory by default.
While this approach works well for a few runs, it can consume excessive memory when applied to various websites.
For production use cases, we recommend using an optimized storage system (database, cache, ...).
To do so, pass a `store` that implements abstract `PromptsStore` methods.
```python
from lavague.contexts.cache import LLMCache
from lavague.contexts.cache.prompts_store import PromptsStore
from llama_index.llms.openai import OpenAI
class MyDataBaseStore(PromptsStore[str]):
def _get_for_prompt(self, prompt: str) -> str:
# return from DB with prompt key
pass
def _add_prompt(self, prompt: str, output: str):
# store in DB
pass
my_database_store = MyDataBaseStore()
llm = LLMCache(yml_prompts_file="llm.yml", fallback=OpenAI(model = "gpt-4o"), store=store)
```
Raw data
{
"_id": null,
"home_page": "https://lavague.ai",
"name": "lavague-contexts-cache",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0.0,>=3.10.0",
"maintainer_email": null,
"keywords": "LAM, action, automation, LLM, NLP, RAG, selenium, playwright",
"author": "lavague-ai",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/99/16/2f0644b551edacc34a7f95af98ccd2660df99b2cd5807342633f4de034da/lavague_contexts_cache-0.0.1.tar.gz",
"platform": null,
"description": "# Cache for LLM, Multimodal LLM and embeddings\n\n\n## Overview\n\nLaVague Context Cache is a caching layer designed for Language Models, Multimodal Language Models and embeddings.\nIt ensures deterministic results, avoids unnecessary token consumption, and speeds up local development by caching responses and embeddings.\n\nThis tool is ideal for:\n- Developers seeking to streamline their workflow when working with models.\n- Builders aiming to reduce model costs and enhance performance when agent objectives are stable.\n\nKey features:\n- Guarantees consistent results in both automated and manual testing by caching previous responses\n- Reduces API token consumption by reusing cached results, avoiding redundant API calls\n- Speeds up local development by eliminating the need to repeatedly query the same results\n- Cached scenario can be replayed offline\n\n\n## Installation\n\n### From pypi\n\n```bash\npip install lavague-contexts-cache\n```\n\n### From the sources\n```bash\npip install -e lavague-integrations/contexts/lavague-contexts-cache\n```\n\n\n## Usage\n\n### Using wrappers\n\n```python\nfrom lavague.contexts.cache import LLMCache\nfrom llama_index.llms.openai import OpenAI\n\nllm = LLMCache(yml_prompts_file=\"llm.yml\", fallback=OpenAI(model = \"gpt-4o\"))\n```\n\n```python\nfrom lavague.contexts.cache import MultiModalLLMCache\nfrom llama_index.multi_modal_llms.openai import OpenAIMultiModal\n\nmm_llm = MultiModalLLMCache(yml_prompts_file=\"mmllm.yml\", fallback=OpenAIMultiModal(model = \"gpt-4o\"))\n```\n\n```python\nfrom lavague.contexts.cache import EmbeddingCache\nfrom llama_index.embeddings.openai import OpenAIEmbedding\n\nembedding = EmbeddingCache(yml_prompts_file=\"embeddings.yml\", fallback=OpenAIEmbedding(model = \"text-embedding-3-large\"))\n```\n\n### Using LaVague context\n\n```python\nfrom lavague.core import WorldModel, ActionEngine\nfrom lavague.core.agents import WebAgent\nfrom lavague.drivers.selenium import SeleniumDriver\nfrom lavague.core.context import get_default_context\nfrom lavague.contexts.cache import ContextCache\n\ndriver = SeleniumDriver()\ncontext = get_default_context()\ncached_context = ContextCache.from_context(context)\nworld_model = WorldModel.from_context(cached_context)\naction_engine = ActionEngine.from_context(cached_context, driver)\nagent = WebAgent(world_model, action_engine)\n```\n\n\n## Performance\n\nCached values are stored in YAML files and loaded into memory by default.\nWhile this approach works well for a few runs, it can consume excessive memory when applied to various websites.\nFor production use cases, we recommend using an optimized storage system (database, cache, ...).\n\nTo do so, pass a `store` that implements abstract `PromptsStore` methods.\n\n```python\nfrom lavague.contexts.cache import LLMCache\nfrom lavague.contexts.cache.prompts_store import PromptsStore\nfrom llama_index.llms.openai import OpenAI\n\nclass MyDataBaseStore(PromptsStore[str]):\n def _get_for_prompt(self, prompt: str) -> str:\n # return from DB with prompt key\n pass\n\n def _add_prompt(self, prompt: str, output: str):\n # store in DB\n pass\n\nmy_database_store = MyDataBaseStore()\n\nllm = LLMCache(yml_prompts_file=\"llm.yml\", fallback=OpenAI(model = \"gpt-4o\"), store=store)\n```",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Context for lavague able to cache and proxy to LLM and MM LLM designed to save up tokens and make model calls deterministic",
"version": "0.0.1",
"project_urls": {
"Documentation": "https://docs.lavague.ai/en/latest/",
"Homepage": "https://lavague.ai",
"Repository": "https://github.com/lavague-ai/LaVague/"
},
"split_keywords": [
"lam",
" action",
" automation",
" llm",
" nlp",
" rag",
" selenium",
" playwright"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f4ee9ac8096852da7ec8a242d7b5095a7086d01042ca720af9bad87a5da42161",
"md5": "38bd673f378252b8bb35f2031e27c7f5",
"sha256": "95754f84b94ebb7e1449fb73097303efb4b9c13da6df0f26fc6a5e146e6d9d20"
},
"downloads": -1,
"filename": "lavague_contexts_cache-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "38bd673f378252b8bb35f2031e27c7f5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.10.0",
"size": 7150,
"upload_time": "2024-08-15T13:10:02",
"upload_time_iso_8601": "2024-08-15T13:10:02.128041Z",
"url": "https://files.pythonhosted.org/packages/f4/ee/9ac8096852da7ec8a242d7b5095a7086d01042ca720af9bad87a5da42161/lavague_contexts_cache-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "99162f0644b551edacc34a7f95af98ccd2660df99b2cd5807342633f4de034da",
"md5": "8afed0a5d05c8f15cecee9d8efdfe281",
"sha256": "60706f05654f694b9e32a8c27783cf725d1e96b910500b588e0b1ea965c602d9"
},
"downloads": -1,
"filename": "lavague_contexts_cache-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "8afed0a5d05c8f15cecee9d8efdfe281",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.10.0",
"size": 4981,
"upload_time": "2024-08-15T13:10:03",
"upload_time_iso_8601": "2024-08-15T13:10:03.640107Z",
"url": "https://files.pythonhosted.org/packages/99/16/2f0644b551edacc34a7f95af98ccd2660df99b2cd5807342633f4de034da/lavague_contexts_cache-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-15 13:10:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lavague-ai",
"github_project": "LaVague",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "lavague-contexts-cache"
}