litemind


Namelitemind JSON
Version 2025.7.28.1 PyPI version JSON
download
home_pageNone
SummaryA wrapper API around LLM APIs and agentic AI framework
upload_time2025-07-29 02:03:22
maintainerNone
docs_urlNone
authorLoic A. Royer
requires_python>=3.9
licenseNone
keywords agents ai llm wrapper
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# litemind

[![PyPI version](https://badge.fury.io/py/litemind.svg)](https://pypi.org/project/litemind/)
[![License: BSD-3-Clause](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](./LICENSE)
[![Downloads](https://static.pepy.tech/badge/litemind)](https://pepy.tech/project/litemind)
[![GitHub stars](https://img.shields.io/github/stars/royerlab/litemind.svg?style=social&label=Star)](https://github.com/royerlab/litemind)

---

## Summary

**Litemind** is a powerful, extensible, and user-friendly Python library for building next-generation multimodal, agentic AI applications. It provides a unified, high-level API for interacting with a wide range of Large Language Model (LLM) providers (OpenAI, Anthropic, Google Gemini, Ollama, and more), and enables the creation of advanced agents that can reason, use tools, access external knowledge, and process multimodal data (text, images, audio, video, tables, documents, and more).

Litemind's philosophy is to make advanced agentic and multimodal AI accessible to all Python developers, with a focus on clarity, composability, and extensibility. Whether you want to build a simple chatbot, a research assistant, or a complex workflow that leverages retrieval-augmented generation (RAG), tool use, and multimodal reasoning, Litemind provides the building blocks you need.

---

## Features

- **Unified API**: Seamlessly interact with multiple LLM providers (OpenAI, Anthropic, Gemini, Ollama, etc.) through a single, consistent interface.
- **Agentic Framework**: Build agents that can reason, use tools, maintain conversations, and augment themselves with external knowledge.
- **Multimodal Support**: Native support for text, images, audio, video, tables, documents, and more, both as inputs and outputs.
- **Tool Integration**: Easily define and add custom Python functions as tools, or use built-in tools (web search, MCP protocol, etc.).
- **Augmentations (RAG)**: Integrate vector databases and retrieval-augmented generation to ground agent responses in external knowledge.
- **Automatic Model Feature Discovery**: Automatically select models based on required features (e.g., image input, tool use, reasoning).
- **Extensible Media Classes**: Rich, type-safe representations for all supported media types.
- **Comprehensive Conversion**: Automatic conversion between media types for maximum model compatibility.
- **Command-Line Tools**: CLI utilities for code generation, repo export, and model feature scanning.
- **Callback and Logging System**: Fine-grained logging and callback hooks for monitoring and debugging (powered by [Arbol](http://github.com/royerlab/arbol)).
- **Robust Testing**: Extensive test suite covering all major features and edge cases.
- **BSD-3-Clause License**: Open source and ready for both academic and commercial use.

---

## Installation

Litemind requires Python 3.9 or newer.

Install the latest release from PyPI:

```bash
pip install litemind
```

For development (with all optional dependencies):

```bash
git clone https://github.com/royerlab/litemind.git
cd litemind
pip install -e ".[dev,rag,whisper,documents,tables,videos,audio,remote,tasks]"
```

---

## Basic Usage

Below are several illustrative examples of the agent-level API. Each example is self-contained and demonstrates a different aspect of Litemind's agentic capabilities.

### 1. Basic Agent Usage

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent

# Initialize the OpenAI API
api = OpenAIApi()

# Create an agent
agent = Agent(api=api, model_name="o3-high")

# Add a system message to guide the agent's behavior
agent.append_system_message("You are a helpful assistant.")

# Ask a question
response = agent("What is the capital of France?")

print("Simple Agent Response:", response)
# Output: Simple Agent Response: [*assistant*:
# The capital of France is Paris.
# ]
```

---

### 2. Agent with Tools

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent
from litemind.agent.tools.toolset import ToolSet
from datetime import datetime

# Define a function to get the current date
def get_current_date() -> str:
    """Fetch the current date"""
    return datetime.now().strftime("%Y-%m-%d")

api = OpenAIApi()
toolset = ToolSet()
toolset.add_function_tool(get_current_date)

agent = Agent(api=api, toolset=toolset)
agent.append_system_message("You are a helpful assistant.")

response = agent("What is the current date?")
print("Agent with Tool Response:", response)
# Output: Agent with Tool Response: [*assistant*:
# The current date is 2025-05-02.
# ]
```

---

### 3. Agent with Tools and Augmentation (RAG)

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent
from litemind.agent.tools.toolset import ToolSet
from litemind.agent.augmentations.information.information import Information
from litemind.agent.augmentations.vector_db.in_memory_vector_db import InMemoryVectorDatabase
from litemind.media.types.media_text import Text

def get_current_date() -> str:
    from datetime import datetime
    return datetime.now().strftime("%Y-%m-%d")

api = OpenAIApi()
toolset = ToolSet()
toolset.add_function_tool(get_current_date, "Fetch the current date")

agent = Agent(api=api, toolset=toolset)

# Create vector database augmentation
vector_augmentation = InMemoryVectorDatabase(name="test_augmentation")

# Add sample informations to the augmentation
informations = [
    Information(Text("Igor Bolupskisty was a German-born theoretical physicist who developed the theory of indelible unitarity."),
                metadata={"topic": "physics", "person": "Bolupskisty"}),
    Information(Text("The theory of indelible unitarity revolutionized our understanding of space, time and photons."),
                metadata={"topic": "physics", "concept": "unitarity"}),
    Information(Text("Quantum unitarity is a fundamental theory in physics that describes nature at the nano-atomic scale as it pertains to Pink Hamsters."),
                metadata={"topic": "physics", "concept": "quantum unitarity"}),
]

vector_augmentation.add_informations(informations)
agent.add_augmentation(vector_augmentation)
agent.append_system_message("You are a helpful assistant.")

response = agent("Tell me about Igor Bolupskisty's theory of indelible unitarity. Also, what is the current date?")
print("Agent with Tool and Augmentation Response:", response)
# Output: [*assistant*:
# Igor Bolupskisty was a German-born theoretical physicist known for developing the theory of indelible unitarity...
# Today's date is May 2, 2025.
# ]
```

---

### 4. More Complex Example: Multimodal Inputs, Tools, and Augmentations

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent
from litemind.agent.tools.toolset import ToolSet
from litemind.agent.augmentations.vector_db.in_memory_vector_db import InMemoryVectorDatabase
from litemind.agent.augmentations.information.information import Information
from litemind.media.types.media_text import Text
from litemind.media.types.media_image import Image

def get_current_date() -> str:
    from datetime import datetime
    return datetime.now().strftime("%Y-%m-%d")

api = OpenAIApi()
toolset = ToolSet()
toolset.add_function_tool(get_current_date, "Fetch the current date")

agent = Agent(api=api, toolset=toolset)

# Add a multimodal information to the vector database
vector_augmentation = InMemoryVectorDatabase(name="multimodal_augmentation")
vector_augmentation.add_informations([
    Information(Image("https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/456px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg"),
                metadata={"topic": "physics", "person": "Einstein"}),
])
agent.add_augmentation(vector_augmentation)
agent.append_system_message("You are a helpful assistant.")

response = agent("Describe the person in the image and tell me today's date.")
print("Multimodal Agent Response:", response)
# Output: [*assistant*:
# The image shows Albert Einstein, a renowned physicist...
# Today's date is May 2, 2025.
# ]
```

---

**Note:** In all examples, `model_features` can be provided as a list of strings, a singleton string, or as enums (see `ModelFeatures.normalise`). For example, `model_features=["textgeneration", "tools"]` or `model_features=ModelFeatures.TextGeneration`.

---

## Concepts

### Main Classes

- **Agent**: The core class representing an agentic AI entity. It manages conversation state, toolsets, augmentations, and interacts with the API.
- **ToolSet**: A collection of tools (Python functions or agent tools) that the agent can use.
- **AugmentationSet**: A collection of augmentations (e.g., vector databases) for retrieval-augmented generation (RAG).
- **Information**: Represents a knowledge chunk (text, image, etc.) with metadata, used in augmentations.
- **Media Classes**: Typed representations for all supported media (Text, Image, Audio, Video, Table, Document, etc.).
- **API Classes**: Abstractions for LLM providers (OpenAIApi, AnthropicApi, GeminiApi, OllamaApi, CombinedApi, etc.).

### API Layers

- **Agentic API**: The high-level, agent-oriented API (as shown above). This is the recommended way to build complex, interactive, multimodal, and tool-using agents.
- **Wrapper API**: Lower-level, direct access to LLM provider APIs (e.g., `api.generate_text(...)`, `api.generate_image(...)`). Use this for fine-grained control or when you don't need agentic features.

**Difference:** The agentic API manages conversation, tool use, augmentation, and multimodal context automatically. The wrapper API is stateless and does not manage agent state or tool use.

---

## Multi-modality

Litemind supports **multimodal inputs and outputs** natively. This means you can send images, audio, video, tables, and documents to models that support them, and receive rich outputs.

### Model Features

- **Model features** describe what a model can do (e.g., text generation, image input, tool use, etc.).
- You can request features as enums, strings, or lists (e.g., `ModelFeatures.TextGeneration`, `"textgeneration"`, or `["textgeneration", "tools"]`).
- Litemind will automatically select the best model that supports the requested features.

### Requesting Features

```python
from litemind.apis.model_features import ModelFeatures

# As enums
agent = Agent(api=api, model_features=[ModelFeatures.TextGeneration, ModelFeatures.Image])

# As strings
agent = Agent(api=api, model_features=["textgeneration", "image"])
```

### Media Classes

- All multimodal data is represented by dedicated media classes (e.g., `Text`, `Image`, `Audio`, `Video`, `Table`, `Document`, etc.).
- These classes provide type safety, conversion utilities, and serialization.
- When building messages, use the appropriate media class (e.g., `Message.append_image(...)`, `Message.append_audio(...)`).

---

## More Examples

Below are some advanced usage examples. For even more, see the [EXAMPLES.md](EXAMPLES.md) file.

### 1. Using the CombinedAPI

```python
from litemind.apis.combined_api import CombinedApi
from litemind.agent.agent import Agent

api = CombinedApi()
agent = Agent(api=api)
agent.append_system_message("You are a helpful assistant.")
response = agent("What is the tallest mountain in the world?")
print(response)
```

---

### 2. Agent That Can Execute Python Code

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent
from litemind.agent.tools.toolset import ToolSet

def execute_python_code(code: str) -> str:
    try:
        exec_globals = {}
        exec(code, exec_globals)
        return str(exec_globals)
    except Exception as e:
        return str(e)

api = OpenAIApi()
toolset = ToolSet()
toolset.add_function_tool(execute_python_code, "Execute Python code and return the result.")

agent = Agent(api=api, toolset=toolset)
agent.append_system_message("You are a Python code executor.")
response = agent("What is the result of 2 + 2 in Python?")
print(response)
```

---

### 3. Using the Wrapper API for Structured Outputs

```python
from litemind import OpenAIApi
from pydantic import BaseModel

class WeatherResponse(BaseModel):
    temperature: float
    condition: str
    humidity: float

api = OpenAIApi()
messages = [
    {"role": "system", "content": "You are a weather bot."},
    {"role": "user", "content": "What is the weather like in Paris?"}
]
response = api.generate_text(messages=messages, response_format=WeatherResponse)
print(response)
# Output: [WeatherResponse(temperature=22.5, condition='sunny', humidity=45.0)]
```

---

### 4. Agent with a Tool That Generates an Image Using the Wrapper API

```python
from litemind import OpenAIApi
from litemind.agent.agent import Agent
from litemind.agent.tools.toolset import ToolSet

def generate_cat_image() -> str:
    api = OpenAIApi()
    image = api.generate_image(positive_prompt="A cute fluffy cat", image_width=512, image_height=512)
    # Save or display the image as needed
    return "Cat image generated!"

api = OpenAIApi()
toolset = ToolSet()
toolset.add_function_tool(generate_cat_image, "Generate a cat image.")

agent = Agent(api=api, toolset=toolset)
agent.append_system_message("You are an assistant that can generate images.")
response = agent("Please generate a cat image for me.")
print(response)
```

---

**More examples can be found in the [EXAMPLES.md](EXAMPLES.md) file.**

---

## Command Line Tools

Litemind provides several command-line tools for code generation, repository export, and model feature scanning.

### Usage

```bash
litemind codegen --api openai -m gpt-4o --file README
litemind export --folder-path . --output-file exported.txt
litemind scan --api openai gemini --models gpt-4o models/gemini-1.5-pro
```

### .codegen.yml Format

To use the `codegen` tool, create a `.codegen` folder in the root of your repository and add one or more `*.codegen.yml` files. Example:

```yaml
file: README.md
prompt: |
  Please generate a detailed, complete and informative README.md file for this repository.
folder:
  path: .
  extensions: [".py", ".md", ".toml", "LICENSE"]
  excluded: ["dist", "build", "litemind.egg-info"]
```

- The `folder` section specifies which files to include/exclude.
- The `prompt` is the instruction for the agent.
- You can have multiple `.codegen.yml` files for different outputs.

---

## Caveats and Limitations

- **Error Handling**: Some error handling, especially in the wrapper API, can be improved.
- **Token Management**: There is no built-in mechanism for managing token usage or quotas.
- **API Key Management**: API keys are managed via environment variables; consider more secure solutions for production.
- **Performance**: No explicit caching or async support yet; large RAG databases may impact performance.
- **Failing Tests**: See the "Code Health" section for test status.
- **Streaming**: Not all models/providers support streaming responses.
- **Model Coverage**: Not all models support all features (e.g., not all support images, tools, or reasoning).
- **Security**: Always keep your API keys secure and do not commit them to version control.

---

## Code Health

- **Unit Tests**: The test suite covers a wide range of functionalities, including text generation, image generation, audio, RAG, tools, multimodal, and more.
- **Test Results**: No test failures reported in the latest run (`test_reports/` is empty).
- **Total Tests**: Hundreds of tests across all modules.
- **Failures**: None reported.
- **Assessment**: The codebase is robust and well-tested. Any failures would be non-critical unless otherwise noted in `test_report.md` or `ANALYSIS.md`.

---

## API Keys

Litemind requires API keys for the various LLM providers:

- **OpenAI**: `OPENAI_API_KEY`
- **Anthropic (Claude)**: `ANTHROPIC_API_KEY`
- **Google Gemini**: `GOOGLE_GEMINI_API_KEY`
- **Ollama**: (local server, no key required by default)

### Setting API Keys

#### Linux / macOS

Add the following lines to your `~/.bashrc`, `~/.zshrc`, or `~/.profile`:

```bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_GEMINI_API_KEY="..."
```

Then reload your shell:

```bash
source ~/.bashrc
```

#### Windows

Set environment variables in the Command Prompt or PowerShell:

```cmd
setx OPENAI_API_KEY "sk-..."
setx ANTHROPIC_API_KEY "sk-ant-..."
setx GOOGLE_GEMINI_API_KEY "..."
```

Or add them to your system environment variables via the Control Panel.

#### In Python (not recommended for production):

```python
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
```

---

## Roadmap

- [x] Setup a readme with a quick start guide.
- [x] setup continuous integration and pipy deployment.
- [x] Improve document conversion (page per page text and video interleaving + whole page images)
- [x] Cleanup structured output with tool usage
- [x] Implement streaming callbacks
- [x] Improve folder/archive conversion: add ascii folder tree
- [x] Reorganise media files used for testing into a single media folder
- [x] Improve logging with arbol, with option to turn off.
- [x] Use specialised libraries for document type identification
- [x] Cleanup the document conversion code.
- [x] Add support for adding nD images to messages.
- [x] Automatic feature support discovery for models (which models support images as input, reasoning, etc...)
- [x] Add support for OpenAI's new 'Response' API.
- [x] Add support for builtin tools: Web search and MCP protocol.
- [ ] Add webui functionality for agents using Reflex.
- [ ] Video conversion temporal sampling should adapt to the video length, short videos should have more frames...
- [ ] RAG ingestion code for arbitrary digital objects: folders, pdf, images, urls, etc...
- [ ] Add more support for MCP protocol beyond built-in API support.
- [ ] Use the faster pybase64 for base64 encoding/decoding.
- [ ] Deal with message sizes in tokens sent to models
- [ ] Improve vendor api robustness features such as retry call when server errors, etc...
- [ ] Improve and uniformize exception handling
- [ ] Implement 'brainstorming' mode for text generation, possibly with API fusion.

---

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

- Fork the repository and create a feature branch.
- Write clear, tested, and well-documented code.
- Run `pytest` and ensure all tests pass.
- Use [Black](https://github.com/psf/black), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/), and [mypy](http://mypy-lang.org/) for code quality.
- Submit a pull request and describe your changes.

---

## License

BSD-3-Clause. See [LICENSE](LICENSE) for details.

---

## Logging

Litemind uses [Arbol](http://github.com/royerlab/arbol) for logging. You can deactivate logging by setting `Arbol.passthrough = True` in your code.

---

> _This README was generated with the help of AI._

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "litemind",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "agents, ai, llm, wrapper",
    "author": "Loic A. Royer",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/5e/4e/424ca928b90d1fb17a2cf3049b0708100ac4aa8c52a781b33cdaeae91833/litemind-2025.7.28.1.tar.gz",
    "platform": null,
    "description": "\n# litemind\n\n[![PyPI version](https://badge.fury.io/py/litemind.svg)](https://pypi.org/project/litemind/)\n[![License: BSD-3-Clause](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](./LICENSE)\n[![Downloads](https://static.pepy.tech/badge/litemind)](https://pepy.tech/project/litemind)\n[![GitHub stars](https://img.shields.io/github/stars/royerlab/litemind.svg?style=social&label=Star)](https://github.com/royerlab/litemind)\n\n---\n\n## Summary\n\n**Litemind** is a powerful, extensible, and user-friendly Python library for building next-generation multimodal, agentic AI applications. It provides a unified, high-level API for interacting with a wide range of Large Language Model (LLM) providers (OpenAI, Anthropic, Google Gemini, Ollama, and more), and enables the creation of advanced agents that can reason, use tools, access external knowledge, and process multimodal data (text, images, audio, video, tables, documents, and more).\n\nLitemind's philosophy is to make advanced agentic and multimodal AI accessible to all Python developers, with a focus on clarity, composability, and extensibility. Whether you want to build a simple chatbot, a research assistant, or a complex workflow that leverages retrieval-augmented generation (RAG), tool use, and multimodal reasoning, Litemind provides the building blocks you need.\n\n---\n\n## Features\n\n- **Unified API**: Seamlessly interact with multiple LLM providers (OpenAI, Anthropic, Gemini, Ollama, etc.) through a single, consistent interface.\n- **Agentic Framework**: Build agents that can reason, use tools, maintain conversations, and augment themselves with external knowledge.\n- **Multimodal Support**: Native support for text, images, audio, video, tables, documents, and more, both as inputs and outputs.\n- **Tool Integration**: Easily define and add custom Python functions as tools, or use built-in tools (web search, MCP protocol, etc.).\n- **Augmentations (RAG)**: Integrate vector databases and retrieval-augmented generation to ground agent responses in external knowledge.\n- **Automatic Model Feature Discovery**: Automatically select models based on required features (e.g., image input, tool use, reasoning).\n- **Extensible Media Classes**: Rich, type-safe representations for all supported media types.\n- **Comprehensive Conversion**: Automatic conversion between media types for maximum model compatibility.\n- **Command-Line Tools**: CLI utilities for code generation, repo export, and model feature scanning.\n- **Callback and Logging System**: Fine-grained logging and callback hooks for monitoring and debugging (powered by [Arbol](http://github.com/royerlab/arbol)).\n- **Robust Testing**: Extensive test suite covering all major features and edge cases.\n- **BSD-3-Clause License**: Open source and ready for both academic and commercial use.\n\n---\n\n## Installation\n\nLitemind requires Python 3.9 or newer.\n\nInstall the latest release from PyPI:\n\n```bash\npip install litemind\n```\n\nFor development (with all optional dependencies):\n\n```bash\ngit clone https://github.com/royerlab/litemind.git\ncd litemind\npip install -e \".[dev,rag,whisper,documents,tables,videos,audio,remote,tasks]\"\n```\n\n---\n\n## Basic Usage\n\nBelow are several illustrative examples of the agent-level API. Each example is self-contained and demonstrates a different aspect of Litemind's agentic capabilities.\n\n### 1. Basic Agent Usage\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\n\n# Initialize the OpenAI API\napi = OpenAIApi()\n\n# Create an agent\nagent = Agent(api=api, model_name=\"o3-high\")\n\n# Add a system message to guide the agent's behavior\nagent.append_system_message(\"You are a helpful assistant.\")\n\n# Ask a question\nresponse = agent(\"What is the capital of France?\")\n\nprint(\"Simple Agent Response:\", response)\n# Output: Simple Agent Response: [*assistant*:\n# The capital of France is Paris.\n# ]\n```\n\n---\n\n### 2. Agent with Tools\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\nfrom litemind.agent.tools.toolset import ToolSet\nfrom datetime import datetime\n\n# Define a function to get the current date\ndef get_current_date() -> str:\n    \"\"\"Fetch the current date\"\"\"\n    return datetime.now().strftime(\"%Y-%m-%d\")\n\napi = OpenAIApi()\ntoolset = ToolSet()\ntoolset.add_function_tool(get_current_date)\n\nagent = Agent(api=api, toolset=toolset)\nagent.append_system_message(\"You are a helpful assistant.\")\n\nresponse = agent(\"What is the current date?\")\nprint(\"Agent with Tool Response:\", response)\n# Output: Agent with Tool Response: [*assistant*:\n# The current date is 2025-05-02.\n# ]\n```\n\n---\n\n### 3. Agent with Tools and Augmentation (RAG)\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\nfrom litemind.agent.tools.toolset import ToolSet\nfrom litemind.agent.augmentations.information.information import Information\nfrom litemind.agent.augmentations.vector_db.in_memory_vector_db import InMemoryVectorDatabase\nfrom litemind.media.types.media_text import Text\n\ndef get_current_date() -> str:\n    from datetime import datetime\n    return datetime.now().strftime(\"%Y-%m-%d\")\n\napi = OpenAIApi()\ntoolset = ToolSet()\ntoolset.add_function_tool(get_current_date, \"Fetch the current date\")\n\nagent = Agent(api=api, toolset=toolset)\n\n# Create vector database augmentation\nvector_augmentation = InMemoryVectorDatabase(name=\"test_augmentation\")\n\n# Add sample informations to the augmentation\ninformations = [\n    Information(Text(\"Igor Bolupskisty was a German-born theoretical physicist who developed the theory of indelible unitarity.\"),\n                metadata={\"topic\": \"physics\", \"person\": \"Bolupskisty\"}),\n    Information(Text(\"The theory of indelible unitarity revolutionized our understanding of space, time and photons.\"),\n                metadata={\"topic\": \"physics\", \"concept\": \"unitarity\"}),\n    Information(Text(\"Quantum unitarity is a fundamental theory in physics that describes nature at the nano-atomic scale as it pertains to Pink Hamsters.\"),\n                metadata={\"topic\": \"physics\", \"concept\": \"quantum unitarity\"}),\n]\n\nvector_augmentation.add_informations(informations)\nagent.add_augmentation(vector_augmentation)\nagent.append_system_message(\"You are a helpful assistant.\")\n\nresponse = agent(\"Tell me about Igor Bolupskisty's theory of indelible unitarity. Also, what is the current date?\")\nprint(\"Agent with Tool and Augmentation Response:\", response)\n# Output: [*assistant*:\n# Igor Bolupskisty was a German-born theoretical physicist known for developing the theory of indelible unitarity...\n# Today's date is May 2, 2025.\n# ]\n```\n\n---\n\n### 4. More Complex Example: Multimodal Inputs, Tools, and Augmentations\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\nfrom litemind.agent.tools.toolset import ToolSet\nfrom litemind.agent.augmentations.vector_db.in_memory_vector_db import InMemoryVectorDatabase\nfrom litemind.agent.augmentations.information.information import Information\nfrom litemind.media.types.media_text import Text\nfrom litemind.media.types.media_image import Image\n\ndef get_current_date() -> str:\n    from datetime import datetime\n    return datetime.now().strftime(\"%Y-%m-%d\")\n\napi = OpenAIApi()\ntoolset = ToolSet()\ntoolset.add_function_tool(get_current_date, \"Fetch the current date\")\n\nagent = Agent(api=api, toolset=toolset)\n\n# Add a multimodal information to the vector database\nvector_augmentation = InMemoryVectorDatabase(name=\"multimodal_augmentation\")\nvector_augmentation.add_informations([\n    Information(Image(\"https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/456px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg\"),\n                metadata={\"topic\": \"physics\", \"person\": \"Einstein\"}),\n])\nagent.add_augmentation(vector_augmentation)\nagent.append_system_message(\"You are a helpful assistant.\")\n\nresponse = agent(\"Describe the person in the image and tell me today's date.\")\nprint(\"Multimodal Agent Response:\", response)\n# Output: [*assistant*:\n# The image shows Albert Einstein, a renowned physicist...\n# Today's date is May 2, 2025.\n# ]\n```\n\n---\n\n**Note:** In all examples, `model_features` can be provided as a list of strings, a singleton string, or as enums (see `ModelFeatures.normalise`). For example, `model_features=[\"textgeneration\", \"tools\"]` or `model_features=ModelFeatures.TextGeneration`.\n\n---\n\n## Concepts\n\n### Main Classes\n\n- **Agent**: The core class representing an agentic AI entity. It manages conversation state, toolsets, augmentations, and interacts with the API.\n- **ToolSet**: A collection of tools (Python functions or agent tools) that the agent can use.\n- **AugmentationSet**: A collection of augmentations (e.g., vector databases) for retrieval-augmented generation (RAG).\n- **Information**: Represents a knowledge chunk (text, image, etc.) with metadata, used in augmentations.\n- **Media Classes**: Typed representations for all supported media (Text, Image, Audio, Video, Table, Document, etc.).\n- **API Classes**: Abstractions for LLM providers (OpenAIApi, AnthropicApi, GeminiApi, OllamaApi, CombinedApi, etc.).\n\n### API Layers\n\n- **Agentic API**: The high-level, agent-oriented API (as shown above). This is the recommended way to build complex, interactive, multimodal, and tool-using agents.\n- **Wrapper API**: Lower-level, direct access to LLM provider APIs (e.g., `api.generate_text(...)`, `api.generate_image(...)`). Use this for fine-grained control or when you don't need agentic features.\n\n**Difference:** The agentic API manages conversation, tool use, augmentation, and multimodal context automatically. The wrapper API is stateless and does not manage agent state or tool use.\n\n---\n\n## Multi-modality\n\nLitemind supports **multimodal inputs and outputs** natively. This means you can send images, audio, video, tables, and documents to models that support them, and receive rich outputs.\n\n### Model Features\n\n- **Model features** describe what a model can do (e.g., text generation, image input, tool use, etc.).\n- You can request features as enums, strings, or lists (e.g., `ModelFeatures.TextGeneration`, `\"textgeneration\"`, or `[\"textgeneration\", \"tools\"]`).\n- Litemind will automatically select the best model that supports the requested features.\n\n### Requesting Features\n\n```python\nfrom litemind.apis.model_features import ModelFeatures\n\n# As enums\nagent = Agent(api=api, model_features=[ModelFeatures.TextGeneration, ModelFeatures.Image])\n\n# As strings\nagent = Agent(api=api, model_features=[\"textgeneration\", \"image\"])\n```\n\n### Media Classes\n\n- All multimodal data is represented by dedicated media classes (e.g., `Text`, `Image`, `Audio`, `Video`, `Table`, `Document`, etc.).\n- These classes provide type safety, conversion utilities, and serialization.\n- When building messages, use the appropriate media class (e.g., `Message.append_image(...)`, `Message.append_audio(...)`).\n\n---\n\n## More Examples\n\nBelow are some advanced usage examples. For even more, see the [EXAMPLES.md](EXAMPLES.md) file.\n\n### 1. Using the CombinedAPI\n\n```python\nfrom litemind.apis.combined_api import CombinedApi\nfrom litemind.agent.agent import Agent\n\napi = CombinedApi()\nagent = Agent(api=api)\nagent.append_system_message(\"You are a helpful assistant.\")\nresponse = agent(\"What is the tallest mountain in the world?\")\nprint(response)\n```\n\n---\n\n### 2. Agent That Can Execute Python Code\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\nfrom litemind.agent.tools.toolset import ToolSet\n\ndef execute_python_code(code: str) -> str:\n    try:\n        exec_globals = {}\n        exec(code, exec_globals)\n        return str(exec_globals)\n    except Exception as e:\n        return str(e)\n\napi = OpenAIApi()\ntoolset = ToolSet()\ntoolset.add_function_tool(execute_python_code, \"Execute Python code and return the result.\")\n\nagent = Agent(api=api, toolset=toolset)\nagent.append_system_message(\"You are a Python code executor.\")\nresponse = agent(\"What is the result of 2 + 2 in Python?\")\nprint(response)\n```\n\n---\n\n### 3. Using the Wrapper API for Structured Outputs\n\n```python\nfrom litemind import OpenAIApi\nfrom pydantic import BaseModel\n\nclass WeatherResponse(BaseModel):\n    temperature: float\n    condition: str\n    humidity: float\n\napi = OpenAIApi()\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a weather bot.\"},\n    {\"role\": \"user\", \"content\": \"What is the weather like in Paris?\"}\n]\nresponse = api.generate_text(messages=messages, response_format=WeatherResponse)\nprint(response)\n# Output: [WeatherResponse(temperature=22.5, condition='sunny', humidity=45.0)]\n```\n\n---\n\n### 4. Agent with a Tool That Generates an Image Using the Wrapper API\n\n```python\nfrom litemind import OpenAIApi\nfrom litemind.agent.agent import Agent\nfrom litemind.agent.tools.toolset import ToolSet\n\ndef generate_cat_image() -> str:\n    api = OpenAIApi()\n    image = api.generate_image(positive_prompt=\"A cute fluffy cat\", image_width=512, image_height=512)\n    # Save or display the image as needed\n    return \"Cat image generated!\"\n\napi = OpenAIApi()\ntoolset = ToolSet()\ntoolset.add_function_tool(generate_cat_image, \"Generate a cat image.\")\n\nagent = Agent(api=api, toolset=toolset)\nagent.append_system_message(\"You are an assistant that can generate images.\")\nresponse = agent(\"Please generate a cat image for me.\")\nprint(response)\n```\n\n---\n\n**More examples can be found in the [EXAMPLES.md](EXAMPLES.md) file.**\n\n---\n\n## Command Line Tools\n\nLitemind provides several command-line tools for code generation, repository export, and model feature scanning.\n\n### Usage\n\n```bash\nlitemind codegen --api openai -m gpt-4o --file README\nlitemind export --folder-path . --output-file exported.txt\nlitemind scan --api openai gemini --models gpt-4o models/gemini-1.5-pro\n```\n\n### .codegen.yml Format\n\nTo use the `codegen` tool, create a `.codegen` folder in the root of your repository and add one or more `*.codegen.yml` files. Example:\n\n```yaml\nfile: README.md\nprompt: |\n  Please generate a detailed, complete and informative README.md file for this repository.\nfolder:\n  path: .\n  extensions: [\".py\", \".md\", \".toml\", \"LICENSE\"]\n  excluded: [\"dist\", \"build\", \"litemind.egg-info\"]\n```\n\n- The `folder` section specifies which files to include/exclude.\n- The `prompt` is the instruction for the agent.\n- You can have multiple `.codegen.yml` files for different outputs.\n\n---\n\n## Caveats and Limitations\n\n- **Error Handling**: Some error handling, especially in the wrapper API, can be improved.\n- **Token Management**: There is no built-in mechanism for managing token usage or quotas.\n- **API Key Management**: API keys are managed via environment variables; consider more secure solutions for production.\n- **Performance**: No explicit caching or async support yet; large RAG databases may impact performance.\n- **Failing Tests**: See the \"Code Health\" section for test status.\n- **Streaming**: Not all models/providers support streaming responses.\n- **Model Coverage**: Not all models support all features (e.g., not all support images, tools, or reasoning).\n- **Security**: Always keep your API keys secure and do not commit them to version control.\n\n---\n\n## Code Health\n\n- **Unit Tests**: The test suite covers a wide range of functionalities, including text generation, image generation, audio, RAG, tools, multimodal, and more.\n- **Test Results**: No test failures reported in the latest run (`test_reports/` is empty).\n- **Total Tests**: Hundreds of tests across all modules.\n- **Failures**: None reported.\n- **Assessment**: The codebase is robust and well-tested. Any failures would be non-critical unless otherwise noted in `test_report.md` or `ANALYSIS.md`.\n\n---\n\n## API Keys\n\nLitemind requires API keys for the various LLM providers:\n\n- **OpenAI**: `OPENAI_API_KEY`\n- **Anthropic (Claude)**: `ANTHROPIC_API_KEY`\n- **Google Gemini**: `GOOGLE_GEMINI_API_KEY`\n- **Ollama**: (local server, no key required by default)\n\n### Setting API Keys\n\n#### Linux / macOS\n\nAdd the following lines to your `~/.bashrc`, `~/.zshrc`, or `~/.profile`:\n\n```bash\nexport OPENAI_API_KEY=\"sk-...\"\nexport ANTHROPIC_API_KEY=\"sk-ant-...\"\nexport GOOGLE_GEMINI_API_KEY=\"...\"\n```\n\nThen reload your shell:\n\n```bash\nsource ~/.bashrc\n```\n\n#### Windows\n\nSet environment variables in the Command Prompt or PowerShell:\n\n```cmd\nsetx OPENAI_API_KEY \"sk-...\"\nsetx ANTHROPIC_API_KEY \"sk-ant-...\"\nsetx GOOGLE_GEMINI_API_KEY \"...\"\n```\n\nOr add them to your system environment variables via the Control Panel.\n\n#### In Python (not recommended for production):\n\n```python\nimport os\nos.environ[\"OPENAI_API_KEY\"] = \"sk-...\"\n```\n\n---\n\n## Roadmap\n\n- [x] Setup a readme with a quick start guide.\n- [x] setup continuous integration and pipy deployment.\n- [x] Improve document conversion (page per page text and video interleaving + whole page images)\n- [x] Cleanup structured output with tool usage\n- [x] Implement streaming callbacks\n- [x] Improve folder/archive conversion: add ascii folder tree\n- [x] Reorganise media files used for testing into a single media folder\n- [x] Improve logging with arbol, with option to turn off.\n- [x] Use specialised libraries for document type identification\n- [x] Cleanup the document conversion code.\n- [x] Add support for adding nD images to messages.\n- [x] Automatic feature support discovery for models (which models support images as input, reasoning, etc...)\n- [x] Add support for OpenAI's new 'Response' API.\n- [x] Add support for builtin tools: Web search and MCP protocol.\n- [ ] Add webui functionality for agents using Reflex.\n- [ ] Video conversion temporal sampling should adapt to the video length, short videos should have more frames...\n- [ ] RAG ingestion code for arbitrary digital objects: folders, pdf, images, urls, etc...\n- [ ] Add more support for MCP protocol beyond built-in API support.\n- [ ] Use the faster pybase64 for base64 encoding/decoding.\n- [ ] Deal with message sizes in tokens sent to models\n- [ ] Improve vendor api robustness features such as retry call when server errors, etc...\n- [ ] Improve and uniformize exception handling\n- [ ] Implement 'brainstorming' mode for text generation, possibly with API fusion.\n\n---\n\n## Contributing\n\nContributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\n\n- Fork the repository and create a feature branch.\n- Write clear, tested, and well-documented code.\n- Run `pytest` and ensure all tests pass.\n- Use [Black](https://github.com/psf/black), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/), and [mypy](http://mypy-lang.org/) for code quality.\n- Submit a pull request and describe your changes.\n\n---\n\n## License\n\nBSD-3-Clause. See [LICENSE](LICENSE) for details.\n\n---\n\n## Logging\n\nLitemind uses [Arbol](http://github.com/royerlab/arbol) for logging. You can deactivate logging by setting `Arbol.passthrough = True` in your code.\n\n---\n\n> _This README was generated with the help of AI._\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A wrapper API around LLM APIs and agentic AI framework",
    "version": "2025.7.28.1",
    "project_urls": {
        "Homepage": "https://github.com/royerlab/litemind",
        "Issues": "https://github.com/royerlab/litemind/issues"
    },
    "split_keywords": [
        "agents",
        " ai",
        " llm",
        " wrapper"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2a763134437176fb08665068599e61096858c3f75f4b5bf5192a52e5f5778251",
                "md5": "e2b151bb1dc4ee0139cd4c4ba7604e6a",
                "sha256": "c5491f54e9432d53ad2f73900e0f26ddcc1a358310d5bbdd48b6867a47257e46"
            },
            "downloads": -1,
            "filename": "litemind-2025.7.28.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e2b151bb1dc4ee0139cd4c4ba7604e6a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 76313203,
            "upload_time": "2025-07-29T02:03:26",
            "upload_time_iso_8601": "2025-07-29T02:03:26.723867Z",
            "url": "https://files.pythonhosted.org/packages/2a/76/3134437176fb08665068599e61096858c3f75f4b5bf5192a52e5f5778251/litemind-2025.7.28.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5e4e424ca928b90d1fb17a2cf3049b0708100ac4aa8c52a781b33cdaeae91833",
                "md5": "e3623ae1773b009175012e91e2e4a498",
                "sha256": "9a6d2e4ac413478ffa78ce92008b0b62b2b13436d1d5b6df2bc11bd3ca4d6b52"
            },
            "downloads": -1,
            "filename": "litemind-2025.7.28.1.tar.gz",
            "has_sig": false,
            "md5_digest": "e3623ae1773b009175012e91e2e4a498",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 338837,
            "upload_time": "2025-07-29T02:03:22",
            "upload_time_iso_8601": "2025-07-29T02:03:22.394580Z",
            "url": "https://files.pythonhosted.org/packages/5e/4e/424ca928b90d1fb17a2cf3049b0708100ac4aa8c52a781b33cdaeae91833/litemind-2025.7.28.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-29 02:03:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "royerlab",
    "github_project": "litemind",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "litemind"
}
        
Elapsed time: 0.42715s