qv-ollama-sdk


Nameqv-ollama-sdk JSON
Version 0.9.1 PyPI version JSON
download
home_pageNone
SummaryA simple SDK for interacting with the Ollama API by automatically creating a conversation (chat history)
upload_time2025-07-15 06:35:23
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords ollama llm ai chat sdk domain-driven-design
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # QV Ollama SDK

A simple SDK for interacting with the Ollama API with **thinking mode** and **tool calling** support.

## Features

- 🧠 **Thinking Mode** - See AI reasoning process before answers
- 🛠️ **Tool Calling** - Execute Python functions automatically  
- 💬 **Simple Conversation** - Easy chat interface
- ⚡ **Streaming Support** - Real-time responses
- 🔧 **Explicit Parameters** - No unnecessary defaults
- 🛡️ **Model Compatibility** - Auto-fallback for unsupported features

## Installation

```bash
pip install qv-ollama-sdk
```

## Quick Start

```python
from qv_ollama_sdk import OllamaChatClient

# Create a client with a system message
client = OllamaChatClient(
    model_name="qwen3:8b",
    system_message="You are a helpful assistant."
)

# Simple chat - uses Ollama's default parameters
response = client.chat("What is the capital of France?")
print(response.content)

# Continue the conversation
response = client.chat("And what is its population?")
print(response.content)

# Set specific parameters only when you need them
client.temperature = 1.0  # Using property setter
client.max_tokens = 500   # Using property setter
client.set_parameters(num_ctx=2048)  # For multiple parameters

# Get conversation history
history = client.get_history()
```

## 🧠 Thinking Mode

```python
# Enable thinking globally
client.enable_thinking()
response = client.chat("Solve this complex problem...")
print(f"🧠 Thinking: {response.thinking}")
print(f"💬 Answer: {response.content}")

# Disable when you want fast responses
client.disable_thinking()
```

## 🛠️ Tool Calling

```python
def add_numbers(a: str, b: str) -> str:
    """Add two numbers."""
    return str(int(a) + int(b))

def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Sunny in {city}, 23°C"

tools = [add_numbers, get_weather]

# AI automatically calls functions when needed
response = client.chat("What's 15+27? And weather in Berlin?", tools=tools)
print(response.content)
```

## 🎯 Thinking + Tools

```python
client.enable_thinking()
response = client.chat("Calculate 25 + 18", tools=tools)

print(f"🧠 Thinking: {response.thinking}")
print(f"💬 Answer: {response.content}")
print(f"🛠️ Tools used: {len(response.tool_calls)}")
```

## ⚡ Streaming

```python
# Stream with thinking and tools
for chunk in client.stream_chat("Add 12 + 8", tools=tools):
    if chunk.thinking:
        print(chunk.thinking, end="")
    if chunk.tool_calls:
        print(f"🛠️ Using: {chunk.tool_calls[0].function.name}")
    if chunk.content:
        print(chunk.content, end="")
```

## API Reference

### Main Methods
- `chat(message, tools=None, auto_execute=True)` - Get response
- `stream_chat(message, tools=None, auto_execute=True)` - Stream response

### Thinking Control
- `enable_thinking()` - Enable thinking globally
- `disable_thinking()` - Disable thinking globally

### Response Object
- `response.content` - The answer
- `response.thinking` - AI's thought process
- `response.tool_calls` - Tools that were called
- `response.tool_results` - Tool execution results

### Parameters
- `tools=None` - List of Python functions
- `auto_execute=True` - Auto-run tools (default)
- `auto_execute=False` - Raw tool calls only

## 🛡️ Model Compatibility

The SDK automatically handles different model capabilities:

```python
# Works with any model - features auto-disabled if unsupported
client = OllamaChatClient(model_name="gemma2:2b")  # No tool/thinking support
client.enable_thinking()  # Will be ignored if not supported
tools = [add_numbers]

# This still works! Falls back to normal chat
response = client.chat("What is 15 + 27?", tools=tools)
# → "15 + 27 equals 42" (calculated by model, no tools used)
```

**Supported Models:**
- ✅ **Modern models** (e.g., `qwen3:8b`) - Full features
- ✅ **Tool-only models** (e.g., `llama3:8b`) - Tools but no thinking  
- ✅ **Thinking-only models** - Thinking but no tools
- ✅ **Basic models** (e.g., `gemma2:2b`) - Normal chat only

**Graceful Degradation:**
- Unsupported features are automatically disabled
- No errors or exceptions thrown
- Always provides a response

## Advanced Usage

For more control, you can use the lower-level API:

```python
from qv_ollama_sdk import Conversation, OllamaConversationService, ModelParameters

# Create a conversation
conversation = Conversation(model_name="qwen3:8b")
conversation.add_system_message("You are a helpful assistant.")
conversation.add_user_message("What is the capital of France?")

# Generate a response with specific parameters including thinking
service = OllamaConversationService()
parameters = ModelParameters(temperature=0.7, num_ctx=2048, think=True)
response = service.generate_response(conversation, parameters)

print(f"Thinking: {response.thinking}")
print(f"Answer: {response.content}")
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "qv-ollama-sdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ollama, llm, ai, chat, sdk, domain-driven-design",
    "author": null,
    "author_email": "Thomas Bernhard <thomas@quantyverse.com>",
    "download_url": "https://files.pythonhosted.org/packages/4e/38/5edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3/qv_ollama_sdk-0.9.1.tar.gz",
    "platform": null,
    "description": "# QV Ollama SDK\r\n\r\nA simple SDK for interacting with the Ollama API with **thinking mode** and **tool calling** support.\r\n\r\n## Features\r\n\r\n- \ud83e\udde0 **Thinking Mode** - See AI reasoning process before answers\r\n- \ud83d\udee0\ufe0f **Tool Calling** - Execute Python functions automatically  \r\n- \ud83d\udcac **Simple Conversation** - Easy chat interface\r\n- \u26a1 **Streaming Support** - Real-time responses\r\n- \ud83d\udd27 **Explicit Parameters** - No unnecessary defaults\r\n- \ud83d\udee1\ufe0f **Model Compatibility** - Auto-fallback for unsupported features\r\n\r\n## Installation\r\n\r\n```bash\r\npip install qv-ollama-sdk\r\n```\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom qv_ollama_sdk import OllamaChatClient\r\n\r\n# Create a client with a system message\r\nclient = OllamaChatClient(\r\n    model_name=\"qwen3:8b\",\r\n    system_message=\"You are a helpful assistant.\"\r\n)\r\n\r\n# Simple chat - uses Ollama's default parameters\r\nresponse = client.chat(\"What is the capital of France?\")\r\nprint(response.content)\r\n\r\n# Continue the conversation\r\nresponse = client.chat(\"And what is its population?\")\r\nprint(response.content)\r\n\r\n# Set specific parameters only when you need them\r\nclient.temperature = 1.0  # Using property setter\r\nclient.max_tokens = 500   # Using property setter\r\nclient.set_parameters(num_ctx=2048)  # For multiple parameters\r\n\r\n# Get conversation history\r\nhistory = client.get_history()\r\n```\r\n\r\n## \ud83e\udde0 Thinking Mode\r\n\r\n```python\r\n# Enable thinking globally\r\nclient.enable_thinking()\r\nresponse = client.chat(\"Solve this complex problem...\")\r\nprint(f\"\ud83e\udde0 Thinking: {response.thinking}\")\r\nprint(f\"\ud83d\udcac Answer: {response.content}\")\r\n\r\n# Disable when you want fast responses\r\nclient.disable_thinking()\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Tool Calling\r\n\r\n```python\r\ndef add_numbers(a: str, b: str) -> str:\r\n    \"\"\"Add two numbers.\"\"\"\r\n    return str(int(a) + int(b))\r\n\r\ndef get_weather(city: str) -> str:\r\n    \"\"\"Get weather for a city.\"\"\"\r\n    return f\"Sunny in {city}, 23\u00b0C\"\r\n\r\ntools = [add_numbers, get_weather]\r\n\r\n# AI automatically calls functions when needed\r\nresponse = client.chat(\"What's 15+27? And weather in Berlin?\", tools=tools)\r\nprint(response.content)\r\n```\r\n\r\n## \ud83c\udfaf Thinking + Tools\r\n\r\n```python\r\nclient.enable_thinking()\r\nresponse = client.chat(\"Calculate 25 + 18\", tools=tools)\r\n\r\nprint(f\"\ud83e\udde0 Thinking: {response.thinking}\")\r\nprint(f\"\ud83d\udcac Answer: {response.content}\")\r\nprint(f\"\ud83d\udee0\ufe0f Tools used: {len(response.tool_calls)}\")\r\n```\r\n\r\n## \u26a1 Streaming\r\n\r\n```python\r\n# Stream with thinking and tools\r\nfor chunk in client.stream_chat(\"Add 12 + 8\", tools=tools):\r\n    if chunk.thinking:\r\n        print(chunk.thinking, end=\"\")\r\n    if chunk.tool_calls:\r\n        print(f\"\ud83d\udee0\ufe0f Using: {chunk.tool_calls[0].function.name}\")\r\n    if chunk.content:\r\n        print(chunk.content, end=\"\")\r\n```\r\n\r\n## API Reference\r\n\r\n### Main Methods\r\n- `chat(message, tools=None, auto_execute=True)` - Get response\r\n- `stream_chat(message, tools=None, auto_execute=True)` - Stream response\r\n\r\n### Thinking Control\r\n- `enable_thinking()` - Enable thinking globally\r\n- `disable_thinking()` - Disable thinking globally\r\n\r\n### Response Object\r\n- `response.content` - The answer\r\n- `response.thinking` - AI's thought process\r\n- `response.tool_calls` - Tools that were called\r\n- `response.tool_results` - Tool execution results\r\n\r\n### Parameters\r\n- `tools=None` - List of Python functions\r\n- `auto_execute=True` - Auto-run tools (default)\r\n- `auto_execute=False` - Raw tool calls only\r\n\r\n## \ud83d\udee1\ufe0f Model Compatibility\r\n\r\nThe SDK automatically handles different model capabilities:\r\n\r\n```python\r\n# Works with any model - features auto-disabled if unsupported\r\nclient = OllamaChatClient(model_name=\"gemma2:2b\")  # No tool/thinking support\r\nclient.enable_thinking()  # Will be ignored if not supported\r\ntools = [add_numbers]\r\n\r\n# This still works! Falls back to normal chat\r\nresponse = client.chat(\"What is 15 + 27?\", tools=tools)\r\n# \u2192 \"15 + 27 equals 42\" (calculated by model, no tools used)\r\n```\r\n\r\n**Supported Models:**\r\n- \u2705 **Modern models** (e.g., `qwen3:8b`) - Full features\r\n- \u2705 **Tool-only models** (e.g., `llama3:8b`) - Tools but no thinking  \r\n- \u2705 **Thinking-only models** - Thinking but no tools\r\n- \u2705 **Basic models** (e.g., `gemma2:2b`) - Normal chat only\r\n\r\n**Graceful Degradation:**\r\n- Unsupported features are automatically disabled\r\n- No errors or exceptions thrown\r\n- Always provides a response\r\n\r\n## Advanced Usage\r\n\r\nFor more control, you can use the lower-level API:\r\n\r\n```python\r\nfrom qv_ollama_sdk import Conversation, OllamaConversationService, ModelParameters\r\n\r\n# Create a conversation\r\nconversation = Conversation(model_name=\"qwen3:8b\")\r\nconversation.add_system_message(\"You are a helpful assistant.\")\r\nconversation.add_user_message(\"What is the capital of France?\")\r\n\r\n# Generate a response with specific parameters including thinking\r\nservice = OllamaConversationService()\r\nparameters = ModelParameters(temperature=0.7, num_ctx=2048, think=True)\r\nresponse = service.generate_response(conversation, parameters)\r\n\r\nprint(f\"Thinking: {response.thinking}\")\r\nprint(f\"Answer: {response.content}\")\r\n```\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple SDK for interacting with the Ollama API by automatically creating a conversation (chat history)",
    "version": "0.9.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/quantyverse/qv-ollama-sdk/issues",
        "Documentation": "https://github.com/quantyverse/qv-ollama-sdk#readme",
        "Homepage": "https://github.com/quantyverse/qv-ollama-sdk"
    },
    "split_keywords": [
        "ollama",
        " llm",
        " ai",
        " chat",
        " sdk",
        " domain-driven-design"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5bedfa6eb433ada33f40f9f569ad034fa0886a4717552e21d9cad4f2dc6922de",
                "md5": "6ceb349691ade711e2d338c9ec77fa30",
                "sha256": "8c828ca8dd0cbfc178bcd9fad25500e8f30efd5db20e74c6dab1f77fca281c87"
            },
            "downloads": -1,
            "filename": "qv_ollama_sdk-0.9.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6ceb349691ade711e2d338c9ec77fa30",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 14694,
            "upload_time": "2025-07-15T06:35:21",
            "upload_time_iso_8601": "2025-07-15T06:35:21.697375Z",
            "url": "https://files.pythonhosted.org/packages/5b/ed/fa6eb433ada33f40f9f569ad034fa0886a4717552e21d9cad4f2dc6922de/qv_ollama_sdk-0.9.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4e385edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3",
                "md5": "da98c0a26ae66eb4587b8f4a6e14db1b",
                "sha256": "cafaae0b6d25864fa155e42e9983bb5eb9acee05c0cf765917c8ed7b36f8fc9b"
            },
            "downloads": -1,
            "filename": "qv_ollama_sdk-0.9.1.tar.gz",
            "has_sig": false,
            "md5_digest": "da98c0a26ae66eb4587b8f4a6e14db1b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 16356,
            "upload_time": "2025-07-15T06:35:23",
            "upload_time_iso_8601": "2025-07-15T06:35:23.016898Z",
            "url": "https://files.pythonhosted.org/packages/4e/38/5edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3/qv_ollama_sdk-0.9.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 06:35:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "quantyverse",
    "github_project": "qv-ollama-sdk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "qv-ollama-sdk"
}
        
Elapsed time: 0.58632s