Name | qv-ollama-sdk JSON |
Version |
0.9.1
JSON |
| download |
home_page | None |
Summary | A simple SDK for interacting with the Ollama API by automatically creating a conversation (chat history) |
upload_time | 2025-07-15 06:35:23 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT |
keywords |
ollama
llm
ai
chat
sdk
domain-driven-design
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# QV Ollama SDK
A simple SDK for interacting with the Ollama API with **thinking mode** and **tool calling** support.
## Features
- 🧠 **Thinking Mode** - See AI reasoning process before answers
- 🛠️ **Tool Calling** - Execute Python functions automatically
- 💬 **Simple Conversation** - Easy chat interface
- ⚡ **Streaming Support** - Real-time responses
- 🔧 **Explicit Parameters** - No unnecessary defaults
- 🛡️ **Model Compatibility** - Auto-fallback for unsupported features
## Installation
```bash
pip install qv-ollama-sdk
```
## Quick Start
```python
from qv_ollama_sdk import OllamaChatClient
# Create a client with a system message
client = OllamaChatClient(
model_name="qwen3:8b",
system_message="You are a helpful assistant."
)
# Simple chat - uses Ollama's default parameters
response = client.chat("What is the capital of France?")
print(response.content)
# Continue the conversation
response = client.chat("And what is its population?")
print(response.content)
# Set specific parameters only when you need them
client.temperature = 1.0 # Using property setter
client.max_tokens = 500 # Using property setter
client.set_parameters(num_ctx=2048) # For multiple parameters
# Get conversation history
history = client.get_history()
```
## 🧠 Thinking Mode
```python
# Enable thinking globally
client.enable_thinking()
response = client.chat("Solve this complex problem...")
print(f"🧠 Thinking: {response.thinking}")
print(f"💬 Answer: {response.content}")
# Disable when you want fast responses
client.disable_thinking()
```
## 🛠️ Tool Calling
```python
def add_numbers(a: str, b: str) -> str:
"""Add two numbers."""
return str(int(a) + int(b))
def get_weather(city: str) -> str:
"""Get weather for a city."""
return f"Sunny in {city}, 23°C"
tools = [add_numbers, get_weather]
# AI automatically calls functions when needed
response = client.chat("What's 15+27? And weather in Berlin?", tools=tools)
print(response.content)
```
## 🎯 Thinking + Tools
```python
client.enable_thinking()
response = client.chat("Calculate 25 + 18", tools=tools)
print(f"🧠 Thinking: {response.thinking}")
print(f"💬 Answer: {response.content}")
print(f"🛠️ Tools used: {len(response.tool_calls)}")
```
## ⚡ Streaming
```python
# Stream with thinking and tools
for chunk in client.stream_chat("Add 12 + 8", tools=tools):
if chunk.thinking:
print(chunk.thinking, end="")
if chunk.tool_calls:
print(f"🛠️ Using: {chunk.tool_calls[0].function.name}")
if chunk.content:
print(chunk.content, end="")
```
## API Reference
### Main Methods
- `chat(message, tools=None, auto_execute=True)` - Get response
- `stream_chat(message, tools=None, auto_execute=True)` - Stream response
### Thinking Control
- `enable_thinking()` - Enable thinking globally
- `disable_thinking()` - Disable thinking globally
### Response Object
- `response.content` - The answer
- `response.thinking` - AI's thought process
- `response.tool_calls` - Tools that were called
- `response.tool_results` - Tool execution results
### Parameters
- `tools=None` - List of Python functions
- `auto_execute=True` - Auto-run tools (default)
- `auto_execute=False` - Raw tool calls only
## 🛡️ Model Compatibility
The SDK automatically handles different model capabilities:
```python
# Works with any model - features auto-disabled if unsupported
client = OllamaChatClient(model_name="gemma2:2b") # No tool/thinking support
client.enable_thinking() # Will be ignored if not supported
tools = [add_numbers]
# This still works! Falls back to normal chat
response = client.chat("What is 15 + 27?", tools=tools)
# → "15 + 27 equals 42" (calculated by model, no tools used)
```
**Supported Models:**
- ✅ **Modern models** (e.g., `qwen3:8b`) - Full features
- ✅ **Tool-only models** (e.g., `llama3:8b`) - Tools but no thinking
- ✅ **Thinking-only models** - Thinking but no tools
- ✅ **Basic models** (e.g., `gemma2:2b`) - Normal chat only
**Graceful Degradation:**
- Unsupported features are automatically disabled
- No errors or exceptions thrown
- Always provides a response
## Advanced Usage
For more control, you can use the lower-level API:
```python
from qv_ollama_sdk import Conversation, OllamaConversationService, ModelParameters
# Create a conversation
conversation = Conversation(model_name="qwen3:8b")
conversation.add_system_message("You are a helpful assistant.")
conversation.add_user_message("What is the capital of France?")
# Generate a response with specific parameters including thinking
service = OllamaConversationService()
parameters = ModelParameters(temperature=0.7, num_ctx=2048, think=True)
response = service.generate_response(conversation, parameters)
print(f"Thinking: {response.thinking}")
print(f"Answer: {response.content}")
```
Raw data
{
"_id": null,
"home_page": null,
"name": "qv-ollama-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "ollama, llm, ai, chat, sdk, domain-driven-design",
"author": null,
"author_email": "Thomas Bernhard <thomas@quantyverse.com>",
"download_url": "https://files.pythonhosted.org/packages/4e/38/5edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3/qv_ollama_sdk-0.9.1.tar.gz",
"platform": null,
"description": "# QV Ollama SDK\r\n\r\nA simple SDK for interacting with the Ollama API with **thinking mode** and **tool calling** support.\r\n\r\n## Features\r\n\r\n- \ud83e\udde0 **Thinking Mode** - See AI reasoning process before answers\r\n- \ud83d\udee0\ufe0f **Tool Calling** - Execute Python functions automatically \r\n- \ud83d\udcac **Simple Conversation** - Easy chat interface\r\n- \u26a1 **Streaming Support** - Real-time responses\r\n- \ud83d\udd27 **Explicit Parameters** - No unnecessary defaults\r\n- \ud83d\udee1\ufe0f **Model Compatibility** - Auto-fallback for unsupported features\r\n\r\n## Installation\r\n\r\n```bash\r\npip install qv-ollama-sdk\r\n```\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom qv_ollama_sdk import OllamaChatClient\r\n\r\n# Create a client with a system message\r\nclient = OllamaChatClient(\r\n model_name=\"qwen3:8b\",\r\n system_message=\"You are a helpful assistant.\"\r\n)\r\n\r\n# Simple chat - uses Ollama's default parameters\r\nresponse = client.chat(\"What is the capital of France?\")\r\nprint(response.content)\r\n\r\n# Continue the conversation\r\nresponse = client.chat(\"And what is its population?\")\r\nprint(response.content)\r\n\r\n# Set specific parameters only when you need them\r\nclient.temperature = 1.0 # Using property setter\r\nclient.max_tokens = 500 # Using property setter\r\nclient.set_parameters(num_ctx=2048) # For multiple parameters\r\n\r\n# Get conversation history\r\nhistory = client.get_history()\r\n```\r\n\r\n## \ud83e\udde0 Thinking Mode\r\n\r\n```python\r\n# Enable thinking globally\r\nclient.enable_thinking()\r\nresponse = client.chat(\"Solve this complex problem...\")\r\nprint(f\"\ud83e\udde0 Thinking: {response.thinking}\")\r\nprint(f\"\ud83d\udcac Answer: {response.content}\")\r\n\r\n# Disable when you want fast responses\r\nclient.disable_thinking()\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Tool Calling\r\n\r\n```python\r\ndef add_numbers(a: str, b: str) -> str:\r\n \"\"\"Add two numbers.\"\"\"\r\n return str(int(a) + int(b))\r\n\r\ndef get_weather(city: str) -> str:\r\n \"\"\"Get weather for a city.\"\"\"\r\n return f\"Sunny in {city}, 23\u00b0C\"\r\n\r\ntools = [add_numbers, get_weather]\r\n\r\n# AI automatically calls functions when needed\r\nresponse = client.chat(\"What's 15+27? And weather in Berlin?\", tools=tools)\r\nprint(response.content)\r\n```\r\n\r\n## \ud83c\udfaf Thinking + Tools\r\n\r\n```python\r\nclient.enable_thinking()\r\nresponse = client.chat(\"Calculate 25 + 18\", tools=tools)\r\n\r\nprint(f\"\ud83e\udde0 Thinking: {response.thinking}\")\r\nprint(f\"\ud83d\udcac Answer: {response.content}\")\r\nprint(f\"\ud83d\udee0\ufe0f Tools used: {len(response.tool_calls)}\")\r\n```\r\n\r\n## \u26a1 Streaming\r\n\r\n```python\r\n# Stream with thinking and tools\r\nfor chunk in client.stream_chat(\"Add 12 + 8\", tools=tools):\r\n if chunk.thinking:\r\n print(chunk.thinking, end=\"\")\r\n if chunk.tool_calls:\r\n print(f\"\ud83d\udee0\ufe0f Using: {chunk.tool_calls[0].function.name}\")\r\n if chunk.content:\r\n print(chunk.content, end=\"\")\r\n```\r\n\r\n## API Reference\r\n\r\n### Main Methods\r\n- `chat(message, tools=None, auto_execute=True)` - Get response\r\n- `stream_chat(message, tools=None, auto_execute=True)` - Stream response\r\n\r\n### Thinking Control\r\n- `enable_thinking()` - Enable thinking globally\r\n- `disable_thinking()` - Disable thinking globally\r\n\r\n### Response Object\r\n- `response.content` - The answer\r\n- `response.thinking` - AI's thought process\r\n- `response.tool_calls` - Tools that were called\r\n- `response.tool_results` - Tool execution results\r\n\r\n### Parameters\r\n- `tools=None` - List of Python functions\r\n- `auto_execute=True` - Auto-run tools (default)\r\n- `auto_execute=False` - Raw tool calls only\r\n\r\n## \ud83d\udee1\ufe0f Model Compatibility\r\n\r\nThe SDK automatically handles different model capabilities:\r\n\r\n```python\r\n# Works with any model - features auto-disabled if unsupported\r\nclient = OllamaChatClient(model_name=\"gemma2:2b\") # No tool/thinking support\r\nclient.enable_thinking() # Will be ignored if not supported\r\ntools = [add_numbers]\r\n\r\n# This still works! Falls back to normal chat\r\nresponse = client.chat(\"What is 15 + 27?\", tools=tools)\r\n# \u2192 \"15 + 27 equals 42\" (calculated by model, no tools used)\r\n```\r\n\r\n**Supported Models:**\r\n- \u2705 **Modern models** (e.g., `qwen3:8b`) - Full features\r\n- \u2705 **Tool-only models** (e.g., `llama3:8b`) - Tools but no thinking \r\n- \u2705 **Thinking-only models** - Thinking but no tools\r\n- \u2705 **Basic models** (e.g., `gemma2:2b`) - Normal chat only\r\n\r\n**Graceful Degradation:**\r\n- Unsupported features are automatically disabled\r\n- No errors or exceptions thrown\r\n- Always provides a response\r\n\r\n## Advanced Usage\r\n\r\nFor more control, you can use the lower-level API:\r\n\r\n```python\r\nfrom qv_ollama_sdk import Conversation, OllamaConversationService, ModelParameters\r\n\r\n# Create a conversation\r\nconversation = Conversation(model_name=\"qwen3:8b\")\r\nconversation.add_system_message(\"You are a helpful assistant.\")\r\nconversation.add_user_message(\"What is the capital of France?\")\r\n\r\n# Generate a response with specific parameters including thinking\r\nservice = OllamaConversationService()\r\nparameters = ModelParameters(temperature=0.7, num_ctx=2048, think=True)\r\nresponse = service.generate_response(conversation, parameters)\r\n\r\nprint(f\"Thinking: {response.thinking}\")\r\nprint(f\"Answer: {response.content}\")\r\n```\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A simple SDK for interacting with the Ollama API by automatically creating a conversation (chat history)",
"version": "0.9.1",
"project_urls": {
"Bug Tracker": "https://github.com/quantyverse/qv-ollama-sdk/issues",
"Documentation": "https://github.com/quantyverse/qv-ollama-sdk#readme",
"Homepage": "https://github.com/quantyverse/qv-ollama-sdk"
},
"split_keywords": [
"ollama",
" llm",
" ai",
" chat",
" sdk",
" domain-driven-design"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5bedfa6eb433ada33f40f9f569ad034fa0886a4717552e21d9cad4f2dc6922de",
"md5": "6ceb349691ade711e2d338c9ec77fa30",
"sha256": "8c828ca8dd0cbfc178bcd9fad25500e8f30efd5db20e74c6dab1f77fca281c87"
},
"downloads": -1,
"filename": "qv_ollama_sdk-0.9.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6ceb349691ade711e2d338c9ec77fa30",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14694,
"upload_time": "2025-07-15T06:35:21",
"upload_time_iso_8601": "2025-07-15T06:35:21.697375Z",
"url": "https://files.pythonhosted.org/packages/5b/ed/fa6eb433ada33f40f9f569ad034fa0886a4717552e21d9cad4f2dc6922de/qv_ollama_sdk-0.9.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4e385edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3",
"md5": "da98c0a26ae66eb4587b8f4a6e14db1b",
"sha256": "cafaae0b6d25864fa155e42e9983bb5eb9acee05c0cf765917c8ed7b36f8fc9b"
},
"downloads": -1,
"filename": "qv_ollama_sdk-0.9.1.tar.gz",
"has_sig": false,
"md5_digest": "da98c0a26ae66eb4587b8f4a6e14db1b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 16356,
"upload_time": "2025-07-15T06:35:23",
"upload_time_iso_8601": "2025-07-15T06:35:23.016898Z",
"url": "https://files.pythonhosted.org/packages/4e/38/5edc29aeceaff5ec2fe11aadce86cb813e141d99b3bb7147cb480c62b4d3/qv_ollama_sdk-0.9.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-15 06:35:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "quantyverse",
"github_project": "qv-ollama-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "qv-ollama-sdk"
}