Name | hanzo-memory JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | AI memory service with FastAPI and MCP support |
upload_time | 2025-07-26 00:12:18 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.13 |
license | BSD |
keywords |
ai
memory
mcp
fastapi
embeddings
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Hanzo Memory Service
## Add memory to any AI application!
A high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with LanceDB vector database (works on all platforms including browsers via WASM), local embeddings, and LiteLLM for flexible LLM integration.
## Features
- **π§ Intelligent Memory Management**: Store and retrieve contextual memories with semantic search
- **π Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships
- **π¬ Chat History**: Store and search conversation history with de-duplication
- **π Unified Search API**: Fast semantic search using FastEmbed embeddings
- **π€ Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)
- **π Multi-tenancy**: Secure user and project-based data isolation
- **π High Performance**: Local embeddings and efficient vector storage
- **ποΈ Cross-Platform Database**: LanceDB works everywhere - Linux, macOS, Windows, and even browsers
- **π MCP Support**: Model Context Protocol server for AI tool integration
- **π¦ Easy Deployment**: Docker support and uvx compatibility
## Architecture
```
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β FastAPI ββββββΆβ Vector DB ββββββΆβ Embeddings β
β Server β β (LanceDB/ β β (FastEmbed/ β
βββββββββββββββββββ β InfinityDB) β β LanceDB) β
β βββββββββββββββββββ βββββββββββββββββββ
βΌ β
βββββββββββββββββββ βββββββββββββββββββ
β LiteLLM β β Local Models β
β (LLM Bridge) β β (BGE, etc.) β
βββββββββββββββββββ βββββββββββββββββββ
```
## Quick Start
### Install with uvx
```bash
# Install and run directly with uvx
uvx hanzo-memory
# Or install globally
uvx install hanzo-memory
```
### Install from source
```bash
# Clone the repository
git clone https://github.com/hanzoai/memory
cd memory
# Install with uv
make setup
# Run the server
make dev
```
### Docker
```bash
# Using docker-compose
docker-compose up
# Or build and run manually
docker build -t hanzo-memory .
docker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory
```
## Configuration
Create a `.env` file (see `.env.example`):
```env
# API Authentication
HANZO_API_KEY=your-api-key-here
HANZO_DISABLE_AUTH=false # Set to true for local development
# LLM Configuration (choose one)
# OpenAI
HANZO_LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=your-openai-key
# Anthropic
HANZO_LLM_MODEL=claude-3-haiku-20240307
ANTHROPIC_API_KEY=your-anthropic-key
# Local Models (Ollama)
HANZO_LLM_MODEL=ollama/llama3.2
HANZO_LLM_API_BASE=http://localhost:11434
# Embedding Model
HANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
# Database Backend (optional, defaults to lancedb)
HANZO_DB_BACKEND=lancedb
HANZO_LANCEDB_PATH=data/lancedb
```
## Database Backends
Hanzo Memory supports multiple vector database backends:
### LanceDB (Default)
- Modern embedded vector database that works on ALL platforms
- Cross-platform: Linux, macOS, Windows, ARM, and even browsers (via WASM)
- Built-in support for FastEmbed and sentence-transformers
- Efficient columnar storage format (Apache Arrow/Parquet)
- Native vector similarity search
- Can be embedded in Python, JavaScript/TypeScript, Rust applications
### InfinityDB (Alternative, Linux/Windows only)
- High-performance embedded vector database
- Not available on macOS
- Optimized for production workloads
- Built-in vector indexing
To configure the database backend:
```env
# Use LanceDB
HANZO_DB_BACKEND=lancedb
HANZO_LANCEDB_PATH=data/lancedb
# Use InfinityDB
HANZO_DB_BACKEND=infinity
HANZO_INFINITY_DB_PATH=data/infinity_db
```
## API Documentation
For complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).
### Quick API Overview
- **Memory Management**: `/v1/remember`, `/v1/memories/*`
- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`
- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`
- **Search**: Unified semantic search across all data types
- **MCP Server**: Model Context Protocol integration for AI tools
### LLM Features
The service can:
- **Summarize content** for knowledge extraction
- **Generate knowledge update instructions** in JSON format
- **Filter search results** for relevance
- **Strip PII** from stored content
Example summarization request:
```python
llm_service.summarize_for_knowledge(
content="Long document...",
skip_summarization=False, # Set to True to skip
provided_summary="Optional pre-made summary"
)
```
Returns:
```json
{
"summary": "Concise summary of content",
"knowledge_instructions": {
"action": "add_fact",
"facts": [{"content": "Extracted fact", "metadata": {...}}],
"reasoning": "Why these facts are important"
}
}
```
## Development
### Running Tests
```bash
# Run all tests
make test
# Run with coverage
make test-cov
# Run specific test
uvx pytest tests/test_memory_api.py -v
```
### Code Quality
```bash
# Format code
make format
# Run linter
make lint
# Type checking
make type-check
```
### Project Structure
```
memory/
βββ src/hanzo_memory/
β βββ api/ # API authentication
β βββ db/ # InfinityDB client
β βββ models/ # Pydantic models
β βββ services/ # Business logic
β βββ config.py # Settings
β βββ server.py # FastAPI app
βββ tests/ # Pytest tests
βββ Makefile # Build automation
βββ pyproject.toml # Project config
```
## Deployment
### Production Checklist
1. Set strong `HANZO_API_KEY`
2. Configure appropriate LLM model and API keys
3. Set `HANZO_DISABLE_AUTH=false`
4. Configure data persistence volume
5. Set up monitoring and logging
6. Configure rate limiting if needed
### Scaling Considerations
- InfinityDB embedded runs in-process (no separate DB server)
- FastEmbed generates embeddings locally (no API calls)
- LLM calls can be directed to local models for full offline operation
- Use Redis for caching in high-traffic scenarios
## Contributing
Pull requests are welcome! Please:
1. Write tests for new features
2. Follow existing code style
3. Update documentation as needed
4. Run `make check` before submitting
## License
BSD License - see LICENSE file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "hanzo-memory",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.13",
"maintainer_email": null,
"keywords": "ai, memory, mcp, fastapi, embeddings",
"author": null,
"author_email": "\"Hanzo Industries Inc.\" <dev@hanzo.ai>",
"download_url": "https://files.pythonhosted.org/packages/b0/a3/9526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e/hanzo_memory-1.0.0.tar.gz",
"platform": null,
"description": "# Hanzo Memory Service\n\n## Add memory to any AI application!\n\nA high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with LanceDB vector database (works on all platforms including browsers via WASM), local embeddings, and LiteLLM for flexible LLM integration.\n\n## Features\n\n- **\ud83e\udde0 Intelligent Memory Management**: Store and retrieve contextual memories with semantic search\n- **\ud83d\udcda Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships\n- **\ud83d\udcac Chat History**: Store and search conversation history with de-duplication\n- **\ud83d\udd0d Unified Search API**: Fast semantic search using FastEmbed embeddings\n- **\ud83e\udd16 Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)\n- **\ud83d\udd10 Multi-tenancy**: Secure user and project-based data isolation\n- **\ud83d\ude80 High Performance**: Local embeddings and efficient vector storage\n- **\ud83d\uddc4\ufe0f Cross-Platform Database**: LanceDB works everywhere - Linux, macOS, Windows, and even browsers\n- **\ud83d\udd0c MCP Support**: Model Context Protocol server for AI tool integration\n- **\ud83d\udce6 Easy Deployment**: Docker support and uvx compatibility\n\n## Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 FastAPI \u2502\u2500\u2500\u2500\u2500\u25b6\u2502 Vector DB \u2502\u2500\u2500\u2500\u2500\u25b6\u2502 Embeddings \u2502\n\u2502 Server \u2502 \u2502 (LanceDB/ \u2502 \u2502 (FastEmbed/ \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502 InfinityDB) \u2502 \u2502 LanceDB) \u2502\n \u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n \u25bc \u2502\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 LiteLLM \u2502 \u2502 Local Models \u2502\n\u2502 (LLM Bridge) \u2502 \u2502 (BGE, etc.) \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## Quick Start\n\n### Install with uvx\n\n```bash\n# Install and run directly with uvx\nuvx hanzo-memory\n\n# Or install globally\nuvx install hanzo-memory\n```\n\n### Install from source\n\n```bash\n# Clone the repository\ngit clone https://github.com/hanzoai/memory\ncd memory\n\n# Install with uv\nmake setup\n\n# Run the server\nmake dev\n```\n\n### Docker\n\n```bash\n# Using docker-compose\ndocker-compose up\n\n# Or build and run manually\ndocker build -t hanzo-memory .\ndocker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory\n```\n\n## Configuration\n\nCreate a `.env` file (see `.env.example`):\n\n```env\n# API Authentication\nHANZO_API_KEY=your-api-key-here\nHANZO_DISABLE_AUTH=false # Set to true for local development\n\n# LLM Configuration (choose one)\n# OpenAI\nHANZO_LLM_MODEL=gpt-4o-mini\nOPENAI_API_KEY=your-openai-key\n\n# Anthropic\nHANZO_LLM_MODEL=claude-3-haiku-20240307\nANTHROPIC_API_KEY=your-anthropic-key\n\n# Local Models (Ollama)\nHANZO_LLM_MODEL=ollama/llama3.2\nHANZO_LLM_API_BASE=http://localhost:11434\n\n# Embedding Model\nHANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5\n\n# Database Backend (optional, defaults to lancedb)\nHANZO_DB_BACKEND=lancedb\nHANZO_LANCEDB_PATH=data/lancedb\n```\n\n## Database Backends\n\nHanzo Memory supports multiple vector database backends:\n\n### LanceDB (Default)\n- Modern embedded vector database that works on ALL platforms\n- Cross-platform: Linux, macOS, Windows, ARM, and even browsers (via WASM)\n- Built-in support for FastEmbed and sentence-transformers\n- Efficient columnar storage format (Apache Arrow/Parquet)\n- Native vector similarity search\n- Can be embedded in Python, JavaScript/TypeScript, Rust applications\n\n### InfinityDB (Alternative, Linux/Windows only)\n- High-performance embedded vector database\n- Not available on macOS\n- Optimized for production workloads\n- Built-in vector indexing\n\nTo configure the database backend:\n\n```env\n# Use LanceDB\nHANZO_DB_BACKEND=lancedb\nHANZO_LANCEDB_PATH=data/lancedb\n\n# Use InfinityDB\nHANZO_DB_BACKEND=infinity\nHANZO_INFINITY_DB_PATH=data/infinity_db\n```\n\n## API Documentation\n\nFor complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).\n\n### Quick API Overview\n\n- **Memory Management**: `/v1/remember`, `/v1/memories/*`\n- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`\n- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`\n- **Search**: Unified semantic search across all data types\n- **MCP Server**: Model Context Protocol integration for AI tools\n\n### LLM Features\n\nThe service can:\n- **Summarize content** for knowledge extraction\n- **Generate knowledge update instructions** in JSON format\n- **Filter search results** for relevance\n- **Strip PII** from stored content\n\nExample summarization request:\n```python\nllm_service.summarize_for_knowledge(\n content=\"Long document...\",\n skip_summarization=False, # Set to True to skip\n provided_summary=\"Optional pre-made summary\"\n)\n```\n\nReturns:\n```json\n{\n \"summary\": \"Concise summary of content\",\n \"knowledge_instructions\": {\n \"action\": \"add_fact\",\n \"facts\": [{\"content\": \"Extracted fact\", \"metadata\": {...}}],\n \"reasoning\": \"Why these facts are important\"\n }\n}\n```\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\nmake test\n\n# Run with coverage\nmake test-cov\n\n# Run specific test\nuvx pytest tests/test_memory_api.py -v\n```\n\n### Code Quality\n\n```bash\n# Format code\nmake format\n\n# Run linter\nmake lint\n\n# Type checking\nmake type-check\n```\n\n### Project Structure\n\n```\nmemory/\n\u251c\u2500\u2500 src/hanzo_memory/\n\u2502 \u251c\u2500\u2500 api/ # API authentication\n\u2502 \u251c\u2500\u2500 db/ # InfinityDB client\n\u2502 \u251c\u2500\u2500 models/ # Pydantic models\n\u2502 \u251c\u2500\u2500 services/ # Business logic\n\u2502 \u251c\u2500\u2500 config.py # Settings\n\u2502 \u2514\u2500\u2500 server.py # FastAPI app\n\u251c\u2500\u2500 tests/ # Pytest tests\n\u251c\u2500\u2500 Makefile # Build automation\n\u2514\u2500\u2500 pyproject.toml # Project config\n```\n\n## Deployment\n\n### Production Checklist\n\n1. Set strong `HANZO_API_KEY`\n2. Configure appropriate LLM model and API keys\n3. Set `HANZO_DISABLE_AUTH=false`\n4. Configure data persistence volume\n5. Set up monitoring and logging\n6. Configure rate limiting if needed\n\n### Scaling Considerations\n\n- InfinityDB embedded runs in-process (no separate DB server)\n- FastEmbed generates embeddings locally (no API calls)\n- LLM calls can be directed to local models for full offline operation\n- Use Redis for caching in high-traffic scenarios\n\n## Contributing\n\nPull requests are welcome! Please:\n1. Write tests for new features\n2. Follow existing code style\n3. Update documentation as needed\n4. Run `make check` before submitting\n\n## License\n\nBSD License - see LICENSE file for details.\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "AI memory service with FastAPI and MCP support",
"version": "1.0.0",
"project_urls": {
"Documentation": "https://docs.hanzo.ai/memory",
"Homepage": "https://github.com/hanzoai/memory",
"Issues": "https://github.com/hanzoai/memory/issues",
"Repository": "https://github.com/hanzoai/memory"
},
"split_keywords": [
"ai",
" memory",
" mcp",
" fastapi",
" embeddings"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8eacf82f490bc7b29c8f89af939398e71dabcee5b07ba424b0b18a1390616b5d",
"md5": "5ea5e0f4ab6103487449e93c16238b3a",
"sha256": "227aecd06fa739b9ecbd1d5118ea3d56f659488e72ea9310accef046642c8c8f"
},
"downloads": -1,
"filename": "hanzo_memory-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5ea5e0f4ab6103487449e93c16238b3a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.13",
"size": 40851,
"upload_time": "2025-07-26T00:12:17",
"upload_time_iso_8601": "2025-07-26T00:12:17.098613Z",
"url": "https://files.pythonhosted.org/packages/8e/ac/f82f490bc7b29c8f89af939398e71dabcee5b07ba424b0b18a1390616b5d/hanzo_memory-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b0a39526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e",
"md5": "964807882c8187e3b845fecc1f161edf",
"sha256": "0f77635847abd8b34263602495aeaba1ea85d8390bb2f47323584c7488eda10e"
},
"downloads": -1,
"filename": "hanzo_memory-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "964807882c8187e3b845fecc1f161edf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.13",
"size": 44463,
"upload_time": "2025-07-26T00:12:18",
"upload_time_iso_8601": "2025-07-26T00:12:18.062771Z",
"url": "https://files.pythonhosted.org/packages/b0/a3/9526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e/hanzo_memory-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-26 00:12:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hanzoai",
"github_project": "memory",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "hanzo-memory"
}