hanzo-memory


Namehanzo-memory JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryAI memory service with FastAPI and MCP support
upload_time2025-07-26 00:12:18
maintainerNone
docs_urlNone
authorNone
requires_python>=3.13
licenseBSD
keywords ai memory mcp fastapi embeddings
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Hanzo Memory Service

## Add memory to any AI application!

A high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with LanceDB vector database (works on all platforms including browsers via WASM), local embeddings, and LiteLLM for flexible LLM integration.

## Features

- **🧠 Intelligent Memory Management**: Store and retrieve contextual memories with semantic search
- **πŸ“š Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships
- **πŸ’¬ Chat History**: Store and search conversation history with de-duplication
- **πŸ” Unified Search API**: Fast semantic search using FastEmbed embeddings
- **πŸ€– Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)
- **πŸ” Multi-tenancy**: Secure user and project-based data isolation
- **πŸš€ High Performance**: Local embeddings and efficient vector storage
- **πŸ—„οΈ Cross-Platform Database**: LanceDB works everywhere - Linux, macOS, Windows, and even browsers
- **πŸ”Œ MCP Support**: Model Context Protocol server for AI tool integration
- **πŸ“¦ Easy Deployment**: Docker support and uvx compatibility

## Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI       │────▢│  Vector DB      │────▢│   Embeddings    β”‚
β”‚   Server        β”‚     β”‚ (LanceDB/       β”‚     β”‚ (FastEmbed/     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚  InfinityDB)    β”‚     β”‚  LanceDB)       β”‚
         β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–Ό                                                β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   LiteLLM       β”‚                              β”‚   Local Models  β”‚
β”‚   (LLM Bridge)  β”‚                              β”‚   (BGE, etc.)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## Quick Start

### Install with uvx

```bash
# Install and run directly with uvx
uvx hanzo-memory

# Or install globally
uvx install hanzo-memory
```

### Install from source

```bash
# Clone the repository
git clone https://github.com/hanzoai/memory
cd memory

# Install with uv
make setup

# Run the server
make dev
```

### Docker

```bash
# Using docker-compose
docker-compose up

# Or build and run manually
docker build -t hanzo-memory .
docker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory
```

## Configuration

Create a `.env` file (see `.env.example`):

```env
# API Authentication
HANZO_API_KEY=your-api-key-here
HANZO_DISABLE_AUTH=false  # Set to true for local development

# LLM Configuration (choose one)
# OpenAI
HANZO_LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=your-openai-key

# Anthropic
HANZO_LLM_MODEL=claude-3-haiku-20240307
ANTHROPIC_API_KEY=your-anthropic-key

# Local Models (Ollama)
HANZO_LLM_MODEL=ollama/llama3.2
HANZO_LLM_API_BASE=http://localhost:11434

# Embedding Model
HANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

# Database Backend (optional, defaults to lancedb)
HANZO_DB_BACKEND=lancedb
HANZO_LANCEDB_PATH=data/lancedb
```

## Database Backends

Hanzo Memory supports multiple vector database backends:

### LanceDB (Default)
- Modern embedded vector database that works on ALL platforms
- Cross-platform: Linux, macOS, Windows, ARM, and even browsers (via WASM)
- Built-in support for FastEmbed and sentence-transformers
- Efficient columnar storage format (Apache Arrow/Parquet)
- Native vector similarity search
- Can be embedded in Python, JavaScript/TypeScript, Rust applications

### InfinityDB (Alternative, Linux/Windows only)
- High-performance embedded vector database
- Not available on macOS
- Optimized for production workloads
- Built-in vector indexing

To configure the database backend:

```env
# Use LanceDB
HANZO_DB_BACKEND=lancedb
HANZO_LANCEDB_PATH=data/lancedb

# Use InfinityDB
HANZO_DB_BACKEND=infinity
HANZO_INFINITY_DB_PATH=data/infinity_db
```

## API Documentation

For complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).

### Quick API Overview

- **Memory Management**: `/v1/remember`, `/v1/memories/*`
- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`
- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`
- **Search**: Unified semantic search across all data types
- **MCP Server**: Model Context Protocol integration for AI tools

### LLM Features

The service can:
- **Summarize content** for knowledge extraction
- **Generate knowledge update instructions** in JSON format
- **Filter search results** for relevance
- **Strip PII** from stored content

Example summarization request:
```python
llm_service.summarize_for_knowledge(
    content="Long document...",
    skip_summarization=False,  # Set to True to skip
    provided_summary="Optional pre-made summary"
)
```

Returns:
```json
{
  "summary": "Concise summary of content",
  "knowledge_instructions": {
    "action": "add_fact",
    "facts": [{"content": "Extracted fact", "metadata": {...}}],
    "reasoning": "Why these facts are important"
  }
}
```

## Development

### Running Tests

```bash
# Run all tests
make test

# Run with coverage
make test-cov

# Run specific test
uvx pytest tests/test_memory_api.py -v
```

### Code Quality

```bash
# Format code
make format

# Run linter
make lint

# Type checking
make type-check
```

### Project Structure

```
memory/
β”œβ”€β”€ src/hanzo_memory/
β”‚   β”œβ”€β”€ api/          # API authentication
β”‚   β”œβ”€β”€ db/           # InfinityDB client
β”‚   β”œβ”€β”€ models/       # Pydantic models
β”‚   β”œβ”€β”€ services/     # Business logic
β”‚   β”œβ”€β”€ config.py     # Settings
β”‚   └── server.py     # FastAPI app
β”œβ”€β”€ tests/            # Pytest tests
β”œβ”€β”€ Makefile          # Build automation
└── pyproject.toml    # Project config
```

## Deployment

### Production Checklist

1. Set strong `HANZO_API_KEY`
2. Configure appropriate LLM model and API keys
3. Set `HANZO_DISABLE_AUTH=false`
4. Configure data persistence volume
5. Set up monitoring and logging
6. Configure rate limiting if needed

### Scaling Considerations

- InfinityDB embedded runs in-process (no separate DB server)
- FastEmbed generates embeddings locally (no API calls)
- LLM calls can be directed to local models for full offline operation
- Use Redis for caching in high-traffic scenarios

## Contributing

Pull requests are welcome! Please:
1. Write tests for new features
2. Follow existing code style
3. Update documentation as needed
4. Run `make check` before submitting

## License

BSD License - see LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "hanzo-memory",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.13",
    "maintainer_email": null,
    "keywords": "ai, memory, mcp, fastapi, embeddings",
    "author": null,
    "author_email": "\"Hanzo Industries Inc.\" <dev@hanzo.ai>",
    "download_url": "https://files.pythonhosted.org/packages/b0/a3/9526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e/hanzo_memory-1.0.0.tar.gz",
    "platform": null,
    "description": "# Hanzo Memory Service\n\n## Add memory to any AI application!\n\nA high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with LanceDB vector database (works on all platforms including browsers via WASM), local embeddings, and LiteLLM for flexible LLM integration.\n\n## Features\n\n- **\ud83e\udde0 Intelligent Memory Management**: Store and retrieve contextual memories with semantic search\n- **\ud83d\udcda Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships\n- **\ud83d\udcac Chat History**: Store and search conversation history with de-duplication\n- **\ud83d\udd0d Unified Search API**: Fast semantic search using FastEmbed embeddings\n- **\ud83e\udd16 Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)\n- **\ud83d\udd10 Multi-tenancy**: Secure user and project-based data isolation\n- **\ud83d\ude80 High Performance**: Local embeddings and efficient vector storage\n- **\ud83d\uddc4\ufe0f Cross-Platform Database**: LanceDB works everywhere - Linux, macOS, Windows, and even browsers\n- **\ud83d\udd0c MCP Support**: Model Context Protocol server for AI tool integration\n- **\ud83d\udce6 Easy Deployment**: Docker support and uvx compatibility\n\n## Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510     \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510     \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   FastAPI       \u2502\u2500\u2500\u2500\u2500\u25b6\u2502  Vector DB      \u2502\u2500\u2500\u2500\u2500\u25b6\u2502   Embeddings    \u2502\n\u2502   Server        \u2502     \u2502 (LanceDB/       \u2502     \u2502 (FastEmbed/     \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518     \u2502  InfinityDB)    \u2502     \u2502  LanceDB)       \u2502\n         \u2502              \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518     \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n         \u25bc                                                \u2502\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510                              \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   LiteLLM       \u2502                              \u2502   Local Models  \u2502\n\u2502   (LLM Bridge)  \u2502                              \u2502   (BGE, etc.)   \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518                              \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## Quick Start\n\n### Install with uvx\n\n```bash\n# Install and run directly with uvx\nuvx hanzo-memory\n\n# Or install globally\nuvx install hanzo-memory\n```\n\n### Install from source\n\n```bash\n# Clone the repository\ngit clone https://github.com/hanzoai/memory\ncd memory\n\n# Install with uv\nmake setup\n\n# Run the server\nmake dev\n```\n\n### Docker\n\n```bash\n# Using docker-compose\ndocker-compose up\n\n# Or build and run manually\ndocker build -t hanzo-memory .\ndocker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory\n```\n\n## Configuration\n\nCreate a `.env` file (see `.env.example`):\n\n```env\n# API Authentication\nHANZO_API_KEY=your-api-key-here\nHANZO_DISABLE_AUTH=false  # Set to true for local development\n\n# LLM Configuration (choose one)\n# OpenAI\nHANZO_LLM_MODEL=gpt-4o-mini\nOPENAI_API_KEY=your-openai-key\n\n# Anthropic\nHANZO_LLM_MODEL=claude-3-haiku-20240307\nANTHROPIC_API_KEY=your-anthropic-key\n\n# Local Models (Ollama)\nHANZO_LLM_MODEL=ollama/llama3.2\nHANZO_LLM_API_BASE=http://localhost:11434\n\n# Embedding Model\nHANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5\n\n# Database Backend (optional, defaults to lancedb)\nHANZO_DB_BACKEND=lancedb\nHANZO_LANCEDB_PATH=data/lancedb\n```\n\n## Database Backends\n\nHanzo Memory supports multiple vector database backends:\n\n### LanceDB (Default)\n- Modern embedded vector database that works on ALL platforms\n- Cross-platform: Linux, macOS, Windows, ARM, and even browsers (via WASM)\n- Built-in support for FastEmbed and sentence-transformers\n- Efficient columnar storage format (Apache Arrow/Parquet)\n- Native vector similarity search\n- Can be embedded in Python, JavaScript/TypeScript, Rust applications\n\n### InfinityDB (Alternative, Linux/Windows only)\n- High-performance embedded vector database\n- Not available on macOS\n- Optimized for production workloads\n- Built-in vector indexing\n\nTo configure the database backend:\n\n```env\n# Use LanceDB\nHANZO_DB_BACKEND=lancedb\nHANZO_LANCEDB_PATH=data/lancedb\n\n# Use InfinityDB\nHANZO_DB_BACKEND=infinity\nHANZO_INFINITY_DB_PATH=data/infinity_db\n```\n\n## API Documentation\n\nFor complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).\n\n### Quick API Overview\n\n- **Memory Management**: `/v1/remember`, `/v1/memories/*`\n- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`\n- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`\n- **Search**: Unified semantic search across all data types\n- **MCP Server**: Model Context Protocol integration for AI tools\n\n### LLM Features\n\nThe service can:\n- **Summarize content** for knowledge extraction\n- **Generate knowledge update instructions** in JSON format\n- **Filter search results** for relevance\n- **Strip PII** from stored content\n\nExample summarization request:\n```python\nllm_service.summarize_for_knowledge(\n    content=\"Long document...\",\n    skip_summarization=False,  # Set to True to skip\n    provided_summary=\"Optional pre-made summary\"\n)\n```\n\nReturns:\n```json\n{\n  \"summary\": \"Concise summary of content\",\n  \"knowledge_instructions\": {\n    \"action\": \"add_fact\",\n    \"facts\": [{\"content\": \"Extracted fact\", \"metadata\": {...}}],\n    \"reasoning\": \"Why these facts are important\"\n  }\n}\n```\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\nmake test\n\n# Run with coverage\nmake test-cov\n\n# Run specific test\nuvx pytest tests/test_memory_api.py -v\n```\n\n### Code Quality\n\n```bash\n# Format code\nmake format\n\n# Run linter\nmake lint\n\n# Type checking\nmake type-check\n```\n\n### Project Structure\n\n```\nmemory/\n\u251c\u2500\u2500 src/hanzo_memory/\n\u2502   \u251c\u2500\u2500 api/          # API authentication\n\u2502   \u251c\u2500\u2500 db/           # InfinityDB client\n\u2502   \u251c\u2500\u2500 models/       # Pydantic models\n\u2502   \u251c\u2500\u2500 services/     # Business logic\n\u2502   \u251c\u2500\u2500 config.py     # Settings\n\u2502   \u2514\u2500\u2500 server.py     # FastAPI app\n\u251c\u2500\u2500 tests/            # Pytest tests\n\u251c\u2500\u2500 Makefile          # Build automation\n\u2514\u2500\u2500 pyproject.toml    # Project config\n```\n\n## Deployment\n\n### Production Checklist\n\n1. Set strong `HANZO_API_KEY`\n2. Configure appropriate LLM model and API keys\n3. Set `HANZO_DISABLE_AUTH=false`\n4. Configure data persistence volume\n5. Set up monitoring and logging\n6. Configure rate limiting if needed\n\n### Scaling Considerations\n\n- InfinityDB embedded runs in-process (no separate DB server)\n- FastEmbed generates embeddings locally (no API calls)\n- LLM calls can be directed to local models for full offline operation\n- Use Redis for caching in high-traffic scenarios\n\n## Contributing\n\nPull requests are welcome! Please:\n1. Write tests for new features\n2. Follow existing code style\n3. Update documentation as needed\n4. Run `make check` before submitting\n\n## License\n\nBSD License - see LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "AI memory service with FastAPI and MCP support",
    "version": "1.0.0",
    "project_urls": {
        "Documentation": "https://docs.hanzo.ai/memory",
        "Homepage": "https://github.com/hanzoai/memory",
        "Issues": "https://github.com/hanzoai/memory/issues",
        "Repository": "https://github.com/hanzoai/memory"
    },
    "split_keywords": [
        "ai",
        " memory",
        " mcp",
        " fastapi",
        " embeddings"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8eacf82f490bc7b29c8f89af939398e71dabcee5b07ba424b0b18a1390616b5d",
                "md5": "5ea5e0f4ab6103487449e93c16238b3a",
                "sha256": "227aecd06fa739b9ecbd1d5118ea3d56f659488e72ea9310accef046642c8c8f"
            },
            "downloads": -1,
            "filename": "hanzo_memory-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5ea5e0f4ab6103487449e93c16238b3a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.13",
            "size": 40851,
            "upload_time": "2025-07-26T00:12:17",
            "upload_time_iso_8601": "2025-07-26T00:12:17.098613Z",
            "url": "https://files.pythonhosted.org/packages/8e/ac/f82f490bc7b29c8f89af939398e71dabcee5b07ba424b0b18a1390616b5d/hanzo_memory-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b0a39526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e",
                "md5": "964807882c8187e3b845fecc1f161edf",
                "sha256": "0f77635847abd8b34263602495aeaba1ea85d8390bb2f47323584c7488eda10e"
            },
            "downloads": -1,
            "filename": "hanzo_memory-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "964807882c8187e3b845fecc1f161edf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.13",
            "size": 44463,
            "upload_time": "2025-07-26T00:12:18",
            "upload_time_iso_8601": "2025-07-26T00:12:18.062771Z",
            "url": "https://files.pythonhosted.org/packages/b0/a3/9526f7c83cddb3d7d0a613847342342071515f3abd2172620b2f2c51568e/hanzo_memory-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-26 00:12:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hanzoai",
    "github_project": "memory",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "hanzo-memory"
}
        
Elapsed time: 1.17619s