Name | hanzo-memory JSON |
Version |
0.1.1
JSON |
| download |
home_page | None |
Summary | AI memory service with FastAPI and MCP support |
upload_time | 2025-07-23 23:07:01 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.13 |
license | BSD |
keywords |
ai
memory
mcp
fastapi
embeddings
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Hanzo Memory Service
## Add memory to any AI application!
A high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with InfinityDB for vector storage, FastEmbed for local embeddings, and LiteLLM for flexible LLM integration.
## Features
- **π§ Intelligent Memory Management**: Store and retrieve contextual memories with semantic search
- **π Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships
- **π¬ Chat History**: Store and search conversation history with de-duplication
- **π Unified Search API**: Fast semantic search using FastEmbed embeddings
- **π€ Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)
- **π Multi-tenancy**: Secure user and project-based data isolation
- **π High Performance**: Local embeddings and efficient vector storage with InfinityDB
- **π MCP Support**: Model Context Protocol server for AI tool integration
- **π¦ Easy Deployment**: Docker support and uvx compatibility
## Architecture
```
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β FastAPI ββββββΆβ InfinityDB ββββββΆβ FastEmbed β
β Server β β (Vector DB) β β (Embeddings) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ
β LiteLLM β β Local Models β
β (LLM Bridge) β β (BGE, etc.) β
βββββββββββββββββββ βββββββββββββββββββ
```
## Quick Start
### Install with uvx
```bash
# Install and run directly with uvx
uvx hanzo-memory
# Or install globally
uvx install hanzo-memory
```
### Install from source
```bash
# Clone the repository
git clone https://github.com/hanzoai/memory
cd memory
# Install with uv
make setup
# Run the server
make dev
```
### Docker
```bash
# Using docker-compose
docker-compose up
# Or build and run manually
docker build -t hanzo-memory .
docker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory
```
## Configuration
Create a `.env` file (see `.env.example`):
```env
# API Authentication
HANZO_API_KEY=your-api-key-here
HANZO_DISABLE_AUTH=false # Set to true for local development
# LLM Configuration (choose one)
# OpenAI
HANZO_LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=your-openai-key
# Anthropic
HANZO_LLM_MODEL=claude-3-haiku-20240307
ANTHROPIC_API_KEY=your-anthropic-key
# Local Models (Ollama)
HANZO_LLM_MODEL=ollama/llama3.2
HANZO_LLM_API_BASE=http://localhost:11434
# Embedding Model
HANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
```
## API Documentation
For complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).
### Quick API Overview
- **Memory Management**: `/v1/remember`, `/v1/memories/*`
- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`
- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`
- **Search**: Unified semantic search across all data types
- **MCP Server**: Model Context Protocol integration for AI tools
### LLM Features
The service can:
- **Summarize content** for knowledge extraction
- **Generate knowledge update instructions** in JSON format
- **Filter search results** for relevance
- **Strip PII** from stored content
Example summarization request:
```python
llm_service.summarize_for_knowledge(
content="Long document...",
skip_summarization=False, # Set to True to skip
provided_summary="Optional pre-made summary"
)
```
Returns:
```json
{
"summary": "Concise summary of content",
"knowledge_instructions": {
"action": "add_fact",
"facts": [{"content": "Extracted fact", "metadata": {...}}],
"reasoning": "Why these facts are important"
}
}
```
## Development
### Running Tests
```bash
# Run all tests
make test
# Run with coverage
make test-cov
# Run specific test
uvx pytest tests/test_memory_api.py -v
```
### Code Quality
```bash
# Format code
make format
# Run linter
make lint
# Type checking
make type-check
```
### Project Structure
```
memory/
βββ src/hanzo_memory/
β βββ api/ # API authentication
β βββ db/ # InfinityDB client
β βββ models/ # Pydantic models
β βββ services/ # Business logic
β βββ config.py # Settings
β βββ server.py # FastAPI app
βββ tests/ # Pytest tests
βββ Makefile # Build automation
βββ pyproject.toml # Project config
```
## Deployment
### Production Checklist
1. Set strong `HANZO_API_KEY`
2. Configure appropriate LLM model and API keys
3. Set `HANZO_DISABLE_AUTH=false`
4. Configure data persistence volume
5. Set up monitoring and logging
6. Configure rate limiting if needed
### Scaling Considerations
- InfinityDB embedded runs in-process (no separate DB server)
- FastEmbed generates embeddings locally (no API calls)
- LLM calls can be directed to local models for full offline operation
- Use Redis for caching in high-traffic scenarios
## Contributing
Pull requests are welcome! Please:
1. Write tests for new features
2. Follow existing code style
3. Update documentation as needed
4. Run `make check` before submitting
## License
BSD License - see LICENSE file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "hanzo-memory",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.13",
"maintainer_email": null,
"keywords": "ai, memory, mcp, fastapi, embeddings",
"author": null,
"author_email": "\"Hanzo Industries Inc.\" <dev@hanzo.ai>",
"download_url": "https://files.pythonhosted.org/packages/80/8c/f8245c42e099959ba23e4c10e3ae09d81bc566cce94e6a0ff5886b3f4187/hanzo_memory-0.1.1.tar.gz",
"platform": null,
"description": "# Hanzo Memory Service\n\n## Add memory to any AI application!\n\nA high-performance FastAPI service that provides memory and knowledge management capabilities for AI applications. Built with InfinityDB for vector storage, FastEmbed for local embeddings, and LiteLLM for flexible LLM integration.\n\n## Features\n\n- **\ud83e\udde0 Intelligent Memory Management**: Store and retrieve contextual memories with semantic search\n- **\ud83d\udcda Knowledge Base System**: Organize facts in hierarchical knowledge bases with parent-child relationships\n- **\ud83d\udcac Chat History**: Store and search conversation history with de-duplication\n- **\ud83d\udd0d Unified Search API**: Fast semantic search using FastEmbed embeddings\n- **\ud83e\udd16 Flexible LLM Support**: Use any LLM via LiteLLM (OpenAI, Anthropic, Ollama, etc.)\n- **\ud83d\udd10 Multi-tenancy**: Secure user and project-based data isolation\n- **\ud83d\ude80 High Performance**: Local embeddings and efficient vector storage with InfinityDB\n- **\ud83d\udd0c MCP Support**: Model Context Protocol server for AI tool integration\n- **\ud83d\udce6 Easy Deployment**: Docker support and uvx compatibility\n\n## Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 FastAPI \u2502\u2500\u2500\u2500\u2500\u25b6\u2502 InfinityDB \u2502\u2500\u2500\u2500\u2500\u25b6\u2502 FastEmbed \u2502\n\u2502 Server \u2502 \u2502 (Vector DB) \u2502 \u2502 (Embeddings) \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n \u2502 \u2502\n \u25bc \u25bc\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 LiteLLM \u2502 \u2502 Local Models \u2502\n\u2502 (LLM Bridge) \u2502 \u2502 (BGE, etc.) \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## Quick Start\n\n### Install with uvx\n\n```bash\n# Install and run directly with uvx\nuvx hanzo-memory\n\n# Or install globally\nuvx install hanzo-memory\n```\n\n### Install from source\n\n```bash\n# Clone the repository\ngit clone https://github.com/hanzoai/memory\ncd memory\n\n# Install with uv\nmake setup\n\n# Run the server\nmake dev\n```\n\n### Docker\n\n```bash\n# Using docker-compose\ndocker-compose up\n\n# Or build and run manually\ndocker build -t hanzo-memory .\ndocker run -p 4000:4000 -v $(pwd)/data:/app/data hanzo-memory\n```\n\n## Configuration\n\nCreate a `.env` file (see `.env.example`):\n\n```env\n# API Authentication\nHANZO_API_KEY=your-api-key-here\nHANZO_DISABLE_AUTH=false # Set to true for local development\n\n# LLM Configuration (choose one)\n# OpenAI\nHANZO_LLM_MODEL=gpt-4o-mini\nOPENAI_API_KEY=your-openai-key\n\n# Anthropic\nHANZO_LLM_MODEL=claude-3-haiku-20240307\nANTHROPIC_API_KEY=your-anthropic-key\n\n# Local Models (Ollama)\nHANZO_LLM_MODEL=ollama/llama3.2\nHANZO_LLM_API_BASE=http://localhost:11434\n\n# Embedding Model\nHANZO_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5\n```\n\n## API Documentation\n\nFor complete API documentation including all endpoints, request/response formats, and examples, see [docs/API.md](docs/API.md).\n\n### Quick API Overview\n\n- **Memory Management**: `/v1/remember`, `/v1/memories/*`\n- **Knowledge Bases**: `/v1/kb/*`, `/v1/kb/facts/*`\n- **Chat Sessions**: `/v1/chat/sessions/*`, `/v1/chat/messages/*`\n- **Search**: Unified semantic search across all data types\n- **MCP Server**: Model Context Protocol integration for AI tools\n\n### LLM Features\n\nThe service can:\n- **Summarize content** for knowledge extraction\n- **Generate knowledge update instructions** in JSON format\n- **Filter search results** for relevance\n- **Strip PII** from stored content\n\nExample summarization request:\n```python\nllm_service.summarize_for_knowledge(\n content=\"Long document...\",\n skip_summarization=False, # Set to True to skip\n provided_summary=\"Optional pre-made summary\"\n)\n```\n\nReturns:\n```json\n{\n \"summary\": \"Concise summary of content\",\n \"knowledge_instructions\": {\n \"action\": \"add_fact\",\n \"facts\": [{\"content\": \"Extracted fact\", \"metadata\": {...}}],\n \"reasoning\": \"Why these facts are important\"\n }\n}\n```\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\nmake test\n\n# Run with coverage\nmake test-cov\n\n# Run specific test\nuvx pytest tests/test_memory_api.py -v\n```\n\n### Code Quality\n\n```bash\n# Format code\nmake format\n\n# Run linter\nmake lint\n\n# Type checking\nmake type-check\n```\n\n### Project Structure\n\n```\nmemory/\n\u251c\u2500\u2500 src/hanzo_memory/\n\u2502 \u251c\u2500\u2500 api/ # API authentication\n\u2502 \u251c\u2500\u2500 db/ # InfinityDB client\n\u2502 \u251c\u2500\u2500 models/ # Pydantic models\n\u2502 \u251c\u2500\u2500 services/ # Business logic\n\u2502 \u251c\u2500\u2500 config.py # Settings\n\u2502 \u2514\u2500\u2500 server.py # FastAPI app\n\u251c\u2500\u2500 tests/ # Pytest tests\n\u251c\u2500\u2500 Makefile # Build automation\n\u2514\u2500\u2500 pyproject.toml # Project config\n```\n\n## Deployment\n\n### Production Checklist\n\n1. Set strong `HANZO_API_KEY`\n2. Configure appropriate LLM model and API keys\n3. Set `HANZO_DISABLE_AUTH=false`\n4. Configure data persistence volume\n5. Set up monitoring and logging\n6. Configure rate limiting if needed\n\n### Scaling Considerations\n\n- InfinityDB embedded runs in-process (no separate DB server)\n- FastEmbed generates embeddings locally (no API calls)\n- LLM calls can be directed to local models for full offline operation\n- Use Redis for caching in high-traffic scenarios\n\n## Contributing\n\nPull requests are welcome! Please:\n1. Write tests for new features\n2. Follow existing code style\n3. Update documentation as needed\n4. Run `make check` before submitting\n\n## License\n\nBSD License - see LICENSE file for details.\n",
"bugtrack_url": null,
"license": "BSD",
"summary": "AI memory service with FastAPI and MCP support",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://docs.hanzo.ai/memory",
"Homepage": "https://github.com/hanzoai/memory",
"Issues": "https://github.com/hanzoai/memory/issues",
"Repository": "https://github.com/hanzoai/memory"
},
"split_keywords": [
"ai",
" memory",
" mcp",
" fastapi",
" embeddings"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a0520b9689dec2f316e6e2385371c268b486fc85d9487cea1205db51cf529baf",
"md5": "9d311ea05f2d80bd745e76bbce41ea83",
"sha256": "2eb779215b65bbdcfcd383d3aaa1d0704b94bdb0cc7e604dfd6f26b4c68fd22e"
},
"downloads": -1,
"filename": "hanzo_memory-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9d311ea05f2d80bd745e76bbce41ea83",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.13",
"size": 32931,
"upload_time": "2025-07-23T23:07:00",
"upload_time_iso_8601": "2025-07-23T23:07:00.476963Z",
"url": "https://files.pythonhosted.org/packages/a0/52/0b9689dec2f316e6e2385371c268b486fc85d9487cea1205db51cf529baf/hanzo_memory-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "808cf8245c42e099959ba23e4c10e3ae09d81bc566cce94e6a0ff5886b3f4187",
"md5": "5361bc05192161c9ae3a11f15a21e636",
"sha256": "57b9c74ed833b4239419b48cd542294021f8ddf6d3f5a391913146a6a0e566af"
},
"downloads": -1,
"filename": "hanzo_memory-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5361bc05192161c9ae3a11f15a21e636",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.13",
"size": 39595,
"upload_time": "2025-07-23T23:07:01",
"upload_time_iso_8601": "2025-07-23T23:07:01.940110Z",
"url": "https://files.pythonhosted.org/packages/80/8c/f8245c42e099959ba23e4c10e3ae09d81bc566cce94e6a0ff5886b3f4187/hanzo_memory-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-23 23:07:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hanzoai",
"github_project": "memory",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "hanzo-memory"
}