# SemWare π
[](https://github.com/semware/semware)
[](https://github.com/semware/semware)
[](https://python.org)
[](LICENSE)
A high-performance semantic search API server built with modern Python technologies. SemWare provides REST APIs for vector-based document storage, embedding generation, and similarity search using state-of-the-art machine learning models.
## β¨ Features
- **π High Performance**: Built on FastAPI with automatic async/await support
- **π§ Smart Embeddings**: Supports multiple embedding models (all-MiniLM-L6-v2, EmbeddingGemma-300M)
- **π Advanced Search**: Similarity threshold and top-k search with sub-second response times
- **π‘οΈ Secure**: API key authentication with Bearer token support
- **π Vector Storage**: Powered by LanceDB for efficient vector operations
- **π§ Developer Friendly**: Comprehensive OpenAPI docs, type hints, and test coverage
- **π Scalable**: Handles documents of any length with intelligent text batching
- **ποΈ Production Ready**: Comprehensive logging, error handling, and monitoring
## ποΈ Architecture
SemWare follows a clean architecture pattern with separate layers:
```
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β FastAPI β β Services β β Storage β
β REST APIs βββββΆβ Business βββββΆβ LanceDB β
β (Routes) β β Logic β β Vector DB β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βββββββββββββββββββ
β ML Models β
β Embeddings β
β (HuggingFace) β
βββββββββββββββββββ
```
**Core Components:**
- **Table Management**: Create custom schemas for different document types
- **Data Operations**: CRUD operations with automatic embedding generation
- **Semantic Search**: Vector similarity search with configurable parameters
- **Text Processing**: Smart tokenization and batching for long documents
## π Quick Start
### Installation
**Using uv (Recommended):**
```bash
git clone https://github.com/your-org/semware.git
cd SemWare
uv sync --native-tls
```
**Using pip:**
```bash
git clone https://github.com/your-org/semware.git
cd SemWare
pip install -e .
```
### Configuration
Create a `.env` file:
```bash
# Required
API_KEY=your-super-secret-api-key-here
# Optional (with defaults)
DEBUG=false
DB_PATH=./data
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=INFO
EMBEDDING_MODEL_NAME=all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384
MAX_TOKENS_PER_BATCH=2000
```
### Start the Server
**Simple Command (Recommended):**
```bash
# Start with default settings from .env
semware
# Start with custom options
semware --debug --port 8080
semware --workers 4 --host 127.0.0.1
semware --reload # Development mode with auto-reload
```
**Alternative Methods:**
```bash
# Using uv directly
uv run --native-tls semware
# Using Python module
uv run --native-tls python -m semware.main
# Using uvicorn directly
uv run --native-tls uvicorn semware.main:app --host 0.0.0.0 --port 8000 --workers 4
```
The server will be available at `http://localhost:8000` with automatic API documentation at `/docs`.
### CLI Options
The `semware` command supports these options:
```bash
semware --help Show help message
semware --version Show version
semware --debug Enable debug mode & API docs
semware --reload Development mode with auto-reload
semware --host 127.0.0.1 Bind to specific host
semware --port 8080 Use custom port
semware --workers 4 Number of worker processes
semware --log-level DEBUG Set logging level
```
## π API Reference
### Authentication
All endpoints require authentication using one of:
- **Header**: `X-API-Key: your-api-key`
- **Bearer Token**: `Authorization: Bearer your-api-key`
### ποΈ Table Management
#### Create Table
Create a new table with custom schema.
```http
POST /tables
Content-Type: application/json
X-API-Key: your-api-key
{
"schema": {
"name": "research_papers",
"columns": {
"id": "string",
"title": "string",
"abstract": "string",
"authors": "string",
"year": "int",
"doi": "string"
},
"id_column": "id",
"embedding_column": "abstract"
}
}
```
**Response (201):**
```json
{
"message": "Table 'research_papers' created successfully",
"table_name": "research_papers"
}
```
#### List Tables
Get all available tables.
```http
GET /tables
X-API-Key: your-api-key
```
**Response (200):**
```json
{
"tables": ["research_papers", "product_docs", "customer_support"],
"count": 3
}
```
#### Get Table Info
Get detailed information about a specific table.
```http
GET /tables/research_papers
X-API-Key: your-api-key
```
**Response (200):**
```json
{
"table_name": "research_papers",
"schema": {
"name": "research_papers",
"columns": {
"id": "string",
"title": "string",
"abstract": "string",
"authors": "string",
"year": "int",
"doi": "string"
},
"id_column": "id",
"embedding_column": "abstract"
},
"record_count": 1547,
"created_at": "2024-01-15T10:30:00Z"
}
```
#### Delete Table
Delete a table and all its data.
```http
DELETE /tables/research_papers
X-API-Key: your-api-key
```
**Response (200):**
```json
{
"message": "Table 'research_papers' deleted successfully",
"table_name": "research_papers"
}
```
### π Data Operations
#### Insert/Update Documents
Insert new documents or update existing ones. Embeddings are generated automatically.
```http
POST /tables/research_papers/data
Content-Type: application/json
X-API-Key: your-api-key
{
"records": [
{
"data": {
"id": "paper_001",
"title": "Attention Is All You Need",
"abstract": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...",
"authors": "Ashish Vaswani, Noam Shazeer, Niki Parmar",
"year": 2017,
"doi": "10.48550/arXiv.1706.03762"
}
},
{
"data": {
"id": "paper_002",
"title": "BERT: Pre-training of Deep Bidirectional Transformers",
"abstract": "We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations...",
"authors": "Jacob Devlin, Ming-Wei Chang, Kenton Lee",
"year": 2018,
"doi": "10.48550/arXiv.1810.04805"
}
}
]
}
```
**Response (201):**
```json
{
"message": "Successfully processed 2 records",
"inserted_count": 2,
"updated_count": 0,
"processing_time_ms": 1247.3
}
```
#### Get Document
Retrieve a specific document by ID.
```http
GET /tables/research_papers/data/paper_001
X-API-Key: your-api-key
```
**Response (200):**
```json
{
"table_name": "research_papers",
"record_id": "paper_001",
"data": {
"id": "paper_001",
"title": "Attention Is All You Need",
"abstract": "The dominant sequence transduction models are based on complex recurrent...",
"authors": "Ashish Vaswani, Noam Shazeer, Niki Parmar",
"year": 2017,
"doi": "10.48550/arXiv.1706.03762"
}
}
```
#### Delete Document
Remove a document from the table.
```http
DELETE /tables/research_papers/data/paper_001
X-API-Key: your-api-key
```
**Response (200):**
```json
{
"message": "Record 'paper_001' deleted successfully",
"table_name": "research_papers",
"deleted_id": "paper_001"
}
```
### π Search Operations
#### Similarity Search
Find all documents with similarity above a threshold.
```http
POST /tables/research_papers/search/similarity
Content-Type: application/json
X-API-Key: your-api-key
{
"query": "transformer neural network attention mechanism",
"threshold": 0.7,
"limit": 10
}
```
**Response (200):**
```json
{
"query": "transformer neural network attention mechanism",
"results": [
{
"id": "paper_001",
"data": {
"id": "paper_001",
"title": "Attention Is All You Need",
"abstract": "The dominant sequence transduction models...",
"authors": "Ashish Vaswani, Noam Shazeer, Niki Parmar",
"year": 2017,
"doi": "10.48550/arXiv.1706.03762"
},
"similarity_score": 0.89
},
{
"id": "paper_002",
"data": {
"id": "paper_002",
"title": "BERT: Pre-training of Deep Bidirectional Transformers",
"abstract": "We introduce a new language representation model...",
"authors": "Jacob Devlin, Ming-Wei Chang, Kenton Lee",
"year": 2018,
"doi": "10.48550/arXiv.1810.04805"
},
"similarity_score": 0.76
}
],
"total_results": 2,
"search_time_ms": 23.4,
"threshold": 0.7
}
```
#### Top-K Search
Find the K most similar documents.
```http
POST /tables/research_papers/search/top-k
Content-Type: application/json
X-API-Key: your-api-key
{
"query": "natural language processing BERT",
"k": 5
}
```
**Response (200):**
```json
{
"query": "natural language processing BERT",
"results": [
{
"id": "paper_002",
"data": {
"id": "paper_002",
"title": "BERT: Pre-training of Deep Bidirectional Transformers",
"abstract": "We introduce a new language representation model...",
"authors": "Jacob Devlin, Ming-Wei Chang, Kenton Lee",
"year": 2018,
"doi": "10.48550/arXiv.1810.04805"
},
"similarity_score": 0.94
},
{
"id": "paper_001",
"data": {
"id": "paper_001",
"title": "Attention Is All You Need",
"abstract": "The dominant sequence transduction models...",
"authors": "Ashish Vaswani, Noam Shazeer, Niki Parmar",
"year": 2017,
"doi": "10.48550/arXiv.1706.03762"
},
"similarity_score": 0.81
}
],
"total_results": 5,
"search_time_ms": 31.7,
"k": 5
}
```
### β€οΈ Health Check
```http
GET /health
```
**Response (200):**
```json
{
"status": "healthy",
"app_name": "SemWare",
"version": "0.1.0",
"timestamp": "2024-01-15T14:30:25.123456"
}
```
## π§ Embedding Process
SemWare uses advanced text processing for optimal semantic understanding:
### 1. **Text Tokenization**
- Long texts are intelligently split into manageable chunks
- Uses `tiktoken` with `cl100k_base` encoding for precise token counting
- Default batch size: 2000 tokens with configurable limits
### 2. **Batch Processing**
- Each text chunk is processed through the embedding model
- Supports multiple embedding models via Hugging Face transformers
- Automatic GPU acceleration when available
### 3. **Embedding Aggregation**
- Multiple batch embeddings are combined using average pooling
- Preserves semantic meaning across the entire document
- Results in high-quality 384-dimensional vectors (MiniLM)
### 4. **Normalization & Storage**
- Final embeddings are L2 normalized for consistent similarity scoring
- Stored efficiently in LanceDB with optimized vector indexing
- Enables sub-second search across millions of documents
## π οΈ Development
### Running Tests
```bash
# Run all tests with coverage
uv run --native-tls pytest --cov=src --cov-report=html
# Run specific test file
uv run --native-tls pytest tests/test_api/test_search.py -v
# Run with debug output
uv run --native-tls pytest -s --log-cli-level=DEBUG
```
### Code Quality
```bash
# Format code
uv run --native-tls ruff format src/ tests/
# Lint and fix issues
uv run --native-tls ruff check src/ tests/ --fix
# Type checking
uv run --native-tls mypy src/
```
### API Documentation
Start the server with `DEBUG=true` in your `.env` and visit:
- **Swagger UI**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
- **OpenAPI JSON**: http://localhost:8000/openapi.json
## π Project Structure
```
SemWare/
βββ src/semware/
β βββ api/ # FastAPI route handlers
β β βββ __init__.py
β β βββ auth.py # Authentication middleware
β β βββ data.py # Data CRUD operations
β β βββ search.py # Search endpoints
β β βββ tables.py # Table management
β βββ models/ # Pydantic data models
β β βββ __init__.py
β β βββ requests.py # Request/response models
β β βββ schemas.py # Core data schemas
β βββ services/ # Business logic services
β β βββ __init__.py
β β βββ embedding.py # ML embedding generation
β β βββ search.py # Search orchestration
β β βββ vectordb.py # Vector database operations
β βββ utils/ # Utility functions
β β βββ __init__.py
β β βββ logging.py # Logging configuration
β β βββ tokenizer.py # Text tokenization
β βββ config.py # Configuration management
β βββ main.py # FastAPI application factory
βββ tests/ # Comprehensive test suite
β βββ conftest.py # Test configuration & fixtures
β βββ test_api/ # API endpoint tests
β βββ test_services/ # Service layer tests
β βββ test_utils/ # Utility function tests
βββ pyproject.toml # Project configuration
βββ .env.example # Environment template
βββ README.md # This file
```
## βοΈ Configuration Reference
| Variable | Description | Default | Required |
|----------|-------------|---------|----------|
| `API_KEY` | Authentication key for all endpoints | - | β
|
| `DEBUG` | Enable debug mode and API docs | `false` | β |
| `DB_PATH` | Database storage directory | `./data` | β |
| `HOST` | Server bind address | `0.0.0.0` | β |
| `PORT` | Server port | `8000` | β |
| `LOG_LEVEL` | Logging level (DEBUG/INFO/WARNING/ERROR) | `INFO` | β |
| `LOG_FILE` | Log file path (optional) | - | β |
| `EMBEDDING_MODEL_NAME` | Hugging Face model name | `all-MiniLM-L6-v2` | β |
| `EMBEDDING_DIMENSION` | Embedding vector dimensions | `384` | β |
| `MAX_TOKENS_PER_BATCH` | Max tokens per embedding batch | `2000` | β |
| `WORKERS` | Number of server workers | `1` | β |
## π’ Deployment
### Docker
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install uv
RUN uv sync --native-tls
EXPOSE 8000
CMD ["uv", "run", "--native-tls", "uvicorn", "semware.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Production Considerations
- Use multiple workers: `--workers 4`
- Enable access logs: `--access-log`
- Set up reverse proxy (nginx) for HTTPS termination
- Configure log rotation and monitoring
- Use a dedicated vector storage solution for large scale
## π€ Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/amazing-feature`
3. Make your changes and add tests
4. Run the test suite: `uv run --native-tls pytest`
5. Submit a pull request
## π Performance
**Benchmarks** (on Apple M2 Pro, 16GB RAM):
- **Embedding Generation**: ~200ms per batch (2000 tokens)
- **Document Insertion**: ~500ms per document (including embedding)
- **Vector Search**: <50ms for similarity search across 10K documents
- **Throughput**: ~100 requests/second with 4 workers
## π Troubleshooting
### Common Issues
**Authentication Errors**
```bash
# Ensure API key is set correctly
export API_KEY=your-secret-key
# Or check your .env file
```
**Model Download Issues**
```bash
# Clear Hugging Face cache
rm -rf ~/.cache/huggingface/
# Restart with debug logging
DEBUG=true uv run --native-tls python -m semware.main
```
**Database Permissions**
```bash
# Ensure write permissions to data directory
chmod 755 ./data
```
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Acknowledgments
- **FastAPI** for the excellent async web framework
- **LanceDB** for high-performance vector storage
- **Hugging Face** for the transformer models and ecosystem
- **Pydantic** for robust data validation
- **The Python Community** for the amazing open-source ecosystem
---
<p align="center">
<strong>Built with β€οΈ by the SemWare team</strong>
</p>
<p align="center">
<a href="https://github.com/semware/semware/issues">Report Bug</a> β’
<a href="https://github.com/semware/semware/discussions">Discussions</a> β’
<a href="https://github.com/semware/semware/wiki">Wiki</a>
</p>
Raw data
{
"_id": null,
"home_page": null,
"name": "semware",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "api, embeddings, machine-learning, semantic-search, vector-database",
"author": "SemWare Team",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/8a/02/c6394e787bf1bb03319e324567ac107c6550550fc6c0febdffc0aa6bb7a2/semware-0.1.0.tar.gz",
"platform": null,
"description": "# SemWare \ud83d\ude80\n\n[](https://github.com/semware/semware)\n[](https://github.com/semware/semware)\n[](https://python.org)\n[](LICENSE)\n\nA high-performance semantic search API server built with modern Python technologies. SemWare provides REST APIs for vector-based document storage, embedding generation, and similarity search using state-of-the-art machine learning models.\n\n## \u2728 Features\n\n- **\ud83d\ude84 High Performance**: Built on FastAPI with automatic async/await support\n- **\ud83e\udde0 Smart Embeddings**: Supports multiple embedding models (all-MiniLM-L6-v2, EmbeddingGemma-300M)\n- **\ud83d\udd0d Advanced Search**: Similarity threshold and top-k search with sub-second response times\n- **\ud83d\udee1\ufe0f Secure**: API key authentication with Bearer token support\n- **\ud83d\udcca Vector Storage**: Powered by LanceDB for efficient vector operations\n- **\ud83d\udd27 Developer Friendly**: Comprehensive OpenAPI docs, type hints, and test coverage\n- **\ud83d\udcc8 Scalable**: Handles documents of any length with intelligent text batching\n- **\ud83c\udfd7\ufe0f Production Ready**: Comprehensive logging, error handling, and monitoring\n\n## \ud83c\udfdb\ufe0f Architecture\n\nSemWare follows a clean architecture pattern with separate layers:\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 FastAPI \u2502 \u2502 Services \u2502 \u2502 Storage \u2502\n\u2502 REST APIs \u2502\u2500\u2500\u2500\u25b6\u2502 Business \u2502\u2500\u2500\u2500\u25b6\u2502 LanceDB \u2502\n\u2502 (Routes) \u2502 \u2502 Logic \u2502 \u2502 Vector DB \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n \u2502\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n \u2502 ML Models \u2502\n \u2502 Embeddings \u2502\n \u2502 (HuggingFace) \u2502\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n**Core Components:**\n- **Table Management**: Create custom schemas for different document types\n- **Data Operations**: CRUD operations with automatic embedding generation \n- **Semantic Search**: Vector similarity search with configurable parameters\n- **Text Processing**: Smart tokenization and batching for long documents\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n**Using uv (Recommended):**\n```bash\ngit clone https://github.com/your-org/semware.git\ncd SemWare\nuv sync --native-tls\n```\n\n**Using pip:**\n```bash\ngit clone https://github.com/your-org/semware.git\ncd SemWare\npip install -e .\n```\n\n### Configuration\n\nCreate a `.env` file:\n```bash\n# Required\nAPI_KEY=your-super-secret-api-key-here\n\n# Optional (with defaults)\nDEBUG=false\nDB_PATH=./data\nHOST=0.0.0.0\nPORT=8000\nLOG_LEVEL=INFO\nEMBEDDING_MODEL_NAME=all-MiniLM-L6-v2\nEMBEDDING_DIMENSION=384\nMAX_TOKENS_PER_BATCH=2000\n```\n\n### Start the Server\n\n**Simple Command (Recommended):**\n```bash\n# Start with default settings from .env\nsemware\n\n# Start with custom options\nsemware --debug --port 8080\nsemware --workers 4 --host 127.0.0.1\nsemware --reload # Development mode with auto-reload\n```\n\n**Alternative Methods:**\n```bash\n# Using uv directly\nuv run --native-tls semware\n\n# Using Python module\nuv run --native-tls python -m semware.main\n\n# Using uvicorn directly\nuv run --native-tls uvicorn semware.main:app --host 0.0.0.0 --port 8000 --workers 4\n```\n\nThe server will be available at `http://localhost:8000` with automatic API documentation at `/docs`.\n\n### CLI Options\n\nThe `semware` command supports these options:\n\n```bash\nsemware --help Show help message\nsemware --version Show version\nsemware --debug Enable debug mode & API docs\nsemware --reload Development mode with auto-reload\nsemware --host 127.0.0.1 Bind to specific host\nsemware --port 8080 Use custom port\nsemware --workers 4 Number of worker processes\nsemware --log-level DEBUG Set logging level\n```\n\n## \ud83d\udcda API Reference\n\n### Authentication\n\nAll endpoints require authentication using one of:\n- **Header**: `X-API-Key: your-api-key`\n- **Bearer Token**: `Authorization: Bearer your-api-key`\n\n### \ud83d\uddc2\ufe0f Table Management\n\n#### Create Table\nCreate a new table with custom schema.\n\n```http\nPOST /tables\nContent-Type: application/json\nX-API-Key: your-api-key\n\n{\n \"schema\": {\n \"name\": \"research_papers\",\n \"columns\": {\n \"id\": \"string\",\n \"title\": \"string\", \n \"abstract\": \"string\",\n \"authors\": \"string\",\n \"year\": \"int\",\n \"doi\": \"string\"\n },\n \"id_column\": \"id\",\n \"embedding_column\": \"abstract\"\n }\n}\n```\n\n**Response (201):**\n```json\n{\n \"message\": \"Table 'research_papers' created successfully\",\n \"table_name\": \"research_papers\"\n}\n```\n\n#### List Tables\nGet all available tables.\n\n```http\nGET /tables\nX-API-Key: your-api-key\n```\n\n**Response (200):**\n```json\n{\n \"tables\": [\"research_papers\", \"product_docs\", \"customer_support\"],\n \"count\": 3\n}\n```\n\n#### Get Table Info\nGet detailed information about a specific table.\n\n```http\nGET /tables/research_papers\nX-API-Key: your-api-key\n```\n\n**Response (200):**\n```json\n{\n \"table_name\": \"research_papers\",\n \"schema\": {\n \"name\": \"research_papers\",\n \"columns\": {\n \"id\": \"string\",\n \"title\": \"string\",\n \"abstract\": \"string\",\n \"authors\": \"string\", \n \"year\": \"int\",\n \"doi\": \"string\"\n },\n \"id_column\": \"id\",\n \"embedding_column\": \"abstract\"\n },\n \"record_count\": 1547,\n \"created_at\": \"2024-01-15T10:30:00Z\"\n}\n```\n\n#### Delete Table\nDelete a table and all its data.\n\n```http\nDELETE /tables/research_papers\nX-API-Key: your-api-key\n```\n\n**Response (200):**\n```json\n{\n \"message\": \"Table 'research_papers' deleted successfully\",\n \"table_name\": \"research_papers\"\n}\n```\n\n### \ud83d\udcc4 Data Operations\n\n#### Insert/Update Documents\nInsert new documents or update existing ones. Embeddings are generated automatically.\n\n```http\nPOST /tables/research_papers/data\nContent-Type: application/json\nX-API-Key: your-api-key\n\n{\n \"records\": [\n {\n \"data\": {\n \"id\": \"paper_001\",\n \"title\": \"Attention Is All You Need\",\n \"abstract\": \"The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...\",\n \"authors\": \"Ashish Vaswani, Noam Shazeer, Niki Parmar\",\n \"year\": 2017,\n \"doi\": \"10.48550/arXiv.1706.03762\"\n }\n },\n {\n \"data\": {\n \"id\": \"paper_002\", \n \"title\": \"BERT: Pre-training of Deep Bidirectional Transformers\",\n \"abstract\": \"We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations...\",\n \"authors\": \"Jacob Devlin, Ming-Wei Chang, Kenton Lee\",\n \"year\": 2018,\n \"doi\": \"10.48550/arXiv.1810.04805\"\n }\n }\n ]\n}\n```\n\n**Response (201):**\n```json\n{\n \"message\": \"Successfully processed 2 records\",\n \"inserted_count\": 2,\n \"updated_count\": 0,\n \"processing_time_ms\": 1247.3\n}\n```\n\n#### Get Document\nRetrieve a specific document by ID.\n\n```http\nGET /tables/research_papers/data/paper_001\nX-API-Key: your-api-key\n```\n\n**Response (200):**\n```json\n{\n \"table_name\": \"research_papers\",\n \"record_id\": \"paper_001\",\n \"data\": {\n \"id\": \"paper_001\",\n \"title\": \"Attention Is All You Need\",\n \"abstract\": \"The dominant sequence transduction models are based on complex recurrent...\",\n \"authors\": \"Ashish Vaswani, Noam Shazeer, Niki Parmar\",\n \"year\": 2017,\n \"doi\": \"10.48550/arXiv.1706.03762\"\n }\n}\n```\n\n#### Delete Document\nRemove a document from the table.\n\n```http\nDELETE /tables/research_papers/data/paper_001\nX-API-Key: your-api-key\n```\n\n**Response (200):**\n```json\n{\n \"message\": \"Record 'paper_001' deleted successfully\",\n \"table_name\": \"research_papers\",\n \"deleted_id\": \"paper_001\"\n}\n```\n\n### \ud83d\udd0d Search Operations\n\n#### Similarity Search\nFind all documents with similarity above a threshold.\n\n```http\nPOST /tables/research_papers/search/similarity\nContent-Type: application/json\nX-API-Key: your-api-key\n\n{\n \"query\": \"transformer neural network attention mechanism\",\n \"threshold\": 0.7,\n \"limit\": 10\n}\n```\n\n**Response (200):**\n```json\n{\n \"query\": \"transformer neural network attention mechanism\",\n \"results\": [\n {\n \"id\": \"paper_001\",\n \"data\": {\n \"id\": \"paper_001\",\n \"title\": \"Attention Is All You Need\",\n \"abstract\": \"The dominant sequence transduction models...\",\n \"authors\": \"Ashish Vaswani, Noam Shazeer, Niki Parmar\",\n \"year\": 2017,\n \"doi\": \"10.48550/arXiv.1706.03762\"\n },\n \"similarity_score\": 0.89\n },\n {\n \"id\": \"paper_002\",\n \"data\": {\n \"id\": \"paper_002\",\n \"title\": \"BERT: Pre-training of Deep Bidirectional Transformers\", \n \"abstract\": \"We introduce a new language representation model...\",\n \"authors\": \"Jacob Devlin, Ming-Wei Chang, Kenton Lee\",\n \"year\": 2018,\n \"doi\": \"10.48550/arXiv.1810.04805\"\n },\n \"similarity_score\": 0.76\n }\n ],\n \"total_results\": 2,\n \"search_time_ms\": 23.4,\n \"threshold\": 0.7\n}\n```\n\n#### Top-K Search \nFind the K most similar documents.\n\n```http\nPOST /tables/research_papers/search/top-k\nContent-Type: application/json\nX-API-Key: your-api-key\n\n{\n \"query\": \"natural language processing BERT\",\n \"k\": 5\n}\n```\n\n**Response (200):**\n```json\n{\n \"query\": \"natural language processing BERT\",\n \"results\": [\n {\n \"id\": \"paper_002\",\n \"data\": {\n \"id\": \"paper_002\",\n \"title\": \"BERT: Pre-training of Deep Bidirectional Transformers\",\n \"abstract\": \"We introduce a new language representation model...\",\n \"authors\": \"Jacob Devlin, Ming-Wei Chang, Kenton Lee\", \n \"year\": 2018,\n \"doi\": \"10.48550/arXiv.1810.04805\"\n },\n \"similarity_score\": 0.94\n },\n {\n \"id\": \"paper_001\", \n \"data\": {\n \"id\": \"paper_001\",\n \"title\": \"Attention Is All You Need\",\n \"abstract\": \"The dominant sequence transduction models...\",\n \"authors\": \"Ashish Vaswani, Noam Shazeer, Niki Parmar\",\n \"year\": 2017,\n \"doi\": \"10.48550/arXiv.1706.03762\"\n },\n \"similarity_score\": 0.81\n }\n ],\n \"total_results\": 5,\n \"search_time_ms\": 31.7,\n \"k\": 5\n}\n```\n\n### \u2764\ufe0f Health Check\n\n```http\nGET /health\n```\n\n**Response (200):**\n```json\n{\n \"status\": \"healthy\",\n \"app_name\": \"SemWare\",\n \"version\": \"0.1.0\", \n \"timestamp\": \"2024-01-15T14:30:25.123456\"\n}\n```\n\n## \ud83e\udde0 Embedding Process\n\nSemWare uses advanced text processing for optimal semantic understanding:\n\n### 1. **Text Tokenization**\n- Long texts are intelligently split into manageable chunks\n- Uses `tiktoken` with `cl100k_base` encoding for precise token counting\n- Default batch size: 2000 tokens with configurable limits\n\n### 2. **Batch Processing**\n- Each text chunk is processed through the embedding model\n- Supports multiple embedding models via Hugging Face transformers\n- Automatic GPU acceleration when available\n\n### 3. **Embedding Aggregation**\n- Multiple batch embeddings are combined using average pooling\n- Preserves semantic meaning across the entire document\n- Results in high-quality 384-dimensional vectors (MiniLM)\n\n### 4. **Normalization & Storage**\n- Final embeddings are L2 normalized for consistent similarity scoring\n- Stored efficiently in LanceDB with optimized vector indexing\n- Enables sub-second search across millions of documents\n\n## \ud83d\udee0\ufe0f Development\n\n### Running Tests\n```bash\n# Run all tests with coverage\nuv run --native-tls pytest --cov=src --cov-report=html\n\n# Run specific test file\nuv run --native-tls pytest tests/test_api/test_search.py -v\n\n# Run with debug output\nuv run --native-tls pytest -s --log-cli-level=DEBUG\n```\n\n### Code Quality\n```bash\n# Format code\nuv run --native-tls ruff format src/ tests/\n\n# Lint and fix issues\nuv run --native-tls ruff check src/ tests/ --fix\n\n# Type checking\nuv run --native-tls mypy src/\n```\n\n### API Documentation\nStart the server with `DEBUG=true` in your `.env` and visit:\n- **Swagger UI**: http://localhost:8000/docs\n- **ReDoc**: http://localhost:8000/redoc\n- **OpenAPI JSON**: http://localhost:8000/openapi.json\n\n## \ud83d\udcc1 Project Structure\n\n```\nSemWare/\n\u251c\u2500\u2500 src/semware/\n\u2502 \u251c\u2500\u2500 api/ # FastAPI route handlers\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u2502 \u251c\u2500\u2500 auth.py # Authentication middleware\n\u2502 \u2502 \u251c\u2500\u2500 data.py # Data CRUD operations\n\u2502 \u2502 \u251c\u2500\u2500 search.py # Search endpoints \n\u2502 \u2502 \u2514\u2500\u2500 tables.py # Table management\n\u2502 \u251c\u2500\u2500 models/ # Pydantic data models\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u2502 \u251c\u2500\u2500 requests.py # Request/response models\n\u2502 \u2502 \u2514\u2500\u2500 schemas.py # Core data schemas\n\u2502 \u251c\u2500\u2500 services/ # Business logic services\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u2502 \u251c\u2500\u2500 embedding.py # ML embedding generation\n\u2502 \u2502 \u251c\u2500\u2500 search.py # Search orchestration\n\u2502 \u2502 \u2514\u2500\u2500 vectordb.py # Vector database operations\n\u2502 \u251c\u2500\u2500 utils/ # Utility functions\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u2502 \u251c\u2500\u2500 logging.py # Logging configuration\n\u2502 \u2502 \u2514\u2500\u2500 tokenizer.py # Text tokenization\n\u2502 \u251c\u2500\u2500 config.py # Configuration management\n\u2502 \u2514\u2500\u2500 main.py # FastAPI application factory\n\u251c\u2500\u2500 tests/ # Comprehensive test suite\n\u2502 \u251c\u2500\u2500 conftest.py # Test configuration & fixtures\n\u2502 \u251c\u2500\u2500 test_api/ # API endpoint tests\n\u2502 \u251c\u2500\u2500 test_services/ # Service layer tests\n\u2502 \u2514\u2500\u2500 test_utils/ # Utility function tests\n\u251c\u2500\u2500 pyproject.toml # Project configuration\n\u251c\u2500\u2500 .env.example # Environment template\n\u2514\u2500\u2500 README.md # This file\n```\n\n## \u2699\ufe0f Configuration Reference\n\n| Variable | Description | Default | Required |\n|----------|-------------|---------|----------|\n| `API_KEY` | Authentication key for all endpoints | - | \u2705 |\n| `DEBUG` | Enable debug mode and API docs | `false` | \u274c |\n| `DB_PATH` | Database storage directory | `./data` | \u274c |\n| `HOST` | Server bind address | `0.0.0.0` | \u274c |\n| `PORT` | Server port | `8000` | \u274c |\n| `LOG_LEVEL` | Logging level (DEBUG/INFO/WARNING/ERROR) | `INFO` | \u274c |\n| `LOG_FILE` | Log file path (optional) | - | \u274c |\n| `EMBEDDING_MODEL_NAME` | Hugging Face model name | `all-MiniLM-L6-v2` | \u274c |\n| `EMBEDDING_DIMENSION` | Embedding vector dimensions | `384` | \u274c |\n| `MAX_TOKENS_PER_BATCH` | Max tokens per embedding batch | `2000` | \u274c |\n| `WORKERS` | Number of server workers | `1` | \u274c |\n\n## \ud83d\udea2 Deployment\n\n### Docker\n```dockerfile\nFROM python:3.11-slim\n\nWORKDIR /app\nCOPY . .\n\nRUN pip install uv\nRUN uv sync --native-tls\n\nEXPOSE 8000\nCMD [\"uv\", \"run\", \"--native-tls\", \"uvicorn\", \"semware.main:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n```\n\n### Production Considerations\n- Use multiple workers: `--workers 4`\n- Enable access logs: `--access-log`\n- Set up reverse proxy (nginx) for HTTPS termination\n- Configure log rotation and monitoring\n- Use a dedicated vector storage solution for large scale\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature/amazing-feature`\n3. Make your changes and add tests\n4. Run the test suite: `uv run --native-tls pytest`\n5. Submit a pull request\n\n## \ud83d\udcca Performance\n\n**Benchmarks** (on Apple M2 Pro, 16GB RAM):\n- **Embedding Generation**: ~200ms per batch (2000 tokens)\n- **Document Insertion**: ~500ms per document (including embedding)\n- **Vector Search**: <50ms for similarity search across 10K documents\n- **Throughput**: ~100 requests/second with 4 workers\n\n## \ud83d\udc1b Troubleshooting\n\n### Common Issues\n\n**Authentication Errors**\n```bash\n# Ensure API key is set correctly\nexport API_KEY=your-secret-key\n# Or check your .env file\n```\n\n**Model Download Issues**\n```bash\n# Clear Hugging Face cache\nrm -rf ~/.cache/huggingface/\n# Restart with debug logging\nDEBUG=true uv run --native-tls python -m semware.main\n```\n\n**Database Permissions**\n```bash\n# Ensure write permissions to data directory\nchmod 755 ./data\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **FastAPI** for the excellent async web framework\n- **LanceDB** for high-performance vector storage\n- **Hugging Face** for the transformer models and ecosystem\n- **Pydantic** for robust data validation\n- **The Python Community** for the amazing open-source ecosystem\n\n---\n\n<p align=\"center\">\n <strong>Built with \u2764\ufe0f by the SemWare team</strong>\n</p>\n\n<p align=\"center\">\n <a href=\"https://github.com/semware/semware/issues\">Report Bug</a> \u2022\n <a href=\"https://github.com/semware/semware/discussions\">Discussions</a> \u2022\n <a href=\"https://github.com/semware/semware/wiki\">Wiki</a>\n</p>",
"bugtrack_url": null,
"license": "MIT",
"summary": "Semantic search API server using vector databases and ML embeddings",
"version": "0.1.0",
"project_urls": null,
"split_keywords": [
"api",
" embeddings",
" machine-learning",
" semantic-search",
" vector-database"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4439860f21c0724ca08ffa9dc258a3d92db7529497e0755a8ea846c3f848b53a",
"md5": "b686de875291ac586b5c238ff11981d7",
"sha256": "9322ccfa2c6b5f1bd808dda6f3dcb5a1a82115c482d2e02b28653e688374de67"
},
"downloads": -1,
"filename": "semware-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b686de875291ac586b5c238ff11981d7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 27974,
"upload_time": "2025-09-10T03:14:45",
"upload_time_iso_8601": "2025-09-10T03:14:45.112356Z",
"url": "https://files.pythonhosted.org/packages/44/39/860f21c0724ca08ffa9dc258a3d92db7529497e0755a8ea846c3f848b53a/semware-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8a02c6394e787bf1bb03319e324567ac107c6550550fc6c0febdffc0aa6bb7a2",
"md5": "a87b340ee795e67d0cd7917c505d1d6d",
"sha256": "8ce0a48cb8a9395fc2a35c863f7141a371dd0e11f1cf2cf54ef615b02dbe0f36"
},
"downloads": -1,
"filename": "semware-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "a87b340ee795e67d0cd7917c505d1d6d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 221366,
"upload_time": "2025-09-10T03:14:47",
"upload_time_iso_8601": "2025-09-10T03:14:47.037358Z",
"url": "https://files.pythonhosted.org/packages/8a/02/c6394e787bf1bb03319e324567ac107c6550550fc6c0febdffc0aa6bb7a2/semware-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-10 03:14:47",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "semware"
}