eless


Nameeless JSON
Version 1.0.3 PyPI version JSON
download
home_pageNone
SummaryEvolving Low-resource Embedding and Storage System - A resilient RAG data processing pipeline with comprehensive logging, multi-database support, and CLI interface.
upload_time2025-10-27 10:50:19
maintainerNone
docs_urlNone
authorELESS Team
requires_python>=3.8
licenseMIT
keywords rag embedding vector-database nlp ai document-processing
VCS
bugtrack_url
requirements click PyYAML numpy psutil sentence-transformers torch chromadb langchain-community langchain-core qdrant-client faiss-cpu psycopg2-binary cassandra-driver pypdf python-docx openpyxl pandas beautifulsoup4 lxml
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ELESS - Evolving Low-resource Embedding and Storage System

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Tests](https://img.shields.io/badge/tests-56%20passing-brightgreen.svg)](https://github.com/Bandalaro/eless)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

A resilient RAG (Retrieval-Augmented Generation) data processing pipeline with comprehensive logging, multi-database support, and an intuitive CLI interface. Built for efficiency on low-resource systems while maintaining production-grade reliability.

## Features

- **Multi-Database Support**: ChromaDB, Qdrant, FAISS, PostgreSQL, Cassandra (install extras for full support)
- **Multiple File Formats**: PDF, DOCX, TXT, MD, HTML, and more (install parsers extra)
- **Resumable Processing**: Checkpoint-based system for interrupted workflows
- **Comprehensive Logging**: Structured logs with rotation and performance tracking
- **Smart Caching**: Content-based hashing and atomic manifest writes
- **Flexible Embeddings**: Support for various sentence-transformers models (install embeddings extra)
- **Memory Efficient**: Streaming processing for large files
- **Production Ready**: Graceful error handling and data safety features
- **CLI Interface**: Easy-to-use command-line tools
- **Modular Design**: Extensible architecture for custom parsers and databases

**Note**: ELESS gracefully handles missing optional dependencies with warnings. Install extras for full features. For Qdrant and PostgreSQL, ensure the database instances are running on the specified ports (default: Qdrant 6333, PostgreSQL 5432).

## Project Structure

For contributors, the project is organized as follows:

- `src/`: Source code for the ELESS package
- `tests/`: Unit and integration tests
- `docs/`: Documentation, including user guides, API reference, and contributing guidelines
- `tools/`: Utility scripts for deployment, packaging, and verification
- `config/`: Configuration files and templates
- `build/`: Build configuration files (setup.py, pyproject.toml, etc.)

## Quick Start

### Installation

```bash
# Install from source
git clone https://github.com/Bandalaro/eless.git
cd eless
pip install -e .

# Install with all features (recommended for full functionality)
pip install -e ".[full]"

# Or install specific extras
pip install -e ".[embeddings,databases,parsers]"
```

### Basic Usage

```bash
# Process documents with default settings
eless process /path/to/documents

# Process with specific database
eless process /path/to/documents --databases chroma

# Process with custom settings
eless process /path/to/documents --chunk-size 1000 --log-level DEBUG

# Check processing status
eless status --all

# Resume interrupted processing
eless process /path/to/documents --resume
```

### Python API

```python
from eless import ElessPipeline
import yaml

# Load configuration
with open("config/default_config.yaml") as f:
    config = yaml.safe_load(f)

# Create and run pipeline
pipeline = ElessPipeline(config)
pipeline.run_process("/path/to/documents")

# Check status
files = pipeline.state_manager.get_all_files()
for file in files:
    print(f"{file['path']}: {file['status']}")
```

## πŸ“‹ Requirements

### Core Dependencies
- Python 3.8+
- click >= 8.0.0
- PyYAML >= 6.0
- numpy >= 1.21.0
- psutil >= 5.8.0

### Optional Dependencies

**Embeddings:**
```bash
pip install sentence-transformers torch
```

**Databases:**
```bash
# ChromaDB
pip install chromadb langchain-community langchain-core

# Qdrant
pip install qdrant-client

# FAISS
pip install faiss-cpu  # or faiss-gpu for GPU support

# PostgreSQL
pip install psycopg2-binary

# Cassandra
pip install cassandra-driver
```

**Document Parsers:**
```bash
pip install pypdf python-docx openpyxl pandas beautifulsoup4 lxml
```

**All Features:**
```bash
pip install -e ".[full]"
```

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         CLI Interface (Click)           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      ElessPipeline (Orchestrator)       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Scanner  β”‚Dispatcherβ”‚  State Manager    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Parsers  β”‚ Chunker  β”‚  Archiver         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      Embedder       β”‚  Resource Monitor β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Database Loader    β”‚  Logging System   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Key Components

- **FileScanner**: Discovers and hashes files using SHA-256
- **Dispatcher**: Routes files to appropriate parsers
- **TextChunker**: Intelligent text segmentation with overlap
- **Embedder**: Generates vector embeddings with caching
- **DatabaseLoader**: Multi-database coordination
- **StateManager**: Tracks processing state with atomic writes
- **ResourceMonitor**: Adaptive resource management

## πŸŽ›οΈ Configuration

Create a `config.yaml` file or modify `config/default_config.yaml`:

```yaml
# Logging
logging:
  directory: .eless_logs
  level: INFO
  enable_console: true

# Embedding
embedding:
  model_name: all-MiniLM-L6-v2
  device: cpu
  batch_size: 32

# Chunking
chunking:
  chunk_size: 500
  overlap: 50
  strategy: semantic

# Databases
databases:
  targets:
    - chroma
  connections:
    chroma:
      type: chroma
      path: .eless_chroma
      collection_name: eless_vectors

# Resource Limits
resource_limits:
  max_memory_mb: 512
  enable_adaptive_batching: true

# Streaming
streaming:
  buffer_size: 8192
  max_file_size_mb: 100
  auto_streaming_threshold: 0.7
```

## Documentation

- **[Quick Start Guide](docs/QUICK_START.md)** - Get started in 5 minutes
- **[API Reference](docs/API_REFERENCE.md)** - Complete API documentation
- **[Developer Guide](docs/DEVELOPER_GUIDE.md)** - Contributing and development
- **[Documentation Index](docs/README.md)** - All documentation

## Use Cases

### Document Processing Pipeline
```bash
# Process research papers
eless process papers/ \
  --databases chroma \
  --chunk-size 1000 \
  --log-level INFO
```

### RAG System Setup
```bash
# Index documentation
eless process docs/ \
  --databases qdrant \
  --databases faiss

# Query your RAG application
python query_rag.py "machine learning techniques"
```

### Batch Processing
```bash
# Process multiple directories
for dir in dataset1 dataset2 dataset3; do
  eless process "$dir" --databases chroma --resume
done
```

## CLI Commands

### Process Documents
```bash
eless process <path> [OPTIONS]

Options:
  --databases, -db <name>    Select databases (repeatable)
  --config <file>            Custom configuration file
  --resume                   Resume interrupted processing
  --chunk-size <size>        Override chunk size
  --batch-size <size>        Override batch size
  --log-level <level>        Set log level
  --log-dir <path>           Custom log directory
```

### Check Status
```bash
eless status [OPTIONS]

Options:
  --all                      Show all tracked files
  <file_id>                  Show specific file details
```

### System Management
```bash
eless config-info          # Display configuration
eless test                 # Run system tests
eless logs [--days N]      # Manage log files
```

## Testing

```bash
# Run all tests
pytest tests/

# Run specific test suite
pytest tests/test_cli.py -v

# Run with coverage
pytest tests/ --cov=src --cov-report=html

# Test results: 56/56 passing βœ…
```

## Contributing

We welcome contributions! Please see [docs/CONTRIBUTING.md](docs/CONTRIBUTING.md) for guidelines.

### Development Setup

```bash
# Clone and setup
git clone https://github.com/Bandalaro/eless.git
cd eless
python3 -m venv venv
source venv/bin/activate

# Install development dependencies
pip install -e ".[dev,full]"

# Run tests
pytest tests/

# Format code
black src/ tests/

# Check linting
flake8 src/ tests/
```

## Performance

### Optimized for Low-Resource Systems
```yaml
resource_limits:
  max_memory_mb: 256
  enable_adaptive_batching: true

embedding:
  batch_size: 8

streaming:
  auto_streaming_threshold: 0.5
```

### High-Performance Configuration
```yaml
resource_limits:
  max_memory_mb: 4096

embedding:
  batch_size: 128
  device: cuda

parallel:
  enable: true
  max_workers: 8
```

## Troubleshooting

### Common Issues

**Missing Dependencies:**
```bash
# Install embedding support
pip install sentence-transformers

# Install database support
pip install chromadb langchain-community
```

**Memory Issues:**
```yaml
# Reduce memory usage
embedding:
  batch_size: 8
streaming:
  auto_streaming_threshold: 0.5
```

**Slow Processing:**
```yaml
# Increase performance
embedding:
  batch_size: 64
parallel:
  enable: true
  max_workers: 4
```

See [docs/QUICK_START.md](docs/QUICK_START.md#troubleshooting) for more solutions.

## Project Status

- **56/56 tests passing**
- **Zero warnings**
- **Production ready**
- **Comprehensive documentation**
- **Active development**

## Roadmap

- [ ] PyPI publication
- [ ] Additional database connectors (Milvus, Weaviate)
- [ ] Web interface
- [ ] Docker support
- [ ] Distributed processing
- [ ] Advanced query capabilities

## License

This project is licensed under the MIT License - see the [docs/LICENSE](docs/LICENSE) file for details.

## Acknowledgments

- Built with [sentence-transformers](https://www.sbert.net/)
- Supports [ChromaDB](https://www.trychroma.com/), [Qdrant](https://qdrant.tech/), and more
- Powered by the Python ecosystem

## Support

- **Issues**: [GitHub Issues](https://github.com/Bandalaro/eless/issues)
- **Discussions**: [GitHub Discussions](https://github.com/Bandalaro/eless/discussions)
- **Documentation**: [docs/](docs/)

## Star History

If you find ELESS useful, please consider giving it a star on GitHub!

---

**Made with love by [Bandalaro](https://github.com/Bandalaro)**

**Status: Production Ready** | **Version: 1.0.0**

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "eless",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Bandalaro <bandalaro@users.noreply.github.com>",
    "keywords": "rag, embedding, vector-database, nlp, ai, document-processing",
    "author": "ELESS Team",
    "author_email": "Bandalaro <bandalaro@users.noreply.github.com>",
    "download_url": "https://files.pythonhosted.org/packages/ee/52/f2da28b65bbdb1c7cebb469c0343a48975d4cd8873908f789424ba9b81ad/eless-1.0.3.tar.gz",
    "platform": null,
    "description": "# ELESS - Evolving Low-resource Embedding and Storage System\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Tests](https://img.shields.io/badge/tests-56%20passing-brightgreen.svg)](https://github.com/Bandalaro/eless)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nA resilient RAG (Retrieval-Augmented Generation) data processing pipeline with comprehensive logging, multi-database support, and an intuitive CLI interface. Built for efficiency on low-resource systems while maintaining production-grade reliability.\n\n## Features\n\n- **Multi-Database Support**: ChromaDB, Qdrant, FAISS, PostgreSQL, Cassandra (install extras for full support)\n- **Multiple File Formats**: PDF, DOCX, TXT, MD, HTML, and more (install parsers extra)\n- **Resumable Processing**: Checkpoint-based system for interrupted workflows\n- **Comprehensive Logging**: Structured logs with rotation and performance tracking\n- **Smart Caching**: Content-based hashing and atomic manifest writes\n- **Flexible Embeddings**: Support for various sentence-transformers models (install embeddings extra)\n- **Memory Efficient**: Streaming processing for large files\n- **Production Ready**: Graceful error handling and data safety features\n- **CLI Interface**: Easy-to-use command-line tools\n- **Modular Design**: Extensible architecture for custom parsers and databases\n\n**Note**: ELESS gracefully handles missing optional dependencies with warnings. Install extras for full features. For Qdrant and PostgreSQL, ensure the database instances are running on the specified ports (default: Qdrant 6333, PostgreSQL 5432).\n\n## Project Structure\n\nFor contributors, the project is organized as follows:\n\n- `src/`: Source code for the ELESS package\n- `tests/`: Unit and integration tests\n- `docs/`: Documentation, including user guides, API reference, and contributing guidelines\n- `tools/`: Utility scripts for deployment, packaging, and verification\n- `config/`: Configuration files and templates\n- `build/`: Build configuration files (setup.py, pyproject.toml, etc.)\n\n## Quick Start\n\n### Installation\n\n```bash\n# Install from source\ngit clone https://github.com/Bandalaro/eless.git\ncd eless\npip install -e .\n\n# Install with all features (recommended for full functionality)\npip install -e \".[full]\"\n\n# Or install specific extras\npip install -e \".[embeddings,databases,parsers]\"\n```\n\n### Basic Usage\n\n```bash\n# Process documents with default settings\neless process /path/to/documents\n\n# Process with specific database\neless process /path/to/documents --databases chroma\n\n# Process with custom settings\neless process /path/to/documents --chunk-size 1000 --log-level DEBUG\n\n# Check processing status\neless status --all\n\n# Resume interrupted processing\neless process /path/to/documents --resume\n```\n\n### Python API\n\n```python\nfrom eless import ElessPipeline\nimport yaml\n\n# Load configuration\nwith open(\"config/default_config.yaml\") as f:\n    config = yaml.safe_load(f)\n\n# Create and run pipeline\npipeline = ElessPipeline(config)\npipeline.run_process(\"/path/to/documents\")\n\n# Check status\nfiles = pipeline.state_manager.get_all_files()\nfor file in files:\n    print(f\"{file['path']}: {file['status']}\")\n```\n\n## \ud83d\udccb Requirements\n\n### Core Dependencies\n- Python 3.8+\n- click >= 8.0.0\n- PyYAML >= 6.0\n- numpy >= 1.21.0\n- psutil >= 5.8.0\n\n### Optional Dependencies\n\n**Embeddings:**\n```bash\npip install sentence-transformers torch\n```\n\n**Databases:**\n```bash\n# ChromaDB\npip install chromadb langchain-community langchain-core\n\n# Qdrant\npip install qdrant-client\n\n# FAISS\npip install faiss-cpu  # or faiss-gpu for GPU support\n\n# PostgreSQL\npip install psycopg2-binary\n\n# Cassandra\npip install cassandra-driver\n```\n\n**Document Parsers:**\n```bash\npip install pypdf python-docx openpyxl pandas beautifulsoup4 lxml\n```\n\n**All Features:**\n```bash\npip install -e \".[full]\"\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502         CLI Interface (Click)           \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502      ElessPipeline (Orchestrator)       \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502 Scanner  \u2502Dispatcher\u2502  State Manager    \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502 Parsers  \u2502 Chunker  \u2502  Archiver         \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502      Embedder       \u2502  Resource Monitor \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502  Database Loader    \u2502  Logging System   \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Key Components\n\n- **FileScanner**: Discovers and hashes files using SHA-256\n- **Dispatcher**: Routes files to appropriate parsers\n- **TextChunker**: Intelligent text segmentation with overlap\n- **Embedder**: Generates vector embeddings with caching\n- **DatabaseLoader**: Multi-database coordination\n- **StateManager**: Tracks processing state with atomic writes\n- **ResourceMonitor**: Adaptive resource management\n\n## \ud83c\udf9b\ufe0f Configuration\n\nCreate a `config.yaml` file or modify `config/default_config.yaml`:\n\n```yaml\n# Logging\nlogging:\n  directory: .eless_logs\n  level: INFO\n  enable_console: true\n\n# Embedding\nembedding:\n  model_name: all-MiniLM-L6-v2\n  device: cpu\n  batch_size: 32\n\n# Chunking\nchunking:\n  chunk_size: 500\n  overlap: 50\n  strategy: semantic\n\n# Databases\ndatabases:\n  targets:\n    - chroma\n  connections:\n    chroma:\n      type: chroma\n      path: .eless_chroma\n      collection_name: eless_vectors\n\n# Resource Limits\nresource_limits:\n  max_memory_mb: 512\n  enable_adaptive_batching: true\n\n# Streaming\nstreaming:\n  buffer_size: 8192\n  max_file_size_mb: 100\n  auto_streaming_threshold: 0.7\n```\n\n## Documentation\n\n- **[Quick Start Guide](docs/QUICK_START.md)** - Get started in 5 minutes\n- **[API Reference](docs/API_REFERENCE.md)** - Complete API documentation\n- **[Developer Guide](docs/DEVELOPER_GUIDE.md)** - Contributing and development\n- **[Documentation Index](docs/README.md)** - All documentation\n\n## Use Cases\n\n### Document Processing Pipeline\n```bash\n# Process research papers\neless process papers/ \\\n  --databases chroma \\\n  --chunk-size 1000 \\\n  --log-level INFO\n```\n\n### RAG System Setup\n```bash\n# Index documentation\neless process docs/ \\\n  --databases qdrant \\\n  --databases faiss\n\n# Query your RAG application\npython query_rag.py \"machine learning techniques\"\n```\n\n### Batch Processing\n```bash\n# Process multiple directories\nfor dir in dataset1 dataset2 dataset3; do\n  eless process \"$dir\" --databases chroma --resume\ndone\n```\n\n## CLI Commands\n\n### Process Documents\n```bash\neless process <path> [OPTIONS]\n\nOptions:\n  --databases, -db <name>    Select databases (repeatable)\n  --config <file>            Custom configuration file\n  --resume                   Resume interrupted processing\n  --chunk-size <size>        Override chunk size\n  --batch-size <size>        Override batch size\n  --log-level <level>        Set log level\n  --log-dir <path>           Custom log directory\n```\n\n### Check Status\n```bash\neless status [OPTIONS]\n\nOptions:\n  --all                      Show all tracked files\n  <file_id>                  Show specific file details\n```\n\n### System Management\n```bash\neless config-info          # Display configuration\neless test                 # Run system tests\neless logs [--days N]      # Manage log files\n```\n\n## Testing\n\n```bash\n# Run all tests\npytest tests/\n\n# Run specific test suite\npytest tests/test_cli.py -v\n\n# Run with coverage\npytest tests/ --cov=src --cov-report=html\n\n# Test results: 56/56 passing \u2705\n```\n\n## Contributing\n\nWe welcome contributions! Please see [docs/CONTRIBUTING.md](docs/CONTRIBUTING.md) for guidelines.\n\n### Development Setup\n\n```bash\n# Clone and setup\ngit clone https://github.com/Bandalaro/eless.git\ncd eless\npython3 -m venv venv\nsource venv/bin/activate\n\n# Install development dependencies\npip install -e \".[dev,full]\"\n\n# Run tests\npytest tests/\n\n# Format code\nblack src/ tests/\n\n# Check linting\nflake8 src/ tests/\n```\n\n## Performance\n\n### Optimized for Low-Resource Systems\n```yaml\nresource_limits:\n  max_memory_mb: 256\n  enable_adaptive_batching: true\n\nembedding:\n  batch_size: 8\n\nstreaming:\n  auto_streaming_threshold: 0.5\n```\n\n### High-Performance Configuration\n```yaml\nresource_limits:\n  max_memory_mb: 4096\n\nembedding:\n  batch_size: 128\n  device: cuda\n\nparallel:\n  enable: true\n  max_workers: 8\n```\n\n## Troubleshooting\n\n### Common Issues\n\n**Missing Dependencies:**\n```bash\n# Install embedding support\npip install sentence-transformers\n\n# Install database support\npip install chromadb langchain-community\n```\n\n**Memory Issues:**\n```yaml\n# Reduce memory usage\nembedding:\n  batch_size: 8\nstreaming:\n  auto_streaming_threshold: 0.5\n```\n\n**Slow Processing:**\n```yaml\n# Increase performance\nembedding:\n  batch_size: 64\nparallel:\n  enable: true\n  max_workers: 4\n```\n\nSee [docs/QUICK_START.md](docs/QUICK_START.md#troubleshooting) for more solutions.\n\n## Project Status\n\n- **56/56 tests passing**\n- **Zero warnings**\n- **Production ready**\n- **Comprehensive documentation**\n- **Active development**\n\n## Roadmap\n\n- [ ] PyPI publication\n- [ ] Additional database connectors (Milvus, Weaviate)\n- [ ] Web interface\n- [ ] Docker support\n- [ ] Distributed processing\n- [ ] Advanced query capabilities\n\n## License\n\nThis project is licensed under the MIT License - see the [docs/LICENSE](docs/LICENSE) file for details.\n\n## Acknowledgments\n\n- Built with [sentence-transformers](https://www.sbert.net/)\n- Supports [ChromaDB](https://www.trychroma.com/), [Qdrant](https://qdrant.tech/), and more\n- Powered by the Python ecosystem\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/Bandalaro/eless/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/Bandalaro/eless/discussions)\n- **Documentation**: [docs/](docs/)\n\n## Star History\n\nIf you find ELESS useful, please consider giving it a star on GitHub!\n\n---\n\n**Made with love by [Bandalaro](https://github.com/Bandalaro)**\n\n**Status: Production Ready** | **Version: 1.0.0**\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Evolving Low-resource Embedding and Storage System - A resilient RAG data processing pipeline with comprehensive logging, multi-database support, and CLI interface.",
    "version": "1.0.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/Bandalaro/eless/issues",
        "Changelog": "https://github.com/Bandalaro/eless/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/Bandalaro/eless/tree/main/docs",
        "Homepage": "https://github.com/Bandalaro/eless",
        "Repository": "https://github.com/Bandalaro/eless.git",
        "Source Code": "https://github.com/Bandalaro/eless"
    },
    "split_keywords": [
        "rag",
        " embedding",
        " vector-database",
        " nlp",
        " ai",
        " document-processing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "30e291d0f7fc76451af5ae7fa23e8ab5d29bff0d93f7b7a8125f1f13b7ffdd45",
                "md5": "d69f6aea163862b3edd00e3f33e2894f",
                "sha256": "776969e31fe33c536dcd0c79b80d8888a9a640fe64e0aaed8ee2404eed9e52bd"
            },
            "downloads": -1,
            "filename": "eless-1.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d69f6aea163862b3edd00e3f33e2894f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 103959,
            "upload_time": "2025-10-27T10:50:17",
            "upload_time_iso_8601": "2025-10-27T10:50:17.564809Z",
            "url": "https://files.pythonhosted.org/packages/30/e2/91d0f7fc76451af5ae7fa23e8ab5d29bff0d93f7b7a8125f1f13b7ffdd45/eless-1.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ee52f2da28b65bbdb1c7cebb469c0343a48975d4cd8873908f789424ba9b81ad",
                "md5": "93ad9e0eca7f216a0e60a61698cf372a",
                "sha256": "09bbd93260cd12dd8f9952633e1a6a1d08addcf55b01f6497ec3908e2a9db324"
            },
            "downloads": -1,
            "filename": "eless-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "93ad9e0eca7f216a0e60a61698cf372a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 91694,
            "upload_time": "2025-10-27T10:50:19",
            "upload_time_iso_8601": "2025-10-27T10:50:19.578610Z",
            "url": "https://files.pythonhosted.org/packages/ee/52/f2da28b65bbdb1c7cebb469c0343a48975d4cd8873908f789424ba9b81ad/eless-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-27 10:50:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Bandalaro",
    "github_project": "eless",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    ">=",
                    "8.0.0"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": [
                [
                    ">=",
                    "6.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    ">=",
                    "5.8.0"
                ]
            ]
        },
        {
            "name": "sentence-transformers",
            "specs": [
                [
                    ">=",
                    "2.2.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.9.0"
                ]
            ]
        },
        {
            "name": "chromadb",
            "specs": [
                [
                    ">=",
                    "0.4.0"
                ]
            ]
        },
        {
            "name": "langchain-community",
            "specs": [
                [
                    ">=",
                    "0.0.10"
                ]
            ]
        },
        {
            "name": "langchain-core",
            "specs": [
                [
                    ">=",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "qdrant-client",
            "specs": [
                [
                    ">=",
                    "1.6.0"
                ]
            ]
        },
        {
            "name": "faiss-cpu",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "psycopg2-binary",
            "specs": [
                [
                    ">=",
                    "2.9.0"
                ]
            ]
        },
        {
            "name": "cassandra-driver",
            "specs": [
                [
                    ">=",
                    "3.28.0"
                ]
            ]
        },
        {
            "name": "pypdf",
            "specs": [
                [
                    ">=",
                    "3.17.0"
                ]
            ]
        },
        {
            "name": "python-docx",
            "specs": [
                [
                    ">=",
                    "0.8.11"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    ">=",
                    "3.1.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.11.0"
                ]
            ]
        },
        {
            "name": "lxml",
            "specs": [
                [
                    ">=",
                    "4.9.0"
                ]
            ]
        }
    ],
    "lcname": "eless"
}
        
Elapsed time: 1.76895s