evolvishub-outlook-ingestor


Nameevolvishub-outlook-ingestor JSON
Version 1.0.2 PyPI version JSON
download
home_pageNone
SummaryProduction-ready, secure email ingestion system for Microsoft Outlook with advanced processing, monitoring, and database integration
upload_time2025-10-06 22:41:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseEvolvis AI License
keywords outlook email ingestion exchange graph-api imap pop3 database async batch-processing security monitoring performance postgresql mongodb enterprise
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="https://evolvis.ai/wp-content/uploads/2025/08/evie-solutions-03.png" alt="Evolvis AI - Evie Solutions Logo" width="400">
</div>

# Evolvishub Outlook Ingestor

**Production-ready, secure email ingestion system for Microsoft Outlook with advanced processing, monitoring, and hybrid storage capabilities.**

A comprehensive Python library for ingesting, processing, and storing email data from Microsoft Outlook and Exchange systems. Built with enterprise-grade security, performance, and scalability in mind, featuring intelligent hybrid storage architecture for optimal cost and performance.

## Download Statistics

[![PyPI Downloads](https://pepy.tech/badge/evolvishub-outlook-ingestor/month)](https://pepy.tech/project/evolvishub-outlook-ingestor)
[![Total Downloads](https://pepy.tech/badge/evolvishub-outlook-ingestor)](https://pepy.tech/project/evolvishub-outlook-ingestor)
[![PyPI Version](https://img.shields.io/pypi/v/evolvishub-outlook-ingestor)](https://pypi.org/project/evolvishub-outlook-ingestor/)
[![Python Versions](https://img.shields.io/pypi/pyversions/evolvishub-outlook-ingestor)](https://pypi.org/project/evolvishub-outlook-ingestor/)
[![License](https://img.shields.io/pypi/l/evolvishub-outlook-ingestor)](LICENSE)
[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Type Hints](https://img.shields.io/badge/type%20hints-yes-brightgreen.svg)](https://mypy.readthedocs.io/)

## Table of Contents

- [Features](#features)
- [Architecture](#architecture)
- [About Evolvis AI](#about-evolvis-ai)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Hybrid Storage Configuration](#hybrid-storage-configuration)
- [Configuration](#configuration)
- [Performance](#performance)
- [Advanced Usage](#advanced-usage)
- [Support and Documentation](#support-and-documentation)
- [Technical Specifications](#technical-specifications)
- [Acknowledgments](#acknowledgments)
- [License](#license)

## Features

### Protocol Support
- **Microsoft Graph API** - Modern OAuth2-based access to Office 365 and Exchange Online
- **Exchange Web Services (EWS)** - Enterprise-grade access to on-premises Exchange servers
- **IMAP/POP3** - Universal email protocol support for legacy systems and third-party providers

### Database Integration
- **PostgreSQL** - High-performance relational database with advanced indexing and async support
- **MongoDB** - Scalable NoSQL document storage for flexible email data structures
- **MySQL** - Reliable relational database support for existing infrastructure
- **SQLite** - Lightweight file-based database for development, testing, and small deployments
- **Microsoft SQL Server** - Enterprise database for Windows-centric environments with advanced features
- **MariaDB** - Open-source MySQL alternative with enhanced performance and features
- **Oracle Database** - Enterprise-grade database for mission-critical applications
- **CockroachDB** - Distributed, cloud-native database for global scale and resilience
- **ClickHouse** - High-performance columnar database for analytics and real-time queries

### Data Lake Integration
- **Delta Lake** - Apache Spark-based ACID transactional storage layer with time travel capabilities
- **Apache Iceberg** - Open table format for large-scale analytics with schema evolution support
- **Hybrid Analytics** - Seamless integration between operational databases and analytical data lakes

### Hybrid Storage Architecture
- **MinIO** - Self-hosted S3-compatible storage for on-premises control and high performance
- **AWS S3** - Enterprise cloud storage with global CDN, lifecycle policies, and encryption
- **Azure Blob Storage** - Microsoft ecosystem integration with hot/cool/archive storage tiers
- **Google Cloud Storage** - Global infrastructure with ML integration and advanced analytics
- **Intelligent Routing** - Size-based and content-type-based storage decisions with configurable rules
- **Content Deduplication** - SHA256-based deduplication to eliminate duplicate attachments
- **Automatic Compression** - GZIP/ZLIB compression for text-based attachments
- **Secure Access** - Pre-signed URLs with configurable expiration for secure attachment access

### Performance & Scalability
- **Async/Await Architecture** - Non-blocking operations for maximum throughput (1000+ emails/minute)
- **Hybrid Storage Strategy** - Intelligent routing between database and object storage
- **Batch Processing** - Efficient handling of large email volumes with concurrent workers
- **Connection Pooling** - Optimized database connections for enterprise workloads
- **Memory Optimization** - Smart caching and resource management for large datasets
- **Multi-tier Storage** - Automatic lifecycle management between hot/warm/cold storage

### Enterprise Security
- **Credential Encryption** - Fernet symmetric encryption for sensitive data storage
- **Input Sanitization** - Protection against SQL injection, XSS, and other attacks
- **Secure Configuration** - Environment variable-based configuration with validation
- **Audit Logging** - Complete audit trail without sensitive data exposure
- **Access Control** - IAM-based permissions and secure URL generation

### Developer Experience
- **Type Safety** - Full type hints and IDE support for enhanced development experience
- **Comprehensive Testing** - 80%+ test coverage with unit, integration, and performance tests
- **Extensive Documentation** - Complete API reference with examples and best practices
- **Configuration-Based Setup** - Flexible YAML/JSON configuration with validation
- **Error Handling** - Comprehensive exception hierarchy with automatic retry logic

## Architecture

### System Overview

```mermaid
graph TB
    subgraph "Email Sources"
        A[Microsoft Graph API]
        B[Exchange Web Services]
        C[IMAP/POP3]
    end

    subgraph "Evolvishub Outlook Ingestor"
        D[Protocol Adapters]
        E[Enhanced Attachment Processor]
        F[Email Processor]
        G[Security Layer]
    end

    subgraph "Storage Layer"
        H[Database Storage]
        I[Object Storage]
    end

    subgraph "Database Backends"
        J[PostgreSQL]
        K[MongoDB]
        L[MySQL]
    end

    subgraph "Object Storage Backends"
        M[MinIO]
        N[AWS S3]
        O[Azure Blob]
        P[Google Cloud Storage]
    end

    A --> D
    B --> D
    C --> D
    D --> F
    D --> E
    F --> G
    E --> G
    G --> H
    G --> I
    H --> J
    H --> K
    H --> L
    I --> M
    I --> N
    I --> O
    I --> P
```

### Hybrid Storage Strategy

```mermaid
sequenceDiagram
    participant E as Email Processor
    participant AP as Attachment Processor
    participant DB as Database
    participant OS as Object Storage

    E->>AP: Process Email with Attachments
    AP->>AP: Evaluate Storage Rules

    alt Small Attachment (<1MB)
        AP->>DB: Store in Database
        DB-->>AP: Confirmation
    else Medium Attachment (1-5MB)
        AP->>OS: Store Content
        AP->>DB: Store Metadata + Reference
        OS-->>AP: Storage Key
        DB-->>AP: Confirmation
    else Large Attachment (>5MB)
        AP->>OS: Store Content Only
        OS-->>AP: Storage Key + Metadata
    end

    AP-->>E: Processing Complete
```

### Data Flow Architecture

```mermaid
flowchart LR
    subgraph "Ingestion Layer"
        A[Email Source] --> B[Protocol Adapter]
        B --> C[Rate Limiter]
        C --> D[Authentication]
    end

    subgraph "Processing Layer"
        D --> E[Email Processor]
        E --> F[Attachment Processor]
        F --> G[Security Scanner]
        G --> H[Deduplication Engine]
        H --> I[Compression Engine]
    end

    subgraph "Storage Decision Engine"
        I --> J{Storage Rules}
        J -->|Small Files| K[Database Storage]
        J -->|Medium Files| L[Hybrid Storage]
        J -->|Large Files| M[Object Storage]
    end

    subgraph "Storage Backends"
        K --> N[(PostgreSQL/MongoDB)]
        L --> N
        L --> O[(MinIO/S3/Azure/GCS)]
        M --> O
    end
```

## About Evolvis AI

**Evolvis AI** is a cutting-edge technology company specializing in AI-powered solutions for enterprise email processing, data ingestion, and intelligent automation. Founded with a mission to revolutionize how organizations handle and analyze their email communications, Evolvis AI develops sophisticated tools that combine artificial intelligence with robust engineering practices.

### Our Focus
- **AI-Powered Email Processing** - Advanced algorithms for intelligent email analysis, classification, and extraction
- **Enterprise Data Solutions** - Scalable systems for large-scale email ingestion and processing
- **Intelligent Automation** - Smart workflows that adapt to organizational needs and patterns
- **Security-First Architecture** - Enterprise-grade security and compliance for sensitive email data

### Innovation at Scale
Evolvis AI's solutions are designed to handle enterprise-scale email processing challenges, from small businesses to global corporations. Our technology stack emphasizes performance, security, and scalability while maintaining ease of use and deployment flexibility.

**Learn more about our solutions:** [https://evolvis.ai](https://evolvis.ai)

## Installation

### Basic Installation

```bash
# Install core package
pip install evolvishub-outlook-ingestor
```

### Feature-Specific Installation

```bash
# Protocol adapters (Microsoft Graph, EWS, IMAP/POP3)
pip install evolvishub-outlook-ingestor[protocols]

# Core database connectors (PostgreSQL, MongoDB, MySQL)
pip install evolvishub-outlook-ingestor[database]

# Individual database connectors
pip install evolvishub-outlook-ingestor[database-sqlite]      # SQLite
pip install evolvishub-outlook-ingestor[database-mssql]       # SQL Server
pip install evolvishub-outlook-ingestor[database-mariadb]     # MariaDB
pip install evolvishub-outlook-ingestor[database-oracle]      # Oracle
pip install evolvishub-outlook-ingestor[database-cockroachdb] # CockroachDB

# All database connectors
pip install evolvishub-outlook-ingestor[database-all]

# Data lake connectors
pip install evolvishub-outlook-ingestor[datalake-delta]    # Delta Lake
pip install evolvishub-outlook-ingestor[datalake-iceberg]  # Apache Iceberg
pip install evolvishub-outlook-ingestor[database-clickhouse] # ClickHouse

# All data lake connectors
pip install evolvishub-outlook-ingestor[datalake-all]

# Object storage support (MinIO S3-compatible)
pip install evolvishub-outlook-ingestor[storage]

# Data processing features (HTML parsing, image processing)
pip install evolvishub-outlook-ingestor[processing]
```

### Cloud Storage Installation

```bash
# AWS S3 support
pip install evolvishub-outlook-ingestor[cloud-aws]

# Azure Blob Storage support
pip install evolvishub-outlook-ingestor[cloud-azure]

# Google Cloud Storage support
pip install evolvishub-outlook-ingestor[cloud-gcp]

# All cloud storage backends
pip install evolvishub-outlook-ingestor[cloud-all]
```

### Complete Installation

```bash
# Install all features and dependencies
pip install evolvishub-outlook-ingestor[all]

# Development installation with testing tools
pip install evolvishub-outlook-ingestor[dev]
```

### Requirements

- **Python**: 3.9 or higher
- **Operating System**: Linux, macOS, Windows
- **Memory**: Minimum 512MB RAM (2GB+ recommended for large datasets)
- **Storage**: Varies based on email volume and attachment storage strategy

## Quick Start

### Basic Email Ingestion

```python
import asyncio
from evolvishub_outlook_ingestor.protocols.microsoft_graph import GraphAPIAdapter
from evolvishub_outlook_ingestor.connectors.postgresql_connector import PostgreSQLConnector
from evolvishub_outlook_ingestor.processors.email_processor import EmailProcessor

async def basic_email_ingestion():
    # Configure Microsoft Graph API
    graph_config = {
        "client_id": "your_client_id",
        "client_secret": "your_client_secret",
        "tenant_id": "your_tenant_id"
    }

    # Configure PostgreSQL database
    db_config = {
        "host": "localhost",
        "port": 5432,
        "database": "outlook_data",
        "username": "postgres",
        "password": "your_password"
    }

    # Initialize components
    async with GraphAPIAdapter("graph", graph_config) as protocol, \
               PostgreSQLConnector("db", db_config) as connector:

        # Create email processor
        processor = EmailProcessor("email_processor")

        # Fetch and process emails
        emails = await protocol.fetch_emails(limit=10)

        for email in emails:
            # Process email content
            result = await processor.process(email)

            # Store in database
            if result.status.value == "success":
                await connector.store_email(result.processed_data)
                print(f"Stored email: {email.subject}")

asyncio.run(basic_email_ingestion())
```

## Hybrid Storage Configuration

### Enterprise-Grade Attachment Processing

```python
import asyncio
from evolvishub_outlook_ingestor.processors.enhanced_attachment_processor import (
    EnhancedAttachmentProcessor,
    StorageStrategy
)
from evolvishub_outlook_ingestor.connectors.minio_connector import MinIOConnector
from evolvishub_outlook_ingestor.connectors.aws_s3_connector import AWSS3Connector
from evolvishub_outlook_ingestor.connectors.postgresql_connector import PostgreSQLConnector

async def hybrid_storage_setup():
    # Configure MinIO for hot storage (frequently accessed files)
    minio_config = {
        "endpoint_url": "localhost:9000",
        "access_key": "minioadmin",
        "secret_key": "minioadmin",
        "bucket_name": "email-attachments-hot",
        "use_ssl": False  # Set to True for production
    }

    # Configure AWS S3 for archive storage (long-term storage)
    s3_config = {
        "access_key": "your_aws_access_key",
        "secret_key": "your_aws_secret_key",
        "bucket_name": "email-attachments-archive",
        "region": "us-east-1"
    }

    # Configure enhanced attachment processor with intelligent routing
    processor_config = {
        "storage_strategy": "hybrid",
        "size_threshold": 1024 * 1024,  # 1MB threshold
        "enable_compression": True,
        "enable_deduplication": True,
        "enable_virus_scanning": False,  # Configure as needed
        "default_storage_backend": "hot_storage",

        # Intelligent storage routing rules
        "storage_rules": [
            {
                "name": "large_files",
                "condition": "size > 5*1024*1024",  # Files > 5MB
                "strategy": "storage_only",
                "storage_backend": "archive_storage"
            },
            {
                "name": "medium_files",
                "condition": "size > 1024*1024 and size <= 5*1024*1024",  # 1-5MB
                "strategy": "hybrid",
                "storage_backend": "hot_storage"
            },
            {
                "name": "small_files",
                "condition": "size <= 1024*1024",  # Files <= 1MB
                "strategy": "database_only"
            },
            {
                "name": "compressible_text",
                "condition": "content_type.startswith('text/') and size > 1024",
                "strategy": "hybrid",
                "storage_backend": "hot_storage",
                "compress": True,
                "compression_type": "gzip"
            }
        ]
    }

    # Initialize storage connectors
    minio_connector = MinIOConnector("hot_storage", minio_config)
    s3_connector = AWSS3Connector("archive_storage", s3_config)

    # Initialize enhanced processor
    processor = EnhancedAttachmentProcessor("hybrid_attachments", processor_config)

    async with minio_connector, s3_connector:
        # Add storage backends to processor
        await processor.add_storage_backend("hot_storage", minio_connector)
        await processor.add_storage_backend("archive_storage", s3_connector)

        # Process emails with intelligent attachment routing
        # (email processing code here)

        # Generate secure URLs for attachment access
        storage_info = {
            "storage_location": "2024/01/15/abc123.pdf",
            "storage_backend": "hot_storage"
        }

        backend = processor.storage_backends[storage_info["storage_backend"]]
        secure_url = await backend.generate_presigned_url(
            storage_info["storage_location"],
            expires_in=3600  # 1 hour expiration
        )

        print(f"Secure attachment URL: {secure_url}")

asyncio.run(hybrid_storage_setup())
```

### Storage Strategy Decision Matrix

| File Size | Content Type | Storage Strategy | Backend | Compression |
|-----------|--------------|------------------|---------|-------------|
| < 1MB | Any | Database Only | PostgreSQL/MongoDB | No |
| 1-5MB | Documents/Images | Hybrid | MinIO/Hot Storage | Optional |
| 1-5MB | Text Files | Hybrid | MinIO/Hot Storage | Yes (GZIP) |
| > 5MB | Any | Storage Only | AWS S3/Archive | Optional |
| > 10MB | Any | Storage Only | AWS S3/Glacier | Yes |

### Database Selection Guide

| Database | Best For | Pros | Cons | Recommended Use Case |
|----------|----------|------|------|---------------------|
| **SQLite** | Development, Testing, Small Scale | Simple setup, no server required, ACID compliant | Single writer, limited concurrency | Development environments, small deployments (<10K emails/day) |
| **PostgreSQL** | General Purpose, High Performance | Excellent performance, rich features, strong consistency | Requires server setup and maintenance | Most production deployments, complex queries |
| **MongoDB** | Flexible Schema, Document Storage | Schema flexibility, horizontal scaling, JSON-native | Eventual consistency, memory usage | Variable email structures, rapid prototyping |
| **MySQL/MariaDB** | Web Applications, Existing Infrastructure | Wide adoption, good performance, familiar | Limited JSON support (older versions) | Web applications, existing MySQL infrastructure |
| **SQL Server** | Windows Environments, Enterprise | Enterprise features, excellent tooling, integration | Windows-centric, licensing costs | Windows-based enterprises, .NET applications |
| **Oracle** | Mission-Critical, Large Enterprise | Proven reliability, advanced features, scalability | High cost, complexity | Large enterprises, mission-critical systems |
| **CockroachDB** | Global Scale, Cloud-Native | Distributed, strong consistency, cloud-native | Newer technology, learning curve | Global deployments, cloud-native applications |

### Data Lake and Analytics Selection Guide

| Platform | Best For | Pros | Cons | Recommended Use Case |
|----------|----------|------|------|---------------------|
| **Delta Lake** | ACID Analytics, Time Travel | ACID transactions, time travel, schema evolution, Spark ecosystem | Requires Spark, Java/Scala ecosystem | Data science workflows, ML pipelines, audit requirements |
| **Apache Iceberg** | Multi-Engine Analytics | Engine agnostic, hidden partitioning, snapshot isolation | Newer ecosystem, complex setup | Multi-tool analytics, data warehouse modernization |
| **ClickHouse** | Real-Time Analytics | Extremely fast queries, columnar storage, SQL interface | Limited transaction support, specialized use case | Real-time dashboards, email analytics, reporting |

### Hybrid Architecture Patterns

| Pattern | Description | Use Case | Benefits |
|---------|-------------|----------|----------|
| **Operational + Analytics** | PostgreSQL for operations, Delta Lake for analytics | Real-time app + historical analysis | Best of both worlds, optimized for each workload |
| **Hot + Cold Storage** | ClickHouse for recent data, Iceberg for historical | Email analytics with time-based access patterns | Cost optimization, query performance |
| **Multi-Engine Lake** | Iceberg with Spark, Trino, and Flink | Complex analytics requiring different compute engines | Flexibility, avoid vendor lock-in |

## Configuration

### Complete Configuration Example

Create a `config.yaml` file for comprehensive system configuration:

```yaml
# Database configuration examples

# PostgreSQL
database:
  type: "postgresql"
  host: "localhost"
  port: 5432
  database: "outlook_data"
  username: "postgres"
  password: "your_password"
  pool_size: 20
  max_overflow: 30

# SQLite (for development/testing)
database:
  type: "sqlite"
  database_path: "outlook_data.db"
  enable_wal: true
  timeout: 30.0

# SQL Server
database:
  type: "mssql"
  server: "localhost\\SQLEXPRESS"
  port: 1433
  database: "outlook_data"
  username: "sa"
  password: "your_password"
  trusted_connection: false
  encrypt: true

# MariaDB
database:
  type: "mariadb"
  host: "localhost"
  port: 3306
  database: "outlook_data"
  username: "root"
  password: "your_password"
  charset: "utf8mb4"

# Oracle Database
database:
  type: "oracle"
  host: "localhost"
  port: 1521
  service_name: "XEPDB1"
  username: "outlook_user"
  password: "your_password"

# CockroachDB
database:
  type: "cockroachdb"
  host: "localhost"
  port: 26257
  database: "outlook_data"
  username: "root"
  password: "your_password"
  sslmode: "require"

# Data Lake configuration examples

# Delta Lake (local)
database:
  type: "deltalake"
  table_path: "./delta-tables/emails"
  app_name: "outlook-ingestor"
  master: "local[*]"
  partition_columns: ["received_date_partition", "sender_domain"]
  z_order_columns: ["received_date", "sender_email"]
  enable_time_travel: true

# Delta Lake (AWS S3)
database:
  type: "deltalake"
  table_path: "s3a://my-bucket/delta-tables/emails"
  app_name: "outlook-ingestor-prod"
  master: "spark://spark-master:7077"
  cloud_provider: "aws"
  cloud_config:
    access_key: "your_access_key"
    secret_key: "your_secret_key"
    region: "us-west-2"

# Apache Iceberg (Hadoop catalog)
database:
  type: "iceberg"
  catalog_type: "hadoop"
  warehouse_path: "./iceberg-warehouse"
  namespace: "outlook_data"
  table_name: "emails"
  enable_compaction: true

# Apache Iceberg (AWS Glue catalog)
database:
  type: "iceberg"
  catalog_type: "glue"
  catalog_config:
    warehouse: "s3://my-bucket/iceberg-warehouse"
    region: "us-west-2"
  namespace: "outlook_analytics"
  table_name: "emails"

# ClickHouse (local)
database:
  type: "clickhouse"
  host: "localhost"
  port: 8123
  database: "outlook_data"
  username: "default"
  password: "your_password"
  compression: true

# ClickHouse (cluster)
database:
  type: "clickhouse"
  host: "clickhouse-cluster.example.com"
  port: 8123
  database: "outlook_data"
  username: "analytics_user"
  password: "your_password"
  cluster: "outlook_cluster"
  secure: true

# Protocol configurations
protocols:
  graph_api:
    client_id: "your_client_id"
    client_secret: "your_client_secret"
    tenant_id: "your_tenant_id"
    scopes: ["https://graph.microsoft.com/.default"]

  exchange:
    server: "outlook.office365.com"
    username: "your_email@company.com"
    password: "your_password"
    autodiscover: true

# Storage backend configurations
storage:
  minio:
    endpoint_url: "localhost:9000"
    access_key: "minioadmin"
    secret_key: "minioadmin"
    bucket_name: "email-attachments"
    use_ssl: false

  aws_s3:
    access_key: "your_aws_access_key"
    secret_key: "your_aws_secret_key"
    bucket_name: "email-attachments-prod"
    region: "us-east-1"

  azure_blob:
    connection_string: "DefaultEndpointsProtocol=https;AccountName=..."
    container_name: "email-attachments"

# Enhanced attachment processing
attachment_processing:
  storage_strategy: "hybrid"
  size_threshold: 1048576  # 1MB
  enable_compression: true
  enable_deduplication: true
  enable_virus_scanning: false
  max_attachment_size: 52428800  # 50MB

  storage_rules:
    - name: "large_files"
      condition: "size > 5*1024*1024"
      strategy: "storage_only"
      storage_backend: "aws_s3"
    - name: "medium_files"
      condition: "size > 1024*1024 and size <= 5*1024*1024"
      strategy: "hybrid"
      storage_backend: "minio"
    - name: "small_files"
      condition: "size <= 1024*1024"
      strategy: "database_only"

# Processing settings
processing:
  batch_size: 1000
  max_workers: 10
  timeout_seconds: 300
  retry_attempts: 3
  retry_delay: 1.0

# Email settings
email:
  extract_attachments: true
  include_folders:
    - "Inbox"
    - "Sent Items"
    - "Archive"
  exclude_folders:
    - "Deleted Items"
    - "Junk Email"

# Security settings
security:
  encrypt_credentials: true
  master_key: "your_encryption_key"
  enable_audit_logging: true

# Monitoring settings
monitoring:
  enable_metrics: true
  metrics_port: 8080
  health_check_interval: 30
  log_level: "INFO"
```

### Environment Variables

```bash
# Database settings
export DATABASE__HOST=localhost
export DATABASE__PORT=5432
export DATABASE__USERNAME=postgres
export DATABASE__PASSWORD=your_password

# Graph API settings
export PROTOCOLS__GRAPH_API__CLIENT_ID=your_client_id
export PROTOCOLS__GRAPH_API__CLIENT_SECRET=your_client_secret
export PROTOCOLS__GRAPH_API__TENANT_ID=your_tenant_id

# Storage backend settings
export STORAGE__MINIO__ACCESS_KEY=minioadmin
export STORAGE__MINIO__SECRET_KEY=minioadmin
export STORAGE__AWS_S3__ACCESS_KEY=your_aws_access_key
export STORAGE__AWS_S3__SECRET_KEY=your_aws_secret_key

# Security settings
export SECURITY__MASTER_KEY=your_encryption_key
export SECURITY__ENCRYPT_CREDENTIALS=true

# Load configuration file
export CONFIG_FILE=/path/to/config.yaml
```

## Performance

### Throughput Benchmarks

| Configuration | Emails/Minute | Attachments/Minute | Memory Usage |
|---------------|---------------|-------------------|--------------|
| Basic (Database Only) | 500-800 | 200-400 | 256MB |
| Hybrid Storage | 800-1200 | 400-800 | 512MB |
| Object Storage Only | 1000-1500 | 600-1200 | 128MB |
| Multi-tier Enterprise | 1200-2000 | 800-1500 | 1GB |

### Performance Optimization

```python
# High-performance configuration
performance_config = {
    "processing": {
        "batch_size": 2000,        # Larger batches for better throughput
        "max_workers": 20,         # More concurrent workers
        "connection_pool_size": 50, # Larger connection pool
        "prefetch_count": 100      # Prefetch more emails
    },

    "attachment_processing": {
        "enable_compression": True,     # Reduce storage I/O
        "enable_deduplication": True,   # Avoid duplicate processing
        "concurrent_uploads": 10,       # Parallel storage uploads
        "chunk_size": 8192             # Optimal chunk size for uploads
    },

    "storage": {
        "connection_timeout": 30,       # Reasonable timeout
        "retry_attempts": 3,           # Automatic retries
        "use_connection_pooling": True  # Reuse connections
    }
}
```

### Memory Management

```mermaid
graph LR
    A[Email Batch] --> B{Size Check}
    B -->|Small| C[Database Storage]
    B -->|Medium| D[Hybrid Processing]
    B -->|Large| E[Stream to Object Storage]

    C --> F[Memory: ~1MB per email]
    D --> G[Memory: ~100KB per email]
    E --> H[Memory: ~10KB per email]

    F --> I[Total: 256MB for 1000 emails]
    G --> J[Total: 100MB for 1000 emails]
    H --> K[Total: 10MB for 1000 emails]
```

### Scaling Recommendations

#### Small Deployments (< 10,000 emails/day)
- **Configuration**: Basic database storage
- **Resources**: 2 CPU cores, 4GB RAM
- **Storage**: PostgreSQL with SSD storage

#### Medium Deployments (10,000 - 100,000 emails/day)
- **Configuration**: Hybrid storage with MinIO
- **Resources**: 4 CPU cores, 8GB RAM
- **Storage**: PostgreSQL + MinIO cluster

#### Large Deployments (100,000+ emails/day)
- **Configuration**: Multi-tier object storage
- **Resources**: 8+ CPU cores, 16GB+ RAM
- **Storage**: PostgreSQL + AWS S3/Azure Blob + CDN

## Advanced Usage

### Protocol Adapters

#### Microsoft Graph API Adapter
- **Features**: OAuth2 authentication, rate limiting, pagination support
- **Configuration**: Client ID, Client Secret, Tenant ID
- **Usage**: Modern REST API for Office 365 and Outlook.com

```python
from evolvishub_outlook_ingestor.protocols import GraphAPIAdapter

adapter = GraphAPIAdapter("graph_api", {
    "client_id": "your_client_id",
    "client_secret": "your_client_secret",
    "tenant_id": "your_tenant_id",
    "rate_limit": 100,  # requests per minute
})
```

#### Exchange Web Services (EWS) Adapter
- **Features**: Basic and OAuth2 authentication, connection pooling
- **Configuration**: Server URL, credentials, timeout settings
- **Usage**: On-premises Exchange servers and Exchange Online

```python
from evolvishub_outlook_ingestor.protocols import ExchangeWebServicesAdapter

adapter = ExchangeWebServicesAdapter("exchange", {
    "server": "outlook.office365.com",
    "username": "your_email@company.com",
    "password": "your_password",
    "auth_type": "basic",  # or "oauth2"
})
```

#### IMAP/POP3 Adapter
- **Features**: SSL/TLS support, folder synchronization, UID tracking
- **Configuration**: Server details, authentication credentials
- **Usage**: Standard email protocols for broad compatibility

```python
from evolvishub_outlook_ingestor.protocols import IMAPAdapter

adapter = IMAPAdapter("imap", {
    "server": "outlook.office365.com",
    "port": 993,
    "username": "your_email@company.com",
    "password": "your_password",
    "use_ssl": True,
})
```

### Database Connectors

#### PostgreSQL Connector
- **Features**: Async operations, connection pooling, JSON fields, full-text search
- **Schema**: Optimized tables with proper indexes for email data
- **Performance**: Batch operations, transaction support

```python
from evolvishub_outlook_ingestor.connectors import PostgreSQLConnector

connector = PostgreSQLConnector("postgresql", {
    "host": "localhost",
    "port": 5432,
    "database": "outlook_data",
    "username": "postgres",
    "password": "your_password",
    "pool_size": 20,
})
```

#### MongoDB Connector
- **Features**: Document storage, GridFS for large attachments, aggregation pipelines
- **Schema**: Flexible document structure with proper indexing
- **Scalability**: Horizontal scaling support, replica sets

```python
from evolvishub_outlook_ingestor.connectors import MongoDBConnector

connector = MongoDBConnector("mongodb", {
    "host": "localhost",
    "port": 27017,
    "database": "outlook_data",
    "username": "mongo_user",
    "password": "your_password",
})
```

### Data Processors

#### Email Processor
- **Features**: Content normalization, HTML to text conversion, duplicate detection
- **Capabilities**: Email validation, link extraction, encoding detection
- **Configuration**: Customizable processing rules

```python
from evolvishub_outlook_ingestor.processors import EmailProcessor

processor = EmailProcessor("email", {
    "normalize_content": True,
    "extract_links": True,
    "validate_addresses": True,
    "html_to_text": True,
    "remove_duplicates": True,
})
```

#### Attachment Processor
- **Features**: File type detection, virus scanning hooks, metadata extraction
- **Security**: Size validation, type filtering, content analysis
- **Optimization**: Image compression, hash calculation

```python
from evolvishub_outlook_ingestor.processors import AttachmentProcessor

processor = AttachmentProcessor("attachment", {
    "max_attachment_size": 50 * 1024 * 1024,  # 50MB
    "scan_for_viruses": True,
    "extract_metadata": True,
    "calculate_hashes": True,
    "compress_images": True,
})
```

## πŸ”§ Advanced Usage

### Hybrid Storage Configuration

```python
import asyncio
from evolvishub_outlook_ingestor.processors.enhanced_attachment_processor import (
    EnhancedAttachmentProcessor,
    StorageStrategy
)
from evolvishub_outlook_ingestor.connectors.minio_connector import MinIOConnector
from evolvishub_outlook_ingestor.connectors.aws_s3_connector import AWSS3Connector

async def setup_hybrid_storage():
    # Configure storage backends
    minio_config = {
        "endpoint_url": "localhost:9000",
        "access_key": "minioadmin",
        "secret_key": "minioadmin",
        "bucket_name": "email-attachments-hot",
        "use_ssl": False
    }

    s3_config = {
        "access_key": "your_aws_access_key",
        "secret_key": "your_aws_secret_key",
        "bucket_name": "email-attachments-archive",
        "region": "us-east-1"
    }

    # Initialize storage connectors
    minio_connector = MinIOConnector("hot_storage", minio_config)
    s3_connector = AWSS3Connector("archive_storage", s3_config)

    # Configure enhanced processor with storage rules
    processor_config = {
        "storage_strategy": "hybrid",
        "size_threshold": 1024 * 1024,  # 1MB
        "enable_compression": True,
        "enable_deduplication": True,
        "storage_rules": [
            {
                "name": "large_files",
                "condition": "size > 5*1024*1024",  # Files > 5MB
                "strategy": "storage_only",
                "storage_backend": "archive_storage"
            },
            {
                "name": "medium_files",
                "condition": "size > 1024*1024 and size <= 5*1024*1024",
                "strategy": "hybrid",
                "storage_backend": "hot_storage"
            },
            {
                "name": "small_files",
                "condition": "size <= 1024*1024",
                "strategy": "database_only"
            }
        ]
    }

    # Create enhanced processor
    processor = EnhancedAttachmentProcessor("hybrid_attachments", processor_config)

    # Add storage backends
    async with minio_connector, s3_connector:
        await processor.add_storage_backend("hot_storage", minio_connector)
        await processor.add_storage_backend("archive_storage", s3_connector)

        # Process emails with hybrid storage
        result = await processor.process(email_with_attachments)

        # Generate secure URLs for attachment access
        for storage_info in result.metadata.get("storage_infos", []):
            if storage_info.get("storage_backend"):
                backend = processor.storage_backends[storage_info["storage_backend"]]
                secure_url = await backend.generate_presigned_url(
                    storage_info["storage_location"],
                    expires_in=3600  # 1 hour
                )
                print(f"Secure URL: {secure_url}")

asyncio.run(setup_hybrid_storage())
```

### Custom Protocol Adapter

```python
from evolvishub_outlook_ingestor.protocols import BaseProtocol
from evolvishub_outlook_ingestor.core.data_models import EmailMessage

class CustomProtocol(BaseProtocol):
    async def _fetch_emails_impl(self, **kwargs):
        # Implement custom email fetching logic
        emails = []
        # ... fetch emails from custom source
        return emails

# Use custom protocol
ingestor = OutlookIngestor(
    settings=settings,
    protocol_adapters={"custom": CustomProtocol("custom", config)}
)
```

### Batch Processing with Progress Tracking

```python
from evolvishub_outlook_ingestor.core.data_models import BatchProcessingConfig

async def process_with_progress():
    def progress_callback(processed, total, rate):
        print(f"Progress: {processed}/{total} ({rate:.2f} emails/sec)")
    
    batch_config = BatchProcessingConfig(
        batch_size=500,
        max_workers=8,
        progress_callback=progress_callback
    )
    
    result = await ingestor.process_emails(
        protocol="exchange",
        database="mongodb",
        batch_config=batch_config
    )
```

### Database Transactions

```python
from evolvishub_outlook_ingestor.connectors import PostgreSQLConnector

async def transactional_processing():
    connector = PostgreSQLConnector("postgres", config)
    await connector.initialize()
    
    async with connector.transaction() as tx:
        # All operations within this block are transactional
        for email in emails:
            await connector.store_email(email, transaction=tx)
        # Automatically commits on success, rolls back on error
```

## πŸ—οΈ Architecture

### Component Overview

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Protocols     β”‚    β”‚   Processors     β”‚    β”‚   Connectors    β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β”‚ β€’ Exchange EWS  │───▢│ β€’ Email Proc.    │───▢│ β€’ PostgreSQL    β”‚
β”‚ β€’ Graph API     β”‚    β”‚ β€’ Attachment     β”‚    β”‚ β€’ MongoDB       β”‚
β”‚ β€’ IMAP/POP3     β”‚    β”‚ β€’ Batch Proc.    β”‚    β”‚ β€’ MySQL         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Core Framework    β”‚
                    β”‚                     β”‚
                    β”‚ β€’ Configuration     β”‚
                    β”‚ β€’ Logging           β”‚
                    β”‚ β€’ Error Handling    β”‚
                    β”‚ β€’ Retry Logic       β”‚
                    β”‚ β€’ Metrics           β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Design Patterns

- **Strategy Pattern**: Interchangeable protocol adapters
- **Factory Pattern**: Dynamic component creation
- **Repository Pattern**: Database abstraction
- **Observer Pattern**: Progress and metrics tracking
- **Circuit Breaker**: Fault tolerance

## πŸ§ͺ Testing

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=evolvishub_outlook_ingestor --cov-report=html

# Run specific test categories
pytest -m unit          # Unit tests only
pytest -m integration   # Integration tests only
pytest -m performance   # Performance tests only

# Run tests in parallel
pytest -n auto
```

## πŸ“Š Performance

### Benchmarks

- **Email Processing**: 1000+ emails/minute
- **Memory Usage**: <100MB for 10K emails
- **Database Throughput**: 500+ inserts/second
- **Concurrent Connections**: 50+ simultaneous

### Optimization Tips

1. **Use Batch Processing**: Process emails in batches for better throughput
2. **Enable Connection Pooling**: Reuse database connections
3. **Configure Rate Limiting**: Avoid API throttling
4. **Monitor Memory Usage**: Use streaming for large datasets
5. **Tune Worker Count**: Match your system's CPU cores

## πŸ” Monitoring

### Metrics Collection

```python
# Enable Prometheus metrics
settings.monitoring.enable_metrics = True
settings.monitoring.metrics_port = 8000

# Access metrics endpoint
# http://localhost:8000/metrics
```

### Health Checks

```python
# Check component health
status = await ingestor.get_status()
print(f"Protocol Status: {status['protocols']}")
print(f"Database Status: {status['database']}")
```

## 🀝 Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

### Development Setup

```bash
# Clone repository
git clone https://github.com/evolvisai/metcal.git
cd metcal/shared/libs/evolvis-outlook-ingestor

# Install development dependencies
pip install -e ".[dev]"

# Run pre-commit hooks
pre-commit install

# Run tests
pytest
```

## πŸ“„ License

This project is licensed under the Evolvis AI License - see the [LICENSE](LICENSE) file for details.

## πŸ“š API Reference

### Core Components

**Protocol Adapters**
- `GraphAPIAdapter`: Microsoft Graph API integration
- `ExchangeWebServicesAdapter`: Exchange Web Services (EWS) support
- `IMAPAdapter`: IMAP protocol support

**Database Connectors**
- `PostgreSQLConnector`: PostgreSQL database integration
- `MongoDBConnector`: MongoDB database integration

**Data Processors**
- `EmailProcessor`: Email content processing and normalization
- `AttachmentProcessor`: Attachment handling and security scanning

**Security Utilities**
- `SecureCredentialManager`: Credential encryption and management
- `CredentialMasker`: Sensitive data masking for logs
- `InputSanitizer`: Input validation and sanitization

### Configuration Reference

```python
# Complete configuration example
config = {
    "graph_api": {
        "client_id": "your_client_id",
        "client_secret": "your_client_secret",
        "tenant_id": "your_tenant_id",
        "rate_limit": 100,  # requests per minute
        "timeout": 30,      # request timeout in seconds
    },
    "database": {
        "host": "localhost",
        "port": 5432,
        "database": "outlook_ingestor",
        "username": "ingestor_user",
        "password": "secure_password",
        "ssl_mode": "require",
        "enable_connection_pooling": True,
        "pool_size": 10,
    },
    "email_processing": {
        "normalize_content": True,
        "extract_links": True,
        "validate_addresses": True,
        "html_to_text": True,
        "remove_duplicates": True,
    },
    "attachment_processing": {
        "max_attachment_size": 10 * 1024 * 1024,  # 10MB
        "extract_metadata": True,
        "calculate_hashes": True,
        "scan_for_viruses": False,
    }
}
```

### Error Handling

```python
from evolvishub_outlook_ingestor.core.exceptions import (
    ConnectionError,
    AuthenticationError,
    DatabaseError,
    ProcessingError,
    ValidationError,
)

try:
    await protocol.fetch_emails()
except AuthenticationError:
    # Handle authentication issues
    print("Check your API credentials")
except ConnectionError:
    # Handle network/connection issues
    print("Check network connectivity")
except ProcessingError as e:
    # Handle processing errors
    print(f"Processing failed: {e}")
```

## Support and Documentation

### Documentation Resources
- **[Storage Architecture Guide](docs/STORAGE_ARCHITECTURE.md)** - Comprehensive guide to hybrid storage configuration
- **[Migration Guide](docs/MIGRATION_GUIDE.md)** - Step-by-step migration from basic to hybrid storage
- **[API Reference](docs/API_REFERENCE.md)** - Complete API documentation with examples
- **[Performance Tuning](docs/PERFORMANCE_TUNING.md)** - Optimization guidelines for large-scale deployments

### Community and Support
- **[GitHub Issues](https://github.com/evolvisai/metcal/issues)** - Bug reports and feature requests
- **[GitHub Discussions](https://github.com/evolvisai/metcal/discussions)** - Community discussions and Q&A
- **[Examples Directory](examples/)** - Comprehensive usage examples and tutorials

### Enterprise Support
For enterprise deployments requiring dedicated support, custom integrations, or professional services, please contact our team for tailored solutions and SLA-backed support options.

## Technical Specifications

### Supported Platforms
- **Operating Systems**: Linux (Ubuntu 18.04+, CentOS 7+), macOS (10.15+), Windows (10+)
- **Python Versions**: 3.9, 3.10, 3.11, 3.12
- **Database Systems**: PostgreSQL 12+, MongoDB 4.4+, MySQL 8.0+
- **Object Storage**: MinIO, AWS S3, Azure Blob Storage, Google Cloud Storage

### Performance Characteristics
- **Throughput**: Up to 2,000 emails/minute with hybrid storage
- **Concurrency**: Support for 50+ concurrent processing workers
- **Memory Efficiency**: <10KB per email with object storage strategy
- **Storage Optimization**: Up to 70% reduction in database size with intelligent routing

### Security Compliance
- **Encryption**: AES-256 encryption for credentials and sensitive data
- **Authentication**: OAuth2, Basic Auth, and certificate-based authentication
- **Access Control**: Role-based access control and audit logging
- **Compliance**: GDPR, HIPAA, and SOX compliance features available

## Acknowledgments

This project is built on top of excellent open-source technologies:

- **[Pydantic](https://pydantic.dev/)** - Data validation and settings management
- **[SQLAlchemy](https://sqlalchemy.org/)** - Database ORM with async support
- **[asyncio](https://docs.python.org/3/library/asyncio.html)** - Asynchronous programming framework
- **[pytest](https://pytest.org/)** - Testing framework with async support
- **[Black](https://black.readthedocs.io/)**, **[isort](https://pycqa.github.io/isort/)**, **[mypy](https://mypy.readthedocs.io/)** - Code quality and type checking tools

## License

### Evolvis AI License

This software is proprietary to **Evolvis AI** and is protected by copyright and other intellectual property laws.

#### πŸ“‹ **License Terms**

- **βœ… Evaluation and Non-Commercial Use**: This package is available for evaluation, research, and non-commercial use
- **⚠️ Commercial Use Restrictions**: Commercial or production use of this library requires a valid Evolvis AI License
- **🚫 Redistribution Prohibited**: Redistribution or commercial use without proper licensing is strictly prohibited

#### πŸ’Ό **Commercial Licensing**

For commercial licensing, production deployments, or enterprise use, please contact:

**Montgomery Miralles**
πŸ“§ **Email**: [m.miralles@evolvis.ai](mailto:m.miralles@evolvis.ai)
🏒 **Company**: Evolvis AI
🌐 **Website**: [https://evolvis.ai](https://evolvis.ai)

#### βš–οΈ **Important Notice**

> **Commercial users must obtain proper licensing before deploying this software in production environments.** Unauthorized commercial use may result in legal action. Contact Montgomery Miralles for licensing agreements and compliance requirements.

#### πŸ“„ **Full License**

For complete license terms and conditions, see the [LICENSE](LICENSE) file included with this distribution.

---

**Evolvishub Outlook Ingestor** - Enterprise-grade email ingestion with intelligent hybrid storage architecture.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "evolvishub-outlook-ingestor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "Kevin Medina G\u00f3mez <k.medina@evolvis.ai>",
    "keywords": "outlook, email, ingestion, exchange, graph-api, imap, pop3, database, async, batch-processing, security, monitoring, performance, postgresql, mongodb, enterprise",
    "author": null,
    "author_email": "\"Alban Maxhuni, PhD\" <a.maxhuni@evolvis.ai>",
    "download_url": "https://files.pythonhosted.org/packages/88/f2/2dfa77bd7b374a68007e712f301082248b9dd8763768eed9fa44c0c0beff/evolvishub_outlook_ingestor-1.0.2.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"https://evolvis.ai/wp-content/uploads/2025/08/evie-solutions-03.png\" alt=\"Evolvis AI - Evie Solutions Logo\" width=\"400\">\n</div>\n\n# Evolvishub Outlook Ingestor\n\n**Production-ready, secure email ingestion system for Microsoft Outlook with advanced processing, monitoring, and hybrid storage capabilities.**\n\nA comprehensive Python library for ingesting, processing, and storing email data from Microsoft Outlook and Exchange systems. Built with enterprise-grade security, performance, and scalability in mind, featuring intelligent hybrid storage architecture for optimal cost and performance.\n\n## Download Statistics\n\n[![PyPI Downloads](https://pepy.tech/badge/evolvishub-outlook-ingestor/month)](https://pepy.tech/project/evolvishub-outlook-ingestor)\n[![Total Downloads](https://pepy.tech/badge/evolvishub-outlook-ingestor)](https://pepy.tech/project/evolvishub-outlook-ingestor)\n[![PyPI Version](https://img.shields.io/pypi/v/evolvishub-outlook-ingestor)](https://pypi.org/project/evolvishub-outlook-ingestor/)\n[![Python Versions](https://img.shields.io/pypi/pyversions/evolvishub-outlook-ingestor)](https://pypi.org/project/evolvishub-outlook-ingestor/)\n[![License](https://img.shields.io/pypi/l/evolvishub-outlook-ingestor)](LICENSE)\n[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Type Hints](https://img.shields.io/badge/type%20hints-yes-brightgreen.svg)](https://mypy.readthedocs.io/)\n\n## Table of Contents\n\n- [Features](#features)\n- [Architecture](#architecture)\n- [About Evolvis AI](#about-evolvis-ai)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Hybrid Storage Configuration](#hybrid-storage-configuration)\n- [Configuration](#configuration)\n- [Performance](#performance)\n- [Advanced Usage](#advanced-usage)\n- [Support and Documentation](#support-and-documentation)\n- [Technical Specifications](#technical-specifications)\n- [Acknowledgments](#acknowledgments)\n- [License](#license)\n\n## Features\n\n### Protocol Support\n- **Microsoft Graph API** - Modern OAuth2-based access to Office 365 and Exchange Online\n- **Exchange Web Services (EWS)** - Enterprise-grade access to on-premises Exchange servers\n- **IMAP/POP3** - Universal email protocol support for legacy systems and third-party providers\n\n### Database Integration\n- **PostgreSQL** - High-performance relational database with advanced indexing and async support\n- **MongoDB** - Scalable NoSQL document storage for flexible email data structures\n- **MySQL** - Reliable relational database support for existing infrastructure\n- **SQLite** - Lightweight file-based database for development, testing, and small deployments\n- **Microsoft SQL Server** - Enterprise database for Windows-centric environments with advanced features\n- **MariaDB** - Open-source MySQL alternative with enhanced performance and features\n- **Oracle Database** - Enterprise-grade database for mission-critical applications\n- **CockroachDB** - Distributed, cloud-native database for global scale and resilience\n- **ClickHouse** - High-performance columnar database for analytics and real-time queries\n\n### Data Lake Integration\n- **Delta Lake** - Apache Spark-based ACID transactional storage layer with time travel capabilities\n- **Apache Iceberg** - Open table format for large-scale analytics with schema evolution support\n- **Hybrid Analytics** - Seamless integration between operational databases and analytical data lakes\n\n### Hybrid Storage Architecture\n- **MinIO** - Self-hosted S3-compatible storage for on-premises control and high performance\n- **AWS S3** - Enterprise cloud storage with global CDN, lifecycle policies, and encryption\n- **Azure Blob Storage** - Microsoft ecosystem integration with hot/cool/archive storage tiers\n- **Google Cloud Storage** - Global infrastructure with ML integration and advanced analytics\n- **Intelligent Routing** - Size-based and content-type-based storage decisions with configurable rules\n- **Content Deduplication** - SHA256-based deduplication to eliminate duplicate attachments\n- **Automatic Compression** - GZIP/ZLIB compression for text-based attachments\n- **Secure Access** - Pre-signed URLs with configurable expiration for secure attachment access\n\n### Performance & Scalability\n- **Async/Await Architecture** - Non-blocking operations for maximum throughput (1000+ emails/minute)\n- **Hybrid Storage Strategy** - Intelligent routing between database and object storage\n- **Batch Processing** - Efficient handling of large email volumes with concurrent workers\n- **Connection Pooling** - Optimized database connections for enterprise workloads\n- **Memory Optimization** - Smart caching and resource management for large datasets\n- **Multi-tier Storage** - Automatic lifecycle management between hot/warm/cold storage\n\n### Enterprise Security\n- **Credential Encryption** - Fernet symmetric encryption for sensitive data storage\n- **Input Sanitization** - Protection against SQL injection, XSS, and other attacks\n- **Secure Configuration** - Environment variable-based configuration with validation\n- **Audit Logging** - Complete audit trail without sensitive data exposure\n- **Access Control** - IAM-based permissions and secure URL generation\n\n### Developer Experience\n- **Type Safety** - Full type hints and IDE support for enhanced development experience\n- **Comprehensive Testing** - 80%+ test coverage with unit, integration, and performance tests\n- **Extensive Documentation** - Complete API reference with examples and best practices\n- **Configuration-Based Setup** - Flexible YAML/JSON configuration with validation\n- **Error Handling** - Comprehensive exception hierarchy with automatic retry logic\n\n## Architecture\n\n### System Overview\n\n```mermaid\ngraph TB\n    subgraph \"Email Sources\"\n        A[Microsoft Graph API]\n        B[Exchange Web Services]\n        C[IMAP/POP3]\n    end\n\n    subgraph \"Evolvishub Outlook Ingestor\"\n        D[Protocol Adapters]\n        E[Enhanced Attachment Processor]\n        F[Email Processor]\n        G[Security Layer]\n    end\n\n    subgraph \"Storage Layer\"\n        H[Database Storage]\n        I[Object Storage]\n    end\n\n    subgraph \"Database Backends\"\n        J[PostgreSQL]\n        K[MongoDB]\n        L[MySQL]\n    end\n\n    subgraph \"Object Storage Backends\"\n        M[MinIO]\n        N[AWS S3]\n        O[Azure Blob]\n        P[Google Cloud Storage]\n    end\n\n    A --> D\n    B --> D\n    C --> D\n    D --> F\n    D --> E\n    F --> G\n    E --> G\n    G --> H\n    G --> I\n    H --> J\n    H --> K\n    H --> L\n    I --> M\n    I --> N\n    I --> O\n    I --> P\n```\n\n### Hybrid Storage Strategy\n\n```mermaid\nsequenceDiagram\n    participant E as Email Processor\n    participant AP as Attachment Processor\n    participant DB as Database\n    participant OS as Object Storage\n\n    E->>AP: Process Email with Attachments\n    AP->>AP: Evaluate Storage Rules\n\n    alt Small Attachment (<1MB)\n        AP->>DB: Store in Database\n        DB-->>AP: Confirmation\n    else Medium Attachment (1-5MB)\n        AP->>OS: Store Content\n        AP->>DB: Store Metadata + Reference\n        OS-->>AP: Storage Key\n        DB-->>AP: Confirmation\n    else Large Attachment (>5MB)\n        AP->>OS: Store Content Only\n        OS-->>AP: Storage Key + Metadata\n    end\n\n    AP-->>E: Processing Complete\n```\n\n### Data Flow Architecture\n\n```mermaid\nflowchart LR\n    subgraph \"Ingestion Layer\"\n        A[Email Source] --> B[Protocol Adapter]\n        B --> C[Rate Limiter]\n        C --> D[Authentication]\n    end\n\n    subgraph \"Processing Layer\"\n        D --> E[Email Processor]\n        E --> F[Attachment Processor]\n        F --> G[Security Scanner]\n        G --> H[Deduplication Engine]\n        H --> I[Compression Engine]\n    end\n\n    subgraph \"Storage Decision Engine\"\n        I --> J{Storage Rules}\n        J -->|Small Files| K[Database Storage]\n        J -->|Medium Files| L[Hybrid Storage]\n        J -->|Large Files| M[Object Storage]\n    end\n\n    subgraph \"Storage Backends\"\n        K --> N[(PostgreSQL/MongoDB)]\n        L --> N\n        L --> O[(MinIO/S3/Azure/GCS)]\n        M --> O\n    end\n```\n\n## About Evolvis AI\n\n**Evolvis AI** is a cutting-edge technology company specializing in AI-powered solutions for enterprise email processing, data ingestion, and intelligent automation. Founded with a mission to revolutionize how organizations handle and analyze their email communications, Evolvis AI develops sophisticated tools that combine artificial intelligence with robust engineering practices.\n\n### Our Focus\n- **AI-Powered Email Processing** - Advanced algorithms for intelligent email analysis, classification, and extraction\n- **Enterprise Data Solutions** - Scalable systems for large-scale email ingestion and processing\n- **Intelligent Automation** - Smart workflows that adapt to organizational needs and patterns\n- **Security-First Architecture** - Enterprise-grade security and compliance for sensitive email data\n\n### Innovation at Scale\nEvolvis AI's solutions are designed to handle enterprise-scale email processing challenges, from small businesses to global corporations. Our technology stack emphasizes performance, security, and scalability while maintaining ease of use and deployment flexibility.\n\n**Learn more about our solutions:** [https://evolvis.ai](https://evolvis.ai)\n\n## Installation\n\n### Basic Installation\n\n```bash\n# Install core package\npip install evolvishub-outlook-ingestor\n```\n\n### Feature-Specific Installation\n\n```bash\n# Protocol adapters (Microsoft Graph, EWS, IMAP/POP3)\npip install evolvishub-outlook-ingestor[protocols]\n\n# Core database connectors (PostgreSQL, MongoDB, MySQL)\npip install evolvishub-outlook-ingestor[database]\n\n# Individual database connectors\npip install evolvishub-outlook-ingestor[database-sqlite]      # SQLite\npip install evolvishub-outlook-ingestor[database-mssql]       # SQL Server\npip install evolvishub-outlook-ingestor[database-mariadb]     # MariaDB\npip install evolvishub-outlook-ingestor[database-oracle]      # Oracle\npip install evolvishub-outlook-ingestor[database-cockroachdb] # CockroachDB\n\n# All database connectors\npip install evolvishub-outlook-ingestor[database-all]\n\n# Data lake connectors\npip install evolvishub-outlook-ingestor[datalake-delta]    # Delta Lake\npip install evolvishub-outlook-ingestor[datalake-iceberg]  # Apache Iceberg\npip install evolvishub-outlook-ingestor[database-clickhouse] # ClickHouse\n\n# All data lake connectors\npip install evolvishub-outlook-ingestor[datalake-all]\n\n# Object storage support (MinIO S3-compatible)\npip install evolvishub-outlook-ingestor[storage]\n\n# Data processing features (HTML parsing, image processing)\npip install evolvishub-outlook-ingestor[processing]\n```\n\n### Cloud Storage Installation\n\n```bash\n# AWS S3 support\npip install evolvishub-outlook-ingestor[cloud-aws]\n\n# Azure Blob Storage support\npip install evolvishub-outlook-ingestor[cloud-azure]\n\n# Google Cloud Storage support\npip install evolvishub-outlook-ingestor[cloud-gcp]\n\n# All cloud storage backends\npip install evolvishub-outlook-ingestor[cloud-all]\n```\n\n### Complete Installation\n\n```bash\n# Install all features and dependencies\npip install evolvishub-outlook-ingestor[all]\n\n# Development installation with testing tools\npip install evolvishub-outlook-ingestor[dev]\n```\n\n### Requirements\n\n- **Python**: 3.9 or higher\n- **Operating System**: Linux, macOS, Windows\n- **Memory**: Minimum 512MB RAM (2GB+ recommended for large datasets)\n- **Storage**: Varies based on email volume and attachment storage strategy\n\n## Quick Start\n\n### Basic Email Ingestion\n\n```python\nimport asyncio\nfrom evolvishub_outlook_ingestor.protocols.microsoft_graph import GraphAPIAdapter\nfrom evolvishub_outlook_ingestor.connectors.postgresql_connector import PostgreSQLConnector\nfrom evolvishub_outlook_ingestor.processors.email_processor import EmailProcessor\n\nasync def basic_email_ingestion():\n    # Configure Microsoft Graph API\n    graph_config = {\n        \"client_id\": \"your_client_id\",\n        \"client_secret\": \"your_client_secret\",\n        \"tenant_id\": \"your_tenant_id\"\n    }\n\n    # Configure PostgreSQL database\n    db_config = {\n        \"host\": \"localhost\",\n        \"port\": 5432,\n        \"database\": \"outlook_data\",\n        \"username\": \"postgres\",\n        \"password\": \"your_password\"\n    }\n\n    # Initialize components\n    async with GraphAPIAdapter(\"graph\", graph_config) as protocol, \\\n               PostgreSQLConnector(\"db\", db_config) as connector:\n\n        # Create email processor\n        processor = EmailProcessor(\"email_processor\")\n\n        # Fetch and process emails\n        emails = await protocol.fetch_emails(limit=10)\n\n        for email in emails:\n            # Process email content\n            result = await processor.process(email)\n\n            # Store in database\n            if result.status.value == \"success\":\n                await connector.store_email(result.processed_data)\n                print(f\"Stored email: {email.subject}\")\n\nasyncio.run(basic_email_ingestion())\n```\n\n## Hybrid Storage Configuration\n\n### Enterprise-Grade Attachment Processing\n\n```python\nimport asyncio\nfrom evolvishub_outlook_ingestor.processors.enhanced_attachment_processor import (\n    EnhancedAttachmentProcessor,\n    StorageStrategy\n)\nfrom evolvishub_outlook_ingestor.connectors.minio_connector import MinIOConnector\nfrom evolvishub_outlook_ingestor.connectors.aws_s3_connector import AWSS3Connector\nfrom evolvishub_outlook_ingestor.connectors.postgresql_connector import PostgreSQLConnector\n\nasync def hybrid_storage_setup():\n    # Configure MinIO for hot storage (frequently accessed files)\n    minio_config = {\n        \"endpoint_url\": \"localhost:9000\",\n        \"access_key\": \"minioadmin\",\n        \"secret_key\": \"minioadmin\",\n        \"bucket_name\": \"email-attachments-hot\",\n        \"use_ssl\": False  # Set to True for production\n    }\n\n    # Configure AWS S3 for archive storage (long-term storage)\n    s3_config = {\n        \"access_key\": \"your_aws_access_key\",\n        \"secret_key\": \"your_aws_secret_key\",\n        \"bucket_name\": \"email-attachments-archive\",\n        \"region\": \"us-east-1\"\n    }\n\n    # Configure enhanced attachment processor with intelligent routing\n    processor_config = {\n        \"storage_strategy\": \"hybrid\",\n        \"size_threshold\": 1024 * 1024,  # 1MB threshold\n        \"enable_compression\": True,\n        \"enable_deduplication\": True,\n        \"enable_virus_scanning\": False,  # Configure as needed\n        \"default_storage_backend\": \"hot_storage\",\n\n        # Intelligent storage routing rules\n        \"storage_rules\": [\n            {\n                \"name\": \"large_files\",\n                \"condition\": \"size > 5*1024*1024\",  # Files > 5MB\n                \"strategy\": \"storage_only\",\n                \"storage_backend\": \"archive_storage\"\n            },\n            {\n                \"name\": \"medium_files\",\n                \"condition\": \"size > 1024*1024 and size <= 5*1024*1024\",  # 1-5MB\n                \"strategy\": \"hybrid\",\n                \"storage_backend\": \"hot_storage\"\n            },\n            {\n                \"name\": \"small_files\",\n                \"condition\": \"size <= 1024*1024\",  # Files <= 1MB\n                \"strategy\": \"database_only\"\n            },\n            {\n                \"name\": \"compressible_text\",\n                \"condition\": \"content_type.startswith('text/') and size > 1024\",\n                \"strategy\": \"hybrid\",\n                \"storage_backend\": \"hot_storage\",\n                \"compress\": True,\n                \"compression_type\": \"gzip\"\n            }\n        ]\n    }\n\n    # Initialize storage connectors\n    minio_connector = MinIOConnector(\"hot_storage\", minio_config)\n    s3_connector = AWSS3Connector(\"archive_storage\", s3_config)\n\n    # Initialize enhanced processor\n    processor = EnhancedAttachmentProcessor(\"hybrid_attachments\", processor_config)\n\n    async with minio_connector, s3_connector:\n        # Add storage backends to processor\n        await processor.add_storage_backend(\"hot_storage\", minio_connector)\n        await processor.add_storage_backend(\"archive_storage\", s3_connector)\n\n        # Process emails with intelligent attachment routing\n        # (email processing code here)\n\n        # Generate secure URLs for attachment access\n        storage_info = {\n            \"storage_location\": \"2024/01/15/abc123.pdf\",\n            \"storage_backend\": \"hot_storage\"\n        }\n\n        backend = processor.storage_backends[storage_info[\"storage_backend\"]]\n        secure_url = await backend.generate_presigned_url(\n            storage_info[\"storage_location\"],\n            expires_in=3600  # 1 hour expiration\n        )\n\n        print(f\"Secure attachment URL: {secure_url}\")\n\nasyncio.run(hybrid_storage_setup())\n```\n\n### Storage Strategy Decision Matrix\n\n| File Size | Content Type | Storage Strategy | Backend | Compression |\n|-----------|--------------|------------------|---------|-------------|\n| < 1MB | Any | Database Only | PostgreSQL/MongoDB | No |\n| 1-5MB | Documents/Images | Hybrid | MinIO/Hot Storage | Optional |\n| 1-5MB | Text Files | Hybrid | MinIO/Hot Storage | Yes (GZIP) |\n| > 5MB | Any | Storage Only | AWS S3/Archive | Optional |\n| > 10MB | Any | Storage Only | AWS S3/Glacier | Yes |\n\n### Database Selection Guide\n\n| Database | Best For | Pros | Cons | Recommended Use Case |\n|----------|----------|------|------|---------------------|\n| **SQLite** | Development, Testing, Small Scale | Simple setup, no server required, ACID compliant | Single writer, limited concurrency | Development environments, small deployments (<10K emails/day) |\n| **PostgreSQL** | General Purpose, High Performance | Excellent performance, rich features, strong consistency | Requires server setup and maintenance | Most production deployments, complex queries |\n| **MongoDB** | Flexible Schema, Document Storage | Schema flexibility, horizontal scaling, JSON-native | Eventual consistency, memory usage | Variable email structures, rapid prototyping |\n| **MySQL/MariaDB** | Web Applications, Existing Infrastructure | Wide adoption, good performance, familiar | Limited JSON support (older versions) | Web applications, existing MySQL infrastructure |\n| **SQL Server** | Windows Environments, Enterprise | Enterprise features, excellent tooling, integration | Windows-centric, licensing costs | Windows-based enterprises, .NET applications |\n| **Oracle** | Mission-Critical, Large Enterprise | Proven reliability, advanced features, scalability | High cost, complexity | Large enterprises, mission-critical systems |\n| **CockroachDB** | Global Scale, Cloud-Native | Distributed, strong consistency, cloud-native | Newer technology, learning curve | Global deployments, cloud-native applications |\n\n### Data Lake and Analytics Selection Guide\n\n| Platform | Best For | Pros | Cons | Recommended Use Case |\n|----------|----------|------|------|---------------------|\n| **Delta Lake** | ACID Analytics, Time Travel | ACID transactions, time travel, schema evolution, Spark ecosystem | Requires Spark, Java/Scala ecosystem | Data science workflows, ML pipelines, audit requirements |\n| **Apache Iceberg** | Multi-Engine Analytics | Engine agnostic, hidden partitioning, snapshot isolation | Newer ecosystem, complex setup | Multi-tool analytics, data warehouse modernization |\n| **ClickHouse** | Real-Time Analytics | Extremely fast queries, columnar storage, SQL interface | Limited transaction support, specialized use case | Real-time dashboards, email analytics, reporting |\n\n### Hybrid Architecture Patterns\n\n| Pattern | Description | Use Case | Benefits |\n|---------|-------------|----------|----------|\n| **Operational + Analytics** | PostgreSQL for operations, Delta Lake for analytics | Real-time app + historical analysis | Best of both worlds, optimized for each workload |\n| **Hot + Cold Storage** | ClickHouse for recent data, Iceberg for historical | Email analytics with time-based access patterns | Cost optimization, query performance |\n| **Multi-Engine Lake** | Iceberg with Spark, Trino, and Flink | Complex analytics requiring different compute engines | Flexibility, avoid vendor lock-in |\n\n## Configuration\n\n### Complete Configuration Example\n\nCreate a `config.yaml` file for comprehensive system configuration:\n\n```yaml\n# Database configuration examples\n\n# PostgreSQL\ndatabase:\n  type: \"postgresql\"\n  host: \"localhost\"\n  port: 5432\n  database: \"outlook_data\"\n  username: \"postgres\"\n  password: \"your_password\"\n  pool_size: 20\n  max_overflow: 30\n\n# SQLite (for development/testing)\ndatabase:\n  type: \"sqlite\"\n  database_path: \"outlook_data.db\"\n  enable_wal: true\n  timeout: 30.0\n\n# SQL Server\ndatabase:\n  type: \"mssql\"\n  server: \"localhost\\\\SQLEXPRESS\"\n  port: 1433\n  database: \"outlook_data\"\n  username: \"sa\"\n  password: \"your_password\"\n  trusted_connection: false\n  encrypt: true\n\n# MariaDB\ndatabase:\n  type: \"mariadb\"\n  host: \"localhost\"\n  port: 3306\n  database: \"outlook_data\"\n  username: \"root\"\n  password: \"your_password\"\n  charset: \"utf8mb4\"\n\n# Oracle Database\ndatabase:\n  type: \"oracle\"\n  host: \"localhost\"\n  port: 1521\n  service_name: \"XEPDB1\"\n  username: \"outlook_user\"\n  password: \"your_password\"\n\n# CockroachDB\ndatabase:\n  type: \"cockroachdb\"\n  host: \"localhost\"\n  port: 26257\n  database: \"outlook_data\"\n  username: \"root\"\n  password: \"your_password\"\n  sslmode: \"require\"\n\n# Data Lake configuration examples\n\n# Delta Lake (local)\ndatabase:\n  type: \"deltalake\"\n  table_path: \"./delta-tables/emails\"\n  app_name: \"outlook-ingestor\"\n  master: \"local[*]\"\n  partition_columns: [\"received_date_partition\", \"sender_domain\"]\n  z_order_columns: [\"received_date\", \"sender_email\"]\n  enable_time_travel: true\n\n# Delta Lake (AWS S3)\ndatabase:\n  type: \"deltalake\"\n  table_path: \"s3a://my-bucket/delta-tables/emails\"\n  app_name: \"outlook-ingestor-prod\"\n  master: \"spark://spark-master:7077\"\n  cloud_provider: \"aws\"\n  cloud_config:\n    access_key: \"your_access_key\"\n    secret_key: \"your_secret_key\"\n    region: \"us-west-2\"\n\n# Apache Iceberg (Hadoop catalog)\ndatabase:\n  type: \"iceberg\"\n  catalog_type: \"hadoop\"\n  warehouse_path: \"./iceberg-warehouse\"\n  namespace: \"outlook_data\"\n  table_name: \"emails\"\n  enable_compaction: true\n\n# Apache Iceberg (AWS Glue catalog)\ndatabase:\n  type: \"iceberg\"\n  catalog_type: \"glue\"\n  catalog_config:\n    warehouse: \"s3://my-bucket/iceberg-warehouse\"\n    region: \"us-west-2\"\n  namespace: \"outlook_analytics\"\n  table_name: \"emails\"\n\n# ClickHouse (local)\ndatabase:\n  type: \"clickhouse\"\n  host: \"localhost\"\n  port: 8123\n  database: \"outlook_data\"\n  username: \"default\"\n  password: \"your_password\"\n  compression: true\n\n# ClickHouse (cluster)\ndatabase:\n  type: \"clickhouse\"\n  host: \"clickhouse-cluster.example.com\"\n  port: 8123\n  database: \"outlook_data\"\n  username: \"analytics_user\"\n  password: \"your_password\"\n  cluster: \"outlook_cluster\"\n  secure: true\n\n# Protocol configurations\nprotocols:\n  graph_api:\n    client_id: \"your_client_id\"\n    client_secret: \"your_client_secret\"\n    tenant_id: \"your_tenant_id\"\n    scopes: [\"https://graph.microsoft.com/.default\"]\n\n  exchange:\n    server: \"outlook.office365.com\"\n    username: \"your_email@company.com\"\n    password: \"your_password\"\n    autodiscover: true\n\n# Storage backend configurations\nstorage:\n  minio:\n    endpoint_url: \"localhost:9000\"\n    access_key: \"minioadmin\"\n    secret_key: \"minioadmin\"\n    bucket_name: \"email-attachments\"\n    use_ssl: false\n\n  aws_s3:\n    access_key: \"your_aws_access_key\"\n    secret_key: \"your_aws_secret_key\"\n    bucket_name: \"email-attachments-prod\"\n    region: \"us-east-1\"\n\n  azure_blob:\n    connection_string: \"DefaultEndpointsProtocol=https;AccountName=...\"\n    container_name: \"email-attachments\"\n\n# Enhanced attachment processing\nattachment_processing:\n  storage_strategy: \"hybrid\"\n  size_threshold: 1048576  # 1MB\n  enable_compression: true\n  enable_deduplication: true\n  enable_virus_scanning: false\n  max_attachment_size: 52428800  # 50MB\n\n  storage_rules:\n    - name: \"large_files\"\n      condition: \"size > 5*1024*1024\"\n      strategy: \"storage_only\"\n      storage_backend: \"aws_s3\"\n    - name: \"medium_files\"\n      condition: \"size > 1024*1024 and size <= 5*1024*1024\"\n      strategy: \"hybrid\"\n      storage_backend: \"minio\"\n    - name: \"small_files\"\n      condition: \"size <= 1024*1024\"\n      strategy: \"database_only\"\n\n# Processing settings\nprocessing:\n  batch_size: 1000\n  max_workers: 10\n  timeout_seconds: 300\n  retry_attempts: 3\n  retry_delay: 1.0\n\n# Email settings\nemail:\n  extract_attachments: true\n  include_folders:\n    - \"Inbox\"\n    - \"Sent Items\"\n    - \"Archive\"\n  exclude_folders:\n    - \"Deleted Items\"\n    - \"Junk Email\"\n\n# Security settings\nsecurity:\n  encrypt_credentials: true\n  master_key: \"your_encryption_key\"\n  enable_audit_logging: true\n\n# Monitoring settings\nmonitoring:\n  enable_metrics: true\n  metrics_port: 8080\n  health_check_interval: 30\n  log_level: \"INFO\"\n```\n\n### Environment Variables\n\n```bash\n# Database settings\nexport DATABASE__HOST=localhost\nexport DATABASE__PORT=5432\nexport DATABASE__USERNAME=postgres\nexport DATABASE__PASSWORD=your_password\n\n# Graph API settings\nexport PROTOCOLS__GRAPH_API__CLIENT_ID=your_client_id\nexport PROTOCOLS__GRAPH_API__CLIENT_SECRET=your_client_secret\nexport PROTOCOLS__GRAPH_API__TENANT_ID=your_tenant_id\n\n# Storage backend settings\nexport STORAGE__MINIO__ACCESS_KEY=minioadmin\nexport STORAGE__MINIO__SECRET_KEY=minioadmin\nexport STORAGE__AWS_S3__ACCESS_KEY=your_aws_access_key\nexport STORAGE__AWS_S3__SECRET_KEY=your_aws_secret_key\n\n# Security settings\nexport SECURITY__MASTER_KEY=your_encryption_key\nexport SECURITY__ENCRYPT_CREDENTIALS=true\n\n# Load configuration file\nexport CONFIG_FILE=/path/to/config.yaml\n```\n\n## Performance\n\n### Throughput Benchmarks\n\n| Configuration | Emails/Minute | Attachments/Minute | Memory Usage |\n|---------------|---------------|-------------------|--------------|\n| Basic (Database Only) | 500-800 | 200-400 | 256MB |\n| Hybrid Storage | 800-1200 | 400-800 | 512MB |\n| Object Storage Only | 1000-1500 | 600-1200 | 128MB |\n| Multi-tier Enterprise | 1200-2000 | 800-1500 | 1GB |\n\n### Performance Optimization\n\n```python\n# High-performance configuration\nperformance_config = {\n    \"processing\": {\n        \"batch_size\": 2000,        # Larger batches for better throughput\n        \"max_workers\": 20,         # More concurrent workers\n        \"connection_pool_size\": 50, # Larger connection pool\n        \"prefetch_count\": 100      # Prefetch more emails\n    },\n\n    \"attachment_processing\": {\n        \"enable_compression\": True,     # Reduce storage I/O\n        \"enable_deduplication\": True,   # Avoid duplicate processing\n        \"concurrent_uploads\": 10,       # Parallel storage uploads\n        \"chunk_size\": 8192             # Optimal chunk size for uploads\n    },\n\n    \"storage\": {\n        \"connection_timeout\": 30,       # Reasonable timeout\n        \"retry_attempts\": 3,           # Automatic retries\n        \"use_connection_pooling\": True  # Reuse connections\n    }\n}\n```\n\n### Memory Management\n\n```mermaid\ngraph LR\n    A[Email Batch] --> B{Size Check}\n    B -->|Small| C[Database Storage]\n    B -->|Medium| D[Hybrid Processing]\n    B -->|Large| E[Stream to Object Storage]\n\n    C --> F[Memory: ~1MB per email]\n    D --> G[Memory: ~100KB per email]\n    E --> H[Memory: ~10KB per email]\n\n    F --> I[Total: 256MB for 1000 emails]\n    G --> J[Total: 100MB for 1000 emails]\n    H --> K[Total: 10MB for 1000 emails]\n```\n\n### Scaling Recommendations\n\n#### Small Deployments (< 10,000 emails/day)\n- **Configuration**: Basic database storage\n- **Resources**: 2 CPU cores, 4GB RAM\n- **Storage**: PostgreSQL with SSD storage\n\n#### Medium Deployments (10,000 - 100,000 emails/day)\n- **Configuration**: Hybrid storage with MinIO\n- **Resources**: 4 CPU cores, 8GB RAM\n- **Storage**: PostgreSQL + MinIO cluster\n\n#### Large Deployments (100,000+ emails/day)\n- **Configuration**: Multi-tier object storage\n- **Resources**: 8+ CPU cores, 16GB+ RAM\n- **Storage**: PostgreSQL + AWS S3/Azure Blob + CDN\n\n## Advanced Usage\n\n### Protocol Adapters\n\n#### Microsoft Graph API Adapter\n- **Features**: OAuth2 authentication, rate limiting, pagination support\n- **Configuration**: Client ID, Client Secret, Tenant ID\n- **Usage**: Modern REST API for Office 365 and Outlook.com\n\n```python\nfrom evolvishub_outlook_ingestor.protocols import GraphAPIAdapter\n\nadapter = GraphAPIAdapter(\"graph_api\", {\n    \"client_id\": \"your_client_id\",\n    \"client_secret\": \"your_client_secret\",\n    \"tenant_id\": \"your_tenant_id\",\n    \"rate_limit\": 100,  # requests per minute\n})\n```\n\n#### Exchange Web Services (EWS) Adapter\n- **Features**: Basic and OAuth2 authentication, connection pooling\n- **Configuration**: Server URL, credentials, timeout settings\n- **Usage**: On-premises Exchange servers and Exchange Online\n\n```python\nfrom evolvishub_outlook_ingestor.protocols import ExchangeWebServicesAdapter\n\nadapter = ExchangeWebServicesAdapter(\"exchange\", {\n    \"server\": \"outlook.office365.com\",\n    \"username\": \"your_email@company.com\",\n    \"password\": \"your_password\",\n    \"auth_type\": \"basic\",  # or \"oauth2\"\n})\n```\n\n#### IMAP/POP3 Adapter\n- **Features**: SSL/TLS support, folder synchronization, UID tracking\n- **Configuration**: Server details, authentication credentials\n- **Usage**: Standard email protocols for broad compatibility\n\n```python\nfrom evolvishub_outlook_ingestor.protocols import IMAPAdapter\n\nadapter = IMAPAdapter(\"imap\", {\n    \"server\": \"outlook.office365.com\",\n    \"port\": 993,\n    \"username\": \"your_email@company.com\",\n    \"password\": \"your_password\",\n    \"use_ssl\": True,\n})\n```\n\n### Database Connectors\n\n#### PostgreSQL Connector\n- **Features**: Async operations, connection pooling, JSON fields, full-text search\n- **Schema**: Optimized tables with proper indexes for email data\n- **Performance**: Batch operations, transaction support\n\n```python\nfrom evolvishub_outlook_ingestor.connectors import PostgreSQLConnector\n\nconnector = PostgreSQLConnector(\"postgresql\", {\n    \"host\": \"localhost\",\n    \"port\": 5432,\n    \"database\": \"outlook_data\",\n    \"username\": \"postgres\",\n    \"password\": \"your_password\",\n    \"pool_size\": 20,\n})\n```\n\n#### MongoDB Connector\n- **Features**: Document storage, GridFS for large attachments, aggregation pipelines\n- **Schema**: Flexible document structure with proper indexing\n- **Scalability**: Horizontal scaling support, replica sets\n\n```python\nfrom evolvishub_outlook_ingestor.connectors import MongoDBConnector\n\nconnector = MongoDBConnector(\"mongodb\", {\n    \"host\": \"localhost\",\n    \"port\": 27017,\n    \"database\": \"outlook_data\",\n    \"username\": \"mongo_user\",\n    \"password\": \"your_password\",\n})\n```\n\n### Data Processors\n\n#### Email Processor\n- **Features**: Content normalization, HTML to text conversion, duplicate detection\n- **Capabilities**: Email validation, link extraction, encoding detection\n- **Configuration**: Customizable processing rules\n\n```python\nfrom evolvishub_outlook_ingestor.processors import EmailProcessor\n\nprocessor = EmailProcessor(\"email\", {\n    \"normalize_content\": True,\n    \"extract_links\": True,\n    \"validate_addresses\": True,\n    \"html_to_text\": True,\n    \"remove_duplicates\": True,\n})\n```\n\n#### Attachment Processor\n- **Features**: File type detection, virus scanning hooks, metadata extraction\n- **Security**: Size validation, type filtering, content analysis\n- **Optimization**: Image compression, hash calculation\n\n```python\nfrom evolvishub_outlook_ingestor.processors import AttachmentProcessor\n\nprocessor = AttachmentProcessor(\"attachment\", {\n    \"max_attachment_size\": 50 * 1024 * 1024,  # 50MB\n    \"scan_for_viruses\": True,\n    \"extract_metadata\": True,\n    \"calculate_hashes\": True,\n    \"compress_images\": True,\n})\n```\n\n## \ud83d\udd27 Advanced Usage\n\n### Hybrid Storage Configuration\n\n```python\nimport asyncio\nfrom evolvishub_outlook_ingestor.processors.enhanced_attachment_processor import (\n    EnhancedAttachmentProcessor,\n    StorageStrategy\n)\nfrom evolvishub_outlook_ingestor.connectors.minio_connector import MinIOConnector\nfrom evolvishub_outlook_ingestor.connectors.aws_s3_connector import AWSS3Connector\n\nasync def setup_hybrid_storage():\n    # Configure storage backends\n    minio_config = {\n        \"endpoint_url\": \"localhost:9000\",\n        \"access_key\": \"minioadmin\",\n        \"secret_key\": \"minioadmin\",\n        \"bucket_name\": \"email-attachments-hot\",\n        \"use_ssl\": False\n    }\n\n    s3_config = {\n        \"access_key\": \"your_aws_access_key\",\n        \"secret_key\": \"your_aws_secret_key\",\n        \"bucket_name\": \"email-attachments-archive\",\n        \"region\": \"us-east-1\"\n    }\n\n    # Initialize storage connectors\n    minio_connector = MinIOConnector(\"hot_storage\", minio_config)\n    s3_connector = AWSS3Connector(\"archive_storage\", s3_config)\n\n    # Configure enhanced processor with storage rules\n    processor_config = {\n        \"storage_strategy\": \"hybrid\",\n        \"size_threshold\": 1024 * 1024,  # 1MB\n        \"enable_compression\": True,\n        \"enable_deduplication\": True,\n        \"storage_rules\": [\n            {\n                \"name\": \"large_files\",\n                \"condition\": \"size > 5*1024*1024\",  # Files > 5MB\n                \"strategy\": \"storage_only\",\n                \"storage_backend\": \"archive_storage\"\n            },\n            {\n                \"name\": \"medium_files\",\n                \"condition\": \"size > 1024*1024 and size <= 5*1024*1024\",\n                \"strategy\": \"hybrid\",\n                \"storage_backend\": \"hot_storage\"\n            },\n            {\n                \"name\": \"small_files\",\n                \"condition\": \"size <= 1024*1024\",\n                \"strategy\": \"database_only\"\n            }\n        ]\n    }\n\n    # Create enhanced processor\n    processor = EnhancedAttachmentProcessor(\"hybrid_attachments\", processor_config)\n\n    # Add storage backends\n    async with minio_connector, s3_connector:\n        await processor.add_storage_backend(\"hot_storage\", minio_connector)\n        await processor.add_storage_backend(\"archive_storage\", s3_connector)\n\n        # Process emails with hybrid storage\n        result = await processor.process(email_with_attachments)\n\n        # Generate secure URLs for attachment access\n        for storage_info in result.metadata.get(\"storage_infos\", []):\n            if storage_info.get(\"storage_backend\"):\n                backend = processor.storage_backends[storage_info[\"storage_backend\"]]\n                secure_url = await backend.generate_presigned_url(\n                    storage_info[\"storage_location\"],\n                    expires_in=3600  # 1 hour\n                )\n                print(f\"Secure URL: {secure_url}\")\n\nasyncio.run(setup_hybrid_storage())\n```\n\n### Custom Protocol Adapter\n\n```python\nfrom evolvishub_outlook_ingestor.protocols import BaseProtocol\nfrom evolvishub_outlook_ingestor.core.data_models import EmailMessage\n\nclass CustomProtocol(BaseProtocol):\n    async def _fetch_emails_impl(self, **kwargs):\n        # Implement custom email fetching logic\n        emails = []\n        # ... fetch emails from custom source\n        return emails\n\n# Use custom protocol\ningestor = OutlookIngestor(\n    settings=settings,\n    protocol_adapters={\"custom\": CustomProtocol(\"custom\", config)}\n)\n```\n\n### Batch Processing with Progress Tracking\n\n```python\nfrom evolvishub_outlook_ingestor.core.data_models import BatchProcessingConfig\n\nasync def process_with_progress():\n    def progress_callback(processed, total, rate):\n        print(f\"Progress: {processed}/{total} ({rate:.2f} emails/sec)\")\n    \n    batch_config = BatchProcessingConfig(\n        batch_size=500,\n        max_workers=8,\n        progress_callback=progress_callback\n    )\n    \n    result = await ingestor.process_emails(\n        protocol=\"exchange\",\n        database=\"mongodb\",\n        batch_config=batch_config\n    )\n```\n\n### Database Transactions\n\n```python\nfrom evolvishub_outlook_ingestor.connectors import PostgreSQLConnector\n\nasync def transactional_processing():\n    connector = PostgreSQLConnector(\"postgres\", config)\n    await connector.initialize()\n    \n    async with connector.transaction() as tx:\n        # All operations within this block are transactional\n        for email in emails:\n            await connector.store_email(email, transaction=tx)\n        # Automatically commits on success, rolls back on error\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\n### Component Overview\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   Protocols     \u2502    \u2502   Processors     \u2502    \u2502   Connectors    \u2502\n\u2502                 \u2502    \u2502                  \u2502    \u2502                 \u2502\n\u2502 \u2022 Exchange EWS  \u2502\u2500\u2500\u2500\u25b6\u2502 \u2022 Email Proc.    \u2502\u2500\u2500\u2500\u25b6\u2502 \u2022 PostgreSQL    \u2502\n\u2502 \u2022 Graph API     \u2502    \u2502 \u2022 Attachment     \u2502    \u2502 \u2022 MongoDB       \u2502\n\u2502 \u2022 IMAP/POP3     \u2502    \u2502 \u2022 Batch Proc.    \u2502    \u2502 \u2022 MySQL         \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n         \u2502                       \u2502                       \u2502\n         \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                                 \u2502\n                    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n                    \u2502   Core Framework    \u2502\n                    \u2502                     \u2502\n                    \u2502 \u2022 Configuration     \u2502\n                    \u2502 \u2022 Logging           \u2502\n                    \u2502 \u2022 Error Handling    \u2502\n                    \u2502 \u2022 Retry Logic       \u2502\n                    \u2502 \u2022 Metrics           \u2502\n                    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Design Patterns\n\n- **Strategy Pattern**: Interchangeable protocol adapters\n- **Factory Pattern**: Dynamic component creation\n- **Repository Pattern**: Database abstraction\n- **Observer Pattern**: Progress and metrics tracking\n- **Circuit Breaker**: Fault tolerance\n\n## \ud83e\uddea Testing\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=evolvishub_outlook_ingestor --cov-report=html\n\n# Run specific test categories\npytest -m unit          # Unit tests only\npytest -m integration   # Integration tests only\npytest -m performance   # Performance tests only\n\n# Run tests in parallel\npytest -n auto\n```\n\n## \ud83d\udcca Performance\n\n### Benchmarks\n\n- **Email Processing**: 1000+ emails/minute\n- **Memory Usage**: <100MB for 10K emails\n- **Database Throughput**: 500+ inserts/second\n- **Concurrent Connections**: 50+ simultaneous\n\n### Optimization Tips\n\n1. **Use Batch Processing**: Process emails in batches for better throughput\n2. **Enable Connection Pooling**: Reuse database connections\n3. **Configure Rate Limiting**: Avoid API throttling\n4. **Monitor Memory Usage**: Use streaming for large datasets\n5. **Tune Worker Count**: Match your system's CPU cores\n\n## \ud83d\udd0d Monitoring\n\n### Metrics Collection\n\n```python\n# Enable Prometheus metrics\nsettings.monitoring.enable_metrics = True\nsettings.monitoring.metrics_port = 8000\n\n# Access metrics endpoint\n# http://localhost:8000/metrics\n```\n\n### Health Checks\n\n```python\n# Check component health\nstatus = await ingestor.get_status()\nprint(f\"Protocol Status: {status['protocols']}\")\nprint(f\"Database Status: {status['database']}\")\n```\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n### Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/evolvisai/metcal.git\ncd metcal/shared/libs/evolvis-outlook-ingestor\n\n# Install development dependencies\npip install -e \".[dev]\"\n\n# Run pre-commit hooks\npre-commit install\n\n# Run tests\npytest\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the Evolvis AI License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udcda API Reference\n\n### Core Components\n\n**Protocol Adapters**\n- `GraphAPIAdapter`: Microsoft Graph API integration\n- `ExchangeWebServicesAdapter`: Exchange Web Services (EWS) support\n- `IMAPAdapter`: IMAP protocol support\n\n**Database Connectors**\n- `PostgreSQLConnector`: PostgreSQL database integration\n- `MongoDBConnector`: MongoDB database integration\n\n**Data Processors**\n- `EmailProcessor`: Email content processing and normalization\n- `AttachmentProcessor`: Attachment handling and security scanning\n\n**Security Utilities**\n- `SecureCredentialManager`: Credential encryption and management\n- `CredentialMasker`: Sensitive data masking for logs\n- `InputSanitizer`: Input validation and sanitization\n\n### Configuration Reference\n\n```python\n# Complete configuration example\nconfig = {\n    \"graph_api\": {\n        \"client_id\": \"your_client_id\",\n        \"client_secret\": \"your_client_secret\",\n        \"tenant_id\": \"your_tenant_id\",\n        \"rate_limit\": 100,  # requests per minute\n        \"timeout\": 30,      # request timeout in seconds\n    },\n    \"database\": {\n        \"host\": \"localhost\",\n        \"port\": 5432,\n        \"database\": \"outlook_ingestor\",\n        \"username\": \"ingestor_user\",\n        \"password\": \"secure_password\",\n        \"ssl_mode\": \"require\",\n        \"enable_connection_pooling\": True,\n        \"pool_size\": 10,\n    },\n    \"email_processing\": {\n        \"normalize_content\": True,\n        \"extract_links\": True,\n        \"validate_addresses\": True,\n        \"html_to_text\": True,\n        \"remove_duplicates\": True,\n    },\n    \"attachment_processing\": {\n        \"max_attachment_size\": 10 * 1024 * 1024,  # 10MB\n        \"extract_metadata\": True,\n        \"calculate_hashes\": True,\n        \"scan_for_viruses\": False,\n    }\n}\n```\n\n### Error Handling\n\n```python\nfrom evolvishub_outlook_ingestor.core.exceptions import (\n    ConnectionError,\n    AuthenticationError,\n    DatabaseError,\n    ProcessingError,\n    ValidationError,\n)\n\ntry:\n    await protocol.fetch_emails()\nexcept AuthenticationError:\n    # Handle authentication issues\n    print(\"Check your API credentials\")\nexcept ConnectionError:\n    # Handle network/connection issues\n    print(\"Check network connectivity\")\nexcept ProcessingError as e:\n    # Handle processing errors\n    print(f\"Processing failed: {e}\")\n```\n\n## Support and Documentation\n\n### Documentation Resources\n- **[Storage Architecture Guide](docs/STORAGE_ARCHITECTURE.md)** - Comprehensive guide to hybrid storage configuration\n- **[Migration Guide](docs/MIGRATION_GUIDE.md)** - Step-by-step migration from basic to hybrid storage\n- **[API Reference](docs/API_REFERENCE.md)** - Complete API documentation with examples\n- **[Performance Tuning](docs/PERFORMANCE_TUNING.md)** - Optimization guidelines for large-scale deployments\n\n### Community and Support\n- **[GitHub Issues](https://github.com/evolvisai/metcal/issues)** - Bug reports and feature requests\n- **[GitHub Discussions](https://github.com/evolvisai/metcal/discussions)** - Community discussions and Q&A\n- **[Examples Directory](examples/)** - Comprehensive usage examples and tutorials\n\n### Enterprise Support\nFor enterprise deployments requiring dedicated support, custom integrations, or professional services, please contact our team for tailored solutions and SLA-backed support options.\n\n## Technical Specifications\n\n### Supported Platforms\n- **Operating Systems**: Linux (Ubuntu 18.04+, CentOS 7+), macOS (10.15+), Windows (10+)\n- **Python Versions**: 3.9, 3.10, 3.11, 3.12\n- **Database Systems**: PostgreSQL 12+, MongoDB 4.4+, MySQL 8.0+\n- **Object Storage**: MinIO, AWS S3, Azure Blob Storage, Google Cloud Storage\n\n### Performance Characteristics\n- **Throughput**: Up to 2,000 emails/minute with hybrid storage\n- **Concurrency**: Support for 50+ concurrent processing workers\n- **Memory Efficiency**: <10KB per email with object storage strategy\n- **Storage Optimization**: Up to 70% reduction in database size with intelligent routing\n\n### Security Compliance\n- **Encryption**: AES-256 encryption for credentials and sensitive data\n- **Authentication**: OAuth2, Basic Auth, and certificate-based authentication\n- **Access Control**: Role-based access control and audit logging\n- **Compliance**: GDPR, HIPAA, and SOX compliance features available\n\n## Acknowledgments\n\nThis project is built on top of excellent open-source technologies:\n\n- **[Pydantic](https://pydantic.dev/)** - Data validation and settings management\n- **[SQLAlchemy](https://sqlalchemy.org/)** - Database ORM with async support\n- **[asyncio](https://docs.python.org/3/library/asyncio.html)** - Asynchronous programming framework\n- **[pytest](https://pytest.org/)** - Testing framework with async support\n- **[Black](https://black.readthedocs.io/)**, **[isort](https://pycqa.github.io/isort/)**, **[mypy](https://mypy.readthedocs.io/)** - Code quality and type checking tools\n\n## License\n\n### Evolvis AI License\n\nThis software is proprietary to **Evolvis AI** and is protected by copyright and other intellectual property laws.\n\n#### \ud83d\udccb **License Terms**\n\n- **\u2705 Evaluation and Non-Commercial Use**: This package is available for evaluation, research, and non-commercial use\n- **\u26a0\ufe0f Commercial Use Restrictions**: Commercial or production use of this library requires a valid Evolvis AI License\n- **\ud83d\udeab Redistribution Prohibited**: Redistribution or commercial use without proper licensing is strictly prohibited\n\n#### \ud83d\udcbc **Commercial Licensing**\n\nFor commercial licensing, production deployments, or enterprise use, please contact:\n\n**Montgomery Miralles**\n\ud83d\udce7 **Email**: [m.miralles@evolvis.ai](mailto:m.miralles@evolvis.ai)\n\ud83c\udfe2 **Company**: Evolvis AI\n\ud83c\udf10 **Website**: [https://evolvis.ai](https://evolvis.ai)\n\n#### \u2696\ufe0f **Important Notice**\n\n> **Commercial users must obtain proper licensing before deploying this software in production environments.** Unauthorized commercial use may result in legal action. Contact Montgomery Miralles for licensing agreements and compliance requirements.\n\n#### \ud83d\udcc4 **Full License**\n\nFor complete license terms and conditions, see the [LICENSE](LICENSE) file included with this distribution.\n\n---\n\n**Evolvishub Outlook Ingestor** - Enterprise-grade email ingestion with intelligent hybrid storage architecture.\n",
    "bugtrack_url": null,
    "license": "Evolvis AI License",
    "summary": "Production-ready, secure email ingestion system for Microsoft Outlook with advanced processing, monitoring, and database integration",
    "version": "1.0.2",
    "project_urls": {
        "Changelog": "https://github.com/evolvisai/metcal/blob/main/shared/libs/evolvis-outlook-ingestor/CHANGELOG.md",
        "Documentation": "https://github.com/evolvisai/metcal/tree/main/shared/libs/evolvis-outlook-ingestor/docs",
        "Examples": "https://github.com/evolvisai/metcal/tree/main/shared/libs/evolvis-outlook-ingestor/examples",
        "Homepage": "https://github.com/evolvisai/metcal",
        "Issues": "https://github.com/evolvisai/metcal/issues",
        "Repository": "https://github.com/evolvisai/metcal.git"
    },
    "split_keywords": [
        "outlook",
        " email",
        " ingestion",
        " exchange",
        " graph-api",
        " imap",
        " pop3",
        " database",
        " async",
        " batch-processing",
        " security",
        " monitoring",
        " performance",
        " postgresql",
        " mongodb",
        " enterprise"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1e324b6fb96f75e6fea90edcac84d329e66b4cc8700dcbcd6a5057781ecc6b61",
                "md5": "b02777a55c545ee5d395ad37b0c18e66",
                "sha256": "690bd9d04c272ff4771628fb590a6e3ead227ef78575b08c7130820a6b0c652c"
            },
            "downloads": -1,
            "filename": "evolvishub_outlook_ingestor-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b02777a55c545ee5d395ad37b0c18e66",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 153970,
            "upload_time": "2025-10-06T22:41:10",
            "upload_time_iso_8601": "2025-10-06T22:41:10.933571Z",
            "url": "https://files.pythonhosted.org/packages/1e/32/4b6fb96f75e6fea90edcac84d329e66b4cc8700dcbcd6a5057781ecc6b61/evolvishub_outlook_ingestor-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "88f22dfa77bd7b374a68007e712f301082248b9dd8763768eed9fa44c0c0beff",
                "md5": "d885fa67eeb8c38e608d3fa773c2c55e",
                "sha256": "dcc5248cffd7d4c5208b4f940e6ec5238c4244de4c2e188e223ed0f67ca4aff2"
            },
            "downloads": -1,
            "filename": "evolvishub_outlook_ingestor-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d885fa67eeb8c38e608d3fa773c2c55e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 185062,
            "upload_time": "2025-10-06T22:41:12",
            "upload_time_iso_8601": "2025-10-06T22:41:12.580331Z",
            "url": "https://files.pythonhosted.org/packages/88/f2/2dfa77bd7b374a68007e712f301082248b9dd8763768eed9fa44c0c0beff/evolvishub_outlook_ingestor-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-06 22:41:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "evolvisai",
    "github_project": "metcal",
    "github_not_found": true,
    "lcname": "evolvishub-outlook-ingestor"
}
        
Elapsed time: 1.70461s