ai-pipeline-core

Name	ai-pipeline-core JSON
Version	0.1.14 JSON
	download
home_page	None
Summary	Core utilities for AI-powered processing pipelines using prefect
upload_time	2025-09-01 15:46:57
maintainer	None
docs_url	None
author	None
requires_python	>=3.12
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # AI Pipeline Core

A high-performance async framework for building type-safe AI pipelines with LLMs, document processing, and workflow orchestration.

[![Python Version](https://img.shields.io/badge/python-3.12%2B-blue)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code Style: Ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
[![Type Checked: Basedpyright](https://img.shields.io/badge/type%20checked-basedpyright-blue)](https://github.com/DetachHead/basedpyright)

## Overview

AI Pipeline Core is a production-ready framework that combines document processing, LLM integration, and workflow orchestration into a unified system. Built with strong typing (Pydantic), automatic retries, cost tracking, and distributed tracing, it enforces best practices while maintaining high performance through fully async operations.

### Key Features

- **Document Processing**: Type-safe handling of text, JSON, YAML, PDFs, and images with automatic MIME type detection and provenance tracking
- **LLM Integration**: Unified interface to any model via LiteLLM proxy with configurable context caching
- **Structured Output**: Type-safe generation with Pydantic model validation
- **Workflow Orchestration**: Prefect-based flows and tasks with automatic retries
- **Observability**: Built-in distributed tracing via Laminar (LMNR) with cost tracking for debugging and monitoring
- **Local Development**: Simple runner for testing pipelines without infrastructure

## Installation

```bash
pip install ai-pipeline-core
```

### Requirements

- Python 3.12 or higher
- Linux/macOS (Windows via WSL2)

### Development Installation

```bash
git clone https://github.com/bbarwik/ai-pipeline-core.git
cd ai-pipeline-core
pip install -e ".[dev]"
make install-dev  # Installs pre-commit hooks
```

## Quick Start

### Basic Pipeline

```python
from ai_pipeline_core import (
    pipeline_flow,
    FlowDocument,
    DocumentList,
    FlowOptions,
    FlowConfig,
    llm,
    AIMessages
)

# Define document types
class InputDoc(FlowDocument):
    """Input document for processing."""

class OutputDoc(FlowDocument):
    """Analysis result document."""

# Define flow configuration
class AnalysisConfig(FlowConfig):
    INPUT_DOCUMENT_TYPES = [InputDoc]
    OUTPUT_DOCUMENT_TYPE = OutputDoc

# Create pipeline flow
@pipeline_flow
async def analyze_flow(
    project_name: str,
    documents: DocumentList,
    flow_options: FlowOptions
) -> DocumentList:
    config = AnalysisConfig()

    # Process documents
    outputs = []
    for doc in documents:
        # Use AIMessages for LLM interaction
        response = await llm.generate(
            model="gpt-5",
            messages=AIMessages([doc])
        )

        output = OutputDoc.create(
            name=f"analysis_{doc.name}",
            content=response.content
        )
        outputs.append(output)

    # RECOMMENDED: Always validate output
    return config.create_and_validate_output(outputs)
```

### Structured Output

```python
from pydantic import BaseModel
from ai_pipeline_core import llm

class Analysis(BaseModel):
    summary: str
    sentiment: float
    key_points: list[str]

# Generate structured output
response = await llm.generate_structured(
    model="gpt-5",
    response_format=Analysis,
    messages="Analyze this product review: ..."
)

# Access parsed result with type safety
analysis = response.parsed
print(f"Sentiment: {analysis.sentiment}")
for point in analysis.key_points:
    print(f"- {point}")
```

### Document Handling

```python
from ai_pipeline_core import FlowDocument, TemporaryDocument

# Create documents with automatic conversion
doc = MyDocument.create(
    name="data.json",
    content={"key": "value"}  # Automatically converted to JSON bytes
)

# Parse back to original type
data = doc.parse(dict)  # Returns {"key": "value"}

# Document provenance tracking (new in v0.1.14)
doc_with_sources = MyDocument.create(
    name="derived.json",
    content={"result": "processed"},
    sources=[source_doc.sha256, "https://api.example.com/data"]
)

# Check provenance
for hash in doc_with_sources.get_source_documents():
    print(f"Derived from document: {hash}")
for ref in doc_with_sources.get_source_references():
    print(f"External source: {ref}")

# Temporary documents (never persisted)
temp = TemporaryDocument.create(
    name="api_response.json",
    content={"status": "ok"}
)
```

## Core Concepts

### Documents

Documents are immutable Pydantic models that wrap binary content with metadata:

- **FlowDocument**: Persists across flow runs, saved to filesystem
- **TaskDocument**: Temporary within task execution, not persisted
- **TemporaryDocument**: Never persisted, useful for sensitive data

```python
class MyDocument(FlowDocument):
    """Custom document type."""

# Use create() for automatic conversion
doc = MyDocument.create(
    name="data.json",
    content={"key": "value"}  # Auto-converts to JSON
)

# Access content
if doc.is_text:
    print(doc.text)

# Parse structured data
data = doc.as_json()  # or as_yaml(), as_pydantic_model()

# Enhanced filtering (new in v0.1.14)
filtered = documents.filter_by([Doc1, Doc2, Doc3])  # Multiple types
named = documents.filter_by(["file1.txt", "file2.txt"])  # Multiple names
```

### LLM Integration

The framework provides a unified interface for LLM interactions with smart caching:

```python
from ai_pipeline_core import llm, AIMessages, ModelOptions

# Simple generation
response = await llm.generate(
    model="gpt-5",
    messages="Explain quantum computing"
)
print(response.content)

# With context caching (saves 50-90% tokens)
static_context = AIMessages([large_document])

# First call: caches context
r1 = await llm.generate(
    model="gpt-5",
    context=static_context,  # Cached for 120 seconds by default
    messages="Summarize"     # Dynamic query
)

# Second call: reuses cache
r2 = await llm.generate(
    model="gpt-5",
    context=static_context,  # Reused from cache!
    messages="Key points?"   # Different query
)

# Custom cache TTL (new in v0.1.14)
response = await llm.generate(
    model="gpt-5",
    context=static_context,
    messages="Analyze",
    options=ModelOptions(cache_ttl="300s")  # Cache for 5 minutes
)

# Disable caching for dynamic contexts
response = await llm.generate(
    model="gpt-5",
    context=dynamic_context,
    messages="Process",
    options=ModelOptions(cache_ttl=None)  # No caching
)
```

### Flow Configuration

Type-safe flow configuration ensures proper document flow:

```python
from ai_pipeline_core import FlowConfig

class ProcessingConfig(FlowConfig):
    INPUT_DOCUMENT_TYPES = [RawDataDocument]
    OUTPUT_DOCUMENT_TYPE = ProcessedDocument  # Must be different!

    # Use in flows for validation
    @pipeline_flow
    async def process(
        config: ProcessingConfig,
        documents: DocumentList,
        flow_options: FlowOptions
    ) -> DocumentList:
        # ... processing logic ...
        return config.create_and_validate_output(outputs)
```

### Pipeline Decorators

Enhanced decorators with built-in tracing and monitoring:

```python
from ai_pipeline_core import pipeline_flow, pipeline_task, set_trace_cost

@pipeline_task  # Automatic retry, tracing, and monitoring
async def process_chunk(data: str) -> str:
    result = await transform(data)
    set_trace_cost(0.05)  # Track costs (new in v0.1.14)
    return result

@pipeline_flow  # Full observability and orchestration
async def main_flow(
    project_name: str,
    documents: DocumentList,
    flow_options: FlowOptions
) -> DocumentList:
    # Your pipeline logic
    return DocumentList(results)
```

## Configuration

### Environment Variables

```bash
# LLM Configuration (via LiteLLM proxy)
OPENAI_BASE_URL=http://localhost:4000
OPENAI_API_KEY=your-api-key

# Optional: Observability
LMNR_PROJECT_API_KEY=your-lmnr-key
LMNR_DEBUG=true  # Enable debug traces

# Optional: Orchestration
PREFECT_API_URL=http://localhost:4200/api
PREFECT_API_KEY=your-prefect-key
```

### Settings Management

Create custom settings by inheriting from the base Settings class:

```python
from ai_pipeline_core import Settings

class ProjectSettings(Settings):
    """Project-specific configuration."""
    app_name: str = "my-app"
    max_retries: int = 3
    enable_cache: bool = True

# Create singleton instance
settings = ProjectSettings()

# Access configuration
print(settings.openai_base_url)
print(settings.app_name)
```

## Best Practices

### Framework Rules (90% Use Cases)

1. **Decorators**: Use `@trace`, `@pipeline_task`, `@pipeline_flow` WITHOUT parameters
2. **Logging**: Use `get_pipeline_logger(__name__)` - NEVER `print()` or `logging` module
3. **LLM calls**: Use `AIMessages` or `str`. Wrap Documents in `AIMessages`
4. **Options**: Omit `ModelOptions` unless specifically needed (defaults are optimal)
5. **Documents**: Create with just `name` and `content` - skip `description`
6. **FlowConfig**: `OUTPUT_DOCUMENT_TYPE` must differ from all `INPUT_DOCUMENT_TYPES`
7. **Initialization**: `PromptManager` and logger at module scope, not in functions
8. **DocumentList**: Use default constructor - no validation flags needed
9. **setup_logging()**: Only in application `main()`, never at import time

### Import Convention

Always import from the top-level package:

```python
# CORRECT
from ai_pipeline_core import llm, pipeline_flow, FlowDocument

# WRONG - Never import from submodules
from ai_pipeline_core.llm import generate  # NO!
from ai_pipeline_core.documents import FlowDocument  # NO!
```

## Development

### Running Tests

```bash
make test           # Run all tests
make test-cov      # Run with coverage report
make test-showcase # Test showcase example
```

### Code Quality

```bash
make lint      # Run linting
make format    # Auto-format code
make typecheck # Type checking with basedpyright
```

### Building Documentation

```bash
make docs-build  # Generate API.md
make docs-check  # Verify documentation is up-to-date
```

## Examples

The `examples/` directory contains:

- `showcase.py` - Comprehensive example demonstrating all major features
- Run with: `cd examples && python showcase.py /path/to/documents`

## API Reference

See [API.md](API.md) for complete API documentation.

### Navigation Tips

For humans:
```bash
grep -n '^##' API.md   # List all main sections
grep -n '^###' API.md  # List all classes and functions
```

For AI assistants:
- Use pattern `^##` to find module sections
- Use pattern `^###` for classes and functions
- Use pattern `^####` for methods and properties

## Project Structure

```
ai-pipeline-core/
├── ai_pipeline_core/
│   ├── documents/      # Document abstraction system
│   ├── flow/           # Flow configuration and options
│   ├── llm/            # LLM client and response handling
│   ├── logging/        # Logging infrastructure
│   ├── tracing.py      # Distributed tracing
│   ├── pipeline.py     # Pipeline decorators
│   ├── prompt_manager.py # Jinja2 template management
│   └── settings.py     # Configuration management
├── tests/              # Comprehensive test suite
├── examples/           # Usage examples
├── API.md             # Complete API reference
└── pyproject.toml     # Project configuration
```

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make changes following the project's style guide
4. Run tests and linting (`make test lint typecheck`)
5. Commit your changes
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Support

- **Issues**: [GitHub Issues](https://github.com/bbarwik/ai-pipeline-core/issues)
- **Discussions**: [GitHub Discussions](https://github.com/bbarwik/ai-pipeline-core/discussions)
- **Documentation**: [API Reference](API.md)

## Acknowledgments

- Built on [Prefect](https://www.prefect.io/) for workflow orchestration
- Uses [LiteLLM](https://github.com/BerriAI/litellm) for LLM provider abstraction
- Integrates [Laminar (LMNR)](https://www.lmnr.ai/) for observability
- Type checking with [Pydantic](https://pydantic.dev/) and [basedpyright](https://github.com/DetachHead/basedpyright)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ai-pipeline-core",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "bbarwik <bbarwik@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/44/00/509c90caddb95a41e46efc800d2d701fc5ca1e5ee088fcb46e332604429d/ai_pipeline_core-0.1.14.tar.gz",
    "platform": null,
    "description": "# AI Pipeline Core\n\nA high-performance async framework for building type-safe AI pipelines with LLMs, document processing, and workflow orchestration.\n\n[![Python Version](https://img.shields.io/badge/python-3.12%2B-blue)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code Style: Ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)\n[![Type Checked: Basedpyright](https://img.shields.io/badge/type%20checked-basedpyright-blue)](https://github.com/DetachHead/basedpyright)\n\n## Overview\n\nAI Pipeline Core is a production-ready framework that combines document processing, LLM integration, and workflow orchestration into a unified system. Built with strong typing (Pydantic), automatic retries, cost tracking, and distributed tracing, it enforces best practices while maintaining high performance through fully async operations.\n\n### Key Features\n\n- **Document Processing**: Type-safe handling of text, JSON, YAML, PDFs, and images with automatic MIME type detection and provenance tracking\n- **LLM Integration**: Unified interface to any model via LiteLLM proxy with configurable context caching\n- **Structured Output**: Type-safe generation with Pydantic model validation\n- **Workflow Orchestration**: Prefect-based flows and tasks with automatic retries\n- **Observability**: Built-in distributed tracing via Laminar (LMNR) with cost tracking for debugging and monitoring\n- **Local Development**: Simple runner for testing pipelines without infrastructure\n\n## Installation\n\n```bash\npip install ai-pipeline-core\n```\n\n### Requirements\n\n- Python 3.12 or higher\n- Linux/macOS (Windows via WSL2)\n\n### Development Installation\n\n```bash\ngit clone https://github.com/bbarwik/ai-pipeline-core.git\ncd ai-pipeline-core\npip install -e \".[dev]\"\nmake install-dev  # Installs pre-commit hooks\n```\n\n## Quick Start\n\n### Basic Pipeline\n\n```python\nfrom ai_pipeline_core import (\n    pipeline_flow,\n    FlowDocument,\n    DocumentList,\n    FlowOptions,\n    FlowConfig,\n    llm,\n    AIMessages\n)\n\n# Define document types\nclass InputDoc(FlowDocument):\n    \"\"\"Input document for processing.\"\"\"\n\nclass OutputDoc(FlowDocument):\n    \"\"\"Analysis result document.\"\"\"\n\n# Define flow configuration\nclass AnalysisConfig(FlowConfig):\n    INPUT_DOCUMENT_TYPES = [InputDoc]\n    OUTPUT_DOCUMENT_TYPE = OutputDoc\n\n# Create pipeline flow\n@pipeline_flow\nasync def analyze_flow(\n    project_name: str,\n    documents: DocumentList,\n    flow_options: FlowOptions\n) -> DocumentList:\n    config = AnalysisConfig()\n\n    # Process documents\n    outputs = []\n    for doc in documents:\n        # Use AIMessages for LLM interaction\n        response = await llm.generate(\n            model=\"gpt-5\",\n            messages=AIMessages([doc])\n        )\n\n        output = OutputDoc.create(\n            name=f\"analysis_{doc.name}\",\n            content=response.content\n        )\n        outputs.append(output)\n\n    # RECOMMENDED: Always validate output\n    return config.create_and_validate_output(outputs)\n```\n\n### Structured Output\n\n```python\nfrom pydantic import BaseModel\nfrom ai_pipeline_core import llm\n\nclass Analysis(BaseModel):\n    summary: str\n    sentiment: float\n    key_points: list[str]\n\n# Generate structured output\nresponse = await llm.generate_structured(\n    model=\"gpt-5\",\n    response_format=Analysis,\n    messages=\"Analyze this product review: ...\"\n)\n\n# Access parsed result with type safety\nanalysis = response.parsed\nprint(f\"Sentiment: {analysis.sentiment}\")\nfor point in analysis.key_points:\n    print(f\"- {point}\")\n```\n\n### Document Handling\n\n```python\nfrom ai_pipeline_core import FlowDocument, TemporaryDocument\n\n# Create documents with automatic conversion\ndoc = MyDocument.create(\n    name=\"data.json\",\n    content={\"key\": \"value\"}  # Automatically converted to JSON bytes\n)\n\n# Parse back to original type\ndata = doc.parse(dict)  # Returns {\"key\": \"value\"}\n\n# Document provenance tracking (new in v0.1.14)\ndoc_with_sources = MyDocument.create(\n    name=\"derived.json\",\n    content={\"result\": \"processed\"},\n    sources=[source_doc.sha256, \"https://api.example.com/data\"]\n)\n\n# Check provenance\nfor hash in doc_with_sources.get_source_documents():\n    print(f\"Derived from document: {hash}\")\nfor ref in doc_with_sources.get_source_references():\n    print(f\"External source: {ref}\")\n\n# Temporary documents (never persisted)\ntemp = TemporaryDocument.create(\n    name=\"api_response.json\",\n    content={\"status\": \"ok\"}\n)\n```\n\n## Core Concepts\n\n### Documents\n\nDocuments are immutable Pydantic models that wrap binary content with metadata:\n\n- **FlowDocument**: Persists across flow runs, saved to filesystem\n- **TaskDocument**: Temporary within task execution, not persisted\n- **TemporaryDocument**: Never persisted, useful for sensitive data\n\n```python\nclass MyDocument(FlowDocument):\n    \"\"\"Custom document type.\"\"\"\n\n# Use create() for automatic conversion\ndoc = MyDocument.create(\n    name=\"data.json\",\n    content={\"key\": \"value\"}  # Auto-converts to JSON\n)\n\n# Access content\nif doc.is_text:\n    print(doc.text)\n\n# Parse structured data\ndata = doc.as_json()  # or as_yaml(), as_pydantic_model()\n\n# Enhanced filtering (new in v0.1.14)\nfiltered = documents.filter_by([Doc1, Doc2, Doc3])  # Multiple types\nnamed = documents.filter_by([\"file1.txt\", \"file2.txt\"])  # Multiple names\n```\n\n### LLM Integration\n\nThe framework provides a unified interface for LLM interactions with smart caching:\n\n```python\nfrom ai_pipeline_core import llm, AIMessages, ModelOptions\n\n# Simple generation\nresponse = await llm.generate(\n    model=\"gpt-5\",\n    messages=\"Explain quantum computing\"\n)\nprint(response.content)\n\n# With context caching (saves 50-90% tokens)\nstatic_context = AIMessages([large_document])\n\n# First call: caches context\nr1 = await llm.generate(\n    model=\"gpt-5\",\n    context=static_context,  # Cached for 120 seconds by default\n    messages=\"Summarize\"     # Dynamic query\n)\n\n# Second call: reuses cache\nr2 = await llm.generate(\n    model=\"gpt-5\",\n    context=static_context,  # Reused from cache!\n    messages=\"Key points?\"   # Different query\n)\n\n# Custom cache TTL (new in v0.1.14)\nresponse = await llm.generate(\n    model=\"gpt-5\",\n    context=static_context,\n    messages=\"Analyze\",\n    options=ModelOptions(cache_ttl=\"300s\")  # Cache for 5 minutes\n)\n\n# Disable caching for dynamic contexts\nresponse = await llm.generate(\n    model=\"gpt-5\",\n    context=dynamic_context,\n    messages=\"Process\",\n    options=ModelOptions(cache_ttl=None)  # No caching\n)\n```\n\n### Flow Configuration\n\nType-safe flow configuration ensures proper document flow:\n\n```python\nfrom ai_pipeline_core import FlowConfig\n\nclass ProcessingConfig(FlowConfig):\n    INPUT_DOCUMENT_TYPES = [RawDataDocument]\n    OUTPUT_DOCUMENT_TYPE = ProcessedDocument  # Must be different!\n\n    # Use in flows for validation\n    @pipeline_flow\n    async def process(\n        config: ProcessingConfig,\n        documents: DocumentList,\n        flow_options: FlowOptions\n    ) -> DocumentList:\n        # ... processing logic ...\n        return config.create_and_validate_output(outputs)\n```\n\n### Pipeline Decorators\n\nEnhanced decorators with built-in tracing and monitoring:\n\n```python\nfrom ai_pipeline_core import pipeline_flow, pipeline_task, set_trace_cost\n\n@pipeline_task  # Automatic retry, tracing, and monitoring\nasync def process_chunk(data: str) -> str:\n    result = await transform(data)\n    set_trace_cost(0.05)  # Track costs (new in v0.1.14)\n    return result\n\n@pipeline_flow  # Full observability and orchestration\nasync def main_flow(\n    project_name: str,\n    documents: DocumentList,\n    flow_options: FlowOptions\n) -> DocumentList:\n    # Your pipeline logic\n    return DocumentList(results)\n```\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# LLM Configuration (via LiteLLM proxy)\nOPENAI_BASE_URL=http://localhost:4000\nOPENAI_API_KEY=your-api-key\n\n# Optional: Observability\nLMNR_PROJECT_API_KEY=your-lmnr-key\nLMNR_DEBUG=true  # Enable debug traces\n\n# Optional: Orchestration\nPREFECT_API_URL=http://localhost:4200/api\nPREFECT_API_KEY=your-prefect-key\n```\n\n### Settings Management\n\nCreate custom settings by inheriting from the base Settings class:\n\n```python\nfrom ai_pipeline_core import Settings\n\nclass ProjectSettings(Settings):\n    \"\"\"Project-specific configuration.\"\"\"\n    app_name: str = \"my-app\"\n    max_retries: int = 3\n    enable_cache: bool = True\n\n# Create singleton instance\nsettings = ProjectSettings()\n\n# Access configuration\nprint(settings.openai_base_url)\nprint(settings.app_name)\n```\n\n## Best Practices\n\n### Framework Rules (90% Use Cases)\n\n1. **Decorators**: Use `@trace`, `@pipeline_task`, `@pipeline_flow` WITHOUT parameters\n2. **Logging**: Use `get_pipeline_logger(__name__)` - NEVER `print()` or `logging` module\n3. **LLM calls**: Use `AIMessages` or `str`. Wrap Documents in `AIMessages`\n4. **Options**: Omit `ModelOptions` unless specifically needed (defaults are optimal)\n5. **Documents**: Create with just `name` and `content` - skip `description`\n6. **FlowConfig**: `OUTPUT_DOCUMENT_TYPE` must differ from all `INPUT_DOCUMENT_TYPES`\n7. **Initialization**: `PromptManager` and logger at module scope, not in functions\n8. **DocumentList**: Use default constructor - no validation flags needed\n9. **setup_logging()**: Only in application `main()`, never at import time\n\n### Import Convention\n\nAlways import from the top-level package:\n\n```python\n# CORRECT\nfrom ai_pipeline_core import llm, pipeline_flow, FlowDocument\n\n# WRONG - Never import from submodules\nfrom ai_pipeline_core.llm import generate  # NO!\nfrom ai_pipeline_core.documents import FlowDocument  # NO!\n```\n\n## Development\n\n### Running Tests\n\n```bash\nmake test           # Run all tests\nmake test-cov      # Run with coverage report\nmake test-showcase # Test showcase example\n```\n\n### Code Quality\n\n```bash\nmake lint      # Run linting\nmake format    # Auto-format code\nmake typecheck # Type checking with basedpyright\n```\n\n### Building Documentation\n\n```bash\nmake docs-build  # Generate API.md\nmake docs-check  # Verify documentation is up-to-date\n```\n\n## Examples\n\nThe `examples/` directory contains:\n\n- `showcase.py` - Comprehensive example demonstrating all major features\n- Run with: `cd examples && python showcase.py /path/to/documents`\n\n## API Reference\n\nSee [API.md](API.md) for complete API documentation.\n\n### Navigation Tips\n\nFor humans:\n```bash\ngrep -n '^##' API.md   # List all main sections\ngrep -n '^###' API.md  # List all classes and functions\n```\n\nFor AI assistants:\n- Use pattern `^##` to find module sections\n- Use pattern `^###` for classes and functions\n- Use pattern `^####` for methods and properties\n\n## Project Structure\n\n```\nai-pipeline-core/\n\u251c\u2500\u2500 ai_pipeline_core/\n\u2502   \u251c\u2500\u2500 documents/      # Document abstraction system\n\u2502   \u251c\u2500\u2500 flow/           # Flow configuration and options\n\u2502   \u251c\u2500\u2500 llm/            # LLM client and response handling\n\u2502   \u251c\u2500\u2500 logging/        # Logging infrastructure\n\u2502   \u251c\u2500\u2500 tracing.py      # Distributed tracing\n\u2502   \u251c\u2500\u2500 pipeline.py     # Pipeline decorators\n\u2502   \u251c\u2500\u2500 prompt_manager.py # Jinja2 template management\n\u2502   \u2514\u2500\u2500 settings.py     # Configuration management\n\u251c\u2500\u2500 tests/              # Comprehensive test suite\n\u251c\u2500\u2500 examples/           # Usage examples\n\u251c\u2500\u2500 API.md             # Complete API reference\n\u2514\u2500\u2500 pyproject.toml     # Project configuration\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make changes following the project's style guide\n4. Run tests and linting (`make test lint typecheck`)\n5. Commit your changes\n6. Push to the branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/bbarwik/ai-pipeline-core/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/bbarwik/ai-pipeline-core/discussions)\n- **Documentation**: [API Reference](API.md)\n\n## Acknowledgments\n\n- Built on [Prefect](https://www.prefect.io/) for workflow orchestration\n- Uses [LiteLLM](https://github.com/BerriAI/litellm) for LLM provider abstraction\n- Integrates [Laminar (LMNR)](https://www.lmnr.ai/) for observability\n- Type checking with [Pydantic](https://pydantic.dev/) and [basedpyright](https://github.com/DetachHead/basedpyright)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Core utilities for AI-powered processing pipelines using prefect",
    "version": "0.1.14",
    "project_urls": {
        "Homepage": "https://github.com/bbarwik/ai-pipeline-core",
        "Issues": "https://github.com/bbarwik/ai-pipeline-core/issues",
        "Repository": "https://github.com/bbarwik/ai-pipeline-core"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7f7030b51e5fe224127d3ce44d93afbc5e9aebc7b0c9eb971fe66b12462ed53a",
                "md5": "8202b33a3f9a386f4e261a66f73c9226",
                "sha256": "cb3cc145d70f9c35b289d0687360d50aacb8711d711ee76c4cb57d36bdc900c6"
            },
            "downloads": -1,
            "filename": "ai_pipeline_core-0.1.14-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8202b33a3f9a386f4e261a66f73c9226",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 93896,
            "upload_time": "2025-09-01T15:46:56",
            "upload_time_iso_8601": "2025-09-01T15:46:56.139911Z",
            "url": "https://files.pythonhosted.org/packages/7f/70/30b51e5fe224127d3ce44d93afbc5e9aebc7b0c9eb971fe66b12462ed53a/ai_pipeline_core-0.1.14-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4400509c90caddb95a41e46efc800d2d701fc5ca1e5ee088fcb46e332604429d",
                "md5": "e378db85bd608d211d45686820b0af7a",
                "sha256": "4602578f51cdee41309338355c20cd29903a8356f046c570c1a88b28efbf312a"
            },
            "downloads": -1,
            "filename": "ai_pipeline_core-0.1.14.tar.gz",
            "has_sig": false,
            "md5_digest": "e378db85bd608d211d45686820b0af7a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 78466,
            "upload_time": "2025-09-01T15:46:57",
            "upload_time_iso_8601": "2025-09-01T15:46:57.584010Z",
            "url": "https://files.pythonhosted.org/packages/44/00/509c90caddb95a41e46efc800d2d701fc5ca1e5ee088fcb46e332604429d/ai_pipeline_core-0.1.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-01 15:46:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bbarwik",
    "github_project": "ai-pipeline-core",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ai-pipeline-core"
}

None