compression-prompt


Namecompression-prompt JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/hivellm/compression-prompt
SummaryFast statistical compression for LLM prompts - 50% token reduction with 91% quality retention
upload_time2025-10-24 00:02:51
maintainerNone
docs_urlNone
authorHiveLLM Team
requires_python>=3.8
licenseNone
keywords llm compression prompt optimization token-reduction nlp ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Compression Prompt - Python Implementation

> Fast, intelligent prompt compression for LLMs - Save 50% tokens while maintaining 91% quality

Python port of the Rust implementation. Achieves **50% token reduction** with **91% quality retention** using pure statistical filtering.

## Quick Start

### Installation

```bash
cd python
pip install -e .
```

Or install from source:

```bash
pip install -e ".[dev]"  # With development dependencies
```

### Basic Usage

```python
from compression_prompt import Compressor, CompressorConfig

# Use default configuration (50% compression)
compressor = Compressor()

text = """
Your long text here...
This will be compressed using statistical filtering
to save 50% tokens while maintaining quality.
"""

result = compressor.compress(text)

print(f"Original: {result.original_tokens} tokens")
print(f"Compressed: {result.compressed_tokens} tokens")
print(f"Saved: {result.tokens_removed} tokens ({(1-result.compression_ratio)*100:.1f}%)")
print(f"\nCompressed text:\n{result.compressed}")
```

### Advanced Configuration

```python
from compression_prompt import (
    Compressor, CompressorConfig, 
    StatisticalFilterConfig
)

# Custom compression ratio
config = CompressorConfig(target_ratio=0.7)  # Keep 70% of tokens
filter_config = StatisticalFilterConfig(
    compression_ratio=0.7,
    idf_weight=0.3,
    position_weight=0.2,
    pos_weight=0.2,
    entity_weight=0.2,
    entropy_weight=0.1,
)

compressor = Compressor(config, filter_config)
result = compressor.compress(text)
```

### Quality Metrics

```python
from compression_prompt import QualityMetrics

original = "Your original text..."
compressed = "Your compressed text..."

metrics = QualityMetrics.calculate(original, compressed)
print(metrics.format())
```

Output:
```
Quality Metrics:
- Keyword Retention: 92.0%
- Entity Retention: 89.5%
- Vocabulary Ratio: 78.3%
- Info Density: 0.845
- Overall Score: 89.2%
```

### Command Line Usage

```bash
# Compress file to stdout
compress input.txt

# Conservative compression (70%)
compress -r 0.7 input.txt

# Aggressive compression (30%)
compress -r 0.3 input.txt

# Show statistics
compress -s input.txt

# Save to file
compress -o output.txt input.txt

# Read from stdin
cat input.txt | compress
```

## Features

- ✅ **Zero Dependencies**: Pure Python implementation, no external libraries required
- ✅ **Fast**: Optimized statistical filtering
- ✅ **Multilingual**: Supports 10+ languages (EN, ES, PT, FR, DE, IT, RU, ZH, JA, AR, HI)
- ✅ **Smart Filtering**: Preserves code blocks, JSON, paths, identifiers
- ✅ **Contextual**: Intelligent stopword handling based on context
- ✅ **Customizable**: Fine-tune weights and parameters for your use case

## Configuration Options

### CompressorConfig

```python
CompressorConfig(
    target_ratio=0.5,        # Target compression ratio (0.0-1.0)
    min_input_tokens=100,    # Minimum tokens to attempt compression
    min_input_bytes=1024     # Minimum bytes to attempt compression
)
```

### StatisticalFilterConfig

```python
StatisticalFilterConfig(
    compression_ratio=0.5,              # Keep 50% of tokens
    
    # Feature weights (sum should be ~1.0)
    idf_weight=0.3,                     # Inverse document frequency
    position_weight=0.2,                # Position in text (start/end important)
    pos_weight=0.2,                     # Part-of-speech heuristics
    entity_weight=0.2,                  # Named entity detection
    entropy_weight=0.1,                 # Local vocabulary diversity
    
    # Protection features
    enable_protection_masks=True,       # Protect code/JSON/paths
    enable_contextual_stopwords=True,   # Smart stopword filtering
    preserve_negations=True,            # Keep "not", "never", etc.
    preserve_comparators=True,          # Keep ">=", "!=", etc.
    
    # Domain-specific
    domain_terms=["YourTerm"],          # Always preserve these terms
    min_gap_between_critical=3          # Fill gaps between important tokens
)
```

## Examples

### Example 1: RAG System Context Compression

```python
from compression_prompt import Compressor

# Compress retrieved context before sending to LLM
retrieved_docs = get_documents(query)
context = "\n\n".join(doc.text for doc in retrieved_docs)

compressor = Compressor()
result = compressor.compress(context)

# Save 50% tokens while maintaining quality
prompt = f"Context: {result.compressed}\n\nQuestion: {user_question}"
response = llm.generate(prompt)
```

### Example 2: Custom Domain Terms

```python
from compression_prompt import StatisticalFilterConfig, Compressor

# Preserve domain-specific terms
filter_config = StatisticalFilterConfig(
    domain_terms=["TensorFlow", "PyTorch", "CUDA", "GPU"]
)

compressor = Compressor(filter_config=filter_config)
result = compressor.compress(technical_text)
```

### Example 3: Aggressive Compression

```python
from compression_prompt import CompressorConfig, StatisticalFilterConfig, Compressor

# 70% compression (keep only 30% of tokens)
config = CompressorConfig(target_ratio=0.3, min_input_tokens=50)
filter_config = StatisticalFilterConfig(compression_ratio=0.3)

compressor = Compressor(config, filter_config)
result = compressor.compress(text)

print(f"Compressed to {result.compressed_tokens} tokens (from {result.original_tokens})")
```

## Performance Characteristics

| Compression | Token Savings | Keyword Retention | Entity Retention | Use Case |
|-------------|--------------|-------------------|------------------|----------|
| **50% (default)** ⭐ | **50%** | **92.0%** | **89.5%** | Best balance |
| 70% (conservative) | 30% | 99.2% | 98.4% | High precision |
| 30% (aggressive) | 70% | 72.4% | 71.5% | Maximum savings |

## Development

### Running Tests

```bash
pytest tests/
```

### Code Formatting

```bash
black compression_prompt/
```

### Type Checking

```bash
mypy compression_prompt/
```

## Differences from Rust Version

The Python implementation maintains feature parity with the Rust version:

- ✅ Same statistical filtering algorithm
- ✅ Same configuration options
- ✅ Same quality metrics
- ✅ CLI tool with identical interface
- ⏳ Image output (optional, requires Pillow)

Performance differences:
- **Rust**: ~0.16ms average, 10.58 MB/s throughput
- **Python**: ~1-5ms average (still very fast for most use cases)

## License

MIT

## See Also

- [Rust Implementation](../rust/) - Original high-performance implementation
- [Main README](../README.md) - Project overview and benchmarks
- [Architecture](../docs/ARCHITECTURE.md) - Technical details


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/hivellm/compression-prompt",
    "name": "compression-prompt",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "llm, compression, prompt, optimization, token-reduction, nlp, ai",
    "author": "HiveLLM Team",
    "author_email": "HiveLLM Team <team@hivellm.org>",
    "download_url": "https://files.pythonhosted.org/packages/da/7d/e60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1/compression_prompt-0.1.1.tar.gz",
    "platform": null,
    "description": "# Compression Prompt - Python Implementation\n\n> Fast, intelligent prompt compression for LLMs - Save 50% tokens while maintaining 91% quality\n\nPython port of the Rust implementation. Achieves **50% token reduction** with **91% quality retention** using pure statistical filtering.\n\n## Quick Start\n\n### Installation\n\n```bash\ncd python\npip install -e .\n```\n\nOr install from source:\n\n```bash\npip install -e \".[dev]\"  # With development dependencies\n```\n\n### Basic Usage\n\n```python\nfrom compression_prompt import Compressor, CompressorConfig\n\n# Use default configuration (50% compression)\ncompressor = Compressor()\n\ntext = \"\"\"\nYour long text here...\nThis will be compressed using statistical filtering\nto save 50% tokens while maintaining quality.\n\"\"\"\n\nresult = compressor.compress(text)\n\nprint(f\"Original: {result.original_tokens} tokens\")\nprint(f\"Compressed: {result.compressed_tokens} tokens\")\nprint(f\"Saved: {result.tokens_removed} tokens ({(1-result.compression_ratio)*100:.1f}%)\")\nprint(f\"\\nCompressed text:\\n{result.compressed}\")\n```\n\n### Advanced Configuration\n\n```python\nfrom compression_prompt import (\n    Compressor, CompressorConfig, \n    StatisticalFilterConfig\n)\n\n# Custom compression ratio\nconfig = CompressorConfig(target_ratio=0.7)  # Keep 70% of tokens\nfilter_config = StatisticalFilterConfig(\n    compression_ratio=0.7,\n    idf_weight=0.3,\n    position_weight=0.2,\n    pos_weight=0.2,\n    entity_weight=0.2,\n    entropy_weight=0.1,\n)\n\ncompressor = Compressor(config, filter_config)\nresult = compressor.compress(text)\n```\n\n### Quality Metrics\n\n```python\nfrom compression_prompt import QualityMetrics\n\noriginal = \"Your original text...\"\ncompressed = \"Your compressed text...\"\n\nmetrics = QualityMetrics.calculate(original, compressed)\nprint(metrics.format())\n```\n\nOutput:\n```\nQuality Metrics:\n- Keyword Retention: 92.0%\n- Entity Retention: 89.5%\n- Vocabulary Ratio: 78.3%\n- Info Density: 0.845\n- Overall Score: 89.2%\n```\n\n### Command Line Usage\n\n```bash\n# Compress file to stdout\ncompress input.txt\n\n# Conservative compression (70%)\ncompress -r 0.7 input.txt\n\n# Aggressive compression (30%)\ncompress -r 0.3 input.txt\n\n# Show statistics\ncompress -s input.txt\n\n# Save to file\ncompress -o output.txt input.txt\n\n# Read from stdin\ncat input.txt | compress\n```\n\n## Features\n\n- \u2705 **Zero Dependencies**: Pure Python implementation, no external libraries required\n- \u2705 **Fast**: Optimized statistical filtering\n- \u2705 **Multilingual**: Supports 10+ languages (EN, ES, PT, FR, DE, IT, RU, ZH, JA, AR, HI)\n- \u2705 **Smart Filtering**: Preserves code blocks, JSON, paths, identifiers\n- \u2705 **Contextual**: Intelligent stopword handling based on context\n- \u2705 **Customizable**: Fine-tune weights and parameters for your use case\n\n## Configuration Options\n\n### CompressorConfig\n\n```python\nCompressorConfig(\n    target_ratio=0.5,        # Target compression ratio (0.0-1.0)\n    min_input_tokens=100,    # Minimum tokens to attempt compression\n    min_input_bytes=1024     # Minimum bytes to attempt compression\n)\n```\n\n### StatisticalFilterConfig\n\n```python\nStatisticalFilterConfig(\n    compression_ratio=0.5,              # Keep 50% of tokens\n    \n    # Feature weights (sum should be ~1.0)\n    idf_weight=0.3,                     # Inverse document frequency\n    position_weight=0.2,                # Position in text (start/end important)\n    pos_weight=0.2,                     # Part-of-speech heuristics\n    entity_weight=0.2,                  # Named entity detection\n    entropy_weight=0.1,                 # Local vocabulary diversity\n    \n    # Protection features\n    enable_protection_masks=True,       # Protect code/JSON/paths\n    enable_contextual_stopwords=True,   # Smart stopword filtering\n    preserve_negations=True,            # Keep \"not\", \"never\", etc.\n    preserve_comparators=True,          # Keep \">=\", \"!=\", etc.\n    \n    # Domain-specific\n    domain_terms=[\"YourTerm\"],          # Always preserve these terms\n    min_gap_between_critical=3          # Fill gaps between important tokens\n)\n```\n\n## Examples\n\n### Example 1: RAG System Context Compression\n\n```python\nfrom compression_prompt import Compressor\n\n# Compress retrieved context before sending to LLM\nretrieved_docs = get_documents(query)\ncontext = \"\\n\\n\".join(doc.text for doc in retrieved_docs)\n\ncompressor = Compressor()\nresult = compressor.compress(context)\n\n# Save 50% tokens while maintaining quality\nprompt = f\"Context: {result.compressed}\\n\\nQuestion: {user_question}\"\nresponse = llm.generate(prompt)\n```\n\n### Example 2: Custom Domain Terms\n\n```python\nfrom compression_prompt import StatisticalFilterConfig, Compressor\n\n# Preserve domain-specific terms\nfilter_config = StatisticalFilterConfig(\n    domain_terms=[\"TensorFlow\", \"PyTorch\", \"CUDA\", \"GPU\"]\n)\n\ncompressor = Compressor(filter_config=filter_config)\nresult = compressor.compress(technical_text)\n```\n\n### Example 3: Aggressive Compression\n\n```python\nfrom compression_prompt import CompressorConfig, StatisticalFilterConfig, Compressor\n\n# 70% compression (keep only 30% of tokens)\nconfig = CompressorConfig(target_ratio=0.3, min_input_tokens=50)\nfilter_config = StatisticalFilterConfig(compression_ratio=0.3)\n\ncompressor = Compressor(config, filter_config)\nresult = compressor.compress(text)\n\nprint(f\"Compressed to {result.compressed_tokens} tokens (from {result.original_tokens})\")\n```\n\n## Performance Characteristics\n\n| Compression | Token Savings | Keyword Retention | Entity Retention | Use Case |\n|-------------|--------------|-------------------|------------------|----------|\n| **50% (default)** \u2b50 | **50%** | **92.0%** | **89.5%** | Best balance |\n| 70% (conservative) | 30% | 99.2% | 98.4% | High precision |\n| 30% (aggressive) | 70% | 72.4% | 71.5% | Maximum savings |\n\n## Development\n\n### Running Tests\n\n```bash\npytest tests/\n```\n\n### Code Formatting\n\n```bash\nblack compression_prompt/\n```\n\n### Type Checking\n\n```bash\nmypy compression_prompt/\n```\n\n## Differences from Rust Version\n\nThe Python implementation maintains feature parity with the Rust version:\n\n- \u2705 Same statistical filtering algorithm\n- \u2705 Same configuration options\n- \u2705 Same quality metrics\n- \u2705 CLI tool with identical interface\n- \u23f3 Image output (optional, requires Pillow)\n\nPerformance differences:\n- **Rust**: ~0.16ms average, 10.58 MB/s throughput\n- **Python**: ~1-5ms average (still very fast for most use cases)\n\n## License\n\nMIT\n\n## See Also\n\n- [Rust Implementation](../rust/) - Original high-performance implementation\n- [Main README](../README.md) - Project overview and benchmarks\n- [Architecture](../docs/ARCHITECTURE.md) - Technical details\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Fast statistical compression for LLM prompts - 50% token reduction with 91% quality retention",
    "version": "0.1.1",
    "project_urls": {
        "Bug Reports": "https://github.com/hivellm/compression-prompt/issues",
        "Documentation": "https://github.com/hivellm/compression-prompt/blob/main/README.md",
        "Homepage": "https://github.com/hivellm/compression-prompt",
        "Repository": "https://github.com/hivellm/compression-prompt"
    },
    "split_keywords": [
        "llm",
        " compression",
        " prompt",
        " optimization",
        " token-reduction",
        " nlp",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "064a95e3af08fd429b766bcaa822914977c4421bf7ed74231e8b007f18fc28f3",
                "md5": "f23a6bb3896a32766a70cbc93d14158e",
                "sha256": "edd9bd4a0e3270994ac9210e5244d121dfa4562afa77c4e3b29f2bfe2d7fa81b"
            },
            "downloads": -1,
            "filename": "compression_prompt-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f23a6bb3896a32766a70cbc93d14158e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 18996,
            "upload_time": "2025-10-24T00:02:50",
            "upload_time_iso_8601": "2025-10-24T00:02:50.688552Z",
            "url": "https://files.pythonhosted.org/packages/06/4a/95e3af08fd429b766bcaa822914977c4421bf7ed74231e8b007f18fc28f3/compression_prompt-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "da7de60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1",
                "md5": "7a54f128ee2a2a59ceee2c2f497dc318",
                "sha256": "7d0a6c65ca5bd04f3edc5f430dd407de2143fc6b029d60be0dc737ff1974a7cf"
            },
            "downloads": -1,
            "filename": "compression_prompt-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "7a54f128ee2a2a59ceee2c2f497dc318",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20867,
            "upload_time": "2025-10-24T00:02:51",
            "upload_time_iso_8601": "2025-10-24T00:02:51.996190Z",
            "url": "https://files.pythonhosted.org/packages/da/7d/e60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1/compression_prompt-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-24 00:02:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hivellm",
    "github_project": "compression-prompt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "compression-prompt"
}
        
Elapsed time: 1.88802s