# Compression Prompt - Python Implementation
> Fast, intelligent prompt compression for LLMs - Save 50% tokens while maintaining 91% quality
Python port of the Rust implementation. Achieves **50% token reduction** with **91% quality retention** using pure statistical filtering.
## Quick Start
### Installation
```bash
cd python
pip install -e .
```
Or install from source:
```bash
pip install -e ".[dev]" # With development dependencies
```
### Basic Usage
```python
from compression_prompt import Compressor, CompressorConfig
# Use default configuration (50% compression)
compressor = Compressor()
text = """
Your long text here...
This will be compressed using statistical filtering
to save 50% tokens while maintaining quality.
"""
result = compressor.compress(text)
print(f"Original: {result.original_tokens} tokens")
print(f"Compressed: {result.compressed_tokens} tokens")
print(f"Saved: {result.tokens_removed} tokens ({(1-result.compression_ratio)*100:.1f}%)")
print(f"\nCompressed text:\n{result.compressed}")
```
### Advanced Configuration
```python
from compression_prompt import (
Compressor, CompressorConfig,
StatisticalFilterConfig
)
# Custom compression ratio
config = CompressorConfig(target_ratio=0.7) # Keep 70% of tokens
filter_config = StatisticalFilterConfig(
compression_ratio=0.7,
idf_weight=0.3,
position_weight=0.2,
pos_weight=0.2,
entity_weight=0.2,
entropy_weight=0.1,
)
compressor = Compressor(config, filter_config)
result = compressor.compress(text)
```
### Quality Metrics
```python
from compression_prompt import QualityMetrics
original = "Your original text..."
compressed = "Your compressed text..."
metrics = QualityMetrics.calculate(original, compressed)
print(metrics.format())
```
Output:
```
Quality Metrics:
- Keyword Retention: 92.0%
- Entity Retention: 89.5%
- Vocabulary Ratio: 78.3%
- Info Density: 0.845
- Overall Score: 89.2%
```
### Command Line Usage
```bash
# Compress file to stdout
compress input.txt
# Conservative compression (70%)
compress -r 0.7 input.txt
# Aggressive compression (30%)
compress -r 0.3 input.txt
# Show statistics
compress -s input.txt
# Save to file
compress -o output.txt input.txt
# Read from stdin
cat input.txt | compress
```
## Features
- ✅ **Zero Dependencies**: Pure Python implementation, no external libraries required
- ✅ **Fast**: Optimized statistical filtering
- ✅ **Multilingual**: Supports 10+ languages (EN, ES, PT, FR, DE, IT, RU, ZH, JA, AR, HI)
- ✅ **Smart Filtering**: Preserves code blocks, JSON, paths, identifiers
- ✅ **Contextual**: Intelligent stopword handling based on context
- ✅ **Customizable**: Fine-tune weights and parameters for your use case
## Configuration Options
### CompressorConfig
```python
CompressorConfig(
target_ratio=0.5, # Target compression ratio (0.0-1.0)
min_input_tokens=100, # Minimum tokens to attempt compression
min_input_bytes=1024 # Minimum bytes to attempt compression
)
```
### StatisticalFilterConfig
```python
StatisticalFilterConfig(
compression_ratio=0.5, # Keep 50% of tokens
# Feature weights (sum should be ~1.0)
idf_weight=0.3, # Inverse document frequency
position_weight=0.2, # Position in text (start/end important)
pos_weight=0.2, # Part-of-speech heuristics
entity_weight=0.2, # Named entity detection
entropy_weight=0.1, # Local vocabulary diversity
# Protection features
enable_protection_masks=True, # Protect code/JSON/paths
enable_contextual_stopwords=True, # Smart stopword filtering
preserve_negations=True, # Keep "not", "never", etc.
preserve_comparators=True, # Keep ">=", "!=", etc.
# Domain-specific
domain_terms=["YourTerm"], # Always preserve these terms
min_gap_between_critical=3 # Fill gaps between important tokens
)
```
## Examples
### Example 1: RAG System Context Compression
```python
from compression_prompt import Compressor
# Compress retrieved context before sending to LLM
retrieved_docs = get_documents(query)
context = "\n\n".join(doc.text for doc in retrieved_docs)
compressor = Compressor()
result = compressor.compress(context)
# Save 50% tokens while maintaining quality
prompt = f"Context: {result.compressed}\n\nQuestion: {user_question}"
response = llm.generate(prompt)
```
### Example 2: Custom Domain Terms
```python
from compression_prompt import StatisticalFilterConfig, Compressor
# Preserve domain-specific terms
filter_config = StatisticalFilterConfig(
domain_terms=["TensorFlow", "PyTorch", "CUDA", "GPU"]
)
compressor = Compressor(filter_config=filter_config)
result = compressor.compress(technical_text)
```
### Example 3: Aggressive Compression
```python
from compression_prompt import CompressorConfig, StatisticalFilterConfig, Compressor
# 70% compression (keep only 30% of tokens)
config = CompressorConfig(target_ratio=0.3, min_input_tokens=50)
filter_config = StatisticalFilterConfig(compression_ratio=0.3)
compressor = Compressor(config, filter_config)
result = compressor.compress(text)
print(f"Compressed to {result.compressed_tokens} tokens (from {result.original_tokens})")
```
## Performance Characteristics
| Compression | Token Savings | Keyword Retention | Entity Retention | Use Case |
|-------------|--------------|-------------------|------------------|----------|
| **50% (default)** ⭐ | **50%** | **92.0%** | **89.5%** | Best balance |
| 70% (conservative) | 30% | 99.2% | 98.4% | High precision |
| 30% (aggressive) | 70% | 72.4% | 71.5% | Maximum savings |
## Development
### Running Tests
```bash
pytest tests/
```
### Code Formatting
```bash
black compression_prompt/
```
### Type Checking
```bash
mypy compression_prompt/
```
## Differences from Rust Version
The Python implementation maintains feature parity with the Rust version:
- ✅ Same statistical filtering algorithm
- ✅ Same configuration options
- ✅ Same quality metrics
- ✅ CLI tool with identical interface
- ⏳ Image output (optional, requires Pillow)
Performance differences:
- **Rust**: ~0.16ms average, 10.58 MB/s throughput
- **Python**: ~1-5ms average (still very fast for most use cases)
## License
MIT
## See Also
- [Rust Implementation](../rust/) - Original high-performance implementation
- [Main README](../README.md) - Project overview and benchmarks
- [Architecture](../docs/ARCHITECTURE.md) - Technical details
Raw data
{
"_id": null,
"home_page": "https://github.com/hivellm/compression-prompt",
"name": "compression-prompt",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "llm, compression, prompt, optimization, token-reduction, nlp, ai",
"author": "HiveLLM Team",
"author_email": "HiveLLM Team <team@hivellm.org>",
"download_url": "https://files.pythonhosted.org/packages/da/7d/e60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1/compression_prompt-0.1.1.tar.gz",
"platform": null,
"description": "# Compression Prompt - Python Implementation\n\n> Fast, intelligent prompt compression for LLMs - Save 50% tokens while maintaining 91% quality\n\nPython port of the Rust implementation. Achieves **50% token reduction** with **91% quality retention** using pure statistical filtering.\n\n## Quick Start\n\n### Installation\n\n```bash\ncd python\npip install -e .\n```\n\nOr install from source:\n\n```bash\npip install -e \".[dev]\" # With development dependencies\n```\n\n### Basic Usage\n\n```python\nfrom compression_prompt import Compressor, CompressorConfig\n\n# Use default configuration (50% compression)\ncompressor = Compressor()\n\ntext = \"\"\"\nYour long text here...\nThis will be compressed using statistical filtering\nto save 50% tokens while maintaining quality.\n\"\"\"\n\nresult = compressor.compress(text)\n\nprint(f\"Original: {result.original_tokens} tokens\")\nprint(f\"Compressed: {result.compressed_tokens} tokens\")\nprint(f\"Saved: {result.tokens_removed} tokens ({(1-result.compression_ratio)*100:.1f}%)\")\nprint(f\"\\nCompressed text:\\n{result.compressed}\")\n```\n\n### Advanced Configuration\n\n```python\nfrom compression_prompt import (\n Compressor, CompressorConfig, \n StatisticalFilterConfig\n)\n\n# Custom compression ratio\nconfig = CompressorConfig(target_ratio=0.7) # Keep 70% of tokens\nfilter_config = StatisticalFilterConfig(\n compression_ratio=0.7,\n idf_weight=0.3,\n position_weight=0.2,\n pos_weight=0.2,\n entity_weight=0.2,\n entropy_weight=0.1,\n)\n\ncompressor = Compressor(config, filter_config)\nresult = compressor.compress(text)\n```\n\n### Quality Metrics\n\n```python\nfrom compression_prompt import QualityMetrics\n\noriginal = \"Your original text...\"\ncompressed = \"Your compressed text...\"\n\nmetrics = QualityMetrics.calculate(original, compressed)\nprint(metrics.format())\n```\n\nOutput:\n```\nQuality Metrics:\n- Keyword Retention: 92.0%\n- Entity Retention: 89.5%\n- Vocabulary Ratio: 78.3%\n- Info Density: 0.845\n- Overall Score: 89.2%\n```\n\n### Command Line Usage\n\n```bash\n# Compress file to stdout\ncompress input.txt\n\n# Conservative compression (70%)\ncompress -r 0.7 input.txt\n\n# Aggressive compression (30%)\ncompress -r 0.3 input.txt\n\n# Show statistics\ncompress -s input.txt\n\n# Save to file\ncompress -o output.txt input.txt\n\n# Read from stdin\ncat input.txt | compress\n```\n\n## Features\n\n- \u2705 **Zero Dependencies**: Pure Python implementation, no external libraries required\n- \u2705 **Fast**: Optimized statistical filtering\n- \u2705 **Multilingual**: Supports 10+ languages (EN, ES, PT, FR, DE, IT, RU, ZH, JA, AR, HI)\n- \u2705 **Smart Filtering**: Preserves code blocks, JSON, paths, identifiers\n- \u2705 **Contextual**: Intelligent stopword handling based on context\n- \u2705 **Customizable**: Fine-tune weights and parameters for your use case\n\n## Configuration Options\n\n### CompressorConfig\n\n```python\nCompressorConfig(\n target_ratio=0.5, # Target compression ratio (0.0-1.0)\n min_input_tokens=100, # Minimum tokens to attempt compression\n min_input_bytes=1024 # Minimum bytes to attempt compression\n)\n```\n\n### StatisticalFilterConfig\n\n```python\nStatisticalFilterConfig(\n compression_ratio=0.5, # Keep 50% of tokens\n \n # Feature weights (sum should be ~1.0)\n idf_weight=0.3, # Inverse document frequency\n position_weight=0.2, # Position in text (start/end important)\n pos_weight=0.2, # Part-of-speech heuristics\n entity_weight=0.2, # Named entity detection\n entropy_weight=0.1, # Local vocabulary diversity\n \n # Protection features\n enable_protection_masks=True, # Protect code/JSON/paths\n enable_contextual_stopwords=True, # Smart stopword filtering\n preserve_negations=True, # Keep \"not\", \"never\", etc.\n preserve_comparators=True, # Keep \">=\", \"!=\", etc.\n \n # Domain-specific\n domain_terms=[\"YourTerm\"], # Always preserve these terms\n min_gap_between_critical=3 # Fill gaps between important tokens\n)\n```\n\n## Examples\n\n### Example 1: RAG System Context Compression\n\n```python\nfrom compression_prompt import Compressor\n\n# Compress retrieved context before sending to LLM\nretrieved_docs = get_documents(query)\ncontext = \"\\n\\n\".join(doc.text for doc in retrieved_docs)\n\ncompressor = Compressor()\nresult = compressor.compress(context)\n\n# Save 50% tokens while maintaining quality\nprompt = f\"Context: {result.compressed}\\n\\nQuestion: {user_question}\"\nresponse = llm.generate(prompt)\n```\n\n### Example 2: Custom Domain Terms\n\n```python\nfrom compression_prompt import StatisticalFilterConfig, Compressor\n\n# Preserve domain-specific terms\nfilter_config = StatisticalFilterConfig(\n domain_terms=[\"TensorFlow\", \"PyTorch\", \"CUDA\", \"GPU\"]\n)\n\ncompressor = Compressor(filter_config=filter_config)\nresult = compressor.compress(technical_text)\n```\n\n### Example 3: Aggressive Compression\n\n```python\nfrom compression_prompt import CompressorConfig, StatisticalFilterConfig, Compressor\n\n# 70% compression (keep only 30% of tokens)\nconfig = CompressorConfig(target_ratio=0.3, min_input_tokens=50)\nfilter_config = StatisticalFilterConfig(compression_ratio=0.3)\n\ncompressor = Compressor(config, filter_config)\nresult = compressor.compress(text)\n\nprint(f\"Compressed to {result.compressed_tokens} tokens (from {result.original_tokens})\")\n```\n\n## Performance Characteristics\n\n| Compression | Token Savings | Keyword Retention | Entity Retention | Use Case |\n|-------------|--------------|-------------------|------------------|----------|\n| **50% (default)** \u2b50 | **50%** | **92.0%** | **89.5%** | Best balance |\n| 70% (conservative) | 30% | 99.2% | 98.4% | High precision |\n| 30% (aggressive) | 70% | 72.4% | 71.5% | Maximum savings |\n\n## Development\n\n### Running Tests\n\n```bash\npytest tests/\n```\n\n### Code Formatting\n\n```bash\nblack compression_prompt/\n```\n\n### Type Checking\n\n```bash\nmypy compression_prompt/\n```\n\n## Differences from Rust Version\n\nThe Python implementation maintains feature parity with the Rust version:\n\n- \u2705 Same statistical filtering algorithm\n- \u2705 Same configuration options\n- \u2705 Same quality metrics\n- \u2705 CLI tool with identical interface\n- \u23f3 Image output (optional, requires Pillow)\n\nPerformance differences:\n- **Rust**: ~0.16ms average, 10.58 MB/s throughput\n- **Python**: ~1-5ms average (still very fast for most use cases)\n\n## License\n\nMIT\n\n## See Also\n\n- [Rust Implementation](../rust/) - Original high-performance implementation\n- [Main README](../README.md) - Project overview and benchmarks\n- [Architecture](../docs/ARCHITECTURE.md) - Technical details\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Fast statistical compression for LLM prompts - 50% token reduction with 91% quality retention",
"version": "0.1.1",
"project_urls": {
"Bug Reports": "https://github.com/hivellm/compression-prompt/issues",
"Documentation": "https://github.com/hivellm/compression-prompt/blob/main/README.md",
"Homepage": "https://github.com/hivellm/compression-prompt",
"Repository": "https://github.com/hivellm/compression-prompt"
},
"split_keywords": [
"llm",
" compression",
" prompt",
" optimization",
" token-reduction",
" nlp",
" ai"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "064a95e3af08fd429b766bcaa822914977c4421bf7ed74231e8b007f18fc28f3",
"md5": "f23a6bb3896a32766a70cbc93d14158e",
"sha256": "edd9bd4a0e3270994ac9210e5244d121dfa4562afa77c4e3b29f2bfe2d7fa81b"
},
"downloads": -1,
"filename": "compression_prompt-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f23a6bb3896a32766a70cbc93d14158e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 18996,
"upload_time": "2025-10-24T00:02:50",
"upload_time_iso_8601": "2025-10-24T00:02:50.688552Z",
"url": "https://files.pythonhosted.org/packages/06/4a/95e3af08fd429b766bcaa822914977c4421bf7ed74231e8b007f18fc28f3/compression_prompt-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "da7de60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1",
"md5": "7a54f128ee2a2a59ceee2c2f497dc318",
"sha256": "7d0a6c65ca5bd04f3edc5f430dd407de2143fc6b029d60be0dc737ff1974a7cf"
},
"downloads": -1,
"filename": "compression_prompt-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "7a54f128ee2a2a59ceee2c2f497dc318",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 20867,
"upload_time": "2025-10-24T00:02:51",
"upload_time_iso_8601": "2025-10-24T00:02:51.996190Z",
"url": "https://files.pythonhosted.org/packages/da/7d/e60a2eeee1adb5e778898769209e8e659f4572132575ef3009ad88c639e1/compression_prompt-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-24 00:02:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hivellm",
"github_project": "compression-prompt",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "compression-prompt"
}