# LLM Redact
Privacy-first text redaction using local LLM models. Automatically detect and redact sensitive information like names, emails, phone numbers, and more.
## Features
- 🔒 **Privacy-first** - Uses local LLM models, no data sent to external services
- 🚀 **Simple API** - One-liner redaction: `llm_redact.mask(text)`
- 💾 **Smart Caching** - SQLite database for caching and history
- 🔧 **Configurable** - Custom rules, models, and database connections
- 📊 **Tracking** - Full history and analytics of redaction operations
## Installation
```bash
pip install llm-redact
```
## Quick Start
```python
import llm_redact
# Simple redaction
result = llm_redact.mask("Hi, I'm John Doe from john@example.com")
print(result.redacted_text)
# Output: "Hi, I'm |_NAME_A1B2C3D4_| from |_EMAIL_E5F6G7H8_|"
print(result.replacements)
# Output: [
# Replacement(original_text="John Doe", replacement_text="|_NAME_A1B2C3D4_|"),
# Replacement(original_text="john@example.com", replacement_text="|_EMAIL_E5F6G7H8_|")
# ]
# Note: Placeholders contain unique IDs and can be stored in database for restoration
# Each placeholder like |_NAME_A1B2C3D4_| maps to original text via database lookup
```
## Configuration
### Environment Variables
```bash
# LLM Host (default: http://localhost:8000)
export LLM_REDACT_LLM_HOST_URL=http://localhost:8000
# Database (default: sqlite:///llm_redact.db)
export LLM_REDACT_DATABASE_URL=sqlite:///my_redact.db
# Model (default: gemma3:1b)
export LLM_REDACT_DEFAULT_MODEL=gemma3:1b
# Caching (default: True)
export LLM_REDACT_ENABLE_CACHING=true
```
### Custom Database
```python
import llm_redact
# Use PostgreSQL
llm_redact.configure_client(
database_url="postgresql://user:pass@localhost/redact_db"
)
# Use custom LLM host
llm_redact.configure_client(
llm_host_url="http://my-llm-server:8000"
)
```
## Advanced Usage
### Custom Rules
```python
from llm_redact import RedactionRule
custom_rules = [
RedactionRule(
name="Replace SSN with [SSN]",
description="Social Security Numbers",
data_type="SSN"
),
RedactionRule(
name="Replace addresses with [ADDRESS]",
description="Physical addresses",
data_type="ADDRESS"
)
]
result = llm_redact.mask(
"My SSN is 123-45-6789 and I live at 123 Main St",
rules=custom_rules
)
```
### Using the Client Directly
```python
from llm_redact import LLMRedactClient
client = LLMRedactClient(
llm_host_url="http://localhost:8000",
database_url="sqlite:///custom.db"
)
result = client.mask("Sensitive text here")
# Get history
history = client.get_history(limit=10)
# Create custom rules
rule = client.create_rule(
name="Replace API keys with [API_KEY]",
description="API keys and tokens"
)
```
## Prerequisites
1. **LLM Host Server**: Run the llm-redact host server locally:
```bash
# Install and run the LLM host
ollama serve
ollama pull gemma3:1b
# Run llm-redact host server
python -m llm_redact_host
```
2. **Database**: SQLite (default) or any SQLAlchemy-supported database
## Supported Redaction Types
- Personal names → `|_NAME_XXXX_|`
- Email addresses → `|_EMAIL_XXXX_|`
- Phone numbers → `|_PHONE_XXXX_|`
- Countries → `|_COUNTRY_XXXX_|`
- Universities → `|_UNIVERSITY_XXXX_|`
- Job titles → `|_JOB_TITLE_XXXX_|`
- Addresses → `|_ADDRESS_XXXX_|`
- Social Security Numbers → `|_SSN_XXXX_|`
- Credit card numbers → `|_CREDIT_CARD_XXXX_|`
Where `XXXX` is a unique 8-character hash ID for each piece of data.
## API Reference
### `llm_redact.mask(text, rules=None, model=None)`
Redact sensitive information from text.
**Parameters:**
- `text` (str): Text to redact
- `rules` (list, optional): Custom redaction rules
- `model` (str, optional): LLM model to use
**Returns:** `RedactionResult` object
### `RedactionResult`
- `original_text`: Original input text
- `redacted_text`: Text with sensitive data redacted
- `replacements`: List of replacements made
- `is_redacted`: Whether any redactions were made
- `processing_time_ms`: Processing time in milliseconds
- `cached`: Whether result was from cache
## License
MIT License
Raw data
{
"_id": null,
"home_page": "https://github.com/lookr-fyi/llm-redact",
"name": "llm-redact",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0.0,>=3.12.4",
"maintainer_email": null,
"keywords": "privacy, redaction, llm, pii, data-protection, sensitive-data, ai",
"author": "LLM Redact Contributors",
"author_email": "yuqil@lookr.fyi",
"download_url": "https://files.pythonhosted.org/packages/39/63/c44473eded2d91d75e640c57293ddc8fb821a88f880c3cdf4c8f0d42ac65/llm_redact-0.1.1.tar.gz",
"platform": null,
"description": "# LLM Redact\n\nPrivacy-first text redaction using local LLM models. Automatically detect and redact sensitive information like names, emails, phone numbers, and more.\n\n## Features\n\n- \ud83d\udd12 **Privacy-first** - Uses local LLM models, no data sent to external services\n- \ud83d\ude80 **Simple API** - One-liner redaction: `llm_redact.mask(text)`\n- \ud83d\udcbe **Smart Caching** - SQLite database for caching and history\n- \ud83d\udd27 **Configurable** - Custom rules, models, and database connections\n- \ud83d\udcca **Tracking** - Full history and analytics of redaction operations\n\n## Installation\n\n```bash\npip install llm-redact\n```\n\n## Quick Start\n\n```python\nimport llm_redact\n\n# Simple redaction\nresult = llm_redact.mask(\"Hi, I'm John Doe from john@example.com\")\nprint(result.redacted_text)\n# Output: \"Hi, I'm |_NAME_A1B2C3D4_| from |_EMAIL_E5F6G7H8_|\"\n\nprint(result.replacements)\n# Output: [\n# Replacement(original_text=\"John Doe\", replacement_text=\"|_NAME_A1B2C3D4_|\"),\n# Replacement(original_text=\"john@example.com\", replacement_text=\"|_EMAIL_E5F6G7H8_|\")\n# ]\n\n# Note: Placeholders contain unique IDs and can be stored in database for restoration\n# Each placeholder like |_NAME_A1B2C3D4_| maps to original text via database lookup\n```\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# LLM Host (default: http://localhost:8000)\nexport LLM_REDACT_LLM_HOST_URL=http://localhost:8000\n\n# Database (default: sqlite:///llm_redact.db)\nexport LLM_REDACT_DATABASE_URL=sqlite:///my_redact.db\n\n# Model (default: gemma3:1b)\nexport LLM_REDACT_DEFAULT_MODEL=gemma3:1b\n\n# Caching (default: True)\nexport LLM_REDACT_ENABLE_CACHING=true\n```\n\n### Custom Database\n\n```python\nimport llm_redact\n\n# Use PostgreSQL\nllm_redact.configure_client(\n database_url=\"postgresql://user:pass@localhost/redact_db\"\n)\n\n# Use custom LLM host\nllm_redact.configure_client(\n llm_host_url=\"http://my-llm-server:8000\"\n)\n```\n\n## Advanced Usage\n\n### Custom Rules\n\n```python\nfrom llm_redact import RedactionRule\n\ncustom_rules = [\n RedactionRule(\n name=\"Replace SSN with [SSN]\", \n description=\"Social Security Numbers\",\n data_type=\"SSN\"\n ),\n RedactionRule(\n name=\"Replace addresses with [ADDRESS]\", \n description=\"Physical addresses\",\n data_type=\"ADDRESS\"\n )\n]\n\nresult = llm_redact.mask(\n \"My SSN is 123-45-6789 and I live at 123 Main St\",\n rules=custom_rules\n)\n```\n\n### Using the Client Directly\n\n```python\nfrom llm_redact import LLMRedactClient\n\nclient = LLMRedactClient(\n llm_host_url=\"http://localhost:8000\",\n database_url=\"sqlite:///custom.db\"\n)\n\nresult = client.mask(\"Sensitive text here\")\n\n# Get history\nhistory = client.get_history(limit=10)\n\n# Create custom rules\nrule = client.create_rule(\n name=\"Replace API keys with [API_KEY]\",\n description=\"API keys and tokens\"\n)\n```\n\n## Prerequisites\n\n1. **LLM Host Server**: Run the llm-redact host server locally:\n ```bash\n # Install and run the LLM host\n ollama serve\n ollama pull gemma3:1b\n \n # Run llm-redact host server\n python -m llm_redact_host\n ```\n\n2. **Database**: SQLite (default) or any SQLAlchemy-supported database\n\n## Supported Redaction Types\n\n- Personal names \u2192 `|_NAME_XXXX_|`\n- Email addresses \u2192 `|_EMAIL_XXXX_|`\n- Phone numbers \u2192 `|_PHONE_XXXX_|`\n- Countries \u2192 `|_COUNTRY_XXXX_|`\n- Universities \u2192 `|_UNIVERSITY_XXXX_|`\n- Job titles \u2192 `|_JOB_TITLE_XXXX_|`\n- Addresses \u2192 `|_ADDRESS_XXXX_|`\n- Social Security Numbers \u2192 `|_SSN_XXXX_|`\n- Credit card numbers \u2192 `|_CREDIT_CARD_XXXX_|`\n\nWhere `XXXX` is a unique 8-character hash ID for each piece of data.\n\n## API Reference\n\n### `llm_redact.mask(text, rules=None, model=None)`\n\nRedact sensitive information from text.\n\n**Parameters:**\n- `text` (str): Text to redact\n- `rules` (list, optional): Custom redaction rules\n- `model` (str, optional): LLM model to use\n\n**Returns:** `RedactionResult` object\n\n### `RedactionResult`\n\n- `original_text`: Original input text\n- `redacted_text`: Text with sensitive data redacted\n- `replacements`: List of replacements made\n- `is_redacted`: Whether any redactions were made\n- `processing_time_ms`: Processing time in milliseconds\n- `cached`: Whether result was from cache\n\n## License\n\nMIT License ",
"bugtrack_url": null,
"license": "MIT",
"summary": "Privacy-first text redaction using local LLM models with rule generation capabilities",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://github.com/lookr-fyi/llm-redact/blob/main/README.md",
"Homepage": "https://github.com/lookr-fyi/llm-redact",
"Repository": "https://github.com/lookr-fyi/llm-redact"
},
"split_keywords": [
"privacy",
" redaction",
" llm",
" pii",
" data-protection",
" sensitive-data",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b94690776b1b60c99e65eb5db90e8b28fd157d94698b416c1fef60248b496c0b",
"md5": "2ac6d5f387128241d4b3098aae6710ad",
"sha256": "0c08d26a8cdeb469c09addfb1731ae1f8bdd8ada4e2e732a7f187b9c8fcc9a6b"
},
"downloads": -1,
"filename": "llm_redact-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2ac6d5f387128241d4b3098aae6710ad",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.12.4",
"size": 27811,
"upload_time": "2025-07-21T04:58:16",
"upload_time_iso_8601": "2025-07-21T04:58:16.732419Z",
"url": "https://files.pythonhosted.org/packages/b9/46/90776b1b60c99e65eb5db90e8b28fd157d94698b416c1fef60248b496c0b/llm_redact-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3963c44473eded2d91d75e640c57293ddc8fb821a88f880c3cdf4c8f0d42ac65",
"md5": "ac6c14c4877a4d042539fb9a26eae010",
"sha256": "61db1eed1179007a0ad25cda31c6b086088d49632b1902ec9d142906570937a6"
},
"downloads": -1,
"filename": "llm_redact-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "ac6c14c4877a4d042539fb9a26eae010",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.12.4",
"size": 18981,
"upload_time": "2025-07-21T04:58:17",
"upload_time_iso_8601": "2025-07-21T04:58:17.994850Z",
"url": "https://files.pythonhosted.org/packages/39/63/c44473eded2d91d75e640c57293ddc8fb821a88f880c3cdf4c8f0d42ac65/llm_redact-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-21 04:58:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lookr-fyi",
"github_project": "llm-redact",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "llm-redact"
}