# SecretSentry π‘οΈ
> **The first sensitive data scanner built for modern data science and web development workflows**
[](https://badge.fury.io/py/secretsentry)
[](https://pypi.org/project/secretsentry/)
[](https://opensource.org/licenses/MIT)
SecretSentry is an advanced sensitive data scanner that goes beyond traditional secret detection. Built specifically for **Jupyter notebooks**, **web development**, and **data science workflows**, it intelligently filters false positives while detecting API keys, PII, credentials, and other sensitive information.
## π― **Why SecretSentry?**
### **Built for Modern Workflows**
- π¬ **Jupyter Notebook Specialist**: First scanner designed for `.ipynb` files
- π§ **Smart False Positive Filtering**: Ignores base64 images, cell IDs, and CSS colors
- π **Multi-Environment**: CLI, Jupyter notebooks, and Python scripts
- ποΈ **Interactive Analysis**: Built-in widgets for exploring findings
### **Comprehensive Detection**
- π **50+ Built-in Patterns**: API keys, tokens, secrets, credentials
- π€ **PII Detection**: SSNs, credit cards, phone numbers, emails
- π° **Financial Data**: Salary information, bank accounts, routing numbers
- π **Geographic Data**: Coordinates, IP addresses, postal codes
- π₯ **Sensitive Categories**: Ethnic data, religious information, medical records
### **Advanced Features**
- π‘οΈ **Smart Sanitization**: Context-aware gibberish replacement
- π **Rich Visualizations**: Charts and statistics (with matplotlib/seaborn)
- π **Pandas Integration**: Export to DataFrames for analysis
- π **CI/CD Ready**: Perfect for automation and pipelines
## π **Quick Start**
### **Installation**
```bash
# Basic installation
pip install secretsentry
# Full installation with all features
pip install secretsentry[full]
# For Jupyter notebooks only
pip install secretsentry[jupyter]
```
### **Basic Usage**
```python
from secretsentry import SecretSentry, quick_scan
# Quick scan with automatic results
scanner = quick_scan("./my_project")
# Manual scanning with custom options
scanner = SecretSentry()
findings = scanner.scan_directory("./my_project")
scanner.display_findings()
# Sanitize files (creates backups automatically)
stats = scanner.sanitize_files(dry_run=True) # Preview changes
stats = scanner.sanitize_files() # Actually sanitize
```
### **Command Line**
```bash
# Scan and display results
secretsentry scan ./my_project --display
# Scan specific file types
secretsentry scan ./my_project --extensions .py .js .ipynb --display
# Export findings
secretsentry scan ./my_project --export findings.json
# Sanitize files (with backup)
secretsentry scan ./my_project --sanitize --dry-run
secretsentry scan ./my_project --sanitize
# List all detection patterns
secretsentry list-patterns
```
## π **Jupyter Notebook Integration**
SecretSentry shines in Jupyter environments with **zero false positives** from notebook metadata:
```python
# In Jupyter notebook
from secretsentry import SecretSentry, create_sample_files
# Create test data
create_sample_files("./test_data")
# Quick scan with visualizations
scanner = quick_scan("./test_data", show_plots=True)
# Interactive exploration
scanner.create_interactive_viewer()
# Data analysis with pandas
df = scanner.to_dataframe()
summary = df.groupby('pattern_type').size()
```
## π **What Makes It Special**
### **Intelligent False Positive Filtering**
**Traditional scanners** flag this as secrets:
```
β aws_secret_key: iVBORw0KGgoAAAANSUhEUgAABKYAAAMW... # Just a PNG image!
β api_key: "cell_type": "code" # Notebook metadata!
β secret: #3498db # CSS color!
```
**SecretSentry** ignores these and only reports **real issues**:
```
β
aws_secret_key: AKIAIOSFODNN7EXAMPLE
β
stripe_key: sk_live_1234567890abcdef123456789
β
database_url: postgresql://user:password@localhost/db
```
### **Smart Sanitization**
SecretSentry doesn't just find secretsβit **fixes them safely**:
```python
# Before sanitization
API_KEY = "sk_live_1234567890abcdef"
employee_ssn = "123-45-6789"
coordinates = "40.7128, -74.0060"
# After sanitization (context-aware gibberish)
API_KEY = "sk_live_xK8mP9nQ4vL7wR2Z"
employee_ssn = "456-78-9123"
coordinates = "38.8951, -77.0364"
```
## π§ **Advanced Usage**
### **Custom Patterns**
```python
# Add organization-specific patterns
custom_patterns = {
'employee_id': r'EMP-\d{6}',
'project_code': r'PROJ-[A-Z]{3}-\d{4}',
'internal_api': r'internal_key_[a-zA-Z0-9]{32}'
}
scanner = SecretSentry(custom_patterns=custom_patterns)
```
### **CI/CD Integration**
```python
#!/usr/bin/env python3
# security_check.py
import sys
from secretsentry import SecretSentry
def security_gate():
scanner = SecretSentry()
findings = scanner.scan_directory(".", show_progress=False)
if findings:
print(f"β SECURITY CHECK FAILED: {len(findings)} secrets found")
scanner.display_findings(max_display=10)
return 1
else:
print("β
SECURITY CHECK PASSED: No secrets detected")
return 0
if __name__ == "__main__":
sys.exit(security_gate())
```
### **Batch Processing**
```python
# Scan multiple projects
from secretsentry import SecretSentry
import os
projects = ["./frontend", "./backend", "./data-science"]
all_results = {}
for project in projects:
if os.path.exists(project):
scanner = SecretSentry()
findings = scanner.scan_directory(project)
all_results[project] = len(findings)
# Export individual reports
scanner.export_findings(f"{project.replace('./', '')}_security_report.json")
print("Security Summary:", all_results)
```
## π **Detection Categories**
<details>
<summary><b>π API Keys & Secrets (20+ patterns)</b></summary>
- AWS Access/Secret Keys
- GitHub Tokens (classic & fine-grained)
- Google API Keys
- Stripe Keys (live & test)
- Slack Tokens & Webhooks
- SendGrid API Keys
- Twilio Keys
- Mailgun Keys
- Azure Storage Keys
- Heroku API Keys
- Generic API patterns
</details>
<details>
<summary><b>π³ Financial Data (8+ patterns)</b></summary>
- Credit Cards (Visa, MasterCard, AmEx, Discover, JCB, Diners)
- Bank Account Numbers
- Routing Numbers
- IBAN & SWIFT Codes
- Salary Information
</details>
<details>
<summary><b>π€ Personal Information (10+ patterns)</b></summary>
- Social Security Numbers
- Phone Numbers (US & International)
- Email Addresses
- Passport Numbers
- Driver's License Numbers
- Medical Record Numbers
</details>
<details>
<summary><b>π Geographic Data (5+ patterns)</b></summary>
- GPS Coordinates
- IP Addresses (IPv4 & IPv6)
- MAC Addresses
- ZIP/Postal Codes
</details>
<details>
<summary><b>π₯ Sensitive Personal Data (5+ patterns)</b></summary>
- Ethnic/Racial Categories
- Religious Affiliations
- Medical Information
- Disability Status
</details>
<details>
<summary><b>π Cryptographic Material (5+ patterns)</b></summary>
- Private Keys (RSA, SSH)
- Public Keys & Certificates
- JWT Tokens
- OAuth Tokens
</details>
## ποΈ **Configuration**
### **Environment Variables**
```bash
# Disable progress bars
export SECRETSENTRY_NO_PROGRESS=1
# Custom config file
export SECRETSENTRY_CONFIG=/path/to/config.json
```
### **Configuration File**
```json
{
"excluded_patterns": ["test_", "example_", "demo_"],
"excluded_files": ["*.test.js", "test_*.py"],
"excluded_dirs": ["tests", "examples", "docs"],
"custom_patterns": {
"company_id": "COMP-\\d{8}"
},
"sanitization": {
"create_backups": true,
"backup_suffix": ".backup"
}
}
```
## π€ **Contributing**
We welcome contributions! Here's how to get started:
```bash
# Clone the repository
git clone https://github.com/yourusername/secretsentry.git
cd secretsentry
# Install development dependencies
pip install -e ".[full]"
pip install pytest black flake8
# Run tests
pytest tests/
# Format code
black secretsentry/
flake8 secretsentry/
```
## π **License**
MIT License - see [LICENSE](LICENSE) file for details.
## π **Acknowledgments**
- Inspired by [detect-secrets](https://github.com/Yelp/detect-secrets) and [truffleHog](https://github.com/dxa4481/truffleHog)
- Built for the data science and security communities
- Special thanks to all contributors and the open source community
## π **Support**
- π **Documentation**: [Full docs](https://github.com/yourusername/secretsentry#readme)
- π **Issues**: [Report bugs](https://github.com/yourusername/secretsentry/issues)
- π¬ **Discussions**: [Community forum](https://github.com/yourusername/secretsentry/discussions)
- π§ **Contact**: your.email@example.com
---
**SecretSentry** - *Standing guard over your sensitive data* π‘οΈ
Raw data
{
"_id": null,
"home_page": null,
"name": "secretsentry",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "security, secrets, scanner, pii, jupyter, notebook, api-keys, credentials, sanitization, privacy, devops, ci-cd",
"author": null,
"author_email": "Abdul Jilani <abdul.jilani@evolveailabs.com>",
"download_url": "https://files.pythonhosted.org/packages/7b/ee/d7d58f4eab50bf55da0c4a6bb863754be63a681bb01c42f69bae26ed9ab3/secretsentry-1.0.0.tar.gz",
"platform": null,
"description": "# SecretSentry \ud83d\udee1\ufe0f\n\n> **The first sensitive data scanner built for modern data science and web development workflows**\n\n[](https://badge.fury.io/py/secretsentry)\n[](https://pypi.org/project/secretsentry/)\n[](https://opensource.org/licenses/MIT)\n\nSecretSentry is an advanced sensitive data scanner that goes beyond traditional secret detection. Built specifically for **Jupyter notebooks**, **web development**, and **data science workflows**, it intelligently filters false positives while detecting API keys, PII, credentials, and other sensitive information.\n\n## \ud83c\udfaf **Why SecretSentry?**\n\n### **Built for Modern Workflows**\n- \ud83d\udd2c **Jupyter Notebook Specialist**: First scanner designed for `.ipynb` files\n- \ud83e\udde0 **Smart False Positive Filtering**: Ignores base64 images, cell IDs, and CSS colors\n- \ud83c\udf10 **Multi-Environment**: CLI, Jupyter notebooks, and Python scripts\n- \ud83c\udf9b\ufe0f **Interactive Analysis**: Built-in widgets for exploring findings\n\n### **Comprehensive Detection**\n- \ud83d\udd11 **50+ Built-in Patterns**: API keys, tokens, secrets, credentials\n- \ud83d\udc64 **PII Detection**: SSNs, credit cards, phone numbers, emails\n- \ud83d\udcb0 **Financial Data**: Salary information, bank accounts, routing numbers\n- \ud83c\udf0d **Geographic Data**: Coordinates, IP addresses, postal codes\n- \ud83c\udfe5 **Sensitive Categories**: Ethnic data, religious information, medical records\n\n### **Advanced Features**\n- \ud83d\udee1\ufe0f **Smart Sanitization**: Context-aware gibberish replacement\n- \ud83d\udcca **Rich Visualizations**: Charts and statistics (with matplotlib/seaborn)\n- \ud83d\udcc8 **Pandas Integration**: Export to DataFrames for analysis\n- \ud83d\udd04 **CI/CD Ready**: Perfect for automation and pipelines\n\n## \ud83d\ude80 **Quick Start**\n\n### **Installation**\n\n```bash\n# Basic installation\npip install secretsentry\n\n# Full installation with all features\npip install secretsentry[full]\n\n# For Jupyter notebooks only\npip install secretsentry[jupyter]\n```\n\n### **Basic Usage**\n\n```python\nfrom secretsentry import SecretSentry, quick_scan\n\n# Quick scan with automatic results\nscanner = quick_scan(\"./my_project\")\n\n# Manual scanning with custom options\nscanner = SecretSentry()\nfindings = scanner.scan_directory(\"./my_project\")\nscanner.display_findings()\n\n# Sanitize files (creates backups automatically)\nstats = scanner.sanitize_files(dry_run=True) # Preview changes\nstats = scanner.sanitize_files() # Actually sanitize\n```\n\n### **Command Line**\n\n```bash\n# Scan and display results\nsecretsentry scan ./my_project --display\n\n# Scan specific file types\nsecretsentry scan ./my_project --extensions .py .js .ipynb --display\n\n# Export findings\nsecretsentry scan ./my_project --export findings.json\n\n# Sanitize files (with backup)\nsecretsentry scan ./my_project --sanitize --dry-run\nsecretsentry scan ./my_project --sanitize\n\n# List all detection patterns\nsecretsentry list-patterns\n```\n\n## \ud83c\udf93 **Jupyter Notebook Integration**\n\nSecretSentry shines in Jupyter environments with **zero false positives** from notebook metadata:\n\n```python\n# In Jupyter notebook\nfrom secretsentry import SecretSentry, create_sample_files\n\n# Create test data\ncreate_sample_files(\"./test_data\")\n\n# Quick scan with visualizations\nscanner = quick_scan(\"./test_data\", show_plots=True)\n\n# Interactive exploration\nscanner.create_interactive_viewer()\n\n# Data analysis with pandas\ndf = scanner.to_dataframe()\nsummary = df.groupby('pattern_type').size()\n```\n\n## \ud83d\udcca **What Makes It Special**\n\n### **Intelligent False Positive Filtering**\n\n**Traditional scanners** flag this as secrets:\n```\n\u274c aws_secret_key: iVBORw0KGgoAAAANSUhEUgAABKYAAAMW... # Just a PNG image!\n\u274c api_key: \"cell_type\": \"code\" # Notebook metadata!\n\u274c secret: #3498db # CSS color!\n```\n\n**SecretSentry** ignores these and only reports **real issues**:\n```\n\u2705 aws_secret_key: AKIAIOSFODNN7EXAMPLE\n\u2705 stripe_key: sk_live_1234567890abcdef123456789\n\u2705 database_url: postgresql://user:password@localhost/db\n```\n\n### **Smart Sanitization**\n\nSecretSentry doesn't just find secrets\u2014it **fixes them safely**:\n\n```python\n# Before sanitization\nAPI_KEY = \"sk_live_1234567890abcdef\"\nemployee_ssn = \"123-45-6789\"\ncoordinates = \"40.7128, -74.0060\"\n\n# After sanitization (context-aware gibberish)\nAPI_KEY = \"sk_live_xK8mP9nQ4vL7wR2Z\"\nemployee_ssn = \"456-78-9123\" \ncoordinates = \"38.8951, -77.0364\"\n```\n\n## \ud83d\udd27 **Advanced Usage**\n\n### **Custom Patterns**\n\n```python\n# Add organization-specific patterns\ncustom_patterns = {\n 'employee_id': r'EMP-\\d{6}',\n 'project_code': r'PROJ-[A-Z]{3}-\\d{4}',\n 'internal_api': r'internal_key_[a-zA-Z0-9]{32}'\n}\n\nscanner = SecretSentry(custom_patterns=custom_patterns)\n```\n\n### **CI/CD Integration**\n\n```python\n#!/usr/bin/env python3\n# security_check.py\nimport sys\nfrom secretsentry import SecretSentry\n\ndef security_gate():\n scanner = SecretSentry()\n findings = scanner.scan_directory(\".\", show_progress=False)\n \n if findings:\n print(f\"\u274c SECURITY CHECK FAILED: {len(findings)} secrets found\")\n scanner.display_findings(max_display=10)\n return 1\n else:\n print(\"\u2705 SECURITY CHECK PASSED: No secrets detected\")\n return 0\n\nif __name__ == \"__main__\":\n sys.exit(security_gate())\n```\n\n### **Batch Processing**\n\n```python\n# Scan multiple projects\nfrom secretsentry import SecretSentry\nimport os\n\nprojects = [\"./frontend\", \"./backend\", \"./data-science\"]\nall_results = {}\n\nfor project in projects:\n if os.path.exists(project):\n scanner = SecretSentry()\n findings = scanner.scan_directory(project)\n all_results[project] = len(findings)\n \n # Export individual reports\n scanner.export_findings(f\"{project.replace('./', '')}_security_report.json\")\n\nprint(\"Security Summary:\", all_results)\n```\n\n## \ud83d\udcc8 **Detection Categories**\n\n<details>\n<summary><b>\ud83d\udd11 API Keys & Secrets (20+ patterns)</b></summary>\n\n- AWS Access/Secret Keys\n- GitHub Tokens (classic & fine-grained) \n- Google API Keys\n- Stripe Keys (live & test)\n- Slack Tokens & Webhooks\n- SendGrid API Keys\n- Twilio Keys\n- Mailgun Keys\n- Azure Storage Keys\n- Heroku API Keys\n- Generic API patterns\n\n</details>\n\n<details>\n<summary><b>\ud83d\udcb3 Financial Data (8+ patterns)</b></summary>\n\n- Credit Cards (Visa, MasterCard, AmEx, Discover, JCB, Diners)\n- Bank Account Numbers\n- Routing Numbers \n- IBAN & SWIFT Codes\n- Salary Information\n\n</details>\n\n<details>\n<summary><b>\ud83d\udc64 Personal Information (10+ patterns)</b></summary>\n\n- Social Security Numbers\n- Phone Numbers (US & International)\n- Email Addresses\n- Passport Numbers\n- Driver's License Numbers\n- Medical Record Numbers\n\n</details>\n\n<details>\n<summary><b>\ud83c\udf0d Geographic Data (5+ patterns)</b></summary>\n\n- GPS Coordinates\n- IP Addresses (IPv4 & IPv6)\n- MAC Addresses \n- ZIP/Postal Codes\n\n</details>\n\n<details>\n<summary><b>\ud83c\udfe5 Sensitive Personal Data (5+ patterns)</b></summary>\n\n- Ethnic/Racial Categories\n- Religious Affiliations \n- Medical Information\n- Disability Status\n\n</details>\n\n<details>\n<summary><b>\ud83d\udd10 Cryptographic Material (5+ patterns)</b></summary>\n\n- Private Keys (RSA, SSH)\n- Public Keys & Certificates\n- JWT Tokens\n- OAuth Tokens \n\n</details>\n\n## \ud83c\udf9b\ufe0f **Configuration**\n\n### **Environment Variables**\n```bash\n# Disable progress bars\nexport SECRETSENTRY_NO_PROGRESS=1\n\n# Custom config file\nexport SECRETSENTRY_CONFIG=/path/to/config.json\n```\n\n### **Configuration File**\n```json\n{\n \"excluded_patterns\": [\"test_\", \"example_\", \"demo_\"],\n \"excluded_files\": [\"*.test.js\", \"test_*.py\"],\n \"excluded_dirs\": [\"tests\", \"examples\", \"docs\"],\n \"custom_patterns\": {\n \"company_id\": \"COMP-\\\\d{8}\"\n },\n \"sanitization\": {\n \"create_backups\": true,\n \"backup_suffix\": \".backup\"\n }\n}\n```\n\n## \ud83e\udd1d **Contributing**\n\nWe welcome contributions! Here's how to get started:\n\n```bash\n# Clone the repository\ngit clone https://github.com/yourusername/secretsentry.git\ncd secretsentry\n\n# Install development dependencies\npip install -e \".[full]\"\npip install pytest black flake8\n\n# Run tests\npytest tests/\n\n# Format code\nblack secretsentry/\nflake8 secretsentry/\n```\n\n## \ud83d\udcdd **License**\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f **Acknowledgments**\n\n- Inspired by [detect-secrets](https://github.com/Yelp/detect-secrets) and [truffleHog](https://github.com/dxa4481/truffleHog)\n- Built for the data science and security communities\n- Special thanks to all contributors and the open source community\n\n## \ud83d\udcde **Support**\n\n- \ud83d\udcd6 **Documentation**: [Full docs](https://github.com/yourusername/secretsentry#readme)\n- \ud83d\udc1b **Issues**: [Report bugs](https://github.com/yourusername/secretsentry/issues)\n- \ud83d\udcac **Discussions**: [Community forum](https://github.com/yourusername/secretsentry/discussions)\n- \ud83d\udce7 **Contact**: your.email@example.com\n\n---\n\n**SecretSentry** - *Standing guard over your sensitive data* \ud83d\udee1\ufe0f\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Advanced sensitive data scanner with Jupyter notebook support and intelligent false positive filtering",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/y2ee201/secretsentry/issues",
"Documentation": "https://github.com/y2ee201/secretsentry#readme",
"Homepage": "https://github.com/y2ee201/secretsentry",
"Repository": "https://github.com/y2ee201/secretsentry.git"
},
"split_keywords": [
"security",
" secrets",
" scanner",
" pii",
" jupyter",
" notebook",
" api-keys",
" credentials",
" sanitization",
" privacy",
" devops",
" ci-cd"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1b203df43e80545f1c51ea210f30f886a16b465671e43d467705d2299d6621b0",
"md5": "c1e71fa35a644d6125dba9c4ce449f98",
"sha256": "b8837236cb3218a4cde4bd81c505b553b833582190072697e7e25e1b09dca0b4"
},
"downloads": -1,
"filename": "secretsentry-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c1e71fa35a644d6125dba9c4ce449f98",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 23082,
"upload_time": "2025-07-29T10:29:29",
"upload_time_iso_8601": "2025-07-29T10:29:29.577442Z",
"url": "https://files.pythonhosted.org/packages/1b/20/3df43e80545f1c51ea210f30f886a16b465671e43d467705d2299d6621b0/secretsentry-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7beed7d58f4eab50bf55da0c4a6bb863754be63a681bb01c42f69bae26ed9ab3",
"md5": "d4acec655419c150e1199a234536fd88",
"sha256": "f900708af29c5c43e60e7da0192c5579652adef2efd5f91c6d4cdbb693c285c1"
},
"downloads": -1,
"filename": "secretsentry-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "d4acec655419c150e1199a234536fd88",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 29493,
"upload_time": "2025-07-29T10:29:31",
"upload_time_iso_8601": "2025-07-29T10:29:31.885709Z",
"url": "https://files.pythonhosted.org/packages/7b/ee/d7d58f4eab50bf55da0c4a6bb863754be63a681bb01c42f69bae26ed9ab3/secretsentry-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-29 10:29:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "y2ee201",
"github_project": "secretsentry",
"github_not_found": true,
"lcname": "secretsentry"
}