# Secret & PII Scanner
This tool detects hardcoded secrets and PII in source code using pattern-based (regex) detection only.
## Features
- Detects API keys, tokens, passwords, emails, credit cards, and more using regex patterns.
- Customizable detection patterns via `.secret-scanner.yaml`.
- Multi-language support.
- Rich reporting (HTML, JSON, etc.).
## How It Works
- The scanner uses a set of regex patterns to match secrets and PII in your codebase.
- No entropy-based detection is used; only explicit pattern matches are reported.
## Configuration
- Add or customize patterns in `.secret-scanner.yaml`.
---
## 🚀 Quick Start
```bash
# 1. Install (from source, recommended for latest features)
git clone https://gitlab.com/ox-saro/SecretDetection.git
cd SecretDetection
pip install -e .
# 2. Run your first scan
secret-scanner scan .
# 3. See all options
secret-scanner --help
```
---
## 📦 Installation
### System Requirements
- **Python**: 3.7 or higher
- **Git**: For diff-based scanning and pre-commit hooks
- **Operating Systems**: Windows, macOS, Linux
### Check System Compatibility
Before installing, verify your system compatibility:
```bash
# After installation, run this to check your system
secret-scanner check-system
```
### From Source (Recommended)
```bash
git clone https://gitlab.com/ox-saro/SecretDetection.git
cd SecretDetection
pip install -e .
```
### From Built Package
```bash
python3 build_package.py build
pip install dist/secret_scanner-1.0.0-py3-none-any.whl
```
### For Development
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
```
### Platform-Specific Installation
#### Windows
```bash
# Install Python 3.7+ from https://python.org
# Install Git from https://git-scm.com
pip install secret-scanner[pdf]
```
#### macOS
```bash
# Using Homebrew (recommended)
brew install python3 git
pip3 install secret-scanner[pdf]
# For PDF support with WeasyPrint
brew install cairo pango gdk-pixbuf libffi
pip3 install weasyprint
```
#### Linux (Ubuntu/Debian)
```bash
sudo apt-get update
sudo apt-get install python3 python3-pip git
pip3 install secret-scanner[pdf]
# For PDF support with WeasyPrint
sudo apt-get install build-essential python3-dev python3-pip python3-setuptools python3-wheel python3-cffi libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info
pip3 install weasyprint
```
#### Linux (CentOS/RHEL)
```bash
sudo yum install python3 python3-pip git
pip3 install secret-scanner[pdf]
# For PDF support with WeasyPrint
sudo yum install gcc python3-devel cairo-devel pango-devel gdk-pixbuf2-devel libffi-devel
pip3 install weasyprint
```
---
## 🏃 Usage
### CLI (Recommended)
```bash
# Scan a directory
secret-scanner scan /path/to/repo
# Scan only Python and JS files
secret-scanner scan . --files "*.py,*.js"
# Scan only changed files (git diff)
secret-scanner scan . --diff-only
# Output as JSON, HTML, or PDF with custom path
secret-scanner scan . --output json --report-path /tmp/my_report.json
secret-scanner scan . --output html --report-name custom_report.html
secret-scanner scan . --output pdf --report-path /reports/security_scan.pdf
# Multi-threaded scan with progress tracking
secret-scanner scan . --max-workers 8 --chunk-size 20 --verbose
# Multi-project scan with custom report
secret-scanner scan-multi /path/to/projects --report-name multi_scan_summary.html
# Install pre-commit hook
secret-scanner install-hook
# Test a regex pattern
secret-scanner test "api_key_[a-zA-Z0-9]{32}" "api_key_1234567890abcdef1234567890abcdef"
```
### As a Python Module
```python
from secret_scanner.core.scanner import SecretScanner
from secret_scanner.core.config import Config
config = Config.load_default()
scanner = SecretScanner(config)
result = scanner.scan_directory("/path/to/repo")
print(f"Found {len(result.findings)} secrets")
```
---
## ⚙️ Configuration
Create `.secret-scanner.yaml` in your repo root:
```yaml
patterns:
- name: "Custom API Key"
regex: "my_custom_key_[a-zA-Z0-9]{32}"
severity: "high"
exclude:
- "*.log"
- "node_modules/"
- "vendor/"
max_file_size: 10485760 # 10MB
context_lines: 3
max_workers: 8
chunk_size: 10
```
- **Custom Patterns**: Add your own regex rules
- **Exclude**: Skip files or directories
- **Multi-threading**: Tune `max_workers` and `chunk_size`
---
## 📧 Email Notification Setup
1. **Create `email_config.json`:**
```json
{
"smtp_server": "smtp.gmail.com",
"smtp_port": 587,
"username": "your-email@gmail.com",
"password": "your-app-password",
"use_tls": true,
"from_email": "your-email@gmail.com",
"to_emails": ["recipient@example.com"]
}
```
2. **Test Email Setup:**
```bash
secret-scanner scan-multi . --test-email --email-config email_config.json
```
3. **Send Reports via Email:**
```bash
secret-scanner scan-multi /path/to/projects --send-email --email-config email_config.json
```
- **Gmail**: Use an App Password (see Google Account > Security)
- **Other Providers**: Use your SMTP settings
---
## 🏗️ Build & Run
### Build the Package
```bash
python3 build_package.py build
```
### Install from Build
```bash
pip install dist/secret_scanner-1.0.0-py3-none-any.whl
```
### Run the CLI
```bash
secret-scanner scan . --help
```
---
## 🆕 Enhanced Features
### Real-time Progress Tracking
The scanner now provides beautiful progress bars showing:
- Current file being scanned
- Number of findings discovered
- Completion percentage and time elapsed
- Multi-project scan progress with project-level updates
```bash
# Watch progress in real-time
secret-scanner scan . --verbose
```
### Custom Report Paths
Specify exactly where and how to save your reports:
```bash
# Full path (overrides report name)
secret-scanner scan . --output json --report-path /tmp/custom_report.json
# Custom filename in current directory
secret-scanner scan . --output html --report-name my_scan_report.html
# Multi-project with custom summary report
secret-scanner scan-multi /projects --report-name security_summary.pdf
```
### Improved Pattern Accuracy
Enhanced regex patterns reduce false positives:
- Most patterns now require assignment operators (`=`, `:`) or context
- Better detection of actual secrets vs. legitimate text
- Improved API key and token pattern matching
- More precise PII detection
---
## 🛠️ Troubleshooting
- **Missing command?**
- Reinstall: `pip install -e .`
- Check: `which secret-scanner`
- **Import errors?**
- Check: `python -c "import secret_scanner; print(secret_scanner.__version__)"`
- **PDF/email issues?**
- Ensure system dependencies for WeasyPrint are installed (see their docs)
- **Email not sending?**
- Check SMTP credentials and provider restrictions
- **Performance?**
- Use multi-threading: `--max-workers 8 --chunk-size 20`
- Exclude large/unnecessary directories
---
## 🤝 Contributing
1. Fork the repo
2. Create a feature branch
3. Add your changes & tests
4. Submit a pull request
- Code style: [Black](https://black.readthedocs.io/), [flake8](https://flake8.pycqa.org/)
- Tests: [pytest](https://docs.pytest.org/)
---
## 📄 License
MIT License. See [LICENSE](LICENSE) for details.
---
## 🆘 Support
- Create an issue for bugs or questions
- Check this README for common solutions
---
## 🕒 Version History
- **v1.0.0** (Latest): Initial release with multi-threaded scanning, 50+ patterns, multi-project/email support, rich CLI, progress tracking, custom report paths, and improved regex accuracy
---
Raw data
{
"_id": null,
"home_page": null,
"name": "secret-scanner-tool",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Saravanan Sathiyamoorthi <saravanansaro976@gmail.com>",
"keywords": "security, secrets, pii, detection, scanning, git, pre-commit, ci-cd, static-analysis, code-review",
"author": null,
"author_email": "Saravanan Sathiyamoorthi <saravanansaro976@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/00/46/afeb045b85e9ac73b87f81c1ee17c3092bce6fd3366e4bdee941d3f68ed0/secret_scanner_tool-1.0.0.tar.gz",
"platform": null,
"description": "# Secret & PII Scanner\n\nThis tool detects hardcoded secrets and PII in source code using pattern-based (regex) detection only.\n\n## Features\n- Detects API keys, tokens, passwords, emails, credit cards, and more using regex patterns.\n- Customizable detection patterns via `.secret-scanner.yaml`.\n- Multi-language support.\n- Rich reporting (HTML, JSON, etc.).\n\n## How It Works\n- The scanner uses a set of regex patterns to match secrets and PII in your codebase.\n- No entropy-based detection is used; only explicit pattern matches are reported.\n\n## Configuration\n- Add or customize patterns in `.secret-scanner.yaml`.\n\n---\n\n## \ud83d\ude80 Quick Start\n\n```bash\n# 1. Install (from source, recommended for latest features)\ngit clone https://gitlab.com/ox-saro/SecretDetection.git\ncd SecretDetection\npip install -e .\n\n# 2. Run your first scan\nsecret-scanner scan .\n\n# 3. See all options\nsecret-scanner --help\n```\n\n---\n\n## \ud83d\udce6 Installation\n\n### System Requirements\n- **Python**: 3.7 or higher\n- **Git**: For diff-based scanning and pre-commit hooks\n- **Operating Systems**: Windows, macOS, Linux\n\n### Check System Compatibility\nBefore installing, verify your system compatibility:\n```bash\n# After installation, run this to check your system\nsecret-scanner check-system\n```\n\n### From Source (Recommended)\n```bash\ngit clone https://gitlab.com/ox-saro/SecretDetection.git\ncd SecretDetection\npip install -e .\n```\n\n### From Built Package\n```bash\npython3 build_package.py build\npip install dist/secret_scanner-1.0.0-py3-none-any.whl\n```\n\n### For Development\n```bash\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\npip install -e \".[dev]\"\n```\n\n### Platform-Specific Installation\n\n#### Windows\n```bash\n# Install Python 3.7+ from https://python.org\n# Install Git from https://git-scm.com\npip install secret-scanner[pdf]\n```\n\n#### macOS\n```bash\n# Using Homebrew (recommended)\nbrew install python3 git\npip3 install secret-scanner[pdf]\n\n# For PDF support with WeasyPrint\nbrew install cairo pango gdk-pixbuf libffi\npip3 install weasyprint\n```\n\n#### Linux (Ubuntu/Debian)\n```bash\nsudo apt-get update\nsudo apt-get install python3 python3-pip git\npip3 install secret-scanner[pdf]\n\n# For PDF support with WeasyPrint\nsudo apt-get install build-essential python3-dev python3-pip python3-setuptools python3-wheel python3-cffi libcairo2 libpango-1.0-0 libpangocairo-1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info\npip3 install weasyprint\n```\n\n#### Linux (CentOS/RHEL)\n```bash\nsudo yum install python3 python3-pip git\npip3 install secret-scanner[pdf]\n\n# For PDF support with WeasyPrint\nsudo yum install gcc python3-devel cairo-devel pango-devel gdk-pixbuf2-devel libffi-devel\npip3 install weasyprint\n```\n\n---\n\n## \ud83c\udfc3 Usage\n\n### CLI (Recommended)\n\n```bash\n# Scan a directory\nsecret-scanner scan /path/to/repo\n\n# Scan only Python and JS files\nsecret-scanner scan . --files \"*.py,*.js\"\n\n# Scan only changed files (git diff)\nsecret-scanner scan . --diff-only\n\n# Output as JSON, HTML, or PDF with custom path\nsecret-scanner scan . --output json --report-path /tmp/my_report.json\nsecret-scanner scan . --output html --report-name custom_report.html\nsecret-scanner scan . --output pdf --report-path /reports/security_scan.pdf\n\n# Multi-threaded scan with progress tracking\nsecret-scanner scan . --max-workers 8 --chunk-size 20 --verbose\n\n# Multi-project scan with custom report\nsecret-scanner scan-multi /path/to/projects --report-name multi_scan_summary.html\n\n# Install pre-commit hook\nsecret-scanner install-hook\n\n# Test a regex pattern\nsecret-scanner test \"api_key_[a-zA-Z0-9]{32}\" \"api_key_1234567890abcdef1234567890abcdef\"\n```\n\n### As a Python Module\n\n```python\nfrom secret_scanner.core.scanner import SecretScanner\nfrom secret_scanner.core.config import Config\n\nconfig = Config.load_default()\nscanner = SecretScanner(config)\nresult = scanner.scan_directory(\"/path/to/repo\")\nprint(f\"Found {len(result.findings)} secrets\")\n```\n\n---\n\n## \u2699\ufe0f Configuration\n\nCreate `.secret-scanner.yaml` in your repo root:\n\n```yaml\npatterns:\n - name: \"Custom API Key\"\n regex: \"my_custom_key_[a-zA-Z0-9]{32}\"\n severity: \"high\"\n\nexclude:\n - \"*.log\"\n - \"node_modules/\"\n - \"vendor/\"\n\nmax_file_size: 10485760 # 10MB\ncontext_lines: 3\nmax_workers: 8\nchunk_size: 10\n```\n\n- **Custom Patterns**: Add your own regex rules\n- **Exclude**: Skip files or directories\n- **Multi-threading**: Tune `max_workers` and `chunk_size`\n\n---\n\n## \ud83d\udce7 Email Notification Setup\n\n1. **Create `email_config.json`:**\n```json\n{\n \"smtp_server\": \"smtp.gmail.com\",\n \"smtp_port\": 587,\n \"username\": \"your-email@gmail.com\",\n \"password\": \"your-app-password\",\n \"use_tls\": true,\n \"from_email\": \"your-email@gmail.com\",\n \"to_emails\": [\"recipient@example.com\"]\n}\n```\n\n2. **Test Email Setup:**\n```bash\nsecret-scanner scan-multi . --test-email --email-config email_config.json\n```\n\n3. **Send Reports via Email:**\n```bash\nsecret-scanner scan-multi /path/to/projects --send-email --email-config email_config.json\n```\n\n- **Gmail**: Use an App Password (see Google Account > Security)\n- **Other Providers**: Use your SMTP settings\n\n---\n\n## \ud83c\udfd7\ufe0f Build & Run\n\n### Build the Package\n```bash\npython3 build_package.py build\n```\n\n### Install from Build\n```bash\npip install dist/secret_scanner-1.0.0-py3-none-any.whl\n```\n\n### Run the CLI\n```bash\nsecret-scanner scan . --help\n```\n\n---\n\n## \ud83c\udd95 Enhanced Features\n\n### Real-time Progress Tracking\nThe scanner now provides beautiful progress bars showing:\n- Current file being scanned\n- Number of findings discovered\n- Completion percentage and time elapsed\n- Multi-project scan progress with project-level updates\n\n```bash\n# Watch progress in real-time\nsecret-scanner scan . --verbose\n```\n\n### Custom Report Paths\nSpecify exactly where and how to save your reports:\n\n```bash\n# Full path (overrides report name)\nsecret-scanner scan . --output json --report-path /tmp/custom_report.json\n\n# Custom filename in current directory\nsecret-scanner scan . --output html --report-name my_scan_report.html\n\n# Multi-project with custom summary report\nsecret-scanner scan-multi /projects --report-name security_summary.pdf\n```\n\n### Improved Pattern Accuracy\nEnhanced regex patterns reduce false positives:\n- Most patterns now require assignment operators (`=`, `:`) or context\n- Better detection of actual secrets vs. legitimate text\n- Improved API key and token pattern matching\n- More precise PII detection\n\n---\n\n## \ud83d\udee0\ufe0f Troubleshooting\n\n- **Missing command?**\n - Reinstall: `pip install -e .`\n - Check: `which secret-scanner`\n- **Import errors?**\n - Check: `python -c \"import secret_scanner; print(secret_scanner.__version__)\"`\n- **PDF/email issues?**\n - Ensure system dependencies for WeasyPrint are installed (see their docs)\n- **Email not sending?**\n - Check SMTP credentials and provider restrictions\n- **Performance?**\n - Use multi-threading: `--max-workers 8 --chunk-size 20`\n - Exclude large/unnecessary directories\n\n---\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repo\n2. Create a feature branch\n3. Add your changes & tests\n4. Submit a pull request\n\n- Code style: [Black](https://black.readthedocs.io/), [flake8](https://flake8.pycqa.org/)\n- Tests: [pytest](https://docs.pytest.org/)\n\n---\n\n## \ud83d\udcc4 License\n\nMIT License. See [LICENSE](LICENSE) for details.\n\n---\n\n## \ud83c\udd98 Support\n\n- Create an issue for bugs or questions\n- Check this README for common solutions\n\n---\n\n## \ud83d\udd52 Version History\n\n- **v1.0.0** (Latest): Initial release with multi-threaded scanning, 50+ patterns, multi-project/email support, rich CLI, progress tracking, custom report paths, and improved regex accuracy\n\n---\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A lightweight, extensible tool to automatically scan source code for hardcoded secrets and PII",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://gitlab.com/ox-saro/SecretDetection"
},
"split_keywords": [
"security",
" secrets",
" pii",
" detection",
" scanning",
" git",
" pre-commit",
" ci-cd",
" static-analysis",
" code-review"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "29de2835bbec7069942e3cbf7afbb54736a4aeeaec7c1e119290f89d5d6a605e",
"md5": "29382c946fb440b5ba36d78d95b4d211",
"sha256": "4b21c071818b7b467200b81d47db6b11def91590c39514359db5531a1ef3131c"
},
"downloads": -1,
"filename": "secret_scanner_tool-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "29382c946fb440b5ba36d78d95b4d211",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 54657,
"upload_time": "2025-07-10T09:35:47",
"upload_time_iso_8601": "2025-07-10T09:35:47.564837Z",
"url": "https://files.pythonhosted.org/packages/29/de/2835bbec7069942e3cbf7afbb54736a4aeeaec7c1e119290f89d5d6a605e/secret_scanner_tool-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0046afeb045b85e9ac73b87f81c1ee17c3092bce6fd3366e4bdee941d3f68ed0",
"md5": "7b4e63ea3f4969a7bddfe37b1e2e957f",
"sha256": "8a19db737b4c81336017665b45f3e7c7d13a2fd00ba974a9424844c76ef15471"
},
"downloads": -1,
"filename": "secret_scanner_tool-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "7b4e63ea3f4969a7bddfe37b1e2e957f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 51891,
"upload_time": "2025-07-10T09:35:49",
"upload_time_iso_8601": "2025-07-10T09:35:49.234062Z",
"url": "https://files.pythonhosted.org/packages/00/46/afeb045b85e9ac73b87f81c1ee17c3092bce6fd3366e4bdee941d3f68ed0/secret_scanner_tool-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 09:35:49",
"github": false,
"gitlab": true,
"bitbucket": false,
"codeberg": false,
"gitlab_user": "ox-saro",
"gitlab_project": "SecretDetection",
"lcname": "secret-scanner-tool"
}