detect-secrets

Name	detect-secrets JSON
Version	1.5.0 JSON
	download
home_page	https://github.com/Yelp/detect-secrets
Summary	Tool for detecting secrets in the codebase
upload_time	2024-05-06 17:46:19
maintainer	None
docs_url	None
author	Yelp, Inc.
requires_python	None
license	None
keywords	secret-management pre-commit security entropy-checks
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage

[![Build Status](https://github.com/Yelp/detect-secrets/actions/workflows/ci.yml/badge.svg)](https://github.com/Yelp/detect-secrets/actions/workflows/ci.yml?query=branch%3Amaster++)
[![PyPI version](https://badge.fury.io/py/detect-secrets.svg)](https://badge.fury.io/py/detect-secrets)
[![Homebrew](https://img.shields.io/badge/dynamic/json.svg?url=https://formulae.brew.sh/api/formula/detect-secrets.json&query=$.versions.stable&label=homebrew)](https://formulae.brew.sh/formula/detect-secrets)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-ff69b4.svg)](https://github.com/Yelp/detect-secrets/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22+)
[![AMF](https://img.shields.io/badge/Donate-Charity-orange.svg)](https://www.againstmalaria.com/donation.aspx)

# detect-secrets

## About

`detect-secrets` is an aptly named module for (surprise, surprise) **detecting secrets** within a
code base.

However, unlike other similar packages that solely focus on finding secrets, this package is
designed with the enterprise client in mind: providing a **backwards compatible**, systematic
means of:

1. Preventing new secrets from entering the code base,
2. Detecting if such preventions are explicitly bypassed, and
3. Providing a checklist of secrets to roll, and migrate off to a more secure storage.

This way, you create a
[separation of concern](https://en.wikipedia.org/wiki/Separation_of_concerns):
accepting that there may *currently* be secrets hiding in your large repository
(this is what we refer to as a _baseline_), but preventing this issue from getting any larger,
without dealing with the potentially gargantuan effort of moving existing secrets away.

It does this by running periodic diff outputs against heuristically crafted regex statements,
to identify whether any *new* secret has been committed. This way, it avoids the overhead of
digging through all git history, as well as the need to scan the entire repository every time.

For a look at recent changes, please see [CHANGELOG.md](CHANGELOG.md).

If you are looking to contribute, please see [CONTRIBUTING.md](CONTRIBUTING.md).

For more detailed documentation, check out our other [documentation](docs/).

## Examples

### Quickstart:

Create a baseline of potential secrets currently found in your git repository.

```bash
$ detect-secrets scan > .secrets.baseline
```

or, to run it from a different directory:

```bash
$ detect-secrets -C /path/to/directory scan > /path/to/directory/.secrets.baseline
```

**Scanning non-git tracked files:**

```bash
$ detect-secrets scan test_data/ --all-files > .secrets.baseline
```

### Adding New Secrets to Baseline:

This will rescan your codebase, and:

1. Update/upgrade your baseline to be compatible with the latest version,
2. Add any new secrets it finds to your baseline,
3. Remove any secrets no longer in your codebase

This will also preserve any labelled secrets you have.

```bash
$ detect-secrets scan --baseline .secrets.baseline
```

For baselines older than version 0.9, just recreate it.

### Alerting off newly added secrets:

**Scanning Staged Files Only:**

```bash
$ git diff --staged --name-only -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline
```

**Scanning All Tracked Files:**

```bash
$ git ls-files -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline
```

### Viewing All Enabled Plugins:

```bash
$ detect-secrets scan --list-all-plugins
ArtifactoryDetector
AWSKeyDetector
AzureStorageKeyDetector
BasicAuthDetector
CloudantDetector
DiscordBotTokenDetector
GitHubTokenDetector
GitLabTokenDetector
Base64HighEntropyString
HexHighEntropyString
IbmCloudIamDetector
IbmCosHmacDetector
IPPublicDetector
JwtTokenDetector
KeywordDetector
MailchimpDetector
NpmDetector
OpenAIDetector
PrivateKeyDetector
PypiTokenDetector
SendGridDetector
SlackDetector
SoftlayerDetector
SquareOAuthDetector
StripeDetector
TelegramBotTokenDetector
TwilioKeyDetector
```

### Disabling Plugins:

```bash
$ detect-secrets scan --disable-plugin KeywordDetector --disable-plugin AWSKeyDetector
```

If you want to **only** run a specific plugin, you can do:

```bash
$ detect-secrets scan --list-all-plugins | \
grep -v 'BasicAuthDetector' | \
sed "s#^#--disable-plugin #g" | \
xargs detect-secrets scan test_data
```

### Auditing a Baseline:

This is an optional step to label the results in your baseline. It can be used to narrow down your
checklist of secrets to migrate, or to better configure your plugins to improve its signal-to-noise
ratio.

```bash
$ detect-secrets audit .secrets.baseline
```

### Usage in Other Python Scripts

**Basic Use:**

```python
from detect_secrets import SecretsCollection
from detect_secrets.settings import default_settings

secrets = SecretsCollection()
with default_settings():
secrets.scan_file('test_data/config.ini')

import json
print(json.dumps(secrets.json(), indent=2))
```

**More Advanced Configuration:**

```python
from detect_secrets import SecretsCollection
from detect_secrets.settings import transient_settings

secrets = SecretsCollection()
with transient_settings({
# Only run scans with only these plugins.
# This format is the same as the one that is saved in the generated baseline.
'plugins_used': [
# Example of configuring a built-in plugin
{
'name': 'Base64HighEntropyString',
'limit': 5.0,
},

# Example of using a custom plugin
{
'name': 'HippoDetector',
'path': 'file:///Users/aaronloo/Documents/github/detect-secrets/testing/plugins.py',
},
],

# We can also specify whichever additional filters we want.
# This is an example of using the function `is_identified_by_ML_model` within the
# local file `./private-filters/example.py`.
'filters_used': [
{
'path': 'file://private-filters/example.py::is_identified_by_ML_model',
},
]
}) as settings:
# If we want to make any further adjustments to the created settings object (e.g.
# disabling default filters), we can do so as such.
settings.disable_filters(
'detect_secrets.filters.heuristic.is_prefixed_with_dollar_sign',
'detect_secrets.filters.heuristic.is_likely_id_string',
)

secrets.scan_file('test_data/config.ini')
```

## Installation

```bash
$ pip install detect-secrets
✨🍰✨
```

Install via [brew](https://brew.sh/):

```bash
$ brew install detect-secrets
```

## Usage

`detect-secrets` comes with three different tools, and there is often confusion around which one
to use. Use this handy checklist to help you decide:

1. Do you want to add secrets to your baseline? If so, use **`detect-secrets scan`**.
2. Do you want to alert off new secrets not in the baseline? If so, use **`detect-secrets-hook`**.
3. Are you analyzing the baseline itself? If so, use **`detect-secrets audit`**.

### Adding Secrets to Baseline

```
$ detect-secrets scan --help
usage: detect-secrets scan [-h] [--string [STRING]] [--only-allowlisted]
[--all-files] [--baseline FILENAME]
[--force-use-all-plugins] [--slim]
[--list-all-plugins] [-p PLUGIN]
[--base64-limit [BASE64_LIMIT]]
[--hex-limit [HEX_LIMIT]]
[--disable-plugin DISABLE_PLUGIN]
[-n | --only-verified]
[--exclude-lines EXCLUDE_LINES]
[--exclude-files EXCLUDE_FILES]
[--exclude-secrets EXCLUDE_SECRETS]
[--word-list WORD_LIST_FILE] [-f FILTER]
[--disable-filter DISABLE_FILTER]
[path [path ...]]

Scans a repository for secrets in code. The generated output is compatible
with `detect-secrets-hook --baseline`.

positional arguments:
path Scans the entire codebase and outputs a snapshot of
currently identified secrets.

optional arguments:
-h, --help show this help message and exit
--string [STRING] Scans an individual string, and displays configured
plugins' verdict.
--only-allowlisted Only scans the lines that are flagged with `allowlist
secret`. This helps verify that individual exceptions
are indeed non-secrets.

scan options:
--all-files Scan all files recursively (as compared to only
scanning git tracked files).
--baseline FILENAME If provided, will update existing baseline by
importing settings from it.
--force-use-all-plugins
If a baseline is provided, detect-secrets will default
to loading the plugins specified by that baseline.
However, this may also mean it doesn't perform the
scan with the latest plugins. If this flag is
provided, it will always use the latest plugins
--slim Slim baselines are created with the intention of
minimizing differences between commits. However, they
are not compatible with the `audit` functionality, and
slim baselines will need to be remade to be audited.

plugin options:
Configure settings for each secret scanning ruleset. By default, all
plugins are enabled unless explicitly disabled.

--list-all-plugins Lists all plugins that will be used for the scan.
-p PLUGIN, --plugin PLUGIN
Specify path to custom secret detector plugin.
--base64-limit [BASE64_LIMIT]
Sets the entropy limit for high entropy strings. Value
must be between 0.0 and 8.0, defaults to 4.5.
--hex-limit [HEX_LIMIT]
Sets the entropy limit for high entropy strings. Value
must be between 0.0 and 8.0, defaults to 3.0.
--disable-plugin DISABLE_PLUGIN
Plugin class names to disable. e.g.
Base64HighEntropyString

filter options:
Configure settings for filtering out secrets after they are flagged by the
engine.

-n, --no-verify Disables additional verification of secrets via
network call.
--only-verified Only flags secrets that can be verified.
--exclude-lines EXCLUDE_LINES
If lines match this regex, it will be ignored.
--exclude-files EXCLUDE_FILES
If filenames match this regex, it will be ignored.
--exclude-secrets EXCLUDE_SECRETS
If secrets match this regex, it will be ignored.
--word-list WORD_LIST_FILE
Text file with a list of words, if a secret contains a
word in the list we ignore it.
-f FILTER, --filter FILTER
Specify path to custom filter. May be a python module
path (e.g.
detect_secrets.filters.common.is_invalid_file) or a
local file path (e.g.
file://path/to/file.py::function_name).
--disable-filter DISABLE_FILTER
Specify filter to disable. e.g.
detect_secrets.filters.common.is_invalid_file
```

### Blocking Secrets not in Baseline

```
$ detect-secrets-hook --help
usage: detect-secrets-hook [-h] [-v] [--version] [--baseline FILENAME]
[--list-all-plugins] [-p PLUGIN]
[--base64-limit [BASE64_LIMIT]]
[--hex-limit [HEX_LIMIT]]
[--disable-plugin DISABLE_PLUGIN]
[-n | --only-verified]
[--exclude-lines EXCLUDE_LINES]
[--exclude-files EXCLUDE_FILES]
[--exclude-secrets EXCLUDE_SECRETS]
[--word-list WORD_LIST_FILE] [-f FILTER]
[--disable-filter DISABLE_FILTER]
[filenames [filenames ...]]

positional arguments:
filenames Filenames to check.

optional arguments:
-h, --help show this help message and exit
-v, --verbose Verbose mode.
--version Display version information.
--json Print detect-secrets-hook output as JSON
--baseline FILENAME Explicitly ignore secrets through a baseline generated
by `detect-secrets scan`

plugin options:
Configure settings for each secret scanning ruleset. By default, all
plugins are enabled unless explicitly disabled.

filter options:
Configure settings for filtering out secrets after they are flagged by the
engine.

-n, --no-verify Disables additional verification of secrets via
network call.
--only-verified Only flags secrets that can be verified.
--exclude-lines EXCLUDE_LINES
If lines match this regex, it will be ignored.
--exclude-files EXCLUDE_FILES
If filenames match this regex, it will be ignored.
--exclude-secrets EXCLUDE_SECRETS
If secrets match this regex, it will be ignored.
-f FILTER, --filter FILTER
Specify path to custom filter. May be a python module
path (e.g.
detect_secrets.filters.common.is_invalid_file) or a
local file path (e.g.
file://path/to/file.py::function_name).
--disable-filter DISABLE_FILTER
Specify filter to disable. e.g.
detect_secrets.filters.common.is_invalid_file
```

We recommend setting this up as a pre-commit hook. One way to do this is by using the
[pre-commit](https://github.com/pre-commit/pre-commit) framework:

```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/Yelp/detect-secrets
rev: v1.5.0
hooks:
- id: detect-secrets
args: ['--baseline', '.secrets.baseline']
exclude: package.lock.json
```

#### Inline Allowlisting

There are times when we want to exclude a false positive from blocking a commit, without creating
a baseline to do so. You can do so by adding a comment as such:

```python
secret = "hunter2" # pragma: allowlist secret
```

```javascript
// pragma: allowlist nextline secret
const secret = "hunter2";
```

### Auditing Secrets in Baseline

```bash
$ detect-secrets audit --help
usage: detect-secrets audit [-h] [--diff] [--stats]
[--report] [--only-real | --only-false]
[--json]
filename [filename ...]

Auditing a baseline allows analysts to label results, and optimize plugins for
the highest signal-to-noise ratio for their environment.

positional arguments:
filename Audit a given baseline file to distinguish the difference
between false and true positives.

optional arguments:
-h, --help show this help message and exit
--diff Allows the comparison of two baseline files, in order to
effectively distinguish the difference between various plugin
configurations.
--stats Displays the results of an interactive auditing session which
have been saved to a baseline file.
--report Displays a report with the secrets detected

reporting:
Display a summary with all the findings and the made decisions. To be used with the report mode (--report).

--only-real Only includes real secrets in the report
--only-false Only includes false positives in the report

analytics:
Quantify the success of your plugins based on the labelled results in your
baseline. To be used with the statistics mode (--stats).

--json Outputs results in a machine-readable format.
```

## Configuration

This tool operates through a system of **plugins** and **filters**.

- **Plugins** find secrets in code
- **Filters** ignore false positives to increase scanning precision

You can adjust both to suit your precision/recall needs.

### Plugins

There are three different strategies we employ to try and find secrets in code:

1. Regex-based Rules

These are the most common type of plugin, and work well with well-structured secrets.
These secrets can optionally be [verified](docs/plugins.md#Verified-Secrets), which increases
scanning precision. However, solely depending on these may negatively affect the recall of your
scan.

2. Entropy Detector

This searches for "secret-looking" strings through a variety of heuristic approaches. This
is great for non-structured secrets, but may require tuning to adjust the scanning precision.

3. Keyword Detector

This ignores the secret value, and searches for variable names that are often associated with
assigning secrets with hard-coded values. This is great for "non-secret-looking" strings (e.g.
le3tc0de passwords), but may require tuning filters to adjust the scanning precision.

Want to find a secret that we don't currently catch? You can also (easily) develop your own
plugin, and use it with the engine! For more information, check out the
[plugin documentation](docs/plugins.md#Using-Your-Own-Plugin).

### Filters

`detect-secrets` comes with several different in-built filters that may suit your needs.

#### --exclude-lines

Sometimes, you want to be able to globally allow certain lines in your scan, if they match a
specific pattern. You can specify a regex rule as such:

```bash
$ detect-secrets scan --exclude-lines 'password = (blah|fake)'
```

Or you can specify multiple regex rules as such:

```bash
$ detect-secrets scan --exclude-lines 'password = blah' --exclude-lines 'password = fake'
```

#### --exclude-files

Sometimes, you want to be able to ignore certain files in your scan. You can specify a regex
pattern to do so, and if the filename meets this regex pattern, it will not be scanned:

```bash
$ detect-secrets scan --exclude-files '.*\.signature$'
```

Or you can specify multiple regex patterns as such:

```bash
$ detect-secrets scan --exclude-files '.*\.signature$' --exclude-files '.*/i18n/.*'
```

#### --exclude-secrets

Sometimes, you want to be able to ignore certain secret values in your scan. You can specify
a regex rule as such:

```bash
$ detect-secrets scan --exclude-secrets '(fakesecret|\${.*})'
```

Or you can specify multiple regex rules as such:

```bash
$ detect-secrets scan --exclude-secrets 'fakesecret' --exclude-secrets '\${.*})'
```

#### Inline Allowlisting

Sometimes, you want to apply an exclusion to a specific line, rather than globally excluding it.
You can do so with inline allowlisting as such:

```python
API_KEY = 'this-will-ordinarily-be-detected-by-a-plugin' # pragma: allowlist secret
```

These comments are supported in multiple languages. e.g.

```java
const GoogleCredentialPassword = "something-secret-here"; // pragma: allowlist secret
```

You can also use:

```python
# pragma: allowlist nextline secret
API_KEY = 'WillAlsoBeIgnored'
```

This may be a convenient way for you to ignore secrets, without needing to regenerate the entire
baseline again. If you need to explicitly search for these allowlisted secrets, you can also do:

```bash
$ detect-secrets scan --only-allowlisted
```

Want to write more custom logic to filter out false positives? Check out how to do this in
our [filters documentation](docs/filters.md#Using-Your-Own-Filters).

## Extensions

### wordlist

The `--exclude-secrets` flag allows you to specify regex rules to exclude secret values. However,
if you want to specify a large list of words instead, you can use the `--word-list` flag.

To use this feature, be sure to install the `pyahocorasick` package, or simply use:

```bash
$ pip install detect-secrets[word_list]
```

Then, you can use it as such:

```bash
$ cat wordlist.txt
not-a-real-secret
$ cat sample.ini
password = not-a-real-secret

# Will show results
$ detect-secrets scan sample.ini

# No results found
$ detect-secrets scan --word-list wordlist.txt
```

### Gibberish Detector

The Gibberish Detector is a simple ML model, that attempts to determine whether a secret value
is actually gibberish, with the assumption that **real** secret values are not word-like.

To use this feature, be sure to install the `gibberish-detector` package, or use:

```bash
$ pip install detect-secrets[gibberish]
```

Check out the [gibberish-detector](https://github.com/domanchi/gibberish-detector) package for
more information on how to train the model. A pre-trained model (seeded by processing RFCs) will
be included for easy use.

You can also specify your own model as such:

```bash
$ detect-secrets scan --gibberish-model custom.model
```

This is not a default plugin, given that this will ignore secrets such as `password`.

## Caveats

This is not meant to be a sure-fire solution to prevent secrets from entering the codebase. Only
proper developer education can truly do that. This pre-commit hook merely implements several
heuristics to try and prevent obvious cases of committing secrets.

**Things That Won't Be Prevented:**

- Multi-line secrets
- Default passwords that don't trigger the `KeywordDetector` (e.g. `login = "hunter2"`)

## FAQ

### General

- **"Did not detect git repository." warning encountered, even though I'm in a git repo.**

Check to see whether your `git` version is >= 1.8.5. If not, please upgrade it then try again.
[More details here](https://github.com/Yelp/detect-secrets/issues/220).

### Windows

- **`detect-secrets audit` displays "Not a valid baseline file!" after creating baseline.**

Ensure the file encoding of your baseline file is UTF-8.
[More details here](https://github.com/Yelp/detect-secrets/issues/272#issuecomment-619187136).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Yelp/detect-secrets",
    "name": "detect-secrets",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "secret-management, pre-commit, security, entropy-checks",
    "author": "Yelp, Inc.",
    "author_email": "opensource@yelp.com",
    "download_url": "https://files.pythonhosted.org/packages/69/67/382a863fff94eae5a0cf05542179169a1c49a4c8784a9480621e2066ca7d/detect_secrets-1.5.0.tar.gz",
    "platform": null,
    "description": "[![Build Status](https://github.com/Yelp/detect-secrets/actions/workflows/ci.yml/badge.svg)](https://github.com/Yelp/detect-secrets/actions/workflows/ci.yml?query=branch%3Amaster++)\n[![PyPI version](https://badge.fury.io/py/detect-secrets.svg)](https://badge.fury.io/py/detect-secrets)\n[![Homebrew](https://img.shields.io/badge/dynamic/json.svg?url=https://formulae.brew.sh/api/formula/detect-secrets.json&query=$.versions.stable&label=homebrew)](https://formulae.brew.sh/formula/detect-secrets)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-ff69b4.svg)](https://github.com/Yelp/detect-secrets/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22+)\n[![AMF](https://img.shields.io/badge/Donate-Charity-orange.svg)](https://www.againstmalaria.com/donation.aspx)\n\n# detect-secrets\n\n## About\n\n`detect-secrets` is an aptly named module for (surprise, surprise) **detecting secrets** within a\ncode base.\n\nHowever, unlike other similar packages that solely focus on finding secrets, this package is\ndesigned with the enterprise client in mind: providing a **backwards compatible**, systematic\nmeans of:\n\n1. Preventing new secrets from entering the code base,\n2. Detecting if such preventions are explicitly bypassed, and\n3. Providing a checklist of secrets to roll, and migrate off to a more secure storage.\n\nThis way, you create a\n[separation of concern](https://en.wikipedia.org/wiki/Separation_of_concerns):\naccepting that there may *currently* be secrets hiding in your large repository\n(this is what we refer to as a _baseline_), but preventing this issue from getting any larger,\nwithout dealing with the potentially gargantuan effort of moving existing secrets away.\n\nIt does this by running periodic diff outputs against heuristically crafted regex statements,\nto identify whether any *new* secret has been committed. This way, it avoids the overhead of\ndigging through all git history, as well as the need to scan the entire repository every time.\n\nFor a look at recent changes, please see [CHANGELOG.md](CHANGELOG.md).\n\nIf you are looking to contribute, please see [CONTRIBUTING.md](CONTRIBUTING.md).\n\nFor more detailed documentation, check out our other [documentation](docs/).\n\n## Examples\n\n### Quickstart:\n\nCreate a baseline of potential secrets currently found in your git repository.\n\n```bash\n$ detect-secrets scan > .secrets.baseline\n```\n\nor, to run it from a different directory:\n\n```bash\n$ detect-secrets -C /path/to/directory scan > /path/to/directory/.secrets.baseline\n```\n\n**Scanning non-git tracked files:**\n\n```bash\n$ detect-secrets scan test_data/ --all-files > .secrets.baseline\n```\n\n### Adding New Secrets to Baseline:\n\nThis will rescan your codebase, and:\n\n1. Update/upgrade your baseline to be compatible with the latest version,\n2. Add any new secrets it finds to your baseline,\n3. Remove any secrets no longer in your codebase\n\nThis will also preserve any labelled secrets you have.\n\n```bash\n$ detect-secrets scan --baseline .secrets.baseline\n```\n\nFor baselines older than version 0.9, just recreate it.\n\n### Alerting off newly added secrets:\n\n**Scanning Staged Files Only:**\n\n```bash\n$ git diff --staged --name-only -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline\n```\n\n**Scanning All Tracked Files:**\n\n```bash\n$ git ls-files -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline\n```\n\n### Viewing All Enabled Plugins:\n\n```bash\n$ detect-secrets scan --list-all-plugins\nArtifactoryDetector\nAWSKeyDetector\nAzureStorageKeyDetector\nBasicAuthDetector\nCloudantDetector\nDiscordBotTokenDetector\nGitHubTokenDetector\nGitLabTokenDetector\nBase64HighEntropyString\nHexHighEntropyString\nIbmCloudIamDetector\nIbmCosHmacDetector\nIPPublicDetector\nJwtTokenDetector\nKeywordDetector\nMailchimpDetector\nNpmDetector\nOpenAIDetector\nPrivateKeyDetector\nPypiTokenDetector\nSendGridDetector\nSlackDetector\nSoftlayerDetector\nSquareOAuthDetector\nStripeDetector\nTelegramBotTokenDetector\nTwilioKeyDetector\n```\n\n### Disabling Plugins:\n\n```bash\n$ detect-secrets scan --disable-plugin KeywordDetector --disable-plugin AWSKeyDetector\n```\n\nIf you want to **only** run a specific plugin, you can do:\n\n```bash\n$ detect-secrets scan --list-all-plugins | \\\n    grep -v 'BasicAuthDetector' | \\\n    sed \"s#^#--disable-plugin #g\" | \\\n    xargs detect-secrets scan test_data\n```\n\n### Auditing a Baseline:\n\nThis is an optional step to label the results in your baseline. It can be used to narrow down your\nchecklist of secrets to migrate, or to better configure your plugins to improve its signal-to-noise\nratio.\n\n```bash\n$ detect-secrets audit .secrets.baseline\n```\n\n### Usage in Other Python Scripts\n\n**Basic Use:**\n\n```python\nfrom detect_secrets import SecretsCollection\nfrom detect_secrets.settings import default_settings\n\nsecrets = SecretsCollection()\nwith default_settings():\n    secrets.scan_file('test_data/config.ini')\n\n\nimport json\nprint(json.dumps(secrets.json(), indent=2))\n```\n\n**More Advanced Configuration:**\n\n```python\nfrom detect_secrets import SecretsCollection\nfrom detect_secrets.settings import transient_settings\n\nsecrets = SecretsCollection()\nwith transient_settings({\n    # Only run scans with only these plugins.\n    # This format is the same as the one that is saved in the generated baseline.\n    'plugins_used': [\n        # Example of configuring a built-in plugin\n        {\n            'name': 'Base64HighEntropyString',\n            'limit': 5.0,\n        },\n\n        # Example of using a custom plugin\n        {\n            'name': 'HippoDetector',\n            'path': 'file:///Users/aaronloo/Documents/github/detect-secrets/testing/plugins.py',\n        },\n    ],\n\n    # We can also specify whichever additional filters we want.\n    # This is an example of using the function `is_identified_by_ML_model` within the\n    # local file `./private-filters/example.py`.\n    'filters_used': [\n        {\n            'path': 'file://private-filters/example.py::is_identified_by_ML_model',\n        },\n    ]\n}) as settings:\n    # If we want to make any further adjustments to the created settings object (e.g.\n    # disabling default filters), we can do so as such.\n    settings.disable_filters(\n        'detect_secrets.filters.heuristic.is_prefixed_with_dollar_sign',\n        'detect_secrets.filters.heuristic.is_likely_id_string',\n    )\n\n    secrets.scan_file('test_data/config.ini')\n```\n\n## Installation\n\n```bash\n$ pip install detect-secrets\n\u2728\ud83c\udf70\u2728\n```\n\nInstall via [brew](https://brew.sh/):\n\n```bash\n$ brew install detect-secrets\n```\n\n## Usage\n\n`detect-secrets` comes with three different tools, and there is often confusion around which one\nto use. Use this handy checklist to help you decide:\n\n1. Do you want to add secrets to your baseline? If so, use **`detect-secrets scan`**.\n2. Do you want to alert off new secrets not in the baseline? If so, use **`detect-secrets-hook`**.\n3. Are you analyzing the baseline itself? If so, use **`detect-secrets audit`**.\n\n### Adding Secrets to Baseline\n\n```\n$ detect-secrets scan --help\nusage: detect-secrets scan [-h] [--string [STRING]] [--only-allowlisted]\n                           [--all-files] [--baseline FILENAME]\n                           [--force-use-all-plugins] [--slim]\n                           [--list-all-plugins] [-p PLUGIN]\n                           [--base64-limit [BASE64_LIMIT]]\n                           [--hex-limit [HEX_LIMIT]]\n                           [--disable-plugin DISABLE_PLUGIN]\n                           [-n | --only-verified]\n                           [--exclude-lines EXCLUDE_LINES]\n                           [--exclude-files EXCLUDE_FILES]\n                           [--exclude-secrets EXCLUDE_SECRETS]\n                           [--word-list WORD_LIST_FILE] [-f FILTER]\n                           [--disable-filter DISABLE_FILTER]\n                           [path [path ...]]\n\nScans a repository for secrets in code. The generated output is compatible\nwith `detect-secrets-hook --baseline`.\n\npositional arguments:\n  path                  Scans the entire codebase and outputs a snapshot of\n                        currently identified secrets.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --string [STRING]     Scans an individual string, and displays configured\n                        plugins' verdict.\n  --only-allowlisted    Only scans the lines that are flagged with `allowlist\n                        secret`. This helps verify that individual exceptions\n                        are indeed non-secrets.\n\nscan options:\n  --all-files           Scan all files recursively (as compared to only\n                        scanning git tracked files).\n  --baseline FILENAME   If provided, will update existing baseline by\n                        importing settings from it.\n  --force-use-all-plugins\n                        If a baseline is provided, detect-secrets will default\n                        to loading the plugins specified by that baseline.\n                        However, this may also mean it doesn't perform the\n                        scan with the latest plugins. If this flag is\n                        provided, it will always use the latest plugins\n  --slim                Slim baselines are created with the intention of\n                        minimizing differences between commits. However, they\n                        are not compatible with the `audit` functionality, and\n                        slim baselines will need to be remade to be audited.\n\nplugin options:\n  Configure settings for each secret scanning ruleset. By default, all\n  plugins are enabled unless explicitly disabled.\n\n  --list-all-plugins    Lists all plugins that will be used for the scan.\n  -p PLUGIN, --plugin PLUGIN\n                        Specify path to custom secret detector plugin.\n  --base64-limit [BASE64_LIMIT]\n                        Sets the entropy limit for high entropy strings. Value\n                        must be between 0.0 and 8.0, defaults to 4.5.\n  --hex-limit [HEX_LIMIT]\n                        Sets the entropy limit for high entropy strings. Value\n                        must be between 0.0 and 8.0, defaults to 3.0.\n  --disable-plugin DISABLE_PLUGIN\n                        Plugin class names to disable. e.g.\n                        Base64HighEntropyString\n\nfilter options:\n  Configure settings for filtering out secrets after they are flagged by the\n  engine.\n\n  -n, --no-verify       Disables additional verification of secrets via\n                        network call.\n  --only-verified       Only flags secrets that can be verified.\n  --exclude-lines EXCLUDE_LINES\n                        If lines match this regex, it will be ignored.\n  --exclude-files EXCLUDE_FILES\n                        If filenames match this regex, it will be ignored.\n  --exclude-secrets EXCLUDE_SECRETS\n                        If secrets match this regex, it will be ignored.\n  --word-list WORD_LIST_FILE\n                        Text file with a list of words, if a secret contains a\n                        word in the list we ignore it.\n  -f FILTER, --filter FILTER\n                        Specify path to custom filter. May be a python module\n                        path (e.g.\n                        detect_secrets.filters.common.is_invalid_file) or a\n                        local file path (e.g.\n                        file://path/to/file.py::function_name).\n  --disable-filter DISABLE_FILTER\n                        Specify filter to disable. e.g.\n                        detect_secrets.filters.common.is_invalid_file\n```\n\n### Blocking Secrets not in Baseline\n\n```\n$ detect-secrets-hook --help\nusage: detect-secrets-hook [-h] [-v] [--version] [--baseline FILENAME]\n                           [--list-all-plugins] [-p PLUGIN]\n                           [--base64-limit [BASE64_LIMIT]]\n                           [--hex-limit [HEX_LIMIT]]\n                           [--disable-plugin DISABLE_PLUGIN]\n                           [-n | --only-verified]\n                           [--exclude-lines EXCLUDE_LINES]\n                           [--exclude-files EXCLUDE_FILES]\n                           [--exclude-secrets EXCLUDE_SECRETS]\n                           [--word-list WORD_LIST_FILE] [-f FILTER]\n                           [--disable-filter DISABLE_FILTER]\n                           [filenames [filenames ...]]\n\npositional arguments:\n  filenames             Filenames to check.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -v, --verbose         Verbose mode.\n  --version             Display version information.\n  --json                Print detect-secrets-hook output as JSON\n  --baseline FILENAME   Explicitly ignore secrets through a baseline generated\n                        by `detect-secrets scan`\n\nplugin options:\n  Configure settings for each secret scanning ruleset. By default, all\n  plugins are enabled unless explicitly disabled.\n\n  --list-all-plugins    Lists all plugins that will be used for the scan.\n  -p PLUGIN, --plugin PLUGIN\n                        Specify path to custom secret detector plugin.\n  --base64-limit [BASE64_LIMIT]\n                        Sets the entropy limit for high entropy strings. Value\n                        must be between 0.0 and 8.0, defaults to 4.5.\n  --hex-limit [HEX_LIMIT]\n                        Sets the entropy limit for high entropy strings. Value\n                        must be between 0.0 and 8.0, defaults to 3.0.\n  --disable-plugin DISABLE_PLUGIN\n                        Plugin class names to disable. e.g.\n                        Base64HighEntropyString\n\nfilter options:\n  Configure settings for filtering out secrets after they are flagged by the\n  engine.\n\n  -n, --no-verify       Disables additional verification of secrets via\n                        network call.\n  --only-verified       Only flags secrets that can be verified.\n  --exclude-lines EXCLUDE_LINES\n                        If lines match this regex, it will be ignored.\n  --exclude-files EXCLUDE_FILES\n                        If filenames match this regex, it will be ignored.\n  --exclude-secrets EXCLUDE_SECRETS\n                        If secrets match this regex, it will be ignored.\n  -f FILTER, --filter FILTER\n                        Specify path to custom filter. May be a python module\n                        path (e.g.\n                        detect_secrets.filters.common.is_invalid_file) or a\n                        local file path (e.g.\n                        file://path/to/file.py::function_name).\n  --disable-filter DISABLE_FILTER\n                        Specify filter to disable. e.g.\n                        detect_secrets.filters.common.is_invalid_file\n```\n\nWe recommend setting this up as a pre-commit hook. One way to do this is by using the\n[pre-commit](https://github.com/pre-commit/pre-commit) framework:\n\n```yaml\n# .pre-commit-config.yaml\nrepos:\n-   repo: https://github.com/Yelp/detect-secrets\n    rev: v1.5.0\n    hooks:\n    -   id: detect-secrets\n        args: ['--baseline', '.secrets.baseline']\n        exclude: package.lock.json\n```\n\n#### Inline Allowlisting\n\nThere are times when we want to exclude a false positive from blocking a commit, without creating\na baseline to do so. You can do so by adding a comment as such:\n\n```python\nsecret = \"hunter2\"      # pragma: allowlist secret\n```\n\nor\n\n```javascript\n//  pragma: allowlist nextline secret\nconst secret = \"hunter2\";\n```\n\n### Auditing Secrets in Baseline\n\n```bash\n$ detect-secrets audit --help\nusage: detect-secrets audit [-h] [--diff] [--stats]\n                      [--report] [--only-real | --only-false]\n                      [--json]\n                      filename [filename ...]\n\nAuditing a baseline allows analysts to label results, and optimize plugins for\nthe highest signal-to-noise ratio for their environment.\n\npositional arguments:\n  filename      Audit a given baseline file to distinguish the difference\n                between false and true positives.\n\noptional arguments:\n  -h, --help    show this help message and exit\n  --diff        Allows the comparison of two baseline files, in order to\n                effectively distinguish the difference between various plugin\n                configurations.\n  --stats       Displays the results of an interactive auditing session which\n                have been saved to a baseline file.\n  --report      Displays a report with the secrets detected\n\nreporting:\n  Display a summary with all the findings and the made decisions. To be used with the report mode (--report).\n\n  --only-real   Only includes real secrets in the report\n  --only-false  Only includes false positives in the report\n\nanalytics:\n  Quantify the success of your plugins based on the labelled results in your\n  baseline. To be used with the statistics mode (--stats).\n\n  --json        Outputs results in a machine-readable format.\n```\n\n## Configuration\n\nThis tool operates through a system of **plugins** and **filters**.\n\n- **Plugins** find secrets in code\n- **Filters** ignore false positives to increase scanning precision\n\nYou can adjust both to suit your precision/recall needs.\n\n### Plugins\n\nThere are three different strategies we employ to try and find secrets in code:\n\n1. Regex-based Rules\n\n   These are the most common type of plugin, and work well with well-structured secrets.\n   These secrets can optionally be [verified](docs/plugins.md#Verified-Secrets), which increases\n   scanning precision. However, solely depending on these may negatively affect the recall of your\n   scan.\n\n2. Entropy Detector\n\n   This searches for \"secret-looking\" strings through a variety of heuristic approaches. This\n   is great for non-structured secrets, but may require tuning to adjust the scanning precision.\n\n3. Keyword Detector\n\n   This ignores the secret value, and searches for variable names that are often associated with\n   assigning secrets with hard-coded values. This is great for \"non-secret-looking\" strings (e.g.\n   le3tc0de passwords), but may require tuning filters to adjust the scanning precision.\n\nWant to find a secret that we don't currently catch? You can also (easily) develop your own\nplugin, and use it with the engine! For more information, check out the\n[plugin documentation](docs/plugins.md#Using-Your-Own-Plugin).\n\n### Filters\n\n`detect-secrets` comes with several different in-built filters that may suit your needs.\n\n#### --exclude-lines\n\nSometimes, you want to be able to globally allow certain lines in your scan, if they match a\nspecific pattern. You can specify a regex rule as such:\n\n```bash\n$ detect-secrets scan --exclude-lines 'password = (blah|fake)'\n```\n\nOr you can specify multiple regex rules as such:\n\n```bash\n$ detect-secrets scan --exclude-lines 'password = blah' --exclude-lines 'password = fake'\n```\n\n#### --exclude-files\n\nSometimes, you want to be able to ignore certain files in your scan. You can specify a regex\npattern to do so, and if the filename meets this regex pattern, it will not be scanned:\n\n```bash\n$ detect-secrets scan --exclude-files '.*\\.signature$'\n```\n\nOr you can specify multiple regex patterns as such:\n\n```bash\n$ detect-secrets scan --exclude-files '.*\\.signature$' --exclude-files '.*/i18n/.*'\n```\n\n#### --exclude-secrets\n\nSometimes, you want to be able to ignore certain secret values in your scan. You can specify\na regex rule as such:\n\n```bash\n$ detect-secrets scan --exclude-secrets '(fakesecret|\\${.*})'\n```\n\nOr you can specify multiple regex rules as such:\n\n```bash\n$ detect-secrets scan --exclude-secrets 'fakesecret' --exclude-secrets '\\${.*})'\n```\n\n#### Inline Allowlisting\n\nSometimes, you want to apply an exclusion to a specific line, rather than globally excluding it.\nYou can do so with inline allowlisting as such:\n\n```python\nAPI_KEY = 'this-will-ordinarily-be-detected-by-a-plugin'    # pragma: allowlist secret\n```\n\nThese comments are supported in multiple languages. e.g.\n\n```java\nconst GoogleCredentialPassword = \"something-secret-here\";     //  pragma: allowlist secret\n```\n\nYou can also use:\n\n```python\n# pragma: allowlist nextline secret\nAPI_KEY = 'WillAlsoBeIgnored'\n```\n\nThis may be a convenient way for you to ignore secrets, without needing to regenerate the entire\nbaseline again. If you need to explicitly search for these allowlisted secrets, you can also do:\n\n```bash\n$ detect-secrets scan --only-allowlisted\n```\n\nWant to write more custom logic to filter out false positives? Check out how to do this in\nour [filters documentation](docs/filters.md#Using-Your-Own-Filters).\n\n## Extensions\n\n### wordlist\n\nThe `--exclude-secrets` flag allows you to specify regex rules to exclude secret values. However,\nif you want to specify a large list of words instead, you can use the `--word-list` flag.\n\nTo use this feature, be sure to install the `pyahocorasick` package, or simply use:\n\n```bash\n$ pip install detect-secrets[word_list]\n```\n\nThen, you can use it as such:\n\n```bash\n$ cat wordlist.txt\nnot-a-real-secret\n$ cat sample.ini\npassword = not-a-real-secret\n\n# Will show results\n$ detect-secrets scan sample.ini\n\n# No results found\n$ detect-secrets scan --word-list wordlist.txt\n```\n\n### Gibberish Detector\n\nThe Gibberish Detector is a simple ML model, that attempts to determine whether a secret value\nis actually gibberish, with the assumption that **real** secret values are not word-like.\n\nTo use this feature, be sure to install the `gibberish-detector` package, or use:\n\n```bash\n$ pip install detect-secrets[gibberish]\n```\n\nCheck out the [gibberish-detector](https://github.com/domanchi/gibberish-detector) package for\nmore information on how to train the model. A pre-trained model (seeded by processing RFCs) will\nbe included for easy use.\n\nYou can also specify your own model as such:\n\n```bash\n$ detect-secrets scan --gibberish-model custom.model\n```\n\nThis is not a default plugin, given that this will ignore secrets such as `password`.\n\n## Caveats\n\nThis is not meant to be a sure-fire solution to prevent secrets from entering the codebase. Only\nproper developer education can truly do that. This pre-commit hook merely implements several\nheuristics to try and prevent obvious cases of committing secrets.\n\n**Things That Won't Be Prevented:**\n\n- Multi-line secrets\n- Default passwords that don't trigger the `KeywordDetector` (e.g. `login = \"hunter2\"`)\n\n## FAQ\n\n### General\n\n- **\"Did not detect git repository.\" warning encountered, even though I'm in a git repo.**\n\n  Check to see whether your `git` version is >= 1.8.5. If not, please upgrade it then try again.\n  [More details here](https://github.com/Yelp/detect-secrets/issues/220).\n\n### Windows\n\n- **`detect-secrets audit` displays \"Not a valid baseline file!\" after creating baseline.**\n\n  Ensure the file encoding of your baseline file is UTF-8.\n  [More details here](https://github.com/Yelp/detect-secrets/issues/272#issuecomment-619187136).\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Tool for detecting secrets in the codebase",
    "version": "1.5.0",
    "project_urls": {
        "Download": "https://github.com/Yelp/detect-secrets/archive/1.5.0.tar.gz",
        "Homepage": "https://github.com/Yelp/detect-secrets"
    },
    "split_keywords": [
        "secret-management",
        " pre-commit",
        " security",
        " entropy-checks"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4e5e4f5fe4b89fde1dc3ed0eb51bd4ce4c0bca406246673d370ea2ad0c58d747",
                "md5": "803b3510c603c29fec9ed333346e0e0c",
                "sha256": "e24e7b9b5a35048c313e983f76c4bd09dad89f045ff059e354f9943bf45aa060"
            },
            "downloads": -1,
            "filename": "detect_secrets-1.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "803b3510c603c29fec9ed333346e0e0c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 120341,
            "upload_time": "2024-05-06T17:46:16",
            "upload_time_iso_8601": "2024-05-06T17:46:16.628206Z",
            "url": "https://files.pythonhosted.org/packages/4e/5e/4f5fe4b89fde1dc3ed0eb51bd4ce4c0bca406246673d370ea2ad0c58d747/detect_secrets-1.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6967382a863fff94eae5a0cf05542179169a1c49a4c8784a9480621e2066ca7d",
                "md5": "c21d09fc490c1316b4aa4d066a108b24",
                "sha256": "6bb46dcc553c10df51475641bb30fd69d25645cc12339e46c824c1e0c388898a"
            },
            "downloads": -1,
            "filename": "detect_secrets-1.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c21d09fc490c1316b4aa4d066a108b24",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 97351,
            "upload_time": "2024-05-06T17:46:19",
            "upload_time_iso_8601": "2024-05-06T17:46:19.721804Z",
            "url": "https://files.pythonhosted.org/packages/69/67/382a863fff94eae5a0cf05542179169a1c49a4c8784a9480621e2066ca7d/detect_secrets-1.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-06 17:46:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Yelp",
    "github_project": "detect-secrets",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "tox": true,
    "lcname": "detect-secrets"
}

Yelp, Inc.