sarif-tools


Namesarif-tools JSON
Version 2.0.0 PyPI version JSON
download
home_pagehttps://github.com/microsoft/sarif-tools
SummarySARIF tools
upload_time2023-11-07 13:17:03
maintainer
docs_urlNone
authorMicrosoft
requires_python>=3.8,<4.0
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SARIF Tools

A set of command line tools and Python library for working with SARIF files.

Read more about the SARIF format here:
[sarifweb.azurewebsites.net](https://sarifweb.azurewebsites.net/).

## Installation

### Prerequisites

You need Python 3.8 or later installed.  Get it from [python.org](https://www.python.org/downloads/).
This document assumes that the `python` command runs that version.

### Installing on Windows

Open a user command prompt and type:

```cmd
pip install sarif-tools
```

Check for a warning such as the following:

```log
WARNING: The script sarif.exe is installed in 'C:\tools\Python38\Scripts' which is not on PATH.
```

Go into Windows Settings and search for "env" (Edit environment variables for your account) and
add the missing path to your PATH variable.  You'll need to open a new terminal or reboot, and
then you can type `sarif --version` at the command prompt.

To install system-wide for all users, use an Administrator command prompt instead, if you are
comfortable with the security risks.

### Installing on Linux or Mac

```bash
pip install sarif-tools
```

Check for a warning such as the following:

```log
WARNING: The script sarif is installed in '/home/XYZ/.local/bin' which is not on PATH.
```

Add the missing path to your PATH.  How to do that varies by Linux flavour, but editing `~/.profile`
is often a good approach.  Then after opening a new terminal or running `source ~/.profile`, you
should be able to type `sarif --version` at the command prompt.

To install system-wide, use `sudo pip install`.  Be aware that this is discouraged from a
security perspective.

### Testing the installation

After installing using `pip`, you should then be able to run:

```bash
sarif --version
```

### Troubleshooting installation

This section has suggestions in case the `sarif` command is not available after installation.

A launcher called `sarif` or `sarif.exe` is created in Python's `Scripts` directory.  The `Scripts`
directory needs to be in the `PATH` environment variable for you to be able to type `sarif` at the
command prompt; this is most likely the case if `pip` is run as a super-user when installing (e.g.
Administrator Command Prompt on Windows, or using `sudo` on Linux).
If the SARIF tools are installed for the current user only, adding the user's Scripts directory to
the current user's PATH variable is the best approach.  Search online for how to do that on your
system.

If the `Scripts` directory is not in the `PATH`, then you can type `python -m sarif` instead of
`sarif` to run the tool.

Confusion can arise when the `python` and `pip` commands on the `PATH` are from different
installations, or the `python` installation on the super-user's `PATH` is different from the
`python` command on the normal user's path.  On Windows, you can use `where python` and `where pip`
in normal CMD and Admin CMD to see which installations are in use; on Linux, it's `which python` and
`which pip` with and without `sudo`.

## Command Line Usage

```plain
usage: sarif [-h] [--version] [--debug] [--check {error,warning,note}] {blame,codeclimate,copy,csv,diff,emacs,html,info,ls,summary,trend,usage,word} ...

Process sets of SARIF files

positional arguments:
  {blame,codeclimate,copy,csv,diff,emacs,html,info,ls,summary,trend,usage,word}
                        command

optional arguments:
  -h, --help            show this help message and exit
  --version, -v         show program's version number and exit
  --debug               Print information useful for debugging
  --check {error,warning,note}, -x {error,warning,note}
                        Exit with error code if there are any issues of the specified level (or for diff, an increase in issues at that level).

commands:
blame        Enhance SARIF file with information from `git blame`
codeclimate  Write a JSON representation in Code Climate format of SARIF file(s) for viewing as a Code Quality report in GitLab UI
copy         Write a new SARIF file containing optionally-filtered data from other SARIF file(s)
csv          Write a CSV file listing the issues from the SARIF files(s) specified
diff         Find the difference between two [sets of] SARIF files
emacs        Write a representation of SARIF file(s) for viewing in emacs
html         Write an HTML representation of SARIF file(s) for viewing in a web browser
info         Print information about SARIF file(s) structure
ls           List all SARIF files in the directories specified
summary      Write a text summary with the counts of issues from the SARIF files(s) specified
trend        Write a CSV file with time series data from SARIF files with "yyyymmddThhmmssZ" timestamps in their filenames
usage        (Command optional) - print usage and exit
word         Produce MS Word .docx summaries of the SARIF files specified
Run `sarif <COMMAND> --help` for command-specific help.
```

### Commands

The commands are illustrated below assuming input files in the following locations:

- `C:\temp\sarif_files` = a directory of SARIF files with arbitrary filenames.
- `C:\temp\sarif_with_date` = a directory of SARIF files with filenames including timestamps e.g. `C:\temp\sarif_with_date\myapp_devskim_output_20211001T012000Z.sarif`.
- `C:\temp\old_sarif_files` = a directory of SARIF files with arbitrary filenames from an older build.
- `C:\code\my_source_repo` = checkout directory of source code files from which SARIF results were obtained.

#### blame

```plain
usage: sarif blame [-h] [--output PATH] [--code PATH] [file_or_dir [file_or_dir ...]]

Enhance SARIF file with information from `git blame`

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --code PATH, -c PATH  Path to git repository; if not specified, the current working directory is used
```

Augment SARIF files with `git blame` information, and write the augmented files to a specified location.

```shell
sarif blame -o "C:\temp\sarif_files_with_blame_info" -c "C:\code\my_source_repo" "C:\temp\sarif_files"
```

If the current working directory is the git repository, the `-c` argument can be omitted.

Blame information is added to the property bag of each `result` object for which it was successfully obtained.  The keys and values used are as in the [git blame porcelain format](https://git-scm.com/docs/git-blame#_the_porcelain_format).  E.g.:

```json
{
  "ruleId": "SM00702",
  ...
  "properties": {
    "blame": {
      "author": "aperson",
      "author-mail": "<aperson@acompany.com>",
      "author-time": "1350899798",
      "author-tz": "+0000",
      "committer": "aperson",
      "committer-mail": "<aperson@acompany.com>",
      "committer-time": "1350899798",
      "committer-tz": "+0000",
      "summary": "blah blah commit comment blah",
      "boundary": true,
      "filename": "src/net/myproject/mypackage/MyClass.java"
    }
  }
}
```

Note that the bare `boundary` key is given the automatic value `true`.

#### codeclimate

```plain
usage: sarif codeclimate [-h] [--output PATH] [--filter FILE] [--autotrim] [--trim PREFIX] [file_or_dir ...]

Write a JSON representation in Code Climate format of SARIF file(s) for viewing as a Code Quality report in GitLab UI

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --autotrim, -a        Strip off the common prefix of paths in the CSV output
  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent
```

Write out a JSON file of Code Climate tool format from [a set of] SARIF files.
This can then be published as a Code Quality report artefact in a GitLab pipeline and shown in GitLab UI for merge requests.

The JSON output can also be filtered using the blame information; see
[Filtering](#filtering) below for how to use the `--filter` option.

#### copy

```plain
usage: sarif copy [-h] [--output FILE] [--filter FILE] [--timestamp] [file_or_dir [file_or_dir ...]]

Write a new SARIF file containing optionally-filtered data from other SARIF file(s)

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --timestamp, -t       Append current timestamp to output filename in the "yyyymmddThhmmssZ" format used by the `sarif trend` command
```

Write a new SARIF file containing optionally-filtered data from an existing SARIF file or multiple
SARIF files.  The resulting file contains each run from the original SARIF files back-to-back.
The results can be filtered (see [Filtering](#filtering) below), in which case only
those results from the original SARIF files that meet the filter are included; the output file
contains no information about the excluded records.  If a run in the original file was empty,
or all its results are filtered out, the empty run is still included.

If no output filename is provided, a file called `out.sarif` in the current directory is written.
If the output file already exists and is also in the input file list, it is not included in the
inputs, to avoid duplication of results.  The output file is overwritten without warning.

The `file_or_dir` specifier can include wildcards e.g. `c:\temp\**\devskim*.sarif` (i.e.
a "glob").  This works for all commands, but it is particularly useful for `copy`.

One use for this is to combine a set of SARIF files from multiple static analysis tools run during
a build process into a single file that can be more easily stored and processed as a build asset.

#### csv

```plain
usage: sarif csv [-h] [--output PATH] [--filter FILE] [--autotrim] [--trim PREFIX] [file_or_dir [file_or_dir ...]]

Write a CSV file listing the issues from the SARIF files(s) specified

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --autotrim, -a        Strip off the common prefix of paths in the CSV output
  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent
```

Write out a simple tabular list of issues from [a set of] SARIF files.  This can then be analysed, e.g. via Pivot Tables in Excel.

Use the `--trim` option to strip specific prefixes from the paths, to make the CSV less verbose.  Alternatively, use `--autotrim` to strip off the longest common prefix.

Generate a CSV summary of a single SARIF file with common file path prefix suppressed:

```shell
sarif csv "C:\temp\sarif_files\devskim_myapp.sarif"
```

Generate a CSV summary of a directory of SARIF files with path prefix `C:\code\my_source_repo` suppressed:

```shell
sarif csv --trim c:\code\my_source_repo "C:\temp\sarif_files"
```

If the SARIF file(s) contain blame information (as added by the `blame` command), then the CSV
includes an "Author" column indicating who last modified the line in question.

The CSV output can also be filtered using the same blame information; see
[Filtering](#filtering) below for how to use the `--filter` option.

#### diff

```plain
usage: sarif diff [-h] [--output FILE] [--filter FILE] old_file_or_dir new_file_or_dir

Find the difference between two [sets of] SARIF files

positional arguments:
  old_file_or_dir       An old SARIF file or a directory containing the old SARIF files
  new_file_or_dir       A new SARIF file or a directory containing the new SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
```

Print the difference between two [sets of] SARIF files.

Difference between the issues in two SARIF files:

```shell
sarif diff "C:\temp\old_sarif_files\devskim_myapp.sarif" "C:\temp\sarif_files\devskim_myapp.sarif"
```

Difference between the issues in two directories of SARIF files:

```shell
sarif diff "C:\temp\old_sarif_files" "C:\temp\sarif_files"
```

Write output to JSON file instead of printing to stdout:

```shell
sarif diff -o mydiff.json "C:\temp\old_sarif_files\devskim_myapp.sarif" "C:\temp\sarif_files\devskim_myapp.sarif"
```

The JSON format is like this:

```json

{
    "all": {
        "+": 5,
        "-": 11
    },
    "error": {
        "+": 2,
        "-": 0,
        "codes": {
            "XYZ1234 Some Issue": {
                "<": 0,
                ">": 2,
                "+@": [
                    {
                        "Location": "C:\\code\\file1.py",
                        "Line": 119
                    },
                    {
                        "Location": "C:\\code\\file2.py",
                        "Line": 61
                    }
                ]
            },
        }
    },
    "warning": {
        "+": 3,
        "-": 11,
        "codes": {...}
    },
    "note": {
        "+": 3,
        "-": 11,
        "codes": {...}
    }
}
```

Where:

- "+" indicates new issue types at this severity, "error", "warning" or "note"
- "-" indicates resolved issue types at this severity (no occurrences remaining)
- "codes" lists each issue code where the number of occurrences has changed:
  - occurrences before indicated by "<"
  - occurrences after indicated by ">"
  - new locations indicated by "+@"

If the set of issue codes at a given severity has changed, diff will report this even if the total
number of issue types at that severity is unchanged.

When the number of occurrences of an issue code is unchanged, diff will not report this issue code,
although it is possible that an equal number of new occurrences of the specific issue have arisen as
have been resolved.  This is to avoid reporting line number changes.

The `diff` operation shows the location of new occurrences of each issue.  When writing to an
output JSON file, all new locations are written, but when writing output to the console, a maximum
of three locations are shown.  Note that there can be some false positives here, if line numbers
have changed.

See [Filtering](#filtering) below for how to use the `--filter` option.

#### emacs

```plain
usage: sarif emacs [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]

Write a representation of SARIF file(s) for viewing in emacs

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document
  --image IMAGE         Image to include at top of file - SARIF logo by default
  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent
```

#### html

```plain
usage: sarif html [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]

Write an HTML representation of SARIF file(s) for viewing in a web browser

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document
  --image IMAGE         Image to include at top of file - SARIF logo by default
  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent
```

Create an HTML file summarising SARIF results.

```shell
sarif html -o summary.html "C:\temp\sarif_files"
```

Use the `--trim` option to strip specific prefixes from the paths, to make the generated HTML page less verbose.  The longest common prefix of the paths will be trimmed unless `--no-autotrim` is specified.

Use the `--image` option to provide a header image for the top of the HTML page.  The image is embedded into the HTML, so the HTML document remains a portable standalone file.

See [Filtering](#filtering) below for how to use the `--filter` option.

#### info

```plain
usage: sarif info [-h] [--output FILE] [file_or_dir [file_or_dir ...]]

Print information about SARIF file(s) structure

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
```

Print information about the structure of a SARIF file or multiple files.  This is about the JSON
structure rather than any meaning of the results produced by the tool.  The summary includes the
full path of the file, its size and modified date, the number of runs, and for each run, the
tool that generated the run, the number of results, and the entries in the results' property bags.

```plain
c:\temp\sarif_files\ios_devskim_output.sarif
  1256241 bytes (1.2 MiB)
  modified: 2021-10-13 21:50:01.251544, accessed: 2022-01-09 18:23:00.060573, ctime: 2021-10-13 20:49:00
  1 run
    Tool: devskim
    1323 results
    All results have properties: tags, DevSkimSeverity
```

#### ls

```plain
usage: sarif ls [-h] [--output FILE] [file_or_dir [file_or_dir ...]]

List all SARIF files in the directories specified

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
```

List SARIF files in one or more directories.

```shell
sarif ls "C:\temp\sarif_files" "C:\temp\sarif_with_date"
```

#### summary

```plain
usage: sarif summary [-h] [--output PATH] [--filter FILE] [file_or_dir [file_or_dir ...]]

Write a text summary with the counts of issues from the SARIF files(s) specified

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
```

Print a summary of the issues in one or more SARIF file(s), grouped by severity and then ordered by number of occurrences.

When directories are provided as input and output, a summary is written for each input file, along with another file containing the totals.

```shell
sarif summary -o summaries "C:\temp\sarif_files"
```

When no output directory or file is specified, the overall summary is printed to the standard output.

```shell
sarif summary "C:\temp\sarif_files\devskim_myapp.sarif"
```

See [Filtering](#filtering) below for how to use the `--filter` option.

#### trend

```plain
usage: sarif trend [-h] [--output FILE] [--filter FILE] [--dateformat {dmy,mdy,ymd}] [file_or_dir [file_or_dir ...]]

Write a CSV file with time series data from SARIF files with "yyyymmddThhmmssZ" timestamps in their filenames

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --dateformat {dmy,mdy,ymd}, -f {dmy,mdy,ymd}
                        Date component order to use in output CSV. Default is `dmy`
```

Generate a CSV showing a timeline of issues from a set of SARIF files in a directory.  The SARIF file names must contain a
timestamp in the specific format `yyyymmddThhhmmss` e.g. `20211012T110000Z`.

The CSV can be loaded in Microsoft Excel for graphing and trend analysis.

```shell
sarif trend -o timeline.csv "C:\temp\sarif_with_date" --dateformat dmy
```

See [Filtering](#filtering) below for how to use the `--filter` option.

#### upgrade-filter

```plain
usage: sarif upgrade-filter [-h] [--output PATH] [file [file ...]]

Upgrade a v1-style blame filter file to a v2-style filter YAML file

positional arguments:
  file                  A v1-style blame-filter file

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
```

#### usage

```plain
usage: sarif usage [-h] [--output FILE]

(Command optional) - print usage and exit

optional arguments:
  -h, --help            show this help message and exit
  --output FILE, -o FILE
                        Output file
```

Print usage and exit.

#### word

```plain
usage: sarif word [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]

Produce MS Word .docx summaries of the SARIF files specified

positional arguments:
  file_or_dir           A SARIF file or a directory containing SARIF files

optional arguments:
  -h, --help            show this help message and exit
  --output PATH, -o PATH
                        Output file or directory
  --filter FILE, -b FILE
                        Specify the filter file to apply. See README for format.
  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document
  --image IMAGE         Image to include at top of file - SARIF logo by default
  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent
```

Create Word documents representing a SARIF file or multiple SARIF files.

If directories are provided for the `-o` option and the input, then a Word document is produced for each individual SARIF file
and for the full set of SARIF files.  Otherwise, a single Word document is created.

Create a Word document for each SARIF file and one for all of them together, in the `reports` directory (created if non-existent):

```shell
sarif word -o reports "C:\temp\sarif_files"
```

Create a Word document for a single SARIF file:

```shell
sarif word -o "reports\devskim_myapp.docx" "C:\temp\sarif_files\devskim_myapp.sarif"
```

Use the `--trim` option to strip specific prefixes from the paths, to make the generated documents less verbose.  The longest common prefix of the paths will be trimmed unless `--no-autotrim` is specified.

Use the `--image` option to provide a header image for the top of the Word document.

See [Filtering](#filtering) below for how to use the `--filter` option.

## Filtering

The data in each `result` object can then be used for filtering via the `--filter` option available for various commands.  This option requires a path to a filter-list YAML file, containing a list of patterns and substrings to match against data in a SARIF file.  The format of a filter-list file is as follows:

```yaml
# Lines beginning with # are interpreted as comments and ignored.
# Optional description for the filter.  If no title is specified, the filter file name is used.
description: Example filter from README.md

# Optional configuration section to override default values.
configuration:
  # This option controls whether to include results where a property to check is missing, default
  # value is true.
  default-include: false
  # This option only applies filter criteria if the line number is present and not equal to 1.
  # Some static analysis tools set the line number to 1 for whole file issues, but this does not
  # work with blame filtering, because who last changed line 1 is irrelevant.  Default value is
  # true.
  check-line-number: true

# Items in `include` list are interpreted as inclusion filtering rules.
# Items are treated with OR operator, the filtered results includes objects matching any rule.
# Each item can be one rule or a list of rules, in the latter case rules in the list are treated
# with AND operator - all rules must match.
include:
  # The following line includes issues whose author-mail property contains "@microsoft.com" AND
  # found in Java files.
  # Values with special characters `\:;_()$%^@,` must be enclosed in quotes (single or double):
  - author-mail: "@microsoft.com"
    locations[*].physicalLocation.artifactLocation.uri: "*.java"
  # Instead of a substring, a regular expression can be used, enclosed in "/" characters.
  # Issues whose committer-mail property includes a string matching the regular expression are included.
  # Use ^ and $ to match the whole committer-mail property.
  - committer-mail:
      value: "/^<myname.*\\.com>$/"
      # Configuration options can be overridden for any rule.
      default-include: true
      check-line-number: true
# Lines under `exclude` are interpreted as exclusion filtering rules.
exclude:
  # The following line excludes issues whose location is in test Java files with names starting with
  #  the "Test" prefix.
  - location: "Test*.java"
  # The value for the property can be empty, in this case only existence of the property is checked.
  - suppression:
```

Here's an example of a filter-file that includes issues on lines changed by an `@microsoft.com` email address or a `myname.SOMETHING.com` email address, but not if those email addresses end in `bot@microsoft.com` or contain a GUID.  It's the same as the above example, with comments stripped out.

```yaml
description: Example filter from README.md
configuration:
  default-include: true
  check-line-number: true
include:
  - author-mail: "@microsoft.com"
  - author-mail: "/myname\\..*\\.com/"
exclude:
  - author-mail: bot@microsoft.com
  - author-mail: '/[0-9A-F]{8}[-][0-9A-F]{4}[-][0-9A-F]{4}[-][0-9A-F]{4}[-][0-9A-F]{12}\@microsoft.com/'
```

Field names must be specified in [JSONPath notation](https://goessner.net/articles/JsonPath/)
accessing data in the [SARIF `result` object](https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012594).

For commonly used properties the following shortcuts are defined:
| Shortcut | Full JSONPath |
| -------- | -------- |
| author | properties.blame.author |
| author-mail | properties.blame.author-mail |
| committer | properties.blame.committer |
| committer-mail | properties.blame.committer-mail |
| location | locations[*].physicalLocation.artifactLocation.uri |
| rule | ruleId |
| suppression | suppressions[*].kind |

For the property `uri` (e.g. in `locations[*].physicalLocation.artifactLocation.uri`) file name wildcard characters can be used as it represents a file location:
- `?` - a single occurrence of any character in a directory or file name
- `*` - zero or more occurrences of any character in a directory or file name
- `**` - zero or more occurrences across multiple directory levels

E.g.
- `tests/Test???.js`
- `src/js/*.js`
- `src/js/**/*.js`

All matching is case insensitive, because email addresses are.  Whitespace at the start and end of lines is ignored, which also means that line ending characters don't matter.  The filter file must be UTF-8 encoded (including plain ASCII7).

If there are no inclusion patterns, all issues are included except for those matching the exclusion patterns.  If there are inclusion patterns, only issues matching the inclusion patterns are included.  If an issue matches one or more inclusion patterns and also at least one exclusion pattern, it is excluded.

## Usage as a Python library

Although not its primary purpose, you can use sarif-tools from a Python script or module to
load and summarise SARIF results.

### Basic usage pattern

After installation, use `sarif.loader` to load a SARIF file or files, and then use the operations
on the returned `SarifFile` or `SarifFileSet` objects to explore the data.

```python
from sarif import loader

sarif_data = loader.load_sarif_file(path_to_sarif_file)
issue_count_by_severity = sarif_data.get_result_count_by_severity()
error_histogram = sarif_data.get_issue_code_histogram("error")
```

### Result access API

The three classes defined in the `sarif_files` module, `SarifFileSet`, `SarifFile` and `SarifRun`,
provide similar APIs, which allows SARIF results to be handled similarly at multiple levels of
aggregation.  This section briefly describes some of the key APIs at the three levels of
aggregation.

#### get_distinct_tool_names()

Returns a list of distinct tool names in a `SarifFile` or for all files in a `SarifFileSet`.
A `SarifRun` has a single tool name so the equivalent method is `get_tool_name()`.

#### get_results()

Return the list of SARIF results.  These are objects as defined in the
[SARIF standard section 3.27](https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317638).

#### get_records()

Return the list of SARIF results as simplified, flattened record dicts.  Each record has the
attributes defined in `sarif_file.RECORD_ATTRIBUTES`.

- `"Tool"` - the tool name for the run containing the result.
- `"Severity"` - the SARIF severity for the record.  One of `error`, `warning` (the default if the
  record doesn't specify) or `note`.
- `"Code"` - the issue code from the result.
- `"Description"` - the issue name from the result - corresponding to the Code.
- `"Location"` - the location of the issue, typically the file containing the issue.  Format varies
  by tool.
- `"Line"` - the line number in the file where the issue occurs.  Value is a string.  This defaults
  to `"1"` if the tool failed to identify the line.

#### get_records_grouped_by_severity()

As per `get_records()`, but the result is a dict from SARIF severity level (`error`, `warning` and
`note`) to the list of records of that severity level.

#### get_result_count(), get_result_count_by_severity()

Get the total number of SARIF results.  `get_result_count_by_severity()` returns a dict from
SARIF severity level (`error`, `warning` and `note`) to the integer number of results of that
severity.

#### get_issue_code_histogram(severity)

For the given severity, get histogram in the form of a list of pairs.  The first item in each pair
is the issue code, the second item is the number of matching records, and the list is sorted in
decreasing order of frequency (the same as the `sarif summary` command output).

#### Disaggregation and filename access

These fields and methods allow access to the underlying information about the SARIF files.

- `SarifFileSet.subdirs` - a list of `SarifFileSet` objects corresponding to the subdirectories of
  the directory from which the `SarifFileSet` was created.
- `SarifFileSet.files` - a list of `SarifFile` objects corresponding to the SARIF files contained
  in the directory from which the `SarifFileSet` was created.
- `SarifFile.get_abs_file_path()` - get the absolute path to the SARIF file.
- `SarifFile.get_file_name()` - get the name of the SARIF file.
- `SarifFile.get_file_name_without_extension()` - get the name of the SARIF file without its
  extension.  Useful for constructing derived filenames.
- `SarifFile.get_filename_timestamp()` - extract the timestamp from the filename of a SARIF file,
  and return it as a string.  The timestamp must be in the format specified in the `sarif trend`
  command.
- `SarifFile.runs` - a list of `SarifRun` objects contained in the SARIF file.  Most SARIF files
  only contain a single run, but it is possible to aggregate runs from multiple tools into a
  single SARIF file.

#### Path shortening API

Call `init_path_prefix_stripping(autotrim, path_prefixes)` on a `SarifFileSet`, `SarifFile` or `SarifRun` object to set up path filtering, either automatically removing the longest common prefix (`autotrim=True`) or removing specific prefixes (`autotrim=False` and a list of strings in `path_prefixes`).

#### Filtering API

Call `init_general_filter(filter_description, include_filters, exclude_filters)` on a `SarifFileSet`, `SarifFile` or `SarifRun` object to set up filtering.  `filter_description` is a string and the other parameters are lists of inclusion and exclusion rules.  They correspond in an obvious way to the filter file contents described in [Filtering](#filtering) above.

Call `get_filter_stats()` to retrieve the filter stats after reading the results or records from sarif files.  It returns `None` if there is no filter, or otherwise a `sarif_file.FilterStats` object with integer fields `filtered_in_result_count`, `filtered_out_result_count`.  Call `to_string()` on the `FilterStats` object for a readable representation of these statistics, which also includes the filter file name or description (`filter_description` field).

## Suggested usage in CI pipelines

Using the `--check` option in combination with the `summary` command causes sarif-tools to exit
with a nonzero exit code if there are any issues of the specified level, or higher.  This can
be useful to fail a continuous integration (CI) pipeline in the case of SAST violation.

The SARIF issue levels are `error`, `warning` and `note`.  These are all valid options for the
`--check` option.

E.g. to fail if there are any errors or warnings:

```dos
sarif --check warning summary c:\temp\sarif_files
```

The `diff` command can check for any increase in issues of the specified level or above, relative
to a previous or baseline build.

E.g. to fail if there are any new issue codes at error level:

```dos
sarif --check error diff c:\temp\old_sarif_files c:\temp\sarif_files
```

You can also use sarif-tools to filter and consolidate the output from multiple tools.  E.g.

```bash
# First run your static analysis tools, configured to write SARIF output.  How to do that depends
# the tool.

# Now run the blame command to augment the output with blame information.
sarif blame -o with_blame/myapp_mytool_with_blame.sarif myapp_mytool.sarif

# Now combine all tools' output into a single file
sarif copy --timestamp -o artifacts/myapp_alltools_with_blame.sarif
```

Download the file `myapp_alltools_with_blame_TIMESTAMP.sarif` that is generated.  Then later you can
filter the results using the `--filter` argument, or generate graph of code quality over time
using `sarif trend`.

## Credits

sarif-tools was originally developed during the Microsoft Global Hackathon 2021 by Simon Abykov, Nick Brabbs, Anthony Hayward, Sivaji Kondapalli, Matt Parkes and Kathryn Pentland.

Thank you to everyone who has contributed
[pull requests](https://github.com/microsoft/sarif-tools/pulls?q=reason%3Acompleted)
since the initial release!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/microsoft/sarif-tools",
    "name": "sarif-tools",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Microsoft",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/1e/93/a8d9947f74546b5550ddf18f69ac5f6c0296ec9151e0e7e5f042552fb194/sarif_tools-2.0.0.tar.gz",
    "platform": null,
    "description": "# SARIF Tools\n\nA set of command line tools and Python library for working with SARIF files.\n\nRead more about the SARIF format here:\n[sarifweb.azurewebsites.net](https://sarifweb.azurewebsites.net/).\n\n## Installation\n\n### Prerequisites\n\nYou need Python 3.8 or later installed.  Get it from [python.org](https://www.python.org/downloads/).\nThis document assumes that the `python` command runs that version.\n\n### Installing on Windows\n\nOpen a user command prompt and type:\n\n```cmd\npip install sarif-tools\n```\n\nCheck for a warning such as the following:\n\n```log\nWARNING: The script sarif.exe is installed in 'C:\\tools\\Python38\\Scripts' which is not on PATH.\n```\n\nGo into Windows Settings and search for \"env\" (Edit environment variables for your account) and\nadd the missing path to your PATH variable.  You'll need to open a new terminal or reboot, and\nthen you can type `sarif --version` at the command prompt.\n\nTo install system-wide for all users, use an Administrator command prompt instead, if you are\ncomfortable with the security risks.\n\n### Installing on Linux or Mac\n\n```bash\npip install sarif-tools\n```\n\nCheck for a warning such as the following:\n\n```log\nWARNING: The script sarif is installed in '/home/XYZ/.local/bin' which is not on PATH.\n```\n\nAdd the missing path to your PATH.  How to do that varies by Linux flavour, but editing `~/.profile`\nis often a good approach.  Then after opening a new terminal or running `source ~/.profile`, you\nshould be able to type `sarif --version` at the command prompt.\n\nTo install system-wide, use `sudo pip install`.  Be aware that this is discouraged from a\nsecurity perspective.\n\n### Testing the installation\n\nAfter installing using `pip`, you should then be able to run:\n\n```bash\nsarif --version\n```\n\n### Troubleshooting installation\n\nThis section has suggestions in case the `sarif` command is not available after installation.\n\nA launcher called `sarif` or `sarif.exe` is created in Python's `Scripts` directory.  The `Scripts`\ndirectory needs to be in the `PATH` environment variable for you to be able to type `sarif` at the\ncommand prompt; this is most likely the case if `pip` is run as a super-user when installing (e.g.\nAdministrator Command Prompt on Windows, or using `sudo` on Linux).\nIf the SARIF tools are installed for the current user only, adding the user's Scripts directory to\nthe current user's PATH variable is the best approach.  Search online for how to do that on your\nsystem.\n\nIf the `Scripts` directory is not in the `PATH`, then you can type `python -m sarif` instead of\n`sarif` to run the tool.\n\nConfusion can arise when the `python` and `pip` commands on the `PATH` are from different\ninstallations, or the `python` installation on the super-user's `PATH` is different from the\n`python` command on the normal user's path.  On Windows, you can use `where python` and `where pip`\nin normal CMD and Admin CMD to see which installations are in use; on Linux, it's `which python` and\n`which pip` with and without `sudo`.\n\n## Command Line Usage\n\n```plain\nusage: sarif [-h] [--version] [--debug] [--check {error,warning,note}] {blame,codeclimate,copy,csv,diff,emacs,html,info,ls,summary,trend,usage,word} ...\n\nProcess sets of SARIF files\n\npositional arguments:\n  {blame,codeclimate,copy,csv,diff,emacs,html,info,ls,summary,trend,usage,word}\n                        command\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --version, -v         show program's version number and exit\n  --debug               Print information useful for debugging\n  --check {error,warning,note}, -x {error,warning,note}\n                        Exit with error code if there are any issues of the specified level (or for diff, an increase in issues at that level).\n\ncommands:\nblame        Enhance SARIF file with information from `git blame`\ncodeclimate  Write a JSON representation in Code Climate format of SARIF file(s) for viewing as a Code Quality report in GitLab UI\ncopy         Write a new SARIF file containing optionally-filtered data from other SARIF file(s)\ncsv          Write a CSV file listing the issues from the SARIF files(s) specified\ndiff         Find the difference between two [sets of] SARIF files\nemacs        Write a representation of SARIF file(s) for viewing in emacs\nhtml         Write an HTML representation of SARIF file(s) for viewing in a web browser\ninfo         Print information about SARIF file(s) structure\nls           List all SARIF files in the directories specified\nsummary      Write a text summary with the counts of issues from the SARIF files(s) specified\ntrend        Write a CSV file with time series data from SARIF files with \"yyyymmddThhmmssZ\" timestamps in their filenames\nusage        (Command optional) - print usage and exit\nword         Produce MS Word .docx summaries of the SARIF files specified\nRun `sarif <COMMAND> --help` for command-specific help.\n```\n\n### Commands\n\nThe commands are illustrated below assuming input files in the following locations:\n\n- `C:\\temp\\sarif_files` = a directory of SARIF files with arbitrary filenames.\n- `C:\\temp\\sarif_with_date` = a directory of SARIF files with filenames including timestamps e.g. `C:\\temp\\sarif_with_date\\myapp_devskim_output_20211001T012000Z.sarif`.\n- `C:\\temp\\old_sarif_files` = a directory of SARIF files with arbitrary filenames from an older build.\n- `C:\\code\\my_source_repo` = checkout directory of source code files from which SARIF results were obtained.\n\n#### blame\n\n```plain\nusage: sarif blame [-h] [--output PATH] [--code PATH] [file_or_dir [file_or_dir ...]]\n\nEnhance SARIF file with information from `git blame`\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --code PATH, -c PATH  Path to git repository; if not specified, the current working directory is used\n```\n\nAugment SARIF files with `git blame` information, and write the augmented files to a specified location.\n\n```shell\nsarif blame -o \"C:\\temp\\sarif_files_with_blame_info\" -c \"C:\\code\\my_source_repo\" \"C:\\temp\\sarif_files\"\n```\n\nIf the current working directory is the git repository, the `-c` argument can be omitted.\n\nBlame information is added to the property bag of each `result` object for which it was successfully obtained.  The keys and values used are as in the [git blame porcelain format](https://git-scm.com/docs/git-blame#_the_porcelain_format).  E.g.:\n\n```json\n{\n  \"ruleId\": \"SM00702\",\n  ...\n  \"properties\": {\n    \"blame\": {\n      \"author\": \"aperson\",\n      \"author-mail\": \"<aperson@acompany.com>\",\n      \"author-time\": \"1350899798\",\n      \"author-tz\": \"+0000\",\n      \"committer\": \"aperson\",\n      \"committer-mail\": \"<aperson@acompany.com>\",\n      \"committer-time\": \"1350899798\",\n      \"committer-tz\": \"+0000\",\n      \"summary\": \"blah blah commit comment blah\",\n      \"boundary\": true,\n      \"filename\": \"src/net/myproject/mypackage/MyClass.java\"\n    }\n  }\n}\n```\n\nNote that the bare `boundary` key is given the automatic value `true`.\n\n#### codeclimate\n\n```plain\nusage: sarif codeclimate [-h] [--output PATH] [--filter FILE] [--autotrim] [--trim PREFIX] [file_or_dir ...]\n\nWrite a JSON representation in Code Climate format of SARIF file(s) for viewing as a Code Quality report in GitLab UI\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --autotrim, -a        Strip off the common prefix of paths in the CSV output\n  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent\n```\n\nWrite out a JSON file of Code Climate tool format from [a set of] SARIF files.\nThis can then be published as a Code Quality report artefact in a GitLab pipeline and shown in GitLab UI for merge requests.\n\nThe JSON output can also be filtered using the blame information; see\n[Filtering](#filtering) below for how to use the `--filter` option.\n\n#### copy\n\n```plain\nusage: sarif copy [-h] [--output FILE] [--filter FILE] [--timestamp] [file_or_dir [file_or_dir ...]]\n\nWrite a new SARIF file containing optionally-filtered data from other SARIF file(s)\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --timestamp, -t       Append current timestamp to output filename in the \"yyyymmddThhmmssZ\" format used by the `sarif trend` command\n```\n\nWrite a new SARIF file containing optionally-filtered data from an existing SARIF file or multiple\nSARIF files.  The resulting file contains each run from the original SARIF files back-to-back.\nThe results can be filtered (see [Filtering](#filtering) below), in which case only\nthose results from the original SARIF files that meet the filter are included; the output file\ncontains no information about the excluded records.  If a run in the original file was empty,\nor all its results are filtered out, the empty run is still included.\n\nIf no output filename is provided, a file called `out.sarif` in the current directory is written.\nIf the output file already exists and is also in the input file list, it is not included in the\ninputs, to avoid duplication of results.  The output file is overwritten without warning.\n\nThe `file_or_dir` specifier can include wildcards e.g. `c:\\temp\\**\\devskim*.sarif` (i.e.\na \"glob\").  This works for all commands, but it is particularly useful for `copy`.\n\nOne use for this is to combine a set of SARIF files from multiple static analysis tools run during\na build process into a single file that can be more easily stored and processed as a build asset.\n\n#### csv\n\n```plain\nusage: sarif csv [-h] [--output PATH] [--filter FILE] [--autotrim] [--trim PREFIX] [file_or_dir [file_or_dir ...]]\n\nWrite a CSV file listing the issues from the SARIF files(s) specified\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --autotrim, -a        Strip off the common prefix of paths in the CSV output\n  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent\n```\n\nWrite out a simple tabular list of issues from [a set of] SARIF files.  This can then be analysed, e.g. via Pivot Tables in Excel.\n\nUse the `--trim` option to strip specific prefixes from the paths, to make the CSV less verbose.  Alternatively, use `--autotrim` to strip off the longest common prefix.\n\nGenerate a CSV summary of a single SARIF file with common file path prefix suppressed:\n\n```shell\nsarif csv \"C:\\temp\\sarif_files\\devskim_myapp.sarif\"\n```\n\nGenerate a CSV summary of a directory of SARIF files with path prefix `C:\\code\\my_source_repo` suppressed:\n\n```shell\nsarif csv --trim c:\\code\\my_source_repo \"C:\\temp\\sarif_files\"\n```\n\nIf the SARIF file(s) contain blame information (as added by the `blame` command), then the CSV\nincludes an \"Author\" column indicating who last modified the line in question.\n\nThe CSV output can also be filtered using the same blame information; see\n[Filtering](#filtering) below for how to use the `--filter` option.\n\n#### diff\n\n```plain\nusage: sarif diff [-h] [--output FILE] [--filter FILE] old_file_or_dir new_file_or_dir\n\nFind the difference between two [sets of] SARIF files\n\npositional arguments:\n  old_file_or_dir       An old SARIF file or a directory containing the old SARIF files\n  new_file_or_dir       A new SARIF file or a directory containing the new SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n```\n\nPrint the difference between two [sets of] SARIF files.\n\nDifference between the issues in two SARIF files:\n\n```shell\nsarif diff \"C:\\temp\\old_sarif_files\\devskim_myapp.sarif\" \"C:\\temp\\sarif_files\\devskim_myapp.sarif\"\n```\n\nDifference between the issues in two directories of SARIF files:\n\n```shell\nsarif diff \"C:\\temp\\old_sarif_files\" \"C:\\temp\\sarif_files\"\n```\n\nWrite output to JSON file instead of printing to stdout:\n\n```shell\nsarif diff -o mydiff.json \"C:\\temp\\old_sarif_files\\devskim_myapp.sarif\" \"C:\\temp\\sarif_files\\devskim_myapp.sarif\"\n```\n\nThe JSON format is like this:\n\n```json\n\n{\n    \"all\": {\n        \"+\": 5,\n        \"-\": 11\n    },\n    \"error\": {\n        \"+\": 2,\n        \"-\": 0,\n        \"codes\": {\n            \"XYZ1234 Some Issue\": {\n                \"<\": 0,\n                \">\": 2,\n                \"+@\": [\n                    {\n                        \"Location\": \"C:\\\\code\\\\file1.py\",\n                        \"Line\": 119\n                    },\n                    {\n                        \"Location\": \"C:\\\\code\\\\file2.py\",\n                        \"Line\": 61\n                    }\n                ]\n            },\n        }\n    },\n    \"warning\": {\n        \"+\": 3,\n        \"-\": 11,\n        \"codes\": {...}\n    },\n    \"note\": {\n        \"+\": 3,\n        \"-\": 11,\n        \"codes\": {...}\n    }\n}\n```\n\nWhere:\n\n- \"+\" indicates new issue types at this severity, \"error\", \"warning\" or \"note\"\n- \"-\" indicates resolved issue types at this severity (no occurrences remaining)\n- \"codes\" lists each issue code where the number of occurrences has changed:\n  - occurrences before indicated by \"<\"\n  - occurrences after indicated by \">\"\n  - new locations indicated by \"+@\"\n\nIf the set of issue codes at a given severity has changed, diff will report this even if the total\nnumber of issue types at that severity is unchanged.\n\nWhen the number of occurrences of an issue code is unchanged, diff will not report this issue code,\nalthough it is possible that an equal number of new occurrences of the specific issue have arisen as\nhave been resolved.  This is to avoid reporting line number changes.\n\nThe `diff` operation shows the location of new occurrences of each issue.  When writing to an\noutput JSON file, all new locations are written, but when writing output to the console, a maximum\nof three locations are shown.  Note that there can be some false positives here, if line numbers\nhave changed.\n\nSee [Filtering](#filtering) below for how to use the `--filter` option.\n\n#### emacs\n\n```plain\nusage: sarif emacs [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]\n\nWrite a representation of SARIF file(s) for viewing in emacs\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document\n  --image IMAGE         Image to include at top of file - SARIF logo by default\n  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent\n```\n\n#### html\n\n```plain\nusage: sarif html [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]\n\nWrite an HTML representation of SARIF file(s) for viewing in a web browser\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document\n  --image IMAGE         Image to include at top of file - SARIF logo by default\n  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent\n```\n\nCreate an HTML file summarising SARIF results.\n\n```shell\nsarif html -o summary.html \"C:\\temp\\sarif_files\"\n```\n\nUse the `--trim` option to strip specific prefixes from the paths, to make the generated HTML page less verbose.  The longest common prefix of the paths will be trimmed unless `--no-autotrim` is specified.\n\nUse the `--image` option to provide a header image for the top of the HTML page.  The image is embedded into the HTML, so the HTML document remains a portable standalone file.\n\nSee [Filtering](#filtering) below for how to use the `--filter` option.\n\n#### info\n\n```plain\nusage: sarif info [-h] [--output FILE] [file_or_dir [file_or_dir ...]]\n\nPrint information about SARIF file(s) structure\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n```\n\nPrint information about the structure of a SARIF file or multiple files.  This is about the JSON\nstructure rather than any meaning of the results produced by the tool.  The summary includes the\nfull path of the file, its size and modified date, the number of runs, and for each run, the\ntool that generated the run, the number of results, and the entries in the results' property bags.\n\n```plain\nc:\\temp\\sarif_files\\ios_devskim_output.sarif\n  1256241 bytes (1.2 MiB)\n  modified: 2021-10-13 21:50:01.251544, accessed: 2022-01-09 18:23:00.060573, ctime: 2021-10-13 20:49:00\n  1 run\n    Tool: devskim\n    1323 results\n    All results have properties: tags, DevSkimSeverity\n```\n\n#### ls\n\n```plain\nusage: sarif ls [-h] [--output FILE] [file_or_dir [file_or_dir ...]]\n\nList all SARIF files in the directories specified\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n```\n\nList SARIF files in one or more directories.\n\n```shell\nsarif ls \"C:\\temp\\sarif_files\" \"C:\\temp\\sarif_with_date\"\n```\n\n#### summary\n\n```plain\nusage: sarif summary [-h] [--output PATH] [--filter FILE] [file_or_dir [file_or_dir ...]]\n\nWrite a text summary with the counts of issues from the SARIF files(s) specified\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n```\n\nPrint a summary of the issues in one or more SARIF file(s), grouped by severity and then ordered by number of occurrences.\n\nWhen directories are provided as input and output, a summary is written for each input file, along with another file containing the totals.\n\n```shell\nsarif summary -o summaries \"C:\\temp\\sarif_files\"\n```\n\nWhen no output directory or file is specified, the overall summary is printed to the standard output.\n\n```shell\nsarif summary \"C:\\temp\\sarif_files\\devskim_myapp.sarif\"\n```\n\nSee [Filtering](#filtering) below for how to use the `--filter` option.\n\n#### trend\n\n```plain\nusage: sarif trend [-h] [--output FILE] [--filter FILE] [--dateformat {dmy,mdy,ymd}] [file_or_dir [file_or_dir ...]]\n\nWrite a CSV file with time series data from SARIF files with \"yyyymmddThhmmssZ\" timestamps in their filenames\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --dateformat {dmy,mdy,ymd}, -f {dmy,mdy,ymd}\n                        Date component order to use in output CSV. Default is `dmy`\n```\n\nGenerate a CSV showing a timeline of issues from a set of SARIF files in a directory.  The SARIF file names must contain a\ntimestamp in the specific format `yyyymmddThhhmmss` e.g. `20211012T110000Z`.\n\nThe CSV can be loaded in Microsoft Excel for graphing and trend analysis.\n\n```shell\nsarif trend -o timeline.csv \"C:\\temp\\sarif_with_date\" --dateformat dmy\n```\n\nSee [Filtering](#filtering) below for how to use the `--filter` option.\n\n#### upgrade-filter\n\n```plain\nusage: sarif upgrade-filter [-h] [--output PATH] [file [file ...]]\n\nUpgrade a v1-style blame filter file to a v2-style filter YAML file\n\npositional arguments:\n  file                  A v1-style blame-filter file\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n```\n\n#### usage\n\n```plain\nusage: sarif usage [-h] [--output FILE]\n\n(Command optional) - print usage and exit\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output FILE, -o FILE\n                        Output file\n```\n\nPrint usage and exit.\n\n#### word\n\n```plain\nusage: sarif word [-h] [--output PATH] [--filter FILE] [--no-autotrim] [--image IMAGE] [--trim PREFIX] [file_or_dir [file_or_dir ...]]\n\nProduce MS Word .docx summaries of the SARIF files specified\n\npositional arguments:\n  file_or_dir           A SARIF file or a directory containing SARIF files\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --output PATH, -o PATH\n                        Output file or directory\n  --filter FILE, -b FILE\n                        Specify the filter file to apply. See README for format.\n  --no-autotrim, -n     Do not strip off the common prefix of paths in the output document\n  --image IMAGE         Image to include at top of file - SARIF logo by default\n  --trim PREFIX         Prefix to strip from issue paths, e.g. the checkout directory on the build agent\n```\n\nCreate Word documents representing a SARIF file or multiple SARIF files.\n\nIf directories are provided for the `-o` option and the input, then a Word document is produced for each individual SARIF file\nand for the full set of SARIF files.  Otherwise, a single Word document is created.\n\nCreate a Word document for each SARIF file and one for all of them together, in the `reports` directory (created if non-existent):\n\n```shell\nsarif word -o reports \"C:\\temp\\sarif_files\"\n```\n\nCreate a Word document for a single SARIF file:\n\n```shell\nsarif word -o \"reports\\devskim_myapp.docx\" \"C:\\temp\\sarif_files\\devskim_myapp.sarif\"\n```\n\nUse the `--trim` option to strip specific prefixes from the paths, to make the generated documents less verbose.  The longest common prefix of the paths will be trimmed unless `--no-autotrim` is specified.\n\nUse the `--image` option to provide a header image for the top of the Word document.\n\nSee [Filtering](#filtering) below for how to use the `--filter` option.\n\n## Filtering\n\nThe data in each `result` object can then be used for filtering via the `--filter` option available for various commands.  This option requires a path to a filter-list YAML file, containing a list of patterns and substrings to match against data in a SARIF file.  The format of a filter-list file is as follows:\n\n```yaml\n# Lines beginning with # are interpreted as comments and ignored.\n# Optional description for the filter.  If no title is specified, the filter file name is used.\ndescription: Example filter from README.md\n\n# Optional configuration section to override default values.\nconfiguration:\n  # This option controls whether to include results where a property to check is missing, default\n  # value is true.\n  default-include: false\n  # This option only applies filter criteria if the line number is present and not equal to 1.\n  # Some static analysis tools set the line number to 1 for whole file issues, but this does not\n  # work with blame filtering, because who last changed line 1 is irrelevant.  Default value is\n  # true.\n  check-line-number: true\n\n# Items in `include` list are interpreted as inclusion filtering rules.\n# Items are treated with OR operator, the filtered results includes objects matching any rule.\n# Each item can be one rule or a list of rules, in the latter case rules in the list are treated\n# with AND operator - all rules must match.\ninclude:\n  # The following line includes issues whose author-mail property contains \"@microsoft.com\" AND\n  # found in Java files.\n  # Values with special characters `\\:;_()$%^@,` must be enclosed in quotes (single or double):\n  - author-mail: \"@microsoft.com\"\n    locations[*].physicalLocation.artifactLocation.uri: \"*.java\"\n  # Instead of a substring, a regular expression can be used, enclosed in \"/\" characters.\n  # Issues whose committer-mail property includes a string matching the regular expression are included.\n  # Use ^ and $ to match the whole committer-mail property.\n  - committer-mail:\n      value: \"/^<myname.*\\\\.com>$/\"\n      # Configuration options can be overridden for any rule.\n      default-include: true\n      check-line-number: true\n# Lines under `exclude` are interpreted as exclusion filtering rules.\nexclude:\n  # The following line excludes issues whose location is in test Java files with names starting with\n  #  the \"Test\" prefix.\n  - location: \"Test*.java\"\n  # The value for the property can be empty, in this case only existence of the property is checked.\n  - suppression:\n```\n\nHere's an example of a filter-file that includes issues on lines changed by an `@microsoft.com` email address or a `myname.SOMETHING.com` email address, but not if those email addresses end in `bot@microsoft.com` or contain a GUID.  It's the same as the above example, with comments stripped out.\n\n```yaml\ndescription: Example filter from README.md\nconfiguration:\n  default-include: true\n  check-line-number: true\ninclude:\n  - author-mail: \"@microsoft.com\"\n  - author-mail: \"/myname\\\\..*\\\\.com/\"\nexclude:\n  - author-mail: bot@microsoft.com\n  - author-mail: '/[0-9A-F]{8}[-][0-9A-F]{4}[-][0-9A-F]{4}[-][0-9A-F]{4}[-][0-9A-F]{12}\\@microsoft.com/'\n```\n\nField names must be specified in [JSONPath notation](https://goessner.net/articles/JsonPath/)\naccessing data in the [SARIF `result` object](https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/sarif-v2.1.0-cs01.html#_Toc16012594).\n\nFor commonly used properties the following shortcuts are defined:\n| Shortcut | Full JSONPath |\n| -------- | -------- |\n| author | properties.blame.author |\n| author-mail | properties.blame.author-mail |\n| committer | properties.blame.committer |\n| committer-mail | properties.blame.committer-mail |\n| location | locations[*].physicalLocation.artifactLocation.uri |\n| rule | ruleId |\n| suppression | suppressions[*].kind |\n\nFor the property `uri` (e.g. in `locations[*].physicalLocation.artifactLocation.uri`) file name wildcard characters can be used as it represents a file location:\n- `?` - a single occurrence of any character in a directory or file name\n- `*` - zero or more occurrences of any character in a directory or file name\n- `**` - zero or more occurrences across multiple directory levels\n\nE.g.\n- `tests/Test???.js`\n- `src/js/*.js`\n- `src/js/**/*.js`\n\nAll matching is case insensitive, because email addresses are.  Whitespace at the start and end of lines is ignored, which also means that line ending characters don't matter.  The filter file must be UTF-8 encoded (including plain ASCII7).\n\nIf there are no inclusion patterns, all issues are included except for those matching the exclusion patterns.  If there are inclusion patterns, only issues matching the inclusion patterns are included.  If an issue matches one or more inclusion patterns and also at least one exclusion pattern, it is excluded.\n\n## Usage as a Python library\n\nAlthough not its primary purpose, you can use sarif-tools from a Python script or module to\nload and summarise SARIF results.\n\n### Basic usage pattern\n\nAfter installation, use `sarif.loader` to load a SARIF file or files, and then use the operations\non the returned `SarifFile` or `SarifFileSet` objects to explore the data.\n\n```python\nfrom sarif import loader\n\nsarif_data = loader.load_sarif_file(path_to_sarif_file)\nissue_count_by_severity = sarif_data.get_result_count_by_severity()\nerror_histogram = sarif_data.get_issue_code_histogram(\"error\")\n```\n\n### Result access API\n\nThe three classes defined in the `sarif_files` module, `SarifFileSet`, `SarifFile` and `SarifRun`,\nprovide similar APIs, which allows SARIF results to be handled similarly at multiple levels of\naggregation.  This section briefly describes some of the key APIs at the three levels of\naggregation.\n\n#### get_distinct_tool_names()\n\nReturns a list of distinct tool names in a `SarifFile` or for all files in a `SarifFileSet`.\nA `SarifRun` has a single tool name so the equivalent method is `get_tool_name()`.\n\n#### get_results()\n\nReturn the list of SARIF results.  These are objects as defined in the\n[SARIF standard section 3.27](https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317638).\n\n#### get_records()\n\nReturn the list of SARIF results as simplified, flattened record dicts.  Each record has the\nattributes defined in `sarif_file.RECORD_ATTRIBUTES`.\n\n- `\"Tool\"` - the tool name for the run containing the result.\n- `\"Severity\"` - the SARIF severity for the record.  One of `error`, `warning` (the default if the\n  record doesn't specify) or `note`.\n- `\"Code\"` - the issue code from the result.\n- `\"Description\"` - the issue name from the result - corresponding to the Code.\n- `\"Location\"` - the location of the issue, typically the file containing the issue.  Format varies\n  by tool.\n- `\"Line\"` - the line number in the file where the issue occurs.  Value is a string.  This defaults\n  to `\"1\"` if the tool failed to identify the line.\n\n#### get_records_grouped_by_severity()\n\nAs per `get_records()`, but the result is a dict from SARIF severity level (`error`, `warning` and\n`note`) to the list of records of that severity level.\n\n#### get_result_count(), get_result_count_by_severity()\n\nGet the total number of SARIF results.  `get_result_count_by_severity()` returns a dict from\nSARIF severity level (`error`, `warning` and `note`) to the integer number of results of that\nseverity.\n\n#### get_issue_code_histogram(severity)\n\nFor the given severity, get histogram in the form of a list of pairs.  The first item in each pair\nis the issue code, the second item is the number of matching records, and the list is sorted in\ndecreasing order of frequency (the same as the `sarif summary` command output).\n\n#### Disaggregation and filename access\n\nThese fields and methods allow access to the underlying information about the SARIF files.\n\n- `SarifFileSet.subdirs` - a list of `SarifFileSet` objects corresponding to the subdirectories of\n  the directory from which the `SarifFileSet` was created.\n- `SarifFileSet.files` - a list of `SarifFile` objects corresponding to the SARIF files contained\n  in the directory from which the `SarifFileSet` was created.\n- `SarifFile.get_abs_file_path()` - get the absolute path to the SARIF file.\n- `SarifFile.get_file_name()` - get the name of the SARIF file.\n- `SarifFile.get_file_name_without_extension()` - get the name of the SARIF file without its\n  extension.  Useful for constructing derived filenames.\n- `SarifFile.get_filename_timestamp()` - extract the timestamp from the filename of a SARIF file,\n  and return it as a string.  The timestamp must be in the format specified in the `sarif trend`\n  command.\n- `SarifFile.runs` - a list of `SarifRun` objects contained in the SARIF file.  Most SARIF files\n  only contain a single run, but it is possible to aggregate runs from multiple tools into a\n  single SARIF file.\n\n#### Path shortening API\n\nCall `init_path_prefix_stripping(autotrim, path_prefixes)` on a `SarifFileSet`, `SarifFile` or `SarifRun` object to set up path filtering, either automatically removing the longest common prefix (`autotrim=True`) or removing specific prefixes (`autotrim=False` and a list of strings in `path_prefixes`).\n\n#### Filtering API\n\nCall `init_general_filter(filter_description, include_filters, exclude_filters)` on a `SarifFileSet`, `SarifFile` or `SarifRun` object to set up filtering.  `filter_description` is a string and the other parameters are lists of inclusion and exclusion rules.  They correspond in an obvious way to the filter file contents described in [Filtering](#filtering) above.\n\nCall `get_filter_stats()` to retrieve the filter stats after reading the results or records from sarif files.  It returns `None` if there is no filter, or otherwise a `sarif_file.FilterStats` object with integer fields `filtered_in_result_count`, `filtered_out_result_count`.  Call `to_string()` on the `FilterStats` object for a readable representation of these statistics, which also includes the filter file name or description (`filter_description` field).\n\n## Suggested usage in CI pipelines\n\nUsing the `--check` option in combination with the `summary` command causes sarif-tools to exit\nwith a nonzero exit code if there are any issues of the specified level, or higher.  This can\nbe useful to fail a continuous integration (CI) pipeline in the case of SAST violation.\n\nThe SARIF issue levels are `error`, `warning` and `note`.  These are all valid options for the\n`--check` option.\n\nE.g. to fail if there are any errors or warnings:\n\n```dos\nsarif --check warning summary c:\\temp\\sarif_files\n```\n\nThe `diff` command can check for any increase in issues of the specified level or above, relative\nto a previous or baseline build.\n\nE.g. to fail if there are any new issue codes at error level:\n\n```dos\nsarif --check error diff c:\\temp\\old_sarif_files c:\\temp\\sarif_files\n```\n\nYou can also use sarif-tools to filter and consolidate the output from multiple tools.  E.g.\n\n```bash\n# First run your static analysis tools, configured to write SARIF output.  How to do that depends\n# the tool.\n\n# Now run the blame command to augment the output with blame information.\nsarif blame -o with_blame/myapp_mytool_with_blame.sarif myapp_mytool.sarif\n\n# Now combine all tools' output into a single file\nsarif copy --timestamp -o artifacts/myapp_alltools_with_blame.sarif\n```\n\nDownload the file `myapp_alltools_with_blame_TIMESTAMP.sarif` that is generated.  Then later you can\nfilter the results using the `--filter` argument, or generate graph of code quality over time\nusing `sarif trend`.\n\n## Credits\n\nsarif-tools was originally developed during the Microsoft Global Hackathon 2021 by Simon Abykov, Nick Brabbs, Anthony Hayward, Sivaji Kondapalli, Matt Parkes and Kathryn Pentland.\n\nThank you to everyone who has contributed\n[pull requests](https://github.com/microsoft/sarif-tools/pulls?q=reason%3Acompleted)\nsince the initial release!\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "SARIF tools",
    "version": "2.0.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/microsoft/sarif-tools/issues",
        "Homepage": "https://github.com/microsoft/sarif-tools"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "26278317cc0ba6363ba9f69936b4fa1cfa354e4f9134b0a1a5b1038449073e7c",
                "md5": "60988f98d602663f7f9c9a66ac43ec1a",
                "sha256": "01d0a1a8b445cc5b7dd41bcf895fb827e8168072f0a6f74353b639e73870da4a"
            },
            "downloads": -1,
            "filename": "sarif_tools-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60988f98d602663f7f9c9a66ac43ec1a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 50390,
            "upload_time": "2023-11-07T13:17:01",
            "upload_time_iso_8601": "2023-11-07T13:17:01.532408Z",
            "url": "https://files.pythonhosted.org/packages/26/27/8317cc0ba6363ba9f69936b4fa1cfa354e4f9134b0a1a5b1038449073e7c/sarif_tools-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e93a8d9947f74546b5550ddf18f69ac5f6c0296ec9151e0e7e5f042552fb194",
                "md5": "036172d716ddf88387e60f52b7d19bdd",
                "sha256": "aae95d255b5e5c989a2043d5441f24a727117da0f49f8cad1e7faf7c69eac3a3"
            },
            "downloads": -1,
            "filename": "sarif_tools-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "036172d716ddf88387e60f52b7d19bdd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 48921,
            "upload_time": "2023-11-07T13:17:03",
            "upload_time_iso_8601": "2023-11-07T13:17:03.609456Z",
            "url": "https://files.pythonhosted.org/packages/1e/93/a8d9947f74546b5550ddf18f69ac5f6c0296ec9151e0e7e5f042552fb194/sarif_tools-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-07 13:17:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "microsoft",
    "github_project": "sarif-tools",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "sarif-tools"
}
        
Elapsed time: 0.14829s