packj


Namepackj JSON
Version 0.15 PyPI version JSON
download
home_pagehttps://github.com/ossillate-inc/packj
SummaryPackj flags "risky" open-source packages in your software supply chain
upload_time2023-02-01 18:14:25
maintainer
docs_urlNone
authorOssillate Inc.
requires_python>=3.4
licenseGNU AGPLv3
keywords software supply chain malware typo-squatting vulnerability open-source software software composition analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <img src="https://packj.dev/static/img/icons/package.svg" width="45"/>&nbsp;<span style="font-size: 42px"> Packj flags malicious/risky open-source packages</span> 

*Packj* (pronounced package) is a tool to mitigate software supply chain attacks. It can detect malicious, vulnerable, abandoned, typo-squatting, and other "risky" packages from popular open-source package registries, such as NPM, RubyGems, and PyPI. It can be easily customized to minimize noise. Packj started as a PhD research project and is currently being developed under various govt grants.

[![GitHub Stars](https://img.shields.io/github/stars/ossillate-inc/packj?style=social)](https://github.com/ossillate-inc/packj/stargazers) ![](https://img.shields.io/badge/status-beta-yellow) [![Prs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=shields)](https://github.com/ossillate-inc/packj/blob/main/CONTRIBUTING.md) ![Github Commit Activity](https://img.shields.io/github/commit-activity/m/ossillate-inc/packj) [![Discord](https://img.shields.io/discord/910733124558802974?label=Discord)](https://discord.gg/qFcqaV2wYa)  [![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker](https://badgen.net/badge/icon/docker?icon=docker&label)](https://hub.docker.com/r/ossillate/packj/tags)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/packj?label=PyPI%20Downloads)](https://pypistats.org/packages/packj)

![demo video](https://drive.google.com/uc?export=view&id=1QfA73i_ihgqo2JbNXoxaGSZ2Wa02RZNq)

# Contents #

* [Get started](#get-started) - available as Docker image, GitHub Action, Python PyPI package
* [Functionality](#functionality) - deep static/dynamic code analysis and sandboxing
* [Our story](#our-story) - started as a PhD research project and is backed by govt grants
* [Why Packj](#why-packj) - existing CVE scanners ASSUME code is BENIGN and not analyze its behavior
* [Customization](#customization) - turn off alerts as per your threat model to reduce noise
* [Malware found](#malware-found) - reported over 70 malicious PyPI and RubyGems packages
* [Talks and videos](#resources) - presentations from PyCon, OpenSourceSummit, BlackHAT
* [Project roadmap](#feature-roadmap) - view or suggest new features; join our [discord channel](https://discord.gg/qFcqaV2wYa)
* [Team and collaboration](#team) - expert Cybersecurity researchers from academia/industry
* [FAQ](#faq) - supported package managers, commonly asked questions on techniques, and more

# Get started #

We support multiple deployment models:

### 1. GitHub runner 

Use Packj to audit dependencies in pull requests.

```yaml
- name: Packj Security Audit
  uses: ossillate-inc/packj-github-action@0.0.4-beta
  with:
    # TODO: replace with your dependency files in the repo
    DEPENDENCY_FILES: pypi:requirements.txt,npm:package.json,rubygems:Gemfile
    REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}
```

View on GitHub [marketplace](https://packj.dev/go?next=https://github.com/marketplace/actions/packj-security-audit). Example [PR run](https://packj.dev/go?next=https://github.com/ossillate-inc/packj-github-action-demo/pull/3#issuecomment-1274797138).

###  2. PyPI package

The quickest way to try/test Packj is using the PyPI package.

>
> **Warning**: Packj only works on Linux.
>

```
pip3 install packj
```

Auditing RubyGems require additional dependencies

```
bundle install
```

### 3. Docker image (recommended)

Use Docker or Podman for containerized (isolated) runs.

```
docker run -v /tmp:/tmp/packj -it ossillate/packj:latest --help
```

### 4. Source repo

Clone this repo, 

```
https://github.com/ossillate-inc/packj.git && cd packj
```

Install dependencies

```
bundle install && pip3 install -r requirements.txt
```

Start with help:

```
python3 main.py --help 
```

# Functionality #

Packj offers the following tools: 

* [Audit](#auditing-a-package) - to vet a package for "risky" attributes.
* [Sandbox](#sandboxed-package-installation) - for safe installation of a package. 

## Auditing a package ##

Packj audits open-source software packages for "risky" attributes that make them vulnerable to supply chain attacks. For instance, packages with expired email domains (lacking 2FA), large release time gap, sensitive APIs or access permissions, etc. are flagged as risky. 

Auditing the following is supported:

- multiple packages: `python3 main.py audit -p pypi:requests rubygems:overcommit`
- dependency files: `python3 main.py audit -f npm:package.json pypi:requirements.txt`

By default, `audit` only performs static code analysis to detect risky code. You can paas `-t` or `--trace` flag to perform dynamic code analysis as well, which will install all requested packages under strace and monitor install-time behavior of packages. Please see the example output below.

<details>
    <summary><h4>Show example run/output</h4></summary>

    $ docker run -v /tmp:/tmp/packj -it ossillate/packj:latest audit --trace -p npm:browserify

    [+] Fetching 'browserify' from npm..........PASS [ver 17.0.0]
    [+]    Checking package description.........PASS [browser-side require() the node way]
    [+]    Checking release history.............PASS [484 version(s)]
    [+] Checking version........................RISK [702 days old]
    [+]    Checking release time gap............PASS [68 days since last release]
    [+] Checking author.........................PASS [mail@substack.net]
    [+]    Checking email/domain validity.......RISK [expired author email domain]
    [+] Checking readme.........................PASS [26838 bytes]
    [+] Checking homepage.......................PASS [https://github.com/browserify/browserify#readme]
    [+] Checking downloads......................PASS [2M weekly]
    [+] Checking repo URL.......................PASS [https://github.com/browserify/browserify]
    [+]    Checking repo data...................PASS [stars: 14189, forks: 1244]
    [+]    Checking if repo is a forked copy....PASS [original, not forked]
    [+]    Checking repo description............PASS [browser-side require() the node.js way]
    [+]    Checking repo activity...............PASS [commits: 2290, contributors: 207, tags: 413]
    [+] Checking for CVEs.......................PASS [none found]
    [+] Checking dependencies...................RISK [48 found]
    [+] Downloading package from npm............PASS [163.83 KB]
    [+] Analyzing code..........................RISK [needs 3 perm(s): decode,codegen,file]
    [+] Checking files/funcs....................PASS [429 files (383 .js), 744 funcs, LoC: 9.7K]
    [+] Installing package and tracing code.....PASS [found 5 process,1130 files,22 network syscalls]
    =============================================
    [+] 5 risk(s) found, package is undesirable!
    => Complete report: /tmp/packj_54rbjhgm/report_npm-browserify-17.0.0_hlr1rhcz.json
    {
        "undesirable": [
            "old package: 702 days old",
            "invalid or no author email: expired author email domain",
            "generates new code at runtime",
            "reads files and dirs",
            "forks or exits OS processes",
        ]
    }
</details>

> WARNING: since packages could execute malicious code during installation, it is recommended to ONLY use `-t` or `--trace` when running inside a Docker container or a Virtual Machine.

Audit can also be performed in Docker/Podman containers. Please find details on risky attributes and how to use at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md).

## Sandboxed package installation ##

Packj offers a lightweight sandboxing for `safe installation` of a package. Specifically, it prevents malicious packages from exfiltrating sensitive data, accessing sensitive files (e.g., SSH keys), and persisting malware. 

It sandboxes install-time scripts, including any native compliation. It uses **strace** (i.e., **NO** VM/Container required).

Please find details on the sandboxing mechanism and how to use at [Sandbox README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/sandbox/README.md).

<details>
    <summary><h4>Show example run/output</h4></summary>

    $ python3 main.py sandbox gem install overcommit
   
    Fetching: overcommit-0.59.1.gem (100%)
    Install hooks by running `overcommit --install` in your Git repository
    Successfully installed overcommit-0.59.1
    Parsing documentation for overcommit-0.59.1
    Installing ri documentation for overcommit-0.59.1
   
    #############################
    # Review summarized activity
    #############################
   
    [+] Network connections
        [+] DNS (1 IPv4 addresses) at port 53 [rule: ALLOW]
        [+] rubygems.org (4 IPv6 addresses) at port 443 [rule: IPv6 rules not supported]
        [+] rubygems.org (4 IPv4 addresses) at port 443 [rule: ALLOW]
    [+] Filesystem changes
    /
    └── home
        └── ubuntu
            └── .ruby
                ├── gems
                │   ├── iniparse-1.5.0 [new: DIR, 15 files, 46.6K bytes]
                │   ├── rexml-3.2.5 [new: DIR, 77 files, 455.6K bytes]
                │   ├── overcommit-0.59.1 [new: DIR, 252 files, 432.7K bytes]
                │   └── childprocess-4.1.0 [new: DIR, 57 files, 141.2K bytes]
                ├── cache
                │   ├── iniparse-1.5.0.gem [new: FILE, 16.4K bytes]
                │   ├── rexml-3.2.5.gem [new: FILE, 93.2K bytes]
                │   ├── childprocess-4.1.0.gem [new: FILE, 34.3K bytes]
                │   └── overcommit-0.59.1.gem [new: FILE, 84K bytes]
                ├── specifications
                │   ├── rexml-3.2.5.gemspec [new: FILE, 2.7K bytes]
                │   ├── overcommit-0.59.1.gemspec [new: FILE, 1.7K bytes]
                │   ├── childprocess-4.1.0.gemspec [new: FILE, 1.8K bytes]
                │   └── iniparse-1.5.0.gemspec [new: FILE, 1.3K bytes]
                ├── bin
                │   └── overcommit [new: FILE, 622 bytes]
                └── doc
                    ├── iniparse-1.5.0
                    │   └── ri [new: DIR, 119 files, 131.7K bytes]
                    ├── rexml-3.2.5
                    │   └── ri [new: DIR, 836 files, 841K bytes]
                    ├── overcommit-0.59.1
                    │   └── ri [new: DIR, 1046 files, 1.5M bytes]
                    └── childprocess-4.1.0
                        └── ri [new: DIR, 272 files, 297.8K bytes]

    [C]ommit all changes, [Q|q]uit & discard changes, [L|l]ist details:
</details>

# Our story

**TL;DR** Packj started as a PhD research project. It is backed by various government grants.

<details>
	<summary><h4>Show long answer</h4></summary>

Packj started as an academic research project. Specifically, the static code analysis techniques used by Packj are based on cutting-edge Cybersecurity research: [MalOSS](https://packj.dev/go?next=https://github.com/osssanitizer/maloss) project by our research [group](https://packj.dev/go?next=http://cyfi.ece.gatech.edu) at Georgia Tech.

<a href="https://packj.dev/go?next=https://arxiv.org/pdf/2002.01139v1.pdf" target="_blank">
	<img src="https://drive.google.com/uc?export=view&id=1L03-kFTdNDFvGLWt_zJ-Qe8PPX75ICqo" width="300" alt="academic paper">
</a>

Packj is backed by generous grants from [NSF](https://www.sbir.gov/node/2083473), [GRA](https://gra.org/company/227/OSSPolice.html), and [ALInnovate](https://innovatealabama.org).

</details>

# Why Packj

**TL;DR** The state-of-the-art open-source vulnerability scanners assume **TRUSTED** code. Therefore, all of them **ONLY** scan for CVEs. Whereas, Packj not only scans for CVEs, but also carries out deep code analsysis to flag any hidden malware and "risky” code behavior, such as spawning of shell, use of SSH keys, and mismatch of GitHub code vs packaged code (provenance). Such risky behavior/attributes does not qualify as vulnerabilities (CVEs), which is why none of the existing tools can flag them. 

<details>
    <summary><h4>Show long answer</h4></summary>

Security vulnerabilities (a.k.a. CVEs) are result of accidental programming bugs (e.g., Log4J, HeartBleed). Typical example is a missing bounds check on user input, which makes the program vulnerable to buffer overflow attacks. Attackers need to develop an exploit to trigger such security vulnerabilities (e.g., a crafted TCP/IP packet in case of HeartBleed or a numerically high input to cause buffer overflow). Such CVEs can be fixed by patching or upgrading to a newer version of the library (e.g., newer version of Log4J fixes the CVE). 

In contrast, malware is purposefully bad. Moreover, malware itself is an exploit and cannot be patched or fixed by upgrading to a newer version. For example, [dependency confusion attack](https://packj.dev/go?next=https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610) was intentionally malicious; it did not exploit any accidental programming bug in the code. Similarly, an author of popular package sabotaging their own code to [protest](https://packj.dev/go?next=https://en.wikipedia.org/wiki/Peacenotwar) against the war is very much intentional and does not exploit any CVEs. Typo-squatting is another attack vector that bad actors use to propagate malware in popular open-source package registries: it exploits [typos and inexperience of devs](https://packj.dev/go?next=https://discuss.python.org/t/improving-risks-and-consequences-against-typosquatting-on-pypi/5090), not accidental programming bugs or CVEs in the code.

Existing scanners **DO NOT** detect malware or intentionally bad code because they assume that the third-party open-source code is benign. As such, these tools simply scan the source code for open-source dependencies, compile a list of all dependencies being used, and look each <dependency-NAME, dependency-VERSION> up in a database (e.g., NVD) to report if the source code uses any vulnerable package versions (e.g., vulnerable version of Log4J, LibSSL version affected by HeartBleed).

Packj uses static code analysis, runtime tracing or dynamic analysis, and metadata checks to audit programmatic behavior of the package. Please read more at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md#faq)
</details>

# Customization #

Packj can be easily customized (zero noise) to your threat model. Simply add a [.packj.yaml](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/.packj.yaml) file in the top dir of your repo/project and reduce alert fatigue by commenting out unwanted attributes.

# Malware found #

We found over 40 and 20 malicious packages on PyPI and Rubygems, respectively using this tool. A number of them been taken down. Refer to an example below:

<details>
    <summary><h4>Show example malware</h4></summary>

    $ python3 main.py audit pypi:krisqian

    [+] Fetching 'krisqian' from pypi...OK [ver 0.0.7]
    [+] Checking version...OK [256 days old]
    [+] Checking release history...OK [7 version(s)]
    [+] Checking release time gap...OK [1 days since last release]
    [+] Checking author...OK [KrisWuQian@baidu.com]
        [+] Checking email/domain validity...OK [KrisWuQian@baidu.com]
    [+] Checking readme...ALERT [no readme]
    [+] Checking homepage...OK [https://www.bilibili.com/bangumi/media/md140632]
    [+] Checking downloads...OK [13 weekly]
    [+] Checking repo_url URL...OK [None]
    [+] Checking for CVEs...OK [none found]
    [+] Checking dependencies...OK [none found]
    [+] Downloading package 'KrisQian' (ver 0.0.7) from pypi...OK [1.94 KB]
    [+] Analyzing code...ALERT [needs 3 perms: process,network,file]
    [+] Checking files/funcs...OK [9 files (2 .py), 6 funcs, LoC: 184]
    =============================================
    [+] 6 risk(s) found, package is undesirable!
    {
        "undesirable": [
            "no readme",
            "only 45 weekly downloads",
            "no source repo found",
            "generates new code at runtime",
            "fetches data over the network: ['KrisQian-0.0.7/setup.py:40', 'KrisQian-0.0.7/setup.py:50']",
            "reads files and dirs: ['KrisQian-0.0.7/setup.py:59', 'KrisQian-0.0.7/setup.py:70']"
        ]
    }
    => Complete report: pypi-KrisQian-0.0.7.json
    => View pre-vetted package report at https://packj.dev/package/PyPi/KrisQian/0.0.7
</details>


Packj flagged KrisQian (v0.0.7) as suspicious due to absence of source repo and use of sensitive APIs (network, code generation) during package installation time (in setup.py). We decided to take a deeper look, and found the package malicious. Please find our detailed analysis at [https://packj.dev/malware/krisqian](https://packj.dev/go?next=https://packj.dev/malware/krisqian).

More examples of malware we found are listed at [https://packj.dev/malware](https://packj.dev/go?next=https://packj.dev/malware) Please reach out to us at [oss@ossillate.com](mailto:oss@ossillate.com) for full list.

# Resources #

To learn more about Packj tool or open-source software supply chain attacks, refer to our

[![PyConUS'22 Video](https://img.youtube.com/vi/Rcuqn56uCDk/hqdefault.jpg)](https://packj.dev/go?next=https://www.youtube.com/watch?v=Rcuqn56uCDk)
[![OSSEU'22 Video](https://img.youtube.com/vi/a7BfDGeW_jY/hqdefault.jpg)](https://packj.dev/go?next=https://www.youtube.com/watch?v=a7BfDGeW_jY)

- PyConUS'22 [talk](https://packj.dev/go?next=https://www.youtube.com/watch?v=Rcuqn56uCDk) and [slides](https://packj.dev/go?next=https://speakerdeck.com/ashishbijlani/pyconus22-slides).
- BlackHAT Asia'22 Arsenal [presentation](https://packj.dev/go?next=https://www.blackhat.com/asia-22/arsenal/schedule/#mitigating-open-source-software-supply-chain-attacks-26241)
- PackagingCon'21 [talk](https://packj.dev/go?next=https://www.youtube.com/watch?v=PHfN-NrUCoo) and [slides](https://packj.dev/go?next=https://speakerdeck.com/ashishbijlani/mitigating-open-source-software-supply-chain-attacks)
- BlackHat USA'22 Arsenal talk [Detecting typo-squatting, backdoored, abandoned, and other "risky" open-source packages using Packj](https://www.blackhat.com/us-22/arsenal/schedule/#detecting-typo-squatting-backdoored-abandoned-and-other-risky-open-source-packages-using-packj-28075)
- Academic [dissertation](https://packj.dev/go?next=https://cyfi.ece.gatech.edu/publications/DUAN-DISSERTATION-2019.pdf) on open-source software security and the [paper](https://packj.dev/go?next=https://www.ndss-symposium.org/wp-content/uploads/ndss2021_1B-1_23055_paper.pdf) from our group at Georgia Tech that started this research.
- Open Source Summit, Europe'22 talk [Scoring dependencies to detect “weak links” in your open-source software supply chain](https://packj.dev/go?next=https://osseu2022.sched.com/overview/type/SupplyChainSecurityCon) - presentation video on [YouTube](https://packj.dev/go?next=https://www.youtube.com/watch?v=a7BfDGeW_jY)
- NullCon'22 talk [Unearthing Malicious And Other “Risky” Open-Source Packages Using Packj](https://packj.dev/go?next=https://nullcon.net/goa-2022/unearthing-malicious-and-other-risky-open-source-packages-using-packj)

# Feature roadmap #

* Add support for other language ecosystems. Rust is a work in progress, and will be available in December '22.
* Add functionality to detect several other "risky" code as well as metadata attributes.

Have a feature or support request? Please visit our [GitHub discussion page](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/discussions/) or join our [discord community](https://discord.gg/qFcqaV2wYa) for discussion and requests.

# Team #

Packj has been developed by Cybersecurity researchers at [Ossillate Inc.](https://packj.dev/go?next=https://ossillate.com/team) and external collaborators to help developers mitigate risks of supply chain attacks when sourcing untrusted third-party open-source software dependencies. We thank our developers and collaborators.

We welcome code contributions with open arms. See [CONTRIBUTING.md](CONTRIBUTING.md) guidelines. Found a bug? Please open an issue. Refer to our [SECURITY.md](SECURITY.md) guidelines to report a security issue.

# FAQ #

<details>
	<summary><b>What Package Managers (Registries) are supported?</b></summary>

Packj can currently vet NPM, PyPI, and RubyGems packages for "risky" attributes. We are adding support for Rust.
	
</details>

<details>
	<summary><b>What techniques does Packj employ to detect risky/malicious packages?</b></summary>

Packj uses static code analysis, dynamic tracing, and metadata analysis for comprehensive auditing. Static analysis alone is not sufficient to flag sophisticated malware that can hide itself better using code obfuscation. Dynamic analysis is performed by installing the package under `strace` and monitoring it's runtime behavior. Please read more at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md).
	
</details>

<details>
	<summary><b>Does it work on obfuscated calls? For example, a base 64 encrypted string that gets decrypted and then passed to a shell?</b></summary>

This is a very common malicious behavior. Packj detects code obfuscation as well as spawning of shell commands (exec system call). For example, Packj can  flag use of `getattr()` and `eval()` API as they indicate "runtime code generation"; a developer can go and take a deeper look then. See [main.py](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/main.py#L512) for details.
	
</details>



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ossillate-inc/packj",
    "name": "packj",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.4",
    "maintainer_email": "",
    "keywords": "software supply chain,malware,typo-squatting,vulnerability,open-source software,software composition analysis",
    "author": "Ossillate Inc.",
    "author_email": "oss@ossillate.com",
    "download_url": "https://files.pythonhosted.org/packages/cc/7b/97b29a95665af1a0ca293d67ce7b1920fe10392adc99bcafc2321f5f8627/packj-0.15.tar.gz",
    "platform": null,
    "description": "# <img src=\"https://packj.dev/static/img/icons/package.svg\" width=\"45\"/>&nbsp;<span style=\"font-size: 42px\"> Packj flags malicious/risky open-source packages</span> \n\n*Packj* (pronounced package) is a tool to mitigate software supply chain attacks. It can detect malicious, vulnerable, abandoned, typo-squatting, and other \"risky\" packages from popular open-source package registries, such as NPM, RubyGems, and PyPI. It can be easily customized to minimize noise. Packj started as a PhD research project and is currently being developed under various govt grants.\n\n[![GitHub Stars](https://img.shields.io/github/stars/ossillate-inc/packj?style=social)](https://github.com/ossillate-inc/packj/stargazers) ![](https://img.shields.io/badge/status-beta-yellow) [![Prs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=shields)](https://github.com/ossillate-inc/packj/blob/main/CONTRIBUTING.md) ![Github Commit Activity](https://img.shields.io/github/commit-activity/m/ossillate-inc/packj) [![Discord](https://img.shields.io/discord/910733124558802974?label=Discord)](https://discord.gg/qFcqaV2wYa)  [![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker](https://badgen.net/badge/icon/docker?icon=docker&label)](https://hub.docker.com/r/ossillate/packj/tags)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/packj?label=PyPI%20Downloads)](https://pypistats.org/packages/packj)\n\n![demo video](https://drive.google.com/uc?export=view&id=1QfA73i_ihgqo2JbNXoxaGSZ2Wa02RZNq)\n\n# Contents #\n\n* [Get started](#get-started) - available as Docker image, GitHub Action, Python PyPI package\n* [Functionality](#functionality) - deep static/dynamic code analysis and sandboxing\n* [Our story](#our-story) - started as a PhD research project and is backed by govt grants\n* [Why Packj](#why-packj) - existing CVE scanners ASSUME code is BENIGN and not analyze its behavior\n* [Customization](#customization) - turn off alerts as per your threat model to reduce noise\n* [Malware found](#malware-found) - reported over 70 malicious PyPI and RubyGems packages\n* [Talks and videos](#resources) - presentations from PyCon, OpenSourceSummit, BlackHAT\n* [Project roadmap](#feature-roadmap) - view or suggest new features; join our [discord channel](https://discord.gg/qFcqaV2wYa)\n* [Team and collaboration](#team) - expert Cybersecurity researchers from academia/industry\n* [FAQ](#faq) - supported package managers, commonly asked questions on techniques, and more\n\n# Get started #\n\nWe support multiple deployment models:\n\n### 1. GitHub runner \n\nUse Packj to audit dependencies in pull requests.\n\n```yaml\n- name: Packj Security Audit\n  uses: ossillate-inc/packj-github-action@0.0.4-beta\n  with:\n    # TODO: replace with your dependency files in the repo\n    DEPENDENCY_FILES: pypi:requirements.txt,npm:package.json,rubygems:Gemfile\n    REPO_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n```\n\nView on GitHub [marketplace](https://packj.dev/go?next=https://github.com/marketplace/actions/packj-security-audit). Example [PR run](https://packj.dev/go?next=https://github.com/ossillate-inc/packj-github-action-demo/pull/3#issuecomment-1274797138).\n\n###  2. PyPI package\n\nThe quickest way to try/test Packj is using the PyPI package.\n\n>\n> **Warning**: Packj only works on Linux.\n>\n\n```\npip3 install packj\n```\n\nAuditing RubyGems require additional dependencies\n\n```\nbundle install\n```\n\n### 3. Docker image (recommended)\n\nUse Docker or Podman for containerized (isolated) runs.\n\n```\ndocker run -v /tmp:/tmp/packj -it ossillate/packj:latest --help\n```\n\n### 4. Source repo\n\nClone this repo, \n\n```\nhttps://github.com/ossillate-inc/packj.git && cd packj\n```\n\nInstall dependencies\n\n```\nbundle install && pip3 install -r requirements.txt\n```\n\nStart with help:\n\n```\npython3 main.py --help \n```\n\n# Functionality #\n\nPackj offers the following tools: \n\n* [Audit](#auditing-a-package) - to vet a package for \"risky\" attributes.\n* [Sandbox](#sandboxed-package-installation) - for safe installation of a package. \n\n## Auditing a package ##\n\nPackj audits open-source software packages for \"risky\" attributes that make them vulnerable to supply chain attacks. For instance, packages with expired email domains (lacking 2FA), large release time gap, sensitive APIs or access permissions, etc. are flagged as risky. \n\nAuditing the following is supported:\n\n- multiple packages: `python3 main.py audit -p pypi:requests rubygems:overcommit`\n- dependency files: `python3 main.py audit -f npm:package.json pypi:requirements.txt`\n\nBy default, `audit` only performs static code analysis to detect risky code. You can paas `-t` or `--trace` flag to perform dynamic code analysis as well, which will install all requested packages under strace and monitor install-time behavior of packages. Please see the example output below.\n\n<details>\n    <summary><h4>Show example run/output</h4></summary>\n\n    $ docker run -v /tmp:/tmp/packj -it ossillate/packj:latest audit --trace -p npm:browserify\n\n    [+] Fetching 'browserify' from npm..........PASS [ver 17.0.0]\n    [+]    Checking package description.........PASS [browser-side require() the node way]\n    [+]    Checking release history.............PASS [484 version(s)]\n    [+] Checking version........................RISK [702 days old]\n    [+]    Checking release time gap............PASS [68 days since last release]\n    [+] Checking author.........................PASS [mail@substack.net]\n    [+]    Checking email/domain validity.......RISK [expired author email domain]\n    [+] Checking readme.........................PASS [26838 bytes]\n    [+] Checking homepage.......................PASS [https://github.com/browserify/browserify#readme]\n    [+] Checking downloads......................PASS [2M weekly]\n    [+] Checking repo URL.......................PASS [https://github.com/browserify/browserify]\n    [+]    Checking repo data...................PASS [stars: 14189, forks: 1244]\n    [+]    Checking if repo is a forked copy....PASS [original, not forked]\n    [+]    Checking repo description............PASS [browser-side require() the node.js way]\n    [+]    Checking repo activity...............PASS [commits: 2290, contributors: 207, tags: 413]\n    [+] Checking for CVEs.......................PASS [none found]\n    [+] Checking dependencies...................RISK [48 found]\n    [+] Downloading package from npm............PASS [163.83 KB]\n    [+] Analyzing code..........................RISK [needs 3 perm(s): decode,codegen,file]\n    [+] Checking files/funcs....................PASS [429 files (383 .js), 744 funcs, LoC: 9.7K]\n    [+] Installing package and tracing code.....PASS [found 5 process,1130 files,22 network syscalls]\n    =============================================\n    [+] 5 risk(s) found, package is undesirable!\n    => Complete report: /tmp/packj_54rbjhgm/report_npm-browserify-17.0.0_hlr1rhcz.json\n    {\n        \"undesirable\": [\n            \"old package: 702 days old\",\n            \"invalid or no author email: expired author email domain\",\n            \"generates new code at runtime\",\n            \"reads files and dirs\",\n            \"forks or exits OS processes\",\n        ]\n    }\n</details>\n\n> WARNING: since packages could execute malicious code during installation, it is recommended to ONLY use `-t` or `--trace` when running inside a Docker container or a Virtual Machine.\n\nAudit can also be performed in Docker/Podman containers. Please find details on risky attributes and how to use at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md).\n\n## Sandboxed package installation ##\n\nPackj offers a lightweight sandboxing for `safe installation` of a package. Specifically, it prevents malicious packages from exfiltrating sensitive data, accessing sensitive files (e.g., SSH keys), and persisting malware. \n\nIt sandboxes install-time scripts, including any native compliation. It uses **strace** (i.e., **NO** VM/Container required).\n\nPlease find details on the sandboxing mechanism and how to use at [Sandbox README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/sandbox/README.md).\n\n<details>\n    <summary><h4>Show example run/output</h4></summary>\n\n    $ python3 main.py sandbox gem install overcommit\n   \n    Fetching: overcommit-0.59.1.gem (100%)\n    Install hooks by running `overcommit --install` in your Git repository\n    Successfully installed overcommit-0.59.1\n    Parsing documentation for overcommit-0.59.1\n    Installing ri documentation for overcommit-0.59.1\n   \n    #############################\n    # Review summarized activity\n    #############################\n   \n    [+] Network connections\n        [+] DNS (1 IPv4 addresses) at port 53 [rule: ALLOW]\n        [+] rubygems.org (4 IPv6 addresses) at port 443 [rule: IPv6 rules not supported]\n        [+] rubygems.org (4 IPv4 addresses) at port 443 [rule: ALLOW]\n    [+] Filesystem changes\n    /\n    \u2514\u2500\u2500 home\n        \u2514\u2500\u2500 ubuntu\n            \u2514\u2500\u2500 .ruby\n                \u251c\u2500\u2500 gems\n                \u2502   \u251c\u2500\u2500 iniparse-1.5.0 [new: DIR, 15 files, 46.6K bytes]\n                \u2502   \u251c\u2500\u2500 rexml-3.2.5 [new: DIR, 77 files, 455.6K bytes]\n                \u2502   \u251c\u2500\u2500 overcommit-0.59.1 [new: DIR, 252 files, 432.7K bytes]\n                \u2502   \u2514\u2500\u2500 childprocess-4.1.0 [new: DIR, 57 files, 141.2K bytes]\n                \u251c\u2500\u2500 cache\n                \u2502   \u251c\u2500\u2500 iniparse-1.5.0.gem [new: FILE, 16.4K bytes]\n                \u2502   \u251c\u2500\u2500 rexml-3.2.5.gem [new: FILE, 93.2K bytes]\n                \u2502   \u251c\u2500\u2500 childprocess-4.1.0.gem [new: FILE, 34.3K bytes]\n                \u2502   \u2514\u2500\u2500 overcommit-0.59.1.gem [new: FILE, 84K bytes]\n                \u251c\u2500\u2500 specifications\n                \u2502   \u251c\u2500\u2500 rexml-3.2.5.gemspec [new: FILE, 2.7K bytes]\n                \u2502   \u251c\u2500\u2500 overcommit-0.59.1.gemspec [new: FILE, 1.7K bytes]\n                \u2502   \u251c\u2500\u2500 childprocess-4.1.0.gemspec [new: FILE, 1.8K bytes]\n                \u2502   \u2514\u2500\u2500 iniparse-1.5.0.gemspec [new: FILE, 1.3K bytes]\n                \u251c\u2500\u2500 bin\n                \u2502   \u2514\u2500\u2500 overcommit [new: FILE, 622 bytes]\n                \u2514\u2500\u2500 doc\n                    \u251c\u2500\u2500 iniparse-1.5.0\n                    \u2502   \u2514\u2500\u2500 ri [new: DIR, 119 files, 131.7K bytes]\n                    \u251c\u2500\u2500 rexml-3.2.5\n                    \u2502   \u2514\u2500\u2500 ri [new: DIR, 836 files, 841K bytes]\n                    \u251c\u2500\u2500 overcommit-0.59.1\n                    \u2502   \u2514\u2500\u2500 ri [new: DIR, 1046 files, 1.5M bytes]\n                    \u2514\u2500\u2500 childprocess-4.1.0\n                        \u2514\u2500\u2500 ri [new: DIR, 272 files, 297.8K bytes]\n\n    [C]ommit all changes, [Q|q]uit & discard changes, [L|l]ist details:\n</details>\n\n# Our story\n\n**TL;DR** Packj started as a PhD research project. It is backed by various government grants.\n\n<details>\n\t<summary><h4>Show long answer</h4></summary>\n\nPackj started as an academic research project. Specifically, the static code analysis techniques used by Packj are based on cutting-edge Cybersecurity research: [MalOSS](https://packj.dev/go?next=https://github.com/osssanitizer/maloss) project by our research [group](https://packj.dev/go?next=http://cyfi.ece.gatech.edu) at Georgia Tech.\n\n<a href=\"https://packj.dev/go?next=https://arxiv.org/pdf/2002.01139v1.pdf\" target=\"_blank\">\n\t<img src=\"https://drive.google.com/uc?export=view&id=1L03-kFTdNDFvGLWt_zJ-Qe8PPX75ICqo\" width=\"300\" alt=\"academic paper\">\n</a>\n\nPackj is backed by generous grants from [NSF](https://www.sbir.gov/node/2083473), [GRA](https://gra.org/company/227/OSSPolice.html), and [ALInnovate](https://innovatealabama.org).\n\n</details>\n\n# Why Packj\n\n**TL;DR** The state-of-the-art open-source vulnerability scanners assume **TRUSTED** code. Therefore, all of them **ONLY** scan for CVEs. Whereas, Packj not only scans for CVEs, but also carries out deep code analsysis to flag any hidden malware and \"risky\u201d code behavior, such as spawning of shell, use of SSH keys, and mismatch of GitHub code vs packaged code (provenance). Such risky behavior/attributes does not qualify as vulnerabilities (CVEs), which is why none of the existing tools can flag them. \n\n<details>\n    <summary><h4>Show long answer</h4></summary>\n\nSecurity vulnerabilities (a.k.a. CVEs) are result of accidental programming bugs (e.g., Log4J, HeartBleed). Typical example is a missing bounds check on user input, which makes the program vulnerable to buffer overflow attacks. Attackers need to develop an exploit to trigger such security vulnerabilities (e.g., a crafted TCP/IP packet in case of HeartBleed or a numerically high input to cause buffer overflow). Such CVEs can be fixed by patching or upgrading to a newer version of the library (e.g., newer version of Log4J fixes the CVE). \n\nIn contrast, malware is purposefully bad. Moreover, malware itself is an exploit and cannot be patched or fixed by upgrading to a newer version. For example, [dependency confusion attack](https://packj.dev/go?next=https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610) was intentionally malicious; it did not exploit any accidental programming bug in the code. Similarly, an author of popular package sabotaging their own code to [protest](https://packj.dev/go?next=https://en.wikipedia.org/wiki/Peacenotwar) against the war is very much intentional and does not exploit any CVEs. Typo-squatting is another attack vector that bad actors use to propagate malware in popular open-source package registries: it exploits [typos and inexperience of devs](https://packj.dev/go?next=https://discuss.python.org/t/improving-risks-and-consequences-against-typosquatting-on-pypi/5090), not accidental programming bugs or CVEs in the code.\n\nExisting scanners **DO NOT** detect malware or intentionally bad code because they assume that the third-party open-source code is benign. As such, these tools simply scan the source code for open-source dependencies, compile a list of all dependencies being used, and look each <dependency-NAME, dependency-VERSION> up in a database (e.g., NVD) to report if the source code uses any vulnerable package versions (e.g., vulnerable version of Log4J, LibSSL version affected by HeartBleed).\n\nPackj uses static code analysis, runtime tracing or dynamic analysis, and metadata checks to audit programmatic behavior of the package. Please read more at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md#faq)\n</details>\n\n# Customization #\n\nPackj can be easily customized (zero noise) to your threat model. Simply add a [.packj.yaml](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/.packj.yaml) file in the top dir of your repo/project and reduce alert fatigue by commenting out unwanted attributes.\n\n# Malware found #\n\nWe found over 40 and 20 malicious packages on PyPI and Rubygems, respectively using this tool. A number of them been taken down. Refer to an example below:\n\n<details>\n    <summary><h4>Show example malware</h4></summary>\n\n    $ python3 main.py audit pypi:krisqian\n\n    [+] Fetching 'krisqian' from pypi...OK [ver 0.0.7]\n    [+] Checking version...OK [256 days old]\n    [+] Checking release history...OK [7 version(s)]\n    [+] Checking release time gap...OK [1 days since last release]\n    [+] Checking author...OK [KrisWuQian@baidu.com]\n        [+] Checking email/domain validity...OK [KrisWuQian@baidu.com]\n    [+] Checking readme...ALERT [no readme]\n    [+] Checking homepage...OK [https://www.bilibili.com/bangumi/media/md140632]\n    [+] Checking downloads...OK [13 weekly]\n    [+] Checking repo_url URL...OK [None]\n    [+] Checking for CVEs...OK [none found]\n    [+] Checking dependencies...OK [none found]\n    [+] Downloading package 'KrisQian' (ver 0.0.7) from pypi...OK [1.94 KB]\n    [+] Analyzing code...ALERT [needs 3 perms: process,network,file]\n    [+] Checking files/funcs...OK [9 files (2 .py), 6 funcs, LoC: 184]\n    =============================================\n    [+] 6 risk(s) found, package is undesirable!\n    {\n        \"undesirable\": [\n            \"no readme\",\n            \"only 45 weekly downloads\",\n            \"no source repo found\",\n            \"generates new code at runtime\",\n            \"fetches data over the network: ['KrisQian-0.0.7/setup.py:40', 'KrisQian-0.0.7/setup.py:50']\",\n            \"reads files and dirs: ['KrisQian-0.0.7/setup.py:59', 'KrisQian-0.0.7/setup.py:70']\"\n        ]\n    }\n    => Complete report: pypi-KrisQian-0.0.7.json\n    => View pre-vetted package report at https://packj.dev/package/PyPi/KrisQian/0.0.7\n</details>\n\n\nPackj flagged KrisQian (v0.0.7) as suspicious due to absence of source repo and use of sensitive APIs (network, code generation) during package installation time (in setup.py). We decided to take a deeper look, and found the package malicious. Please find our detailed analysis at [https://packj.dev/malware/krisqian](https://packj.dev/go?next=https://packj.dev/malware/krisqian).\n\nMore examples of malware we found are listed at [https://packj.dev/malware](https://packj.dev/go?next=https://packj.dev/malware) Please reach out to us at [oss@ossillate.com](mailto:oss@ossillate.com) for full list.\n\n# Resources #\n\nTo learn more about Packj tool or open-source software supply chain attacks, refer to our\n\n[![PyConUS'22 Video](https://img.youtube.com/vi/Rcuqn56uCDk/hqdefault.jpg)](https://packj.dev/go?next=https://www.youtube.com/watch?v=Rcuqn56uCDk)\n[![OSSEU'22 Video](https://img.youtube.com/vi/a7BfDGeW_jY/hqdefault.jpg)](https://packj.dev/go?next=https://www.youtube.com/watch?v=a7BfDGeW_jY)\n\n- PyConUS'22 [talk](https://packj.dev/go?next=https://www.youtube.com/watch?v=Rcuqn56uCDk) and [slides](https://packj.dev/go?next=https://speakerdeck.com/ashishbijlani/pyconus22-slides).\n- BlackHAT Asia'22 Arsenal [presentation](https://packj.dev/go?next=https://www.blackhat.com/asia-22/arsenal/schedule/#mitigating-open-source-software-supply-chain-attacks-26241)\n- PackagingCon'21 [talk](https://packj.dev/go?next=https://www.youtube.com/watch?v=PHfN-NrUCoo) and [slides](https://packj.dev/go?next=https://speakerdeck.com/ashishbijlani/mitigating-open-source-software-supply-chain-attacks)\n- BlackHat USA'22 Arsenal talk [Detecting typo-squatting, backdoored, abandoned, and other \"risky\" open-source packages using Packj](https://www.blackhat.com/us-22/arsenal/schedule/#detecting-typo-squatting-backdoored-abandoned-and-other-risky-open-source-packages-using-packj-28075)\n- Academic [dissertation](https://packj.dev/go?next=https://cyfi.ece.gatech.edu/publications/DUAN-DISSERTATION-2019.pdf) on open-source software security and the [paper](https://packj.dev/go?next=https://www.ndss-symposium.org/wp-content/uploads/ndss2021_1B-1_23055_paper.pdf) from our group at Georgia Tech that started this research.\n- Open Source Summit, Europe'22 talk [Scoring dependencies to detect \u201cweak links\u201d in your open-source software supply chain](https://packj.dev/go?next=https://osseu2022.sched.com/overview/type/SupplyChainSecurityCon) - presentation video on [YouTube](https://packj.dev/go?next=https://www.youtube.com/watch?v=a7BfDGeW_jY)\n- NullCon'22 talk [Unearthing Malicious And Other \u201cRisky\u201d Open-Source Packages Using Packj](https://packj.dev/go?next=https://nullcon.net/goa-2022/unearthing-malicious-and-other-risky-open-source-packages-using-packj)\n\n# Feature roadmap #\n\n* Add support for other language ecosystems. Rust is a work in progress, and will be available in December '22.\n* Add functionality to detect several other \"risky\" code as well as metadata attributes.\n\nHave a feature or support request? Please visit our [GitHub discussion page](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/discussions/) or join our [discord community](https://discord.gg/qFcqaV2wYa) for discussion and requests.\n\n# Team #\n\nPackj has been developed by Cybersecurity researchers at [Ossillate Inc.](https://packj.dev/go?next=https://ossillate.com/team) and external collaborators to help developers mitigate risks of supply chain attacks when sourcing untrusted third-party open-source software dependencies. We thank our developers and collaborators.\n\nWe welcome code contributions with open arms. See [CONTRIBUTING.md](CONTRIBUTING.md) guidelines. Found a bug? Please open an issue. Refer to our [SECURITY.md](SECURITY.md) guidelines to report a security issue.\n\n# FAQ #\n\n<details>\n\t<summary><b>What Package Managers (Registries) are supported?</b></summary>\n\nPackj can currently vet NPM, PyPI, and RubyGems packages for \"risky\" attributes. We are adding support for Rust.\n\t\n</details>\n\n<details>\n\t<summary><b>What techniques does Packj employ to detect risky/malicious packages?</b></summary>\n\nPackj uses static code analysis, dynamic tracing, and metadata analysis for comprehensive auditing. Static analysis alone is not sufficient to flag sophisticated malware that can hide itself better using code obfuscation. Dynamic analysis is performed by installing the package under `strace` and monitoring it's runtime behavior. Please read more at [Audit README](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/README.md).\n\t\n</details>\n\n<details>\n\t<summary><b>Does it work on obfuscated calls? For example, a base 64 encrypted string that gets decrypted and then passed to a shell?</b></summary>\n\nThis is a very common malicious behavior. Packj detects code obfuscation as well as spawning of shell commands (exec system call). For example, Packj can  flag use of `getattr()` and `eval()` API as they indicate \"runtime code generation\"; a developer can go and take a deeper look then. See [main.py](https://packj.dev/go?next=https://github.com/ossillate-inc/packj/blob/main/packj/audit/main.py#L512) for details.\n\t\n</details>\n\n\n",
    "bugtrack_url": null,
    "license": "GNU AGPLv3",
    "summary": "Packj flags \"risky\" open-source packages in your software supply chain",
    "version": "0.15",
    "split_keywords": [
        "software supply chain",
        "malware",
        "typo-squatting",
        "vulnerability",
        "open-source software",
        "software composition analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cc7b97b29a95665af1a0ca293d67ce7b1920fe10392adc99bcafc2321f5f8627",
                "md5": "b38f6e9693c122a9e02e0afacc7c962b",
                "sha256": "c21f25a1dd1e1d673e141e60a9b5a2bb8e2c95314b12c3af23d7cbed80bd2987"
            },
            "downloads": -1,
            "filename": "packj-0.15.tar.gz",
            "has_sig": false,
            "md5_digest": "b38f6e9693c122a9e02e0afacc7c962b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.4",
            "size": 219700,
            "upload_time": "2023-02-01T18:14:25",
            "upload_time_iso_8601": "2023-02-01T18:14:25.690654Z",
            "url": "https://files.pythonhosted.org/packages/cc/7b/97b29a95665af1a0ca293d67ce7b1920fe10392adc99bcafc2321f5f8627/packj-0.15.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-01 18:14:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "ossillate-inc",
    "github_project": "packj",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "packj"
}
        
Elapsed time: 0.03822s