nb-clean


Namenb-clean JSON
Version 3.2.0 PyPI version JSON
download
home_pagehttps://github.com/srstevenson/nb-clean
SummaryClean Jupyter notebooks for versioning
upload_time2023-12-18 15:36:55
maintainer
docs_urlNone
authorScott Stevenson
requires_python>=3.8,<4.0
licenseISC
keywords jupyter notebook clean filter git
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center"><img src="images/nb-clean.png" width=300></p>

[![License](https://img.shields.io/github/license/srstevenson/nb-clean?label=License&color=blue)](https://github.com/srstevenson/nb-clean/blob/main/LICENSE)
[![GitHub release](https://img.shields.io/github/v/release/srstevenson/nb-clean?label=GitHub)](https://github.com/srstevenson/nb-clean)
[![PyPI version](https://img.shields.io/pypi/v/nb-clean?label=PyPI)](https://pypi.org/project/nb-clean/)
[![Python versions](https://img.shields.io/pypi/pyversions/nb-clean?label=Python)](https://pypi.org/project/nb-clean/)
[![CI status](https://github.com/srstevenson/nb-clean/workflows/CI/badge.svg)](https://github.com/srstevenson/nb-clean/actions)
[![Coverage](https://img.shields.io/codecov/c/gh/srstevenson/nb-clean?label=Coverage)](https://app.codecov.io/gh/srstevenson/nb-clean)

`nb-clean` cleans Jupyter notebooks of cell execution counts, metadata, outputs,
and (optionally) empty cells, preparing them for committing to version control.
It provides both a Git filter and pre-commit hook to automatically clean
notebooks before they're staged, and can also be used with other version control
systems, as a command line tool, and as a Python library. It can determine if a
notebook is clean or not, which can be used as a check in your continuous
integration pipelines.

> [!NOTE]
>
> `nb-clean` 2.0.0 introduced a new command line interface to make cleaning
> notebooks in place easier. If you upgrade from a previous release, you'll need
> to migrate to the new interface as described under
> [Migrating to `nb-clean` 2](#migrating-to-nb-clean-2).

## Installation

To install the latest release from [PyPI], use [pip]:

```bash
python3 -m pip install nb-clean
```

`nb-clean` can also be installed with [Conda]:

```bash
conda install -c conda-forge nb-clean
```

In Python projects using [Poetry] or [PDM] for dependency management, add
`nb-clean` as a development dependency with `poetry add --group dev nb-clean` or
`pdm add --dev nb-clean`. `nb-clean` requires Python 3.8 or later.

## Usage

### Checking

You can check if a notebook is clean with:

```bash
nb-clean check notebook.ipynb
```

or by passing the notebook contents on standard input:

```bash
nb-clean check < notebook.ipynb
```

To also check for empty cells, add the `-e`/`--remove-empty-cells` flag. To
ignore cell metadata, add the `-m`/`--preserve-cell-metadata` flag, optionally
with a selection of metadata fields to ignore. To ignore cell outputs, add the
`-o`/`--preserve-cell-outputs` flag. To ignore cell execution counts, add the
`-c`/`--preserve-execution-counts` flag. To ignore notebook metadata, such as
language version, add the `-n`/`--preserve-notebook-metadata` flag.

`nb-clean` will exit with status code 0 if the notebook is clean, and status
code 1 if it is not. `nb-clean` will also print details of cell execution
counts, metadata, outputs, and empty cells it finds.

### Cleaning (interactive)

You can clean a Jupyter notebook with:

```bash
nb-clean clean notebook.ipynb
```

This cleans the notebook in place. You can also pass the notebook content on
standard input, in which case the cleaned notebook is written to standard
output:

```bash
nb-clean clean < original.ipynb > cleaned.ipynb
```

To also remove empty cells, add the `-e`/`--remove-empty-cells` flag. To
preserve cell metadata, add the `-m`/`--preserve-cell-metadata` flag, optionally
with a selection of metadata fields to preserve. To preserve cell outputs, add
the `-o`/`--preserve-cell-outputs` flag. To preserve cell execution counts, add
the `-c`/`--preserve-execution-counts` flag. To preserve notebook metadata, such
as language version, add the `-n`/`--preserve-notebook-metadata` flag.

### Cleaning (Git filter)

To add a filter to an existing Git repository to automatically clean notebooks
when they're staged, run the following from the working tree:

```bash
nb-clean add-filter
```

This will configure a filter to remove cell execution counts, metadata, and
outputs. To also remove empty cells, use:

```bash
nb-clean add-filter --remove-empty-cells
```

To preserve cell metadata, such as that required by tools such as [papermill],
use:

```bash
nb-clean add-filter --preserve-cell-metadata
```

To preserve only specific cell metadata, e.g., `tags` and `special`, use:

```bash
nb-clean add-filter --preserve-cell-metadata tags special
```

To preserve cell outputs, use:

```bash
nb-clean add-filter --preserve-cell-outputs
```

To preserve cell execution counts, use:

```bash
nb-clean add-filter --preserve-execution-counts
```

To preserve notebook metadata, such as language version, use:

```bash
nb-clean add-filter --preserve-notebook-metadata
```

`nb-clean` will configure a filter in the Git repository in which it is run, and
won't mutate your global or system Git configuration. To remove the filter, run:

```bash
nb-clean remove-filter
```

### Cleaning (pre-commit hook)

`nb-clean` can also be used as a [pre-commit] hook. You may prefer this to the
Git filter if your project already uses the pre-commit framework.

Note that the Git filter and pre-commit hook work differently, with different
effects on your working directory. The pre-commit hook operates on the notebook
on disk, cleaning the copy in your working directory. The Git filter cleans
notebooks as they are added to the index, leaving the copy in your working
directory dirty. This means cell outputs are still visible to you in your local
Jupyter instance when using the Git filter, but not when using the pre-commit
hook.

After installing [pre-commit], add the `nb-clean` hook by adding the following
snippet to `.pre-commit-config.yaml` in the root of your repository:

```yaml
repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: 3.2.0
    hooks:
      - id: nb-clean
```

You can pass additional arguments to `nb-clean` with an `args` array. The
following example shows how to preserve only two specific metadata fields. Note
that, in the example, the final item `--` in the arg list is mandatory. The
option `--preserve-cell-metadata` may take an arbitrary number of field
arguments, and the `--` argument is needed to separate them from notebook
filenames, which `pre-commit` will append to the list of arguments.

```yaml
repos:
  - repo: https://github.com/srstevenson/nb-clean
    rev: 3.2.0
    hooks:
      - id: nb-clean
        args:
          - --remove-empty-cells
          - --preserve-cell-metadata
          - tags
          - slideshow
          - --
```

Run `pre-commit install` to ensure the hook is installed, and
`pre-commit autoupdate` to update the hook to the latest release of `nb-clean`.

### Preserving all nbformat metadata

To ignore or preserve specifically the metadata defined in the
[`nbformat` documentation](https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata),
use the following options:
`--preserve-cell-metadata collapsed scrolled deletable editable format name tags jupyter execution`.

### Migrating to `nb-clean` 2

The following table maps from the command line interface of `nb-clean` 1.6.0 to
that of `nb-clean` >=2.0.0.

| Description                             | `nb-clean` 1.6.0                                                    | `nb-clean` >=2.0.0                                          |
| --------------------------------------- | ------------------------------------------------------------------- | ----------------------------------------------------------- |
| Clean notebook                          | `nb-clean clean -i/--input notebook.ipynb \| sponge notebook.ipynb` | `nb-clean clean notebook.ipynb`                             |
| Clean notebook (remove empty cells)     | `nb-clean clean -i/--input notebook.ipynb -e/--remove-empty`        | `nb-clean clean notebook.ipynb -e/--remove-empty-cells`     |
| Clean notebook (preserve cell metadata) | `nb-clean clean -i/--input notebook.ipynb -m/--preserve-metadata`   | `nb-clean clean notebook.ipynb -m/--preserve-cell-metadata` |
| Check notebook                          | `nb-clean check -i/--input notebook.ipynb`                          | `nb-clean check notebook.ipynb`                             |
| Check notebook (ignore non-empty cells) | `nb-clean check -i/--input notebook.ipynb -e/--remove-empty`        | `nb-clean check notebook.ipynb -e/--remove-empty-cells`     |
| Check notebook (ignore cell metadata)   | `nb-clean check -i/--input notebook.ipynb -m/--preserve-metadata`   | `nb-clean check notebook.ipynb -m/--preserve-cell-metadata` |
| Add Git filter to clean notebooks       | `nb-clean configure-git`                                            | `nb-clean add-filter`                                       |
| Remove Git filter                       | `nb-clean unconfigure-git`                                          | `nb-clean remove-filter`                                    |

## Copyright

Copyright © [Scott Stevenson].

`nb-clean` is distributed under the terms of the [ISC license].

[conda]: https://docs.conda.io/
[isc license]: https://opensource.org/licenses/ISC
[papermill]: https://papermill.readthedocs.io/
[pdm]: https://pdm.fming.dev/
[pip]: https://pip.pypa.io/
[poetry]: https://python-poetry.org/
[pre-commit]: https://pre-commit.com/
[pypi]: https://pypi.org/project/nb-clean/
[scott stevenson]: https://scott.stevenson.io

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/srstevenson/nb-clean",
    "name": "nb-clean",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "jupyter,notebook,clean,filter,git",
    "author": "Scott Stevenson",
    "author_email": "scott@stevenson.io",
    "download_url": "https://files.pythonhosted.org/packages/ad/f9/67c6b22e173b5985f96352467bd57461e1d5ca948c970b42ead64459a29a/nb_clean-3.2.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\"><img src=\"images/nb-clean.png\" width=300></p>\n\n[![License](https://img.shields.io/github/license/srstevenson/nb-clean?label=License&color=blue)](https://github.com/srstevenson/nb-clean/blob/main/LICENSE)\n[![GitHub release](https://img.shields.io/github/v/release/srstevenson/nb-clean?label=GitHub)](https://github.com/srstevenson/nb-clean)\n[![PyPI version](https://img.shields.io/pypi/v/nb-clean?label=PyPI)](https://pypi.org/project/nb-clean/)\n[![Python versions](https://img.shields.io/pypi/pyversions/nb-clean?label=Python)](https://pypi.org/project/nb-clean/)\n[![CI status](https://github.com/srstevenson/nb-clean/workflows/CI/badge.svg)](https://github.com/srstevenson/nb-clean/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/srstevenson/nb-clean?label=Coverage)](https://app.codecov.io/gh/srstevenson/nb-clean)\n\n`nb-clean` cleans Jupyter notebooks of cell execution counts, metadata, outputs,\nand (optionally) empty cells, preparing them for committing to version control.\nIt provides both a Git filter and pre-commit hook to automatically clean\nnotebooks before they're staged, and can also be used with other version control\nsystems, as a command line tool, and as a Python library. It can determine if a\nnotebook is clean or not, which can be used as a check in your continuous\nintegration pipelines.\n\n> [!NOTE]\n>\n> `nb-clean` 2.0.0 introduced a new command line interface to make cleaning\n> notebooks in place easier. If you upgrade from a previous release, you'll need\n> to migrate to the new interface as described under\n> [Migrating to `nb-clean` 2](#migrating-to-nb-clean-2).\n\n## Installation\n\nTo install the latest release from [PyPI], use [pip]:\n\n```bash\npython3 -m pip install nb-clean\n```\n\n`nb-clean` can also be installed with [Conda]:\n\n```bash\nconda install -c conda-forge nb-clean\n```\n\nIn Python projects using [Poetry] or [PDM] for dependency management, add\n`nb-clean` as a development dependency with `poetry add --group dev nb-clean` or\n`pdm add --dev nb-clean`. `nb-clean` requires Python 3.8 or later.\n\n## Usage\n\n### Checking\n\nYou can check if a notebook is clean with:\n\n```bash\nnb-clean check notebook.ipynb\n```\n\nor by passing the notebook contents on standard input:\n\n```bash\nnb-clean check < notebook.ipynb\n```\n\nTo also check for empty cells, add the `-e`/`--remove-empty-cells` flag. To\nignore cell metadata, add the `-m`/`--preserve-cell-metadata` flag, optionally\nwith a selection of metadata fields to ignore. To ignore cell outputs, add the\n`-o`/`--preserve-cell-outputs` flag. To ignore cell execution counts, add the\n`-c`/`--preserve-execution-counts` flag. To ignore notebook metadata, such as\nlanguage version, add the `-n`/`--preserve-notebook-metadata` flag.\n\n`nb-clean` will exit with status code 0 if the notebook is clean, and status\ncode 1 if it is not. `nb-clean` will also print details of cell execution\ncounts, metadata, outputs, and empty cells it finds.\n\n### Cleaning (interactive)\n\nYou can clean a Jupyter notebook with:\n\n```bash\nnb-clean clean notebook.ipynb\n```\n\nThis cleans the notebook in place. You can also pass the notebook content on\nstandard input, in which case the cleaned notebook is written to standard\noutput:\n\n```bash\nnb-clean clean < original.ipynb > cleaned.ipynb\n```\n\nTo also remove empty cells, add the `-e`/`--remove-empty-cells` flag. To\npreserve cell metadata, add the `-m`/`--preserve-cell-metadata` flag, optionally\nwith a selection of metadata fields to preserve. To preserve cell outputs, add\nthe `-o`/`--preserve-cell-outputs` flag. To preserve cell execution counts, add\nthe `-c`/`--preserve-execution-counts` flag. To preserve notebook metadata, such\nas language version, add the `-n`/`--preserve-notebook-metadata` flag.\n\n### Cleaning (Git filter)\n\nTo add a filter to an existing Git repository to automatically clean notebooks\nwhen they're staged, run the following from the working tree:\n\n```bash\nnb-clean add-filter\n```\n\nThis will configure a filter to remove cell execution counts, metadata, and\noutputs. To also remove empty cells, use:\n\n```bash\nnb-clean add-filter --remove-empty-cells\n```\n\nTo preserve cell metadata, such as that required by tools such as [papermill],\nuse:\n\n```bash\nnb-clean add-filter --preserve-cell-metadata\n```\n\nTo preserve only specific cell metadata, e.g., `tags` and `special`, use:\n\n```bash\nnb-clean add-filter --preserve-cell-metadata tags special\n```\n\nTo preserve cell outputs, use:\n\n```bash\nnb-clean add-filter --preserve-cell-outputs\n```\n\nTo preserve cell execution counts, use:\n\n```bash\nnb-clean add-filter --preserve-execution-counts\n```\n\nTo preserve notebook metadata, such as language version, use:\n\n```bash\nnb-clean add-filter --preserve-notebook-metadata\n```\n\n`nb-clean` will configure a filter in the Git repository in which it is run, and\nwon't mutate your global or system Git configuration. To remove the filter, run:\n\n```bash\nnb-clean remove-filter\n```\n\n### Cleaning (pre-commit hook)\n\n`nb-clean` can also be used as a [pre-commit] hook. You may prefer this to the\nGit filter if your project already uses the pre-commit framework.\n\nNote that the Git filter and pre-commit hook work differently, with different\neffects on your working directory. The pre-commit hook operates on the notebook\non disk, cleaning the copy in your working directory. The Git filter cleans\nnotebooks as they are added to the index, leaving the copy in your working\ndirectory dirty. This means cell outputs are still visible to you in your local\nJupyter instance when using the Git filter, but not when using the pre-commit\nhook.\n\nAfter installing [pre-commit], add the `nb-clean` hook by adding the following\nsnippet to `.pre-commit-config.yaml` in the root of your repository:\n\n```yaml\nrepos:\n  - repo: https://github.com/srstevenson/nb-clean\n    rev: 3.2.0\n    hooks:\n      - id: nb-clean\n```\n\nYou can pass additional arguments to `nb-clean` with an `args` array. The\nfollowing example shows how to preserve only two specific metadata fields. Note\nthat, in the example, the final item `--` in the arg list is mandatory. The\noption `--preserve-cell-metadata` may take an arbitrary number of field\narguments, and the `--` argument is needed to separate them from notebook\nfilenames, which `pre-commit` will append to the list of arguments.\n\n```yaml\nrepos:\n  - repo: https://github.com/srstevenson/nb-clean\n    rev: 3.2.0\n    hooks:\n      - id: nb-clean\n        args:\n          - --remove-empty-cells\n          - --preserve-cell-metadata\n          - tags\n          - slideshow\n          - --\n```\n\nRun `pre-commit install` to ensure the hook is installed, and\n`pre-commit autoupdate` to update the hook to the latest release of `nb-clean`.\n\n### Preserving all nbformat metadata\n\nTo ignore or preserve specifically the metadata defined in the\n[`nbformat` documentation](https://nbformat.readthedocs.io/en/latest/format_description.html#cell-metadata),\nuse the following options:\n`--preserve-cell-metadata collapsed scrolled deletable editable format name tags jupyter execution`.\n\n### Migrating to `nb-clean` 2\n\nThe following table maps from the command line interface of `nb-clean` 1.6.0 to\nthat of `nb-clean` >=2.0.0.\n\n| Description                             | `nb-clean` 1.6.0                                                    | `nb-clean` >=2.0.0                                          |\n| --------------------------------------- | ------------------------------------------------------------------- | ----------------------------------------------------------- |\n| Clean notebook                          | `nb-clean clean -i/--input notebook.ipynb \\| sponge notebook.ipynb` | `nb-clean clean notebook.ipynb`                             |\n| Clean notebook (remove empty cells)     | `nb-clean clean -i/--input notebook.ipynb -e/--remove-empty`        | `nb-clean clean notebook.ipynb -e/--remove-empty-cells`     |\n| Clean notebook (preserve cell metadata) | `nb-clean clean -i/--input notebook.ipynb -m/--preserve-metadata`   | `nb-clean clean notebook.ipynb -m/--preserve-cell-metadata` |\n| Check notebook                          | `nb-clean check -i/--input notebook.ipynb`                          | `nb-clean check notebook.ipynb`                             |\n| Check notebook (ignore non-empty cells) | `nb-clean check -i/--input notebook.ipynb -e/--remove-empty`        | `nb-clean check notebook.ipynb -e/--remove-empty-cells`     |\n| Check notebook (ignore cell metadata)   | `nb-clean check -i/--input notebook.ipynb -m/--preserve-metadata`   | `nb-clean check notebook.ipynb -m/--preserve-cell-metadata` |\n| Add Git filter to clean notebooks       | `nb-clean configure-git`                                            | `nb-clean add-filter`                                       |\n| Remove Git filter                       | `nb-clean unconfigure-git`                                          | `nb-clean remove-filter`                                    |\n\n## Copyright\n\nCopyright \u00a9 [Scott Stevenson].\n\n`nb-clean` is distributed under the terms of the [ISC license].\n\n[conda]: https://docs.conda.io/\n[isc license]: https://opensource.org/licenses/ISC\n[papermill]: https://papermill.readthedocs.io/\n[pdm]: https://pdm.fming.dev/\n[pip]: https://pip.pypa.io/\n[poetry]: https://python-poetry.org/\n[pre-commit]: https://pre-commit.com/\n[pypi]: https://pypi.org/project/nb-clean/\n[scott stevenson]: https://scott.stevenson.io\n",
    "bugtrack_url": null,
    "license": "ISC",
    "summary": "Clean Jupyter notebooks for versioning",
    "version": "3.2.0",
    "project_urls": {
        "Homepage": "https://github.com/srstevenson/nb-clean",
        "Repository": "https://github.com/srstevenson/nb-clean"
    },
    "split_keywords": [
        "jupyter",
        "notebook",
        "clean",
        "filter",
        "git"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a005690f96452dbbece5bc9e61d868b0cc650aa95fb5fd805c1967536f7c5428",
                "md5": "ac165e3c9af8621f317177a9d5e8a06a",
                "sha256": "60be514c0d5cb3b87bbbdb0e528823497c553e73b266840eb25d1600812b51a7"
            },
            "downloads": -1,
            "filename": "nb_clean-3.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ac165e3c9af8621f317177a9d5e8a06a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 20703,
            "upload_time": "2023-12-18T15:36:53",
            "upload_time_iso_8601": "2023-12-18T15:36:53.107372Z",
            "url": "https://files.pythonhosted.org/packages/a0/05/690f96452dbbece5bc9e61d868b0cc650aa95fb5fd805c1967536f7c5428/nb_clean-3.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "adf967c6b22e173b5985f96352467bd57461e1d5ca948c970b42ead64459a29a",
                "md5": "8a8822510c82ebc258416e090ed467f3",
                "sha256": "614d8bec635e35aac2992a1ed5c0dd3dbf1d4bfd41993851136e32b3e35960c5"
            },
            "downloads": -1,
            "filename": "nb_clean-3.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8a8822510c82ebc258416e090ed467f3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 23315,
            "upload_time": "2023-12-18T15:36:55",
            "upload_time_iso_8601": "2023-12-18T15:36:55.171407Z",
            "url": "https://files.pythonhosted.org/packages/ad/f9/67c6b22e173b5985f96352467bd57461e1d5ca948c970b42ead64459a29a/nb_clean-3.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-18 15:36:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "srstevenson",
    "github_project": "nb-clean",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "nb-clean"
}
        
Elapsed time: 0.15245s