papermill


Namepapermill JSON
Version 2.5.0 PyPI version JSON
download
home_pagehttps://github.com/nteract/papermill
SummaryParameterize and run Jupyter and nteract Notebooks
upload_time2023-11-01 22:59:29
maintainer
docs_urlNone
authornteract contributors
requires_python>=3.8
licenseBSD
keywords jupyter mapreduce nteract pipeline notebook
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <a href="https://github.com/nteract/papermill"><img src="https://media.githubusercontent.com/media/nteract/logos/master/nteract_papermill/exports/images/png/papermill_logo_wide.png" height="48px" /></a>

<!---(binder links generated at https://mybinder.readthedocs.io/en/latest/howto/badges.html and compressed at https://tinyurl.com) -->

[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg)](https://github.com/nteract/papermill/actions/workflows/ci.yml)
[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/nteract/papermill/actions/workflows/ci.yml)
[![image](https://codecov.io/github/nteract/papermill/coverage.svg?branch=main)](https://codecov.io/github/nteract/papermill?branch=main)
[![Documentation Status](https://readthedocs.org/projects/papermill/badge/?version=latest)](http://papermill.readthedocs.io/en/latest/?badge=latest)
[![badge](https://tinyurl.com/ybwovtw2)](https://mybinder.org/v2/gh/nteract/papermill/main?filepath=binder%2Fprocess_highlight_dates.ipynb)
[![badge](https://tinyurl.com/y7uz2eh9)](https://mybinder.org/v2/gh/nteract/papermill/main?)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/papermill)](https://pypi.org/project/papermill/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![papermill](https://snyk.io/advisor/python/papermill/badge.svg)](https://snyk.io/advisor/python/papermill)
[![Anaconda-Server Badge](https://anaconda.org/conda-forge/papermill/badges/downloads.svg)](https://anaconda.org/conda-forge/papermill)

**papermill** is a tool for parameterizing, executing, and analyzing
Jupyter Notebooks.

Papermill lets you:

- **parameterize** notebooks
- **execute** notebooks

This opens up new opportunities for how notebooks can be used. For
example:

- Perhaps you have a financial report that you wish to run with
  different values on the first or last day of a month or at the
  beginning or end of the year, **using parameters** makes this task
  easier.
- Do you want to run a notebook and depending on its results, choose a
  particular notebook to run next? You can now programmatically
  **execute a workflow** without having to copy and paste from
  notebook to notebook manually.

Papermill takes an *opinionated* approach to notebook parameterization and
execution based on our experiences using notebooks at scale in data
pipelines.

## Installation

From the command line:

```{.sourceCode .bash}
pip install papermill
```

For all optional io dependencies, you can specify individual bundles
like `s3`, or `azure` -- or use `all`. To use Black to format parameters you can add as an extra requires \['black'\].

```{.sourceCode .bash}
pip install papermill[all]
```

## Python Version Support

This library currently supports Python 3.8+ versions. As minor Python
versions are officially sunset by the Python org papermill will similarly
drop support in the future.

## Usage

### Parameterizing a Notebook

To parameterize your notebook designate a cell with the tag `parameters`.

![enable parameters in Jupyter](docs/img/enable_parameters.gif)

Papermill looks for the `parameters` cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with `injected-parameters` with input parameters in order to overwrite the values in `parameters`. If no cell is tagged with `parameters` the injected cell will be inserted at the top of the notebook.

Additionally, if you rerun notebooks through papermill and it will reuse the `injected-parameters` cell from the prior run. In this case Papermill will replace the old `injected-parameters` cell with the new run's inputs.

![image](docs/img/parameters.png)

### Executing a Notebook

The two ways to execute the notebook with parameters are: (1) through
the Python API and (2) through the command line interface.

#### Execute via the Python API

```{.sourceCode .python}
import papermill as pm

pm.execute_notebook(
   'path/to/input.ipynb',
   'path/to/output.ipynb',
   parameters = dict(alpha=0.6, ratio=0.1)
)
```

#### Execute via CLI

Here's an example of a local notebook being executed and output to an
Amazon S3 account:

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
```

**NOTE:**
If you use multiple AWS accounts, and you have [properly configured your AWS  credentials](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html), then you can specify which account to use by setting the `AWS_PROFILE` environment variable at the command-line. For example:

```{.sourceCode .bash}
$ AWS_PROFILE=dev_account papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
```

In the above example, two parameters are set: `alpha` and `l1_ratio` using `-p` (`--parameters` also works). Parameter values that look like booleans or numbers will be interpreted as such. Here are the different ways users may set parameters:

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -r version 1.0
```

Using `-r` or `--parameters_raw`, users can set parameters one by one. However, unlike `-p`, the parameter will remain a string, even if it may be interpreted as a number or boolean.

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -f parameters.yaml
```

Using `-f` or `--parameters_file`, users can provide a YAML file from which parameter values should be read.

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -y "
alpha: 0.6
l1_ratio: 0.1"
```

Using `-y` or `--parameters_yaml`, users can directly provide a YAML string containing parameter values.

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -b YWxwaGE6IDAuNgpsMV9yYXRpbzogMC4xCg==
```

Using `-b` or `--parameters_base64`, users can provide a YAML string, base64-encoded, containing parameter values.

When using YAML to pass arguments, through `-y`, `-b` or `-f`, parameter values can be arrays or dictionaries:

```{.sourceCode .bash}
$ papermill local/input.ipynb s3://bkt/output.ipynb -y "
x:
    - 0.0
    - 1.0
    - 2.0
    - 3.0
linear_function:
    slope: 3.0
    intercept: 1.0"
```

#### Supported Name Handlers

Papermill supports the following name handlers for input and output paths during execution:

- Local file system: `local`

- HTTP, HTTPS protocol:  `http://, https://`

- Amazon Web Services: [AWS S3](https://aws.amazon.com/s3/) `s3://`

- Azure: [Azure DataLake Store](https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview), [Azure Blob Store](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-overview) `adl://, abs://`

- Google Cloud: [Google Cloud Storage](https://cloud.google.com/storage/) `gs://`

## Development Guide

Read [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines on how to setup a local development environment and make code changes back to Papermill.

For development guidelines look in the [DEVELOPMENT_GUIDE.md](./DEVELOPMENT_GUIDE.md) file. This should inform you on how to make particular additions to the code base.

## Documentation

We host the [Papermill documentation](http://papermill.readthedocs.io)
on ReadTheDocs.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nteract/papermill",
    "name": "papermill",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "jupyter mapreduce nteract pipeline notebook",
    "author": "nteract contributors",
    "author_email": "nteract@googlegroups.com",
    "download_url": "https://files.pythonhosted.org/packages/45/f2/9a978a1791d031e20f1fceb6633fac939e019716f5d0b24d305de08df4c9/papermill-2.5.0.tar.gz",
    "platform": null,
    "description": "# <a href=\"https://github.com/nteract/papermill\"><img src=\"https://media.githubusercontent.com/media/nteract/logos/master/nteract_papermill/exports/images/png/papermill_logo_wide.png\" height=\"48px\" /></a>\n\n<!---(binder links generated at https://mybinder.readthedocs.io/en/latest/howto/badges.html and compressed at https://tinyurl.com) -->\n\n[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg)](https://github.com/nteract/papermill/actions/workflows/ci.yml)\n[![CI](https://github.com/nteract/papermill/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/nteract/papermill/actions/workflows/ci.yml)\n[![image](https://codecov.io/github/nteract/papermill/coverage.svg?branch=main)](https://codecov.io/github/nteract/papermill?branch=main)\n[![Documentation Status](https://readthedocs.org/projects/papermill/badge/?version=latest)](http://papermill.readthedocs.io/en/latest/?badge=latest)\n[![badge](https://tinyurl.com/ybwovtw2)](https://mybinder.org/v2/gh/nteract/papermill/main?filepath=binder%2Fprocess_highlight_dates.ipynb)\n[![badge](https://tinyurl.com/y7uz2eh9)](https://mybinder.org/v2/gh/nteract/papermill/main?)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/papermill)](https://pypi.org/project/papermill/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n[![papermill](https://snyk.io/advisor/python/papermill/badge.svg)](https://snyk.io/advisor/python/papermill)\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/papermill/badges/downloads.svg)](https://anaconda.org/conda-forge/papermill)\n\n**papermill** is a tool for parameterizing, executing, and analyzing\nJupyter Notebooks.\n\nPapermill lets you:\n\n- **parameterize** notebooks\n- **execute** notebooks\n\nThis opens up new opportunities for how notebooks can be used. For\nexample:\n\n- Perhaps you have a financial report that you wish to run with\n  different values on the first or last day of a month or at the\n  beginning or end of the year, **using parameters** makes this task\n  easier.\n- Do you want to run a notebook and depending on its results, choose a\n  particular notebook to run next? You can now programmatically\n  **execute a workflow** without having to copy and paste from\n  notebook to notebook manually.\n\nPapermill takes an *opinionated* approach to notebook parameterization and\nexecution based on our experiences using notebooks at scale in data\npipelines.\n\n## Installation\n\nFrom the command line:\n\n```{.sourceCode .bash}\npip install papermill\n```\n\nFor all optional io dependencies, you can specify individual bundles\nlike `s3`, or `azure` -- or use `all`. To use Black to format parameters you can add as an extra requires \\['black'\\].\n\n```{.sourceCode .bash}\npip install papermill[all]\n```\n\n## Python Version Support\n\nThis library currently supports Python 3.8+ versions. As minor Python\nversions are officially sunset by the Python org papermill will similarly\ndrop support in the future.\n\n## Usage\n\n### Parameterizing a Notebook\n\nTo parameterize your notebook designate a cell with the tag `parameters`.\n\n![enable parameters in Jupyter](docs/img/enable_parameters.gif)\n\nPapermill looks for the `parameters` cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with `injected-parameters` with input parameters in order to overwrite the values in `parameters`. If no cell is tagged with `parameters` the injected cell will be inserted at the top of the notebook.\n\nAdditionally, if you rerun notebooks through papermill and it will reuse the `injected-parameters` cell from the prior run. In this case Papermill will replace the old `injected-parameters` cell with the new run's inputs.\n\n![image](docs/img/parameters.png)\n\n### Executing a Notebook\n\nThe two ways to execute the notebook with parameters are: (1) through\nthe Python API and (2) through the command line interface.\n\n#### Execute via the Python API\n\n```{.sourceCode .python}\nimport papermill as pm\n\npm.execute_notebook(\n   'path/to/input.ipynb',\n   'path/to/output.ipynb',\n   parameters = dict(alpha=0.6, ratio=0.1)\n)\n```\n\n#### Execute via CLI\n\nHere's an example of a local notebook being executed and output to an\nAmazon S3 account:\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1\n```\n\n**NOTE:**\nIf you use multiple AWS accounts, and you have [properly configured your AWS  credentials](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html), then you can specify which account to use by setting the `AWS_PROFILE` environment variable at the command-line. For example:\n\n```{.sourceCode .bash}\n$ AWS_PROFILE=dev_account papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1\n```\n\nIn the above example, two parameters are set: `alpha` and `l1_ratio` using `-p` (`--parameters` also works). Parameter values that look like booleans or numbers will be interpreted as such. Here are the different ways users may set parameters:\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -r version 1.0\n```\n\nUsing `-r` or `--parameters_raw`, users can set parameters one by one. However, unlike `-p`, the parameter will remain a string, even if it may be interpreted as a number or boolean.\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -f parameters.yaml\n```\n\nUsing `-f` or `--parameters_file`, users can provide a YAML file from which parameter values should be read.\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -y \"\nalpha: 0.6\nl1_ratio: 0.1\"\n```\n\nUsing `-y` or `--parameters_yaml`, users can directly provide a YAML string containing parameter values.\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -b YWxwaGE6IDAuNgpsMV9yYXRpbzogMC4xCg==\n```\n\nUsing `-b` or `--parameters_base64`, users can provide a YAML string, base64-encoded, containing parameter values.\n\nWhen using YAML to pass arguments, through `-y`, `-b` or `-f`, parameter values can be arrays or dictionaries:\n\n```{.sourceCode .bash}\n$ papermill local/input.ipynb s3://bkt/output.ipynb -y \"\nx:\n    - 0.0\n    - 1.0\n    - 2.0\n    - 3.0\nlinear_function:\n    slope: 3.0\n    intercept: 1.0\"\n```\n\n#### Supported Name Handlers\n\nPapermill supports the following name handlers for input and output paths during execution:\n\n- Local file system: `local`\n\n- HTTP, HTTPS protocol:  `http://, https://`\n\n- Amazon Web Services: [AWS S3](https://aws.amazon.com/s3/) `s3://`\n\n- Azure: [Azure DataLake Store](https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview), [Azure Blob Store](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-overview) `adl://, abs://`\n\n- Google Cloud: [Google Cloud Storage](https://cloud.google.com/storage/) `gs://`\n\n## Development Guide\n\nRead [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines on how to setup a local development environment and make code changes back to Papermill.\n\nFor development guidelines look in the [DEVELOPMENT_GUIDE.md](./DEVELOPMENT_GUIDE.md) file. This should inform you on how to make particular additions to the code base.\n\n## Documentation\n\nWe host the [Papermill documentation](http://papermill.readthedocs.io)\non ReadTheDocs.\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Parameterize and run Jupyter and nteract Notebooks",
    "version": "2.5.0",
    "project_urls": {
        "Documentation": "https://papermill.readthedocs.io",
        "Funding": "https://nteract.io",
        "Homepage": "https://github.com/nteract/papermill",
        "Source": "https://github.com/nteract/papermill/",
        "Tracker": "https://github.com/nteract/papermill/issues"
    },
    "split_keywords": [
        "jupyter",
        "mapreduce",
        "nteract",
        "pipeline",
        "notebook"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "efdfa81912b5c70852db6cce7f4f7671ecfa3ac457b59d20062b06ac6c89b938",
                "md5": "53ffc4a1d6b93abad18cfceaccd89494",
                "sha256": "c42303afb92e482a60ae1df2577be59a5b7a64c5cd52d37c74c7f74e36085708"
            },
            "downloads": -1,
            "filename": "papermill-2.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "53ffc4a1d6b93abad18cfceaccd89494",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 38556,
            "upload_time": "2023-11-01T22:59:26",
            "upload_time_iso_8601": "2023-11-01T22:59:26.785452Z",
            "url": "https://files.pythonhosted.org/packages/ef/df/a81912b5c70852db6cce7f4f7671ecfa3ac457b59d20062b06ac6c89b938/papermill-2.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "45f29a978a1791d031e20f1fceb6633fac939e019716f5d0b24d305de08df4c9",
                "md5": "affb2fc1915e1d7bd8f03df43786f866",
                "sha256": "ea7b70c0553f56fe91b0fa9cc5e17012cd699320a8b015373e7870c5e6086c72"
            },
            "downloads": -1,
            "filename": "papermill-2.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "affb2fc1915e1d7bd8f03df43786f866",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 79118,
            "upload_time": "2023-11-01T22:59:29",
            "upload_time_iso_8601": "2023-11-01T22:59:29.047644Z",
            "url": "https://files.pythonhosted.org/packages/45/f2/9a978a1791d031e20f1fceb6633fac939e019716f5d0b24d305de08df4c9/papermill-2.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-01 22:59:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nteract",
    "github_project": "papermill",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "papermill"
}
        
Elapsed time: 0.14258s