airflow-remote-jupyter-notebook


Nameairflow-remote-jupyter-notebook JSON
Version 0.0.3 PyPI version JSON
download
home_pageNone
SummaryAirflow plugin to execute Jupyter Notebook remotely
upload_time2024-10-06 13:15:16
maintainerNone
docs_urlNone
authorMarcelo Vinicius
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Airflow run Jupyter Notebook Remote 

- [Airflow run Jupyter Notebook Remote](#airflow-run-jupyter-notebook-remote)
  - [What is it?](#what-is-it)
  - [Would you mind buying me a coffee?](#would-you-mind-buying-me-a-coffee)
  - [Dependencies](#dependencies)
  - [Installation](#installation)
    - [Via Pypi Package:](#via-pypi-package)
    - [Manually](#manually)
  - [Airfow plugin dependencies](#airfow-plugin-dependencies)
  - [Test dependences](#test-dependences)
  - [How to contribute](#how-to-contribute)
  - [Credits](#credits)
  - [Run remote jupyter notebook using Airflow](#run-remote-jupyter-notebook-using-airflow)
  - [Plugin Usage](#plugin-usage)
  - [Run tests](#run-tests)

## What is it?

![architecture](https://raw.githubusercontent.com/marcelo225/airflow-remote-jupyter-notebook/main/architecture.png)

This plugin is designed to allow the execution of Jupyter Notebooks remotely from within an Airflow DAG. By using the plugin, users can integrate and manage Jupyter Notebook workflows as part of their Airflow pipelines, ensuring that data analysis or machine learning code can be orchestrated and run automatically within the DAG scheduling system.

The plugin utilizes the Jupyter API to communicate with a Jupyter server, allowing for operations such as starting a kernel, running notebook cells, and managing sessions. It supports both HTTP requests for session and kernel management and WebSocket connections for sending code to execute inside the notebooks.

Package link: https://pypi.org/project/airflow-remote-jupyter-notebook/

## Would you mind buying me a coffee?

If you find this library helpful, consider buying me a coffee! Your support helps maintain and improve the project, allowing me to dedicate more time to developing new features, fixing bugs, and providing updates.

![coffee](https://raw.githubusercontent.com/marcelo225/airflow-remote-jupyter-notebook/main/qr_code.png)

## Dependencies

- [Python 3](https://www.python.org/)
- [Requests](https://pypi.org/project/requests/)
- [Websockets](https://pypi.org/project/websockets/)
- [Asyncio](https://pypi.org/project/asyncio/)

## Installation

### Via Pypi Package:

```bash
$ pip install airflow-remote-jupyter-notebook
```

### Manually

```bash
# run docker-compose to up Airfow and Jupyter Notebook containers
$ docker-compose up
```

## Airfow plugin dependencies

- Look at [requirements.txt](airflow/requirements.txt)

## Test dependences

- [pytest](https://docs.pytest.org)

## How to contribute

Please report bugs and feature requests at
https://github.com/marcelo225/airflow-remote-jupyter-notebook/issues

## Credits

Lead Developer - Marcelo Vinicius

## Run remote jupyter notebook using Airflow

```bash
# in root project folder
$ docker-compose up
```

- Open [http://localhost:8080](http://localhost:8080) in your web browser to open Airflow
- Open [http://localhost:8888](http://localhost:8888) in your web browser to open Jupyter Notebook, when you need it
- Run `test_dag` in Airflow

## Plugin Usage

```python

from jupyter_plugin.plugin import JupyterDAG # <--------- How to import this plugin
from airflow.models import Variable
import datetime

with JupyterDAG(
    'test_dag',     
    jupyter_url=Variable.get('jupyter_url'),
    jupyter_token=Variable.get('jupyter_token'),
    jupyter_base_path=Variable.get('jupyter_base_path'),
    max_active_runs=1,
    default_args={
        'owner': 'Marcelo Vinicius',
        'depends_on_past': False,
        'start_date': datetime.datetime(2021, 1, 1),
        'email_on_failure': False,
        'email_on_retry': False,
        'retries': 2        
    },
    description=f'DAG test to run some remote Jupyter Notebook file.',
    schedule=2,
    catchup=False
) as dag:

    test1 = dag.create_jupyter_remote_operator(task_id="test1", notebook_path=f"notebooks/test1.ipynb")
    test2 = dag.create_jupyter_remote_operator(task_id="test2", notebook_path=f"notebooks/test2.ipynb")
    test3 = dag.create_jupyter_remote_operator(task_id="test3", notebook_path=f"notebooks/test3.ipynb")

test1 >> test2 >> test3
```

| **DAG Attributes**    | **Description**                                                                 |
|-----------------------|---------------------------------------------------------------------------------|
| `jupyter_url`         | Jupyter URL server with HTTP or HTTPS                                           |
| `jupyter_token`       | Jupyter Authentication Token                                                    |
| `jupyter_base_path`   | Base path where your Jupyter notebooks are stored                               |


| **Task Creation**     | **Explanation**                                                                 |
|-----------------------|---------------------------------------------------------------------------------|
| `create_jupyter_remote_operator` | Method from the `JupyterDAG` class that creates a task to execute a specified Jupyter notebook on a remote server. |
| `task_id`              | A unique identifier for the task, used for tracking and logging within Airflow.  |
| `notebook_path`        | Specifies the path to the Jupyter notebook to be executed, relative to the base path. |


## Run tests

To test the scripts within the Airflow environment, you can use the following command. 
This will run all tests located in the **/home/airflow/tests** directory inside the container:

```bash
$ docker-compose exec airflow pytest /home/airflow/tests
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "airflow-remote-jupyter-notebook",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Marcelo Vinicius",
    "author_email": "mr.225@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/65/cc/f5807929efea646f754b68e0690912b9dc1d40f0099f1035a16e12878ecb/airflow_remote_jupyter_notebook-0.0.3.tar.gz",
    "platform": null,
    "description": "# Airflow run Jupyter Notebook Remote \n\n- [Airflow run Jupyter Notebook Remote](#airflow-run-jupyter-notebook-remote)\n  - [What is it?](#what-is-it)\n  - [Would you mind buying me a coffee?](#would-you-mind-buying-me-a-coffee)\n  - [Dependencies](#dependencies)\n  - [Installation](#installation)\n    - [Via Pypi Package:](#via-pypi-package)\n    - [Manually](#manually)\n  - [Airfow plugin dependencies](#airfow-plugin-dependencies)\n  - [Test dependences](#test-dependences)\n  - [How to contribute](#how-to-contribute)\n  - [Credits](#credits)\n  - [Run remote jupyter notebook using Airflow](#run-remote-jupyter-notebook-using-airflow)\n  - [Plugin Usage](#plugin-usage)\n  - [Run tests](#run-tests)\n\n## What is it?\n\n![architecture](https://raw.githubusercontent.com/marcelo225/airflow-remote-jupyter-notebook/main/architecture.png)\n\nThis plugin is designed to allow the execution of Jupyter Notebooks remotely from within an Airflow DAG. By using the plugin, users can integrate and manage Jupyter Notebook workflows as part of their Airflow pipelines, ensuring that data analysis or machine learning code can be orchestrated and run automatically within the DAG scheduling system.\n\nThe plugin utilizes the Jupyter API to communicate with a Jupyter server, allowing for operations such as starting a kernel, running notebook cells, and managing sessions. It supports both HTTP requests for session and kernel management and WebSocket connections for sending code to execute inside the notebooks.\n\nPackage link: https://pypi.org/project/airflow-remote-jupyter-notebook/\n\n## Would you mind buying me a coffee?\n\nIf you find this library helpful, consider buying me a coffee! Your support helps maintain and improve the project, allowing me to dedicate more time to developing new features, fixing bugs, and providing updates.\n\n![coffee](https://raw.githubusercontent.com/marcelo225/airflow-remote-jupyter-notebook/main/qr_code.png)\n\n## Dependencies\n\n- [Python 3](https://www.python.org/)\n- [Requests](https://pypi.org/project/requests/)\n- [Websockets](https://pypi.org/project/websockets/)\n- [Asyncio](https://pypi.org/project/asyncio/)\n\n## Installation\n\n### Via Pypi Package:\n\n```bash\n$ pip install airflow-remote-jupyter-notebook\n```\n\n### Manually\n\n```bash\n# run docker-compose to up Airfow and Jupyter Notebook containers\n$ docker-compose up\n```\n\n## Airfow plugin dependencies\n\n- Look at [requirements.txt](airflow/requirements.txt)\n\n## Test dependences\n\n- [pytest](https://docs.pytest.org)\n\n## How to contribute\n\nPlease report bugs and feature requests at\nhttps://github.com/marcelo225/airflow-remote-jupyter-notebook/issues\n\n## Credits\n\nLead Developer - Marcelo Vinicius\n\n## Run remote jupyter notebook using Airflow\n\n```bash\n# in root project folder\n$ docker-compose up\n```\n\n- Open [http://localhost:8080](http://localhost:8080) in your web browser to open Airflow\n- Open [http://localhost:8888](http://localhost:8888) in your web browser to open Jupyter Notebook, when you need it\n- Run `test_dag` in Airflow\n\n## Plugin Usage\n\n```python\n\nfrom jupyter_plugin.plugin import JupyterDAG # <--------- How to import this plugin\nfrom airflow.models import Variable\nimport datetime\n\nwith JupyterDAG(\n    'test_dag',     \n    jupyter_url=Variable.get('jupyter_url'),\n    jupyter_token=Variable.get('jupyter_token'),\n    jupyter_base_path=Variable.get('jupyter_base_path'),\n    max_active_runs=1,\n    default_args={\n        'owner': 'Marcelo Vinicius',\n        'depends_on_past': False,\n        'start_date': datetime.datetime(2021, 1, 1),\n        'email_on_failure': False,\n        'email_on_retry': False,\n        'retries': 2        \n    },\n    description=f'DAG test to run some remote Jupyter Notebook file.',\n    schedule=2,\n    catchup=False\n) as dag:\n\n    test1 = dag.create_jupyter_remote_operator(task_id=\"test1\", notebook_path=f\"notebooks/test1.ipynb\")\n    test2 = dag.create_jupyter_remote_operator(task_id=\"test2\", notebook_path=f\"notebooks/test2.ipynb\")\n    test3 = dag.create_jupyter_remote_operator(task_id=\"test3\", notebook_path=f\"notebooks/test3.ipynb\")\n\ntest1 >> test2 >> test3\n```\n\n| **DAG Attributes**    | **Description**                                                                 |\n|-----------------------|---------------------------------------------------------------------------------|\n| `jupyter_url`         | Jupyter URL server with HTTP or HTTPS                                           |\n| `jupyter_token`       | Jupyter Authentication Token                                                    |\n| `jupyter_base_path`   | Base path where your Jupyter notebooks are stored                               |\n\n\n| **Task Creation**     | **Explanation**                                                                 |\n|-----------------------|---------------------------------------------------------------------------------|\n| `create_jupyter_remote_operator` | Method from the `JupyterDAG` class that creates a task to execute a specified Jupyter notebook on a remote server. |\n| `task_id`              | A unique identifier for the task, used for tracking and logging within Airflow.  |\n| `notebook_path`        | Specifies the path to the Jupyter notebook to be executed, relative to the base path. |\n\n\n## Run tests\n\nTo test the scripts within the Airflow environment, you can use the following command. \nThis will run all tests located in the **/home/airflow/tests** directory inside the container:\n\n```bash\n$ docker-compose exec airflow pytest /home/airflow/tests\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Airflow plugin to execute Jupyter Notebook remotely",
    "version": "0.0.3",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7ff8277d98296950838f742632dc6218c891bb769b4400c938c8ae1be71d2fd8",
                "md5": "cd823cd6d1619bc1842f0ab837536972",
                "sha256": "b6dfdb3efcf17fd609be89b60d4ea186689b9b18cd20313f6014477411aef47b"
            },
            "downloads": -1,
            "filename": "airflow_remote_jupyter_notebook-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cd823cd6d1619bc1842f0ab837536972",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 3518,
            "upload_time": "2024-10-06T13:15:14",
            "upload_time_iso_8601": "2024-10-06T13:15:14.904506Z",
            "url": "https://files.pythonhosted.org/packages/7f/f8/277d98296950838f742632dc6218c891bb769b4400c938c8ae1be71d2fd8/airflow_remote_jupyter_notebook-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65ccf5807929efea646f754b68e0690912b9dc1d40f0099f1035a16e12878ecb",
                "md5": "b76eba7a17bb316fdb2e27629de131ac",
                "sha256": "019118e47f7c07432aa12c5b6fb281eaf8159887d03dcf636890e1aa1e268376"
            },
            "downloads": -1,
            "filename": "airflow_remote_jupyter_notebook-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "b76eba7a17bb316fdb2e27629de131ac",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4155,
            "upload_time": "2024-10-06T13:15:16",
            "upload_time_iso_8601": "2024-10-06T13:15:16.458827Z",
            "url": "https://files.pythonhosted.org/packages/65/cc/f5807929efea646f754b68e0690912b9dc1d40f0099f1035a16e12878ecb/airflow_remote_jupyter_notebook-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-06 13:15:16",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "airflow-remote-jupyter-notebook"
}
        
Elapsed time: 1.60571s