nbrefactor


Namenbrefactor JSON
Version 0.1.2 PyPI version JSON
download
home_pagehttps://github.com/ThunderStruct/nbrefactor
SummaryAn automation tool to refactor Jupyter Notebooks to Python modules, with code dependency analysis.
upload_time2024-09-15 19:09:23
maintainerNone
docs_urlNone
authorMohamed Shahawy
requires_python>=3.7
licenseNone
keywords jupyter notebook refactor python cli
VCS
bugtrack_url
requirements attrs fastjsonschema graphviz importlib-metadata importlib-resources jsonschema jupyter_core nbformat pkgutil_resolve_name pyrsistent tqdm traitlets typing_extensions zipp
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <img src="https://i.imgur.com/ukBP39X.png" alt="nbrefactor Logo" width="420">
</p>

<br />

<div align="center">

<a href="https://github.com/ThunderStruct/nbrefactor">![Platform](https://img.shields.io/badge/python-v3.7-green)</a>
<a href="https://pypi.org/project/nbrefactor/">![pypi](https://img.shields.io/badge/pypi%20package-0.1.2-lightgrey.svg)</a>
<a href="https://github.com/ThunderStruct/nbrefactor/blob/master/LICENSE">![License](https://img.shields.io/badge/license-MIT-orange)</a>
<a href="https://nbrefactor.readthedocs.io/en/latest/">![Read the Docs](https://readthedocs.org/projects/nbrefactor/badge/?version=latest)</a>

</div>

<p align="center">
An automation tool to refactor Jupyter Notebooks to Python packages and modules.
</p>

---

# Overview (The "What")

**nbrefactor** is designed to refactor Jupyter Notebooks into structured Python packages and modules. Using Markdown Headers and/or custom commands in a notebook's Markdown/text cells, nbrefactor creates a hierarchical module structure that reflects the notebook's content autonomously.

# Motivation (The "Why")

With the growing dependence on cloud-based IPython platforms ([Google Colab](https://colab.research.google.com/), primarily), developing projects directly in-browser has become more prominent. Having suffered through the pain of refactoring entire projects from Jupyter Notebooks into Python packages/modules to facilitate PyPI publication (and proper source control), this tool was developed to automate the refactoring process.

# Approach (The "How")

This project does _not_ just create a hierarchy based on the level of Markdown headers (how many `#` there are); this is just a single step in the refactoring process.

Since we are generating modules that potentially depend on context from previous cells in the notebook, dependency-analysis is required. Furthermore, we also need track the generated modules and all globally-accessible identifiers throughout the notebook to generate relative import statements as needed.

For instance, if a class is refactored to a generated module `./package/sub_package/module.py`, this definition and module path need to be tracked so we can relatively import it as needed if it appears in successive cells or modules. _Scope-Awareness_ and _Identifier-Shadowing_ posed a challenge as well, and are also handled in the dependency analysis phase of the refactoring.

## Module Hierarchy Generation

Convert markdown headers in notebooks into a corresponding folder and file structure.

![refactoring_examples](https://i.imgur.com/bBgHJay.png)

## Code Dependency Analyzer (CDA)

The core of **nbrefactor**'s functionality lies in the Code Dependency Analyzer (CDA). The CDA is responsible for parsing code cells, tracking declared definitions, and analyzing dependencies across the generated modules. This module tackles challenges that were raised during the inception of the refactoring-automation process (primarily handling relative imports dynamically as we generate the modules, identifier shadowing, and non-redundant dependency injection).

1. **IPython Magic Command Removal**: clean the source code by omitting IPython magic commands (to ensure that the code can be parsed by Python's `ast`).
2. **AST Parsing**: parse the sanitized code into an Abstract Syntax Tree
3. **Import Statement Stripping**: extract and strip import statements from the parsed code, and add them to a global (across all cells) tracker.
4. **Global Definition Tracking**: track all encountered definitions (declared functions and classes) globally. This inherently handles identifier shadowing.
5. **Dependency Analysis**: analyze identifier usages in a given code block.
6. **Dynamic Relative Import Resolution**: resolve local import statements dynamically depending on the current and target modules' positions in the tree.
7. **Dependency Generation and Resolution**: generate the respective import statements (given the definitions' analysis in step 5 & 6) to be injected during the file-writing phase.

# Installation

## PyPI (recommended)

The Python package is hosted on the [Python Package Index (PyPI)](https://pypi.org/project/nbrefactor/).

The latest published version of **nbrefactor** can be installed using

```bash
pip install nbrefactor
```

## Manual Installation

Simply clone the repo and extract the files in the `nbrefactor` folder,
then run:

```bash
pip install -r requirements.txt
pip install -e .
```

Or use one of the scripts below:

### GIT

- `cd` into your project directory
- Use `sparse-checkout` to pull the library files only into your project directory
  ```bash
  git init nbrefactor
  cd nbrefactor
  git remote add -f origin https://github.com/ThunderStruct/nbrefactor.git
  git config core.sparseCheckout true
  echo "nbrefactor/*" >> .git/info/sparse-checkout
  git pull --depth=1 origin master
  pip install -r requirements.txt
  pip install -e .
  ```

### SVN

- `cd` into your project directory
- `checkout` the library files
  ```bash
  svn checkout https://github.com/ThunderStruct/nbrefactor/trunk/nbrefactor
  pip install -r requirements.txt
  pip install -e .
  ```

# Usage

Refer to the [documentation](https://nbrefactor.readthedocs.io/en/latest/) for the comprehensive commands' reference. Some basic usages are provided below.

## Command Line Interface

`nbrefactor` provides a CLI to easily refactor notebooks into a structured project hierarchy.

### Basic CLI Usage

To use the CLI, run the following command:

```bash
jupyter nbrefactor <notebook_path> <output_path> [OPTIONS]
```

- `<notebook_path>`: Path to the Jupyter notebook file you want to refactor.
- `<output_path>`: Directory where the refactored Python modules will be saved.


## Demo

There are several example notebooks provided to showcase **nbrefactor**'s capabilities.

- [_Primary Demo Notebook_](src/demo/examples/sample_primary_demo.ipynb): this notebook contains several examples of the core nbrefactor features, including all Markdown commands.
- [_CS231n Notebook_](src/demo/examples/sample_CS231n_colab.ipynb): the official CS231n Colab notebook.
- [_HiveNAS Notebook_](src/demo/examples/sample_HiveNAS.ipynb): a larger project with a more complex folder structure.
- [_Markdown-only Notebook_](src/demo/examples/sample_markdown_only.ipynb): a Markdown-only notebook to illustrate the directory-refactoring abilities of nbrefactor.

### Interactive Demo

An interactive Notebook-based demo can be found [here](src/demo/demo.ipynb), which can be used to run the example projects discussed above.

# Change Log

Consult the [CHANGELOG](CHANGELOG.md) for the latest updates.

# Contributing

PRs are welcome (and encouraged)! If you'd like to contribute to **nbrefactor**, please read the [CONTRIBUTING](CONTRIBUTING.md) guidelines. 

The [TODO](TODO.md) list delineates some potential future implementations and improvements. 

## PR Submission

In addition to following the [contribution guidelines](CONTRIBUTING.md), please ensure the steps below are adhered to prior to submitting a PR:

- The [CHANGELOG](CHANGELOG.md) is updated according to the given structure
- The [README](README.md) and [TODO](TODO.md) are updated (if applicable)

# License

**nbrefactor** is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.

---

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ThunderStruct/nbrefactor",
    "name": "nbrefactor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "jupyter, notebook, refactor, python, cli",
    "author": "Mohamed Shahawy",
    "author_email": "envious-citizen.0s@icloud.com",
    "download_url": "https://files.pythonhosted.org/packages/26/6b/47187f12c84f67cbb6e8f85bb2baefc47893308689c989245202801d020f/nbrefactor-0.1.2.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n    <img src=\"https://i.imgur.com/ukBP39X.png\" alt=\"nbrefactor Logo\" width=\"420\">\n</p>\n\n<br />\n\n<div align=\"center\">\n\n<a href=\"https://github.com/ThunderStruct/nbrefactor\">![Platform](https://img.shields.io/badge/python-v3.7-green)</a>\n<a href=\"https://pypi.org/project/nbrefactor/\">![pypi](https://img.shields.io/badge/pypi%20package-0.1.2-lightgrey.svg)</a>\n<a href=\"https://github.com/ThunderStruct/nbrefactor/blob/master/LICENSE\">![License](https://img.shields.io/badge/license-MIT-orange)</a>\n<a href=\"https://nbrefactor.readthedocs.io/en/latest/\">![Read the Docs](https://readthedocs.org/projects/nbrefactor/badge/?version=latest)</a>\n\n</div>\n\n<p align=\"center\">\nAn automation tool to refactor Jupyter Notebooks to Python packages and modules.\n</p>\n\n---\n\n# Overview (The \"What\")\n\n**nbrefactor** is designed to refactor Jupyter Notebooks into structured Python packages and modules. Using Markdown Headers and/or custom commands in a notebook's Markdown/text cells, nbrefactor creates a hierarchical module structure that reflects the notebook's content autonomously.\n\n# Motivation (The \"Why\")\n\nWith the growing dependence on cloud-based IPython platforms ([Google Colab](https://colab.research.google.com/), primarily), developing projects directly in-browser has become more prominent. Having suffered through the pain of refactoring entire projects from Jupyter Notebooks into Python packages/modules to facilitate PyPI publication (and proper source control), this tool was developed to automate the refactoring process.\n\n# Approach (The \"How\")\n\nThis project does _not_ just create a hierarchy based on the level of Markdown headers (how many `#` there are); this is just a single step in the refactoring process.\n\nSince we are generating modules that potentially depend on context from previous cells in the notebook, dependency-analysis is required. Furthermore, we also need track the generated modules and all globally-accessible identifiers throughout the notebook to generate relative import statements as needed.\n\nFor instance, if a class is refactored to a generated module `./package/sub_package/module.py`, this definition and module path need to be tracked so we can relatively import it as needed if it appears in successive cells or modules. _Scope-Awareness_ and _Identifier-Shadowing_ posed a challenge as well, and are also handled in the dependency analysis phase of the refactoring.\n\n## Module Hierarchy Generation\n\nConvert markdown headers in notebooks into a corresponding folder and file structure.\n\n![refactoring_examples](https://i.imgur.com/bBgHJay.png)\n\n## Code Dependency Analyzer (CDA)\n\nThe core of **nbrefactor**'s functionality lies in the Code Dependency Analyzer (CDA). The CDA is responsible for parsing code cells, tracking declared definitions, and analyzing dependencies across the generated modules. This module tackles challenges that were raised during the inception of the refactoring-automation process (primarily handling relative imports dynamically as we generate the modules, identifier shadowing, and non-redundant dependency injection).\n\n1. **IPython Magic Command Removal**: clean the source code by omitting IPython magic commands (to ensure that the code can be parsed by Python's `ast`).\n2. **AST Parsing**: parse the sanitized code into an Abstract Syntax Tree\n3. **Import Statement Stripping**: extract and strip import statements from the parsed code, and add them to a global (across all cells) tracker.\n4. **Global Definition Tracking**: track all encountered definitions (declared functions and classes) globally. This inherently handles identifier shadowing.\n5. **Dependency Analysis**: analyze identifier usages in a given code block.\n6. **Dynamic Relative Import Resolution**: resolve local import statements dynamically depending on the current and target modules' positions in the tree.\n7. **Dependency Generation and Resolution**: generate the respective import statements (given the definitions' analysis in step 5 & 6) to be injected during the file-writing phase.\n\n# Installation\n\n## PyPI (recommended)\n\nThe Python package is hosted on the [Python Package Index (PyPI)](https://pypi.org/project/nbrefactor/).\n\nThe latest published version of **nbrefactor** can be installed using\n\n```bash\npip install nbrefactor\n```\n\n## Manual Installation\n\nSimply clone the repo and extract the files in the `nbrefactor` folder,\nthen run:\n\n```bash\npip install -r requirements.txt\npip install -e .\n```\n\nOr use one of the scripts below:\n\n### GIT\n\n- `cd` into your project directory\n- Use `sparse-checkout` to pull the library files only into your project directory\n  ```bash\n  git init nbrefactor\n  cd nbrefactor\n  git remote add -f origin https://github.com/ThunderStruct/nbrefactor.git\n  git config core.sparseCheckout true\n  echo \"nbrefactor/*\" >> .git/info/sparse-checkout\n  git pull --depth=1 origin master\n  pip install -r requirements.txt\n  pip install -e .\n  ```\n\n### SVN\n\n- `cd` into your project directory\n- `checkout` the library files\n  ```bash\n  svn checkout https://github.com/ThunderStruct/nbrefactor/trunk/nbrefactor\n  pip install -r requirements.txt\n  pip install -e .\n  ```\n\n# Usage\n\nRefer to the [documentation](https://nbrefactor.readthedocs.io/en/latest/) for the comprehensive commands' reference. Some basic usages are provided below.\n\n## Command Line Interface\n\n`nbrefactor` provides a CLI to easily refactor notebooks into a structured project hierarchy.\n\n### Basic CLI Usage\n\nTo use the CLI, run the following command:\n\n```bash\njupyter nbrefactor <notebook_path> <output_path> [OPTIONS]\n```\n\n- `<notebook_path>`: Path to the Jupyter notebook file you want to refactor.\n- `<output_path>`: Directory where the refactored Python modules will be saved.\n\n\n## Demo\n\nThere are several example notebooks provided to showcase **nbrefactor**'s capabilities.\n\n- [_Primary Demo Notebook_](src/demo/examples/sample_primary_demo.ipynb): this notebook contains several examples of the core nbrefactor features, including all Markdown commands.\n- [_CS231n Notebook_](src/demo/examples/sample_CS231n_colab.ipynb): the official CS231n Colab notebook.\n- [_HiveNAS Notebook_](src/demo/examples/sample_HiveNAS.ipynb): a larger project with a more complex folder structure.\n- [_Markdown-only Notebook_](src/demo/examples/sample_markdown_only.ipynb): a Markdown-only notebook to illustrate the directory-refactoring abilities of nbrefactor.\n\n### Interactive Demo\n\nAn interactive Notebook-based demo can be found [here](src/demo/demo.ipynb), which can be used to run the example projects discussed above.\n\n# Change Log\n\nConsult the [CHANGELOG](CHANGELOG.md) for the latest updates.\n\n# Contributing\n\nPRs are welcome (and encouraged)! If you'd like to contribute to **nbrefactor**, please read the [CONTRIBUTING](CONTRIBUTING.md) guidelines. \n\nThe [TODO](TODO.md) list delineates some potential future implementations and improvements. \n\n## PR Submission\n\nIn addition to following the [contribution guidelines](CONTRIBUTING.md), please ensure the steps below are adhered to prior to submitting a PR:\n\n- The [CHANGELOG](CHANGELOG.md) is updated according to the given structure\n- The [README](README.md) and [TODO](TODO.md) are updated (if applicable)\n\n# License\n\n**nbrefactor** is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.\n\n---\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "An automation tool to refactor Jupyter Notebooks to Python modules, with code dependency analysis.",
    "version": "0.1.2",
    "project_urls": {
        "Bug Reports": "https://github.com/ThunderStruct/nbrefactor/issues",
        "Homepage": "https://github.com/ThunderStruct/nbrefactor",
        "Source": "https://github.com/ThunderStruct/nbrefactor"
    },
    "split_keywords": [
        "jupyter",
        " notebook",
        " refactor",
        " python",
        " cli"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1d4148d18f65741aa997cc005b2befced80f4041ca7f1248d3cbf8504853036",
                "md5": "8008366e40e2117536256071e686d481",
                "sha256": "9eade8460ea3720d22db31c8c148876d0d1d25001bbbaa81c435bda014408ce9"
            },
            "downloads": -1,
            "filename": "nbrefactor-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8008366e40e2117536256071e686d481",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 27775,
            "upload_time": "2024-09-15T19:09:21",
            "upload_time_iso_8601": "2024-09-15T19:09:21.757988Z",
            "url": "https://files.pythonhosted.org/packages/c1/d4/148d18f65741aa997cc005b2befced80f4041ca7f1248d3cbf8504853036/nbrefactor-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "266b47187f12c84f67cbb6e8f85bb2baefc47893308689c989245202801d020f",
                "md5": "ab617660e4ff78f9dd362632311288e1",
                "sha256": "b73eb3a3fbd550eb154669acf92f331de9f70dc565f129c07dd460d1224d6302"
            },
            "downloads": -1,
            "filename": "nbrefactor-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "ab617660e4ff78f9dd362632311288e1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 25900,
            "upload_time": "2024-09-15T19:09:23",
            "upload_time_iso_8601": "2024-09-15T19:09:23.298253Z",
            "url": "https://files.pythonhosted.org/packages/26/6b/47187f12c84f67cbb6e8f85bb2baefc47893308689c989245202801d020f/nbrefactor-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-15 19:09:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ThunderStruct",
    "github_project": "nbrefactor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "attrs",
            "specs": [
                [
                    "==",
                    "24.2.0"
                ]
            ]
        },
        {
            "name": "fastjsonschema",
            "specs": [
                [
                    "==",
                    "2.20.0"
                ]
            ]
        },
        {
            "name": "graphviz",
            "specs": [
                [
                    "==",
                    "0.20.1"
                ]
            ]
        },
        {
            "name": "importlib-metadata",
            "specs": [
                [
                    "==",
                    "6.7.0"
                ]
            ]
        },
        {
            "name": "importlib-resources",
            "specs": [
                [
                    "==",
                    "5.12.0"
                ]
            ]
        },
        {
            "name": "jsonschema",
            "specs": [
                [
                    "==",
                    "4.17.3"
                ]
            ]
        },
        {
            "name": "jupyter_core",
            "specs": [
                [
                    "==",
                    "4.12.0"
                ]
            ]
        },
        {
            "name": "nbformat",
            "specs": [
                [
                    "==",
                    "5.8.0"
                ]
            ]
        },
        {
            "name": "pkgutil_resolve_name",
            "specs": [
                [
                    "==",
                    "1.3.10"
                ]
            ]
        },
        {
            "name": "pyrsistent",
            "specs": [
                [
                    "==",
                    "0.19.3"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.66.5"
                ]
            ]
        },
        {
            "name": "traitlets",
            "specs": [
                [
                    "==",
                    "5.9.0"
                ]
            ]
        },
        {
            "name": "typing_extensions",
            "specs": [
                [
                    "==",
                    "4.7.1"
                ]
            ]
        },
        {
            "name": "zipp",
            "specs": [
                [
                    "==",
                    "3.15.0"
                ]
            ]
        }
    ],
    "lcname": "nbrefactor"
}
        
Elapsed time: 0.28093s