nbdump


Namenbdump JSON
Version 0.0.3 PyPI version JSON
download
home_page
SummaryDump files to Jupyter notebook.
upload_time2023-12-07 04:48:37
maintainer
docs_urlNone
author
requires_python>=3.9
licenseMIT License Copyright (c) 2023 evanarlian Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords jupyter notebook kaggle
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # nbdump
Dump files to Jupyter notebook. Restore by running the notebook. Add optional extra commands to run.

# Installation
```bash
# user
pip install -U nbdump

# development
pip install -e .
pip install tests/requirements.txt
pytest
```

# Usage
In this demo, we will use `src_example/` as a fake repo that you want to import to notebook.

## CLI
```bash
# see help
nbdump -h

# basic usage, this will dump entire `src_example/` to `nb1.ipynb`
nbdump src_example -o nb1.ipynb

# use shell expansion, this will come in handy later
nbdump src_example/**/*.py -o nb2.ipynb

# handle multiple files/dirs, will be deduplicated
nbdump src_example src_example/main.py -o nb3.ipynb

# append extra code cell, e.g. running the `src_example/main.py`
nbdump src_example -c '%run src_example/main.py' -o nb4.ipynb

# extra cells can be more than one
nbdump src_example \
    -c '%run src_example/main.py' \
    -c '!git status' \
    -o nb5.ipynb

# use fd to skip ignored files and hidden files
nbdump $(fd -t f . src_example) -o nb6.ipynb

# clone metadata from another notebook
nbdump src_example/**/*.py -o nb7.ipynb -m tests/kaggle/modified/modified-notebook.ipynb
```
There is a catch, `nbdump` will not respect gitignore because the core functionality is just converting a bunch of files to notebook cells. This means, by using the first example on `nb1.ipynb`, `nbdump` will try to convert all files recursively, regardless of file format. The problem arises when `src_example/` contains binary files such as pictures or even `__pycache__/*`.

Then shell expansion can be used to only select relevant files, such as the example on `nb2.ipynb` (make sure to enable globstar in bash to use `**`). Another solution is to use other tools like [fd](https://github.com/sharkdp/fd) to list the files while respecting gitignore and skipping hidden files automatically.

## Library
```python
from pathlib import Path
import nbdump


target_files = list(Path("src_example").rglob("*.py"))
codes = ["!ls -lah", "!git log --oneline", "%run src_example/main.py"]
metadata_notebook = "tests/kaggle/modified/modified-notebook.ipynb"

# save to disk
with open("nb8.ipynb", "w") as f:
    nbdump.dump(f, target_files, codes, metadata_notebook)

# save as string
ipynb = nbdump.dumps(target_files, codes, metadata_notebook)
print(ipynb[:50])
```

# Why?
Kaggle kernel with *code competition* type with disabled internet cannot use git clone inside the notebook. `nbdump` allows one to work in a standard environment but the final result can be exported to a single notebook, while still preserving the filesystem tree.

This is different than just zipping and unzipping because by using `%%writefile`, you can see and edit the file inside, even after the notebook creation.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "nbdump",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "jupyter,notebook,kaggle",
    "author": "",
    "author_email": "Evan Arlian <evanarlian2000@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/26/74/4edc6a6de5447235facad758034acbb54a07f88bcd8de61ad628458a9113/nbdump-0.0.3.tar.gz",
    "platform": null,
    "description": "# nbdump\nDump files to Jupyter notebook. Restore by running the notebook. Add optional extra commands to run.\n\n# Installation\n```bash\n# user\npip install -U nbdump\n\n# development\npip install -e .\npip install tests/requirements.txt\npytest\n```\n\n# Usage\nIn this demo, we will use `src_example/` as a fake repo that you want to import to notebook.\n\n## CLI\n```bash\n# see help\nnbdump -h\n\n# basic usage, this will dump entire `src_example/` to `nb1.ipynb`\nnbdump src_example -o nb1.ipynb\n\n# use shell expansion, this will come in handy later\nnbdump src_example/**/*.py -o nb2.ipynb\n\n# handle multiple files/dirs, will be deduplicated\nnbdump src_example src_example/main.py -o nb3.ipynb\n\n# append extra code cell, e.g. running the `src_example/main.py`\nnbdump src_example -c '%run src_example/main.py' -o nb4.ipynb\n\n# extra cells can be more than one\nnbdump src_example \\\n    -c '%run src_example/main.py' \\\n    -c '!git status' \\\n    -o nb5.ipynb\n\n# use fd to skip ignored files and hidden files\nnbdump $(fd -t f . src_example) -o nb6.ipynb\n\n# clone metadata from another notebook\nnbdump src_example/**/*.py -o nb7.ipynb -m tests/kaggle/modified/modified-notebook.ipynb\n```\nThere is a catch, `nbdump` will not respect gitignore because the core functionality is just converting a bunch of files to notebook cells. This means, by using the first example on `nb1.ipynb`, `nbdump` will try to convert all files recursively, regardless of file format. The problem arises when `src_example/` contains binary files such as pictures or even `__pycache__/*`.\n\nThen shell expansion can be used to only select relevant files, such as the example on `nb2.ipynb` (make sure to enable globstar in bash to use `**`). Another solution is to use other tools like [fd](https://github.com/sharkdp/fd) to list the files while respecting gitignore and skipping hidden files automatically.\n\n## Library\n```python\nfrom pathlib import Path\nimport nbdump\n\n\ntarget_files = list(Path(\"src_example\").rglob(\"*.py\"))\ncodes = [\"!ls -lah\", \"!git log --oneline\", \"%run src_example/main.py\"]\nmetadata_notebook = \"tests/kaggle/modified/modified-notebook.ipynb\"\n\n# save to disk\nwith open(\"nb8.ipynb\", \"w\") as f:\n    nbdump.dump(f, target_files, codes, metadata_notebook)\n\n# save as string\nipynb = nbdump.dumps(target_files, codes, metadata_notebook)\nprint(ipynb[:50])\n```\n\n# Why?\nKaggle kernel with *code competition* type with disabled internet cannot use git clone inside the notebook. `nbdump` allows one to work in a standard environment but the final result can be exported to a single notebook, while still preserving the filesystem tree.\n\nThis is different than just zipping and unzipping because by using `%%writefile`, you can see and edit the file inside, even after the notebook creation.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 evanarlian  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Dump files to Jupyter notebook.",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/evanarlian/nbdump"
    },
    "split_keywords": [
        "jupyter",
        "notebook",
        "kaggle"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "be07a1fad87fed92da3d4a794001909cbe8bea8c2fd2dad2c98587bf1dd94353",
                "md5": "482dc4dd5e9fc40eed79cbe45fbf803c",
                "sha256": "a6be5e11d68621d4971946acb236dc3759428ef74b67afba415f472ac98ee938"
            },
            "downloads": -1,
            "filename": "nbdump-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "482dc4dd5e9fc40eed79cbe45fbf803c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 6449,
            "upload_time": "2023-12-07T04:48:36",
            "upload_time_iso_8601": "2023-12-07T04:48:36.194538Z",
            "url": "https://files.pythonhosted.org/packages/be/07/a1fad87fed92da3d4a794001909cbe8bea8c2fd2dad2c98587bf1dd94353/nbdump-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "26744edc6a6de5447235facad758034acbb54a07f88bcd8de61ad628458a9113",
                "md5": "f53927e03af1cd7c61819e4253b6fc8b",
                "sha256": "29fe4fb6ea0038490cef09f62a1f3251b8c95d791779efb9e73ecb36cd68f911"
            },
            "downloads": -1,
            "filename": "nbdump-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "f53927e03af1cd7c61819e4253b6fc8b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 5993,
            "upload_time": "2023-12-07T04:48:37",
            "upload_time_iso_8601": "2023-12-07T04:48:37.624420Z",
            "url": "https://files.pythonhosted.org/packages/26/74/4edc6a6de5447235facad758034acbb54a07f88bcd8de61ad628458a9113/nbdump-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-07 04:48:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "evanarlian",
    "github_project": "nbdump",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "nbdump"
}
        
Elapsed time: 0.15879s