mnist-ae

Name	mnist-ae JSON
Version	0.0.4 JSON
	download
home_page	https://github.com/ofgarzon2662/mnist_ae
Summary	MNIST auto-encoder
upload_time	2025-08-05 00:35:40
maintainer	None
docs_url	None
author	Fernando Garzon
requires_python	>=3.9
license	Apache Software License 2.0
keywords	nbdev jupyter notebook python
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # mnist_ae – From Notebook to Python Package

This guide walks you **step-by-step** through turning the `CIML25_MNIST_Intro_v6.ipynb` notebook into a distributable Python package that you can install anywhere (even on TSCC).  It assumes you already know how to run a Jupyter notebook, and that you have **Python ≥ 3.8** available (Python 3.11 recommended).

## 0  Clone the repository

```bash
git clone https://github.com/<your-username>/mnist_ae.git
cd mnist_ae
```

Feel free to fork the project first if you want your own remote.

---

## 1  Set up a clean Python environment

### Windows ( PowerShell or cmd )
```powershell
:: create & activate a virtual-env in the project root
python -m venv .venv
.venv\Scripts\activate          # cmd
# or
.\.venv\Scripts\Activate.ps1    # PowerShell
```

### macOS / Linux ( bash / zsh )
```bash
python3 -m venv .venv
source .venv/bin/activate
```

Upgrade pip & install build-time tools:
```bash
pip install --upgrade pip nbdev build wheel twine
```

### Install project requirements (to run the notebook)
The notebook itself depends on **PyTorch** and **torchvision** (plus NumPy, etc.).  The easiest way is to use the pinned list that comes with the repo:

```bash
pip install -r requirements.txt      # installs CPU wheels by default
```

If you already have GPU-enabled PyTorch, feel free to skip this step or install only the libraries you miss:

```bash
pip install torch torchvision
```

> 🗒️ **Why a venv?**  Keeping build tools isolated avoids polluting your base Python and makes the process reproducible.

---

## 1½  Place the notebook in `nbs/`

If your starting file is `CIML25_MNIST_Intro_v6.ipynb`, move (or copy) it into the `nbs/` directory **and** rename it to the more compact `01_mnist_intro.ipynb` so nbdev can pick it up.

### Windows
```powershell
move CIML25_MNIST_Intro_v6.ipynb nbs\01_mnist_intro.ipynb
```

### macOS / Linux
```bash
mv CIML25_MNIST_Intro_v6.ipynb nbs/01_mnist_intro.ipynb
```

> nbdev scans all notebooks inside `nbs/`. The numeric prefix (`01_`, `02_`, …) also sets the order of the generated documentation.

## 2  Run & explore the notebook

```bash
jupyter notebook nbs/01_mnist_intro.ipynb
```

Execute a few cells to verify the model trains as expected (each epoch should take only a few seconds on CPU).

---

## 3  Export code with nbdev

nbdev turns specially-marked cells into a Python module.  The two directives you need to know are:

* `#| default_exp mnist_training` – appears once, tells nbdev *which module file* to create (`mnist_training.py`).
* `#| export` – placed on any cell whose code you want included in the library.

The **intro notebook already contains** these directives, so exporting is a one-liner:

```bash
nbdev_export            # generates mnist_ae/mnist_training.py
```

(Optional) update metadata in `settings.ini` – package name, version, runtime requirements, author, etc.  nbdev will read this file when we build the wheel.

---

### 3½  Sync metadata & version (optional but recommended)

Before building, open `settings.ini` and update:

```
version      = 0.0.2        # bump each release
requirements = torch torchvision   # runtime deps only
```

Then run

```bash
nbdev_prepare      # sync settings → pyproject.toml, tag version, install git hooks
```

### Inspect what nbdev generated
`nbdev_prepare` rewrites `pyproject.toml`, regenerates type stubs, and may reformat your code. **Open the `mnist_ae/` folder** and look at the newly-created or updated modules.

**Recommendations:**
1. **Do *not* mark long training loops or plotting cells with `#| export`.**  Keep exploratory code in the notebook; only export reusable library functions and models. Heavy loops inside the package will run every time someone imports it and can waste GPU/CPU hours.
2. The exported file can be a single, monolithic script – notebooks aren’t always written with clean architecture in mind.  After export, audit the code (or ask an advanced LLM, o3 from ChatGPT is recommended, as well as Gemini2.5 or any other reasoning model) and refactor it into small, SOLID-compliant modules.

Use this starter prompt to guide the refactor:
```text
You are a senior Python engineer. Rewrite the file `mnist_ae/mnist_training.py` so that:
• Each class/function has one clear responsibility (Single-Responsibility Principle).
• Related functionality is grouped into modules (e.g. data, model, training, cli).
• Internal helpers are made private (_prefix).
• No global execution at import-time; provide a `main()` entry point.
• Add type hints and docstrings.
Return the full, refactored code as a valid Python package structure.
```

**What is SOLID?**  It’s a set of five design guidelines for maintainable OO code:

* **S — Single Responsibility:** each module/class/function does one job.
* **O — Open/Closed:** code is open for extension but closed for modification.
* **L — Liskov Substitution:** derived classes can stand in for their base without breaking behaviour.
* **I — Interface Segregation:** prefer many small, specific interfaces over one large general-purpose interface.
* **D — Dependency Inversion:** depend on abstractions (interfaces), not concrete implementations.

Spend some time on this step; clean structure pays off later.

---

---

## 4  Build the wheel (binary package)

```bash
python -m build --wheel        # produces dist/mnist_ae-0.0.1-py3-none-any.whl
```

The file inside `dist/` is a **portable package** that can be installed with `pip install <file>.whl` on any machine that has Python ≥ the minimum you set.

### 4½  Test the wheel locally

### 4¾  Run unit tests from source
If you’re working from the cloned repo rather than the installed wheel, install the package in *editable* mode so Python can find it:

```bash
pip install -e .[dev]   # or just `pip install -e .` if you skipped dev extras
pytest --cov=mnist_ae -q  # run tests **and** show coverage %
```

If `mnist_ae` is not importable you’ll get a `ModuleNotFoundError`; the editable install (or adding the repo root to `PYTHONPATH`) solves that.
```bash
pip install --force-reinstall dist/mnist_ae-*.whl
python -m mnist_ae.mnist_training --epochs 1 --batch_size 128  # quick sanity run
```

---

## 5  Publish to (Test)PyPI  
*(skip if you only need a local wheel)*

1. Create an account on [pypi.org](https://pypi.org) (and on [test.pypi.org](https://test.pypi.org) for dry-runs).
2. Generate an **API token**:  *Settings → API tokens → New token*.
3. Upload:

```bash
# one-time: store credentials safely or export as env-vars
export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-********************************"

# upload to TestPyPI first
python -m twine upload --repository testpypi dist/*

# if everything looks good, push to the real PyPI
python -m twine upload dist/*
```

Once published, anyone can install with
```bash
pip install mnist_ae      # replace with the final project name
```

---

## 6  Install & run on TSCC (or any HPC)

```bash
# inside a job script or interactive srun session
module load python3 cuda            # adjust to cluster versions
python -m venv ~/mnist_env && source ~/mnist_env/bin/activate


# now install your package from PyPI
pip install mnist_ae

# (alternative) install a local wheel -- You'd have to scp your local *.whl to TSCC.
# pip install ~/dist/mnist_ae-0.0.1-py3-none-any.whl

# launch training
python -m mnist_ae.mnist_training --epochs 5 --batch_size 256
```

Check the time it takes for these 5 epocs and compare to your local run. Spot any significant difference?

---

## Appendix – Common commands (Windows vs Unix)

| Task                     | Windows (PowerShell)                         | macOS / Linux (bash)            |
|--------------------------|----------------------------------------------|---------------------------------|
| Activate venv            | `.\.venv\Scripts\Activate.ps1`              | `source .venv/bin/activate`     |
| Deactivate venv          | `deactivate`                                 | `deactivate`                    |
| Upgrade pip              | `python -m pip install --upgrade pip`        | `pip install --upgrade pip`     |
| Run nbdev export         | `nbdev_export`                               | `nbdev_export`                  |
| Build wheel              | `python -m build --wheel`                    | `python -m build --wheel`       |
| Upload with twine        | `python -m twine upload dist/*`              | same                            |
| Install wheel            | `pip install dist\mnist_ae-*.whl`            | `pip install dist/mnist_ae-*.whl` |

That’s it!  You’ve gone from a Jupyter notebook to a published, pip-installable Python package 🎉

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ofgarzon2662/mnist_ae",
    "name": "mnist-ae",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "nbdev jupyter notebook python",
    "author": "Fernando Garzon",
    "author_email": "of.garzon2662@uniandes.edu.co",
    "download_url": "https://files.pythonhosted.org/packages/93/11/03e64e9c9106baa8a357a4a63ffd5d943e52e52cdbbbb5fd1f7681c9731d/mnist_ae-0.0.4.tar.gz",
    "platform": null,
    "description": "# mnist_ae \u2013 From Notebook to Python Package\n\nThis guide walks you **step-by-step** through turning the `CIML25_MNIST_Intro_v6.ipynb` notebook into a distributable Python package that you can install anywhere (even on TSCC).  It assumes you already know how to run a Jupyter notebook, and that you have **Python \u2265 3.8** available (Python 3.11 recommended).\n\n## 0  Clone the repository\n\n```bash\ngit clone https://github.com/<your-username>/mnist_ae.git\ncd mnist_ae\n```\n\nFeel free to fork the project first if you want your own remote.\n\n---\n\n## 1  Set up a clean Python environment\n\n### Windows (\u2006PowerShell or cmd\u2006)\n```powershell\n:: create & activate a virtual-env in the project root\npython -m venv .venv\n.venv\\Scripts\\activate          # cmd\n# or\n.\\.venv\\Scripts\\Activate.ps1    # PowerShell\n```\n\n### macOS / Linux (\u2006bash / zsh\u2006)\n```bash\npython3 -m venv .venv\nsource .venv/bin/activate\n```\n\nUpgrade pip & install build-time tools:\n```bash\npip install --upgrade pip nbdev build wheel twine\n```\n\n### Install project requirements (to run the notebook)\nThe notebook itself depends on **PyTorch** and **torchvision** (plus NumPy, etc.).  The easiest way is to use the pinned list that comes with the repo:\n\n```bash\npip install -r requirements.txt      # installs CPU wheels by default\n```\n\nIf you already have GPU-enabled PyTorch, feel free to skip this step or install only the libraries you miss:\n\n```bash\npip install torch torchvision\n```\n\n> \ud83d\uddd2\ufe0f **Why a venv?**  Keeping build tools isolated avoids polluting your base Python and makes the process reproducible.\n\n---\n\n## 1\u00bd  Place the notebook in `nbs/`\n\nIf your starting file is `CIML25_MNIST_Intro_v6.ipynb`, move (or copy) it into the `nbs/` directory **and** rename it to the more compact `01_mnist_intro.ipynb` so nbdev can pick it up.\n\n### Windows\n```powershell\nmove CIML25_MNIST_Intro_v6.ipynb nbs\\01_mnist_intro.ipynb\n```\n\n### macOS / Linux\n```bash\nmv CIML25_MNIST_Intro_v6.ipynb nbs/01_mnist_intro.ipynb\n```\n\n> nbdev scans all notebooks inside `nbs/`. The numeric prefix (`01_`, `02_`, \u2026) also sets the order of the generated documentation.\n\n## 2  Run & explore the notebook\n\n```bash\njupyter notebook nbs/01_mnist_intro.ipynb\n```\n\nExecute a few cells to verify the model trains as expected (each epoch should take only a few seconds on CPU).\n\n---\n\n## 3  Export code with nbdev\n\nnbdev turns specially-marked cells into a Python module.  The two directives you need to know are:\n\n* `#| default_exp mnist_training` \u2013 appears once, tells nbdev *which module file* to create (`mnist_training.py`).\n* `#| export` \u2013 placed on any cell whose code you want included in the library.\n\nThe **intro notebook already contains** these directives, so exporting is a one-liner:\n\n```bash\nnbdev_export            # generates mnist_ae/mnist_training.py\n```\n\n(Optional) update metadata in `settings.ini` \u2013 package name, version, runtime requirements, author, etc.  nbdev will read this file when we build the wheel.\n\n---\n\n### 3\u00bd  Sync metadata & version (optional but recommended)\n\nBefore building, open `settings.ini` and update:\n\n```\nversion      = 0.0.2        # bump each release\nrequirements = torch torchvision   # runtime deps only\n```\n\nThen run\n\n```bash\nnbdev_prepare      # sync settings \u2192 pyproject.toml, tag version, install git hooks\n```\n\n### Inspect what nbdev generated\n`nbdev_prepare` rewrites `pyproject.toml`, regenerates type stubs, and may reformat your code. **Open the `mnist_ae/` folder** and look at the newly-created or updated modules.\n\n**Recommendations:**\n1. **Do *not* mark long training loops or plotting cells with `#| export`.**  Keep exploratory code in the notebook; only export reusable library functions and models. Heavy loops inside the package will run every time someone imports it and can waste GPU/CPU hours.\n2. The exported file can be a single, monolithic script \u2013 notebooks aren\u2019t always written with clean architecture in mind.  After export, audit the code (or ask an advanced LLM, o3 from ChatGPT is recommended, as well as Gemini2.5 or any other reasoning model) and refactor it into small, SOLID-compliant modules.\n\nUse this starter prompt to guide the refactor:\n```text\nYou are a senior Python engineer. Rewrite the file `mnist_ae/mnist_training.py` so that:\n\u2022 Each class/function has one clear responsibility (Single-Responsibility Principle).\n\u2022 Related functionality is grouped into modules (e.g. data, model, training, cli).\n\u2022 Internal helpers are made private (_prefix).\n\u2022 No global execution at import-time; provide a `main()` entry point.\n\u2022 Add type hints and docstrings.\nReturn the full, refactored code as a valid Python package structure.\n```\n\n**What is SOLID?**  It\u2019s a set of five design guidelines for maintainable OO code:\n\n* **S\u200a\u2014\u200aSingle Responsibility:** each module/class/function does one job.\n* **O\u200a\u2014\u200aOpen/Closed:** code is open for extension but closed for modification.\n* **L\u200a\u2014\u200aLiskov Substitution:** derived classes can stand in for their base without breaking behaviour.\n* **I\u200a\u2014\u200aInterface Segregation:** prefer many small, specific interfaces over one large general-purpose interface.\n* **D\u200a\u2014\u200aDependency Inversion:** depend on abstractions (interfaces), not concrete implementations.\n\nSpend some time on this step; clean structure pays off later.\n\n---\n\n---\n\n## 4  Build the wheel (binary package)\n\n```bash\npython -m build --wheel        # produces dist/mnist_ae-0.0.1-py3-none-any.whl\n```\n\nThe file inside `dist/` is a **portable package** that can be installed with `pip install <file>.whl` on any machine that has Python \u2265 the minimum you set.\n\n### 4\u00bd  Test the wheel locally\n\n### 4\u00be  Run unit tests from source\nIf you\u2019re working from the cloned repo rather than the installed wheel, install the package in *editable* mode so Python can find it:\n\n```bash\npip install -e .[dev]   # or just `pip install -e .` if you skipped dev extras\npytest --cov=mnist_ae -q  # run tests **and** show coverage %\n```\n\nIf `mnist_ae` is not importable you\u2019ll get a `ModuleNotFoundError`; the editable install (or adding the repo root to `PYTHONPATH`) solves that.\n```bash\npip install --force-reinstall dist/mnist_ae-*.whl\npython -m mnist_ae.mnist_training --epochs 1 --batch_size 128  # quick sanity run\n```\n\n---\n\n## 5  Publish to (Test)PyPI  \n*(skip if you only need a local wheel)*\n\n1. Create an account on [pypi.org](https://pypi.org) (and on [test.pypi.org](https://test.pypi.org) for dry-runs).\n2. Generate an **API token**:  *Settings \u2192 API tokens \u2192 New token*.\n3. Upload:\n\n```bash\n# one-time: store credentials safely or export as env-vars\nexport TWINE_USERNAME=\"__token__\"\nexport TWINE_PASSWORD=\"pypi-********************************\"\n\n# upload to TestPyPI first\npython -m twine upload --repository testpypi dist/*\n\n# if everything looks good, push to the real PyPI\npython -m twine upload dist/*\n```\n\nOnce published, anyone can install with\n```bash\npip install mnist_ae      # replace with the final project name\n```\n\n---\n\n## 6  Install & run on TSCC (or any HPC)\n\n```bash\n# inside a job script or interactive srun session\nmodule load python3 cuda            # adjust to cluster versions\npython -m venv ~/mnist_env && source ~/mnist_env/bin/activate\n\n\n# now install your package from PyPI\npip install mnist_ae\n\n# (alternative) install a local wheel -- You'd have to scp your local *.whl to TSCC.\n# pip install ~/dist/mnist_ae-0.0.1-py3-none-any.whl\n\n# launch training\npython -m mnist_ae.mnist_training --epochs 5 --batch_size 256\n```\n\nCheck the time it takes for these 5 epocs and compare to your local run. Spot any significant difference?\n\n---\n\n## Appendix \u2013 Common commands (Windows vs Unix)\n\n| Task                     | Windows (PowerShell)                         | macOS / Linux (bash)            |\n|--------------------------|----------------------------------------------|---------------------------------|\n| Activate venv            | `.\\.venv\\Scripts\\Activate.ps1`              | `source .venv/bin/activate`     |\n| Deactivate venv          | `deactivate`                                 | `deactivate`                    |\n| Upgrade pip              | `python -m pip install --upgrade pip`        | `pip install --upgrade pip`     |\n| Run nbdev export         | `nbdev_export`                               | `nbdev_export`                  |\n| Build wheel              | `python -m build --wheel`                    | `python -m build --wheel`       |\n| Upload with twine        | `python -m twine upload dist/*`              | same                            |\n| Install wheel            | `pip install dist\\mnist_ae-*.whl`            | `pip install dist/mnist_ae-*.whl` |\n\nThat\u2019s it!  You\u2019ve gone from a Jupyter notebook to a published, pip-installable Python package \ud83c\udf89\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "MNIST auto-encoder",
    "version": "0.0.4",
    "project_urls": {
        "Homepage": "https://github.com/ofgarzon2662/mnist_ae"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4668461630d91b8c25781950773f9a410714ad33165c5696e6dd648999a19972",
                "md5": "c722309b5596eda4fc89893015caab0c",
                "sha256": "20abc70af8e84f1999fb2b25750aa3aec6cc97fee6641451611a33d72b301bb7"
            },
            "downloads": -1,
            "filename": "mnist_ae-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c722309b5596eda4fc89893015caab0c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 12405,
            "upload_time": "2025-08-05T00:35:38",
            "upload_time_iso_8601": "2025-08-05T00:35:38.717873Z",
            "url": "https://files.pythonhosted.org/packages/46/68/461630d91b8c25781950773f9a410714ad33165c5696e6dd648999a19972/mnist_ae-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "931103e64e9c9106baa8a357a4a63ffd5d943e52e52cdbbbb5fd1f7681c9731d",
                "md5": "9a59ea618b7cdfeac9e55c2fe26ac40a",
                "sha256": "9acce8c72dd4ad6458f18d4ba11c6eb753d5d69d8e751a43fa8f4f8716683c2e"
            },
            "downloads": -1,
            "filename": "mnist_ae-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "9a59ea618b7cdfeac9e55c2fe26ac40a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 14709,
            "upload_time": "2025-08-05T00:35:40",
            "upload_time_iso_8601": "2025-08-05T00:35:40.189284Z",
            "url": "https://files.pythonhosted.org/packages/93/11/03e64e9c9106baa8a357a4a63ffd5d943e52e52cdbbbb5fd1f7681c9731d/mnist_ae-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-05 00:35:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ofgarzon2662",
    "github_project": "mnist_ae",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "mnist-ae"
}

Fernando Garzon