dmlx


Namedmlx JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryDeclarative machine learning experiments.
upload_time2025-07-28 07:26:04
maintainerNone
docs_urlNone
author3h
requires_python>=3.10
licenseNone
keywords argument cli command line interface component declarative experiment machine learning management option parameter
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # dmlx

> Declarative Machine Learning eXperiments

## Introduction

`dmlx` is a declarative framework for machine learning (ML) experiments.
Typically, ML codebases use the standard python library `argparse` to parse
parameters from command line, and pass these parameters deep into the models and
other components. `dmlx` standardizes this process and provides an elegant
framework for experiment declaration and basic management, including the
following main features:

- **Declarative Experiment Components:** Declarative interfaces are presented
    for defining resusable and reproducible experiment components and
    hyperparameters, such as model path, dataset getter and random seed.
- **`click`-powered Command Line Interface:**
    [`click`](https://click.palletsprojects.com/) is integrated to provide
    powerful command line functionalities, including parameter properties.
- **Automatic Parameter Collection:** Parameter properties will be wired with
    command line inputs and collected for experiment reproducibility.
- **Experiment Archive Management:** Archive directories will be automatically
    created to hold experiment data for further analysis.
- **ML Framework Independent:** `dmlx` is independent from ML frameworks so you
    can use whatever ML framework you like (PyTorch/TensorFlow/ScikitLearn/...).

## Example

An example ML codebase using `dmlx` is illustrated below:

- `my_innovative_approach/`
    - `model/`
        - `baseline.py`
        - `ours.py`
    - `dataset/`
        - `dataset_foo.py`
        - `dataset_bar.py`
    - `experiments/`
        - ...
    - `approach.py`
    - `train.py`
    - `analyze.py`

1. Firstly, models are defined as submodules of the `model` module, and dataset
    loaders are defined as submodules of the `dataset` module. These components
    should expect normal Python arguments, and the component factories defined
    later using `component()` will parse command line parameters and pass the
    arguments to real components.

    ```python
    # model/xxx.py

    class Model:
        def __init__(self, alpha: float, beta: float, ...) -> None: ...
    ```

    ``` python
    # dataset/dataset_yyy.py

    def get_dataset_yyy(...): ...
    ```

2. Secondly, the components (models/datasets) and other parameters can be
    declared as properties on a composed approach using `dmlx`. The parameter
    properties, declared by `argument()` and `option()`, will define
    corresponding command line parameters and store them as instance attributes.
    The component properties, declared by `component()`, will create the actual
    component objects and store them as instance attributes.

    ```python
    # approach.py

    from dmlx.context import argument, option, component


    class Approach:
        model = component(
            argument("model_locator", default="ours"),  # click argument
            "model",  # module base
            "Model",  # default factory name
        )
        dataset = component(
            option("dataset_locator", "-d", "--dataset"),  # click option
            "dataset",  # module base
        )
        epochs = option("-e", "--epochs", type=int, default=800)  # click option

        def run(self):
            for epoch in range(self.epochs):
                for x, y_true in self.dataset:
                    y_pred = self.model(x)
                    yield x, y_true, y_pred
    ```

3. Thirdly, `dmlx.experiment.Experiment` can be used to declare your experiment.
    The experiment object will create an underlying `click` command, and the
    experiment context will collect the parameters(`model_locator`,
    `dataset_locater` and `epochs`) and wire them with command line inputs.

    ```python
    # train.py

    from dmlx.experiment import Experiment

    experiment = Experiment()

    with experiment.context():
        from approach import Approach

    @experiment.main
    def main(**args):
        experiment.init()

        approach = Approach()
        with (experiment.path / "train.log").open("w") as log_file:
            for x, y_true, y_pred in approach.run():
                metrics = compute_metrics(y_pred, y_true)
                log_file.write(repr(metrics) + "\n")

        approach.model.save(experiment.path / "model.bin")

    experiment.run()
    ```

4. Finally, you can invoke `train.py` in the command line to actually conduct
    the experiment, where component params accept string locators in the form
    of `path.to.module[:factory_name][?[k_0=v_0][;k_n=v_n...]]` with values
    parsed by `json.loads`.

    ```shell
    python train.py 'ours?alpha=0.1' \
        --dataset 'dataset_foo:get_dataset_foo?
            version = "2.0";
            shots = 5;
            # ...
        ' \
        --epochs 500
    ```

5. After calling `experiment.init()`, an experiment directory will be created in
    `experiments/`, to which `experiment.path` will point, and the experiment
    meta will be dumped into `meta.json` in that directory. Extra data can also
    be saved to the experiment directory, as shown in `train.py`, where a log
    file `train.log` holding epoch metrics and a model archive `model.bin` are
    created. This experiment archive can then be loaded to perform extensive
    inspections, such as visualization and further statistical analysis, where
    properties defined on `Approach` will be automatically restored:

    ```python
    # analyze.py

    from dmlx.experiment import Experiment

    experiment = Experiment()

    with experiment.context():
        from approach import Approach


    @experiment.main
    def main(**args):
        print("Loaded args:", args)
        print("Loaded meta:", experiment.meta)

        approach = Approach()
        approach.model.load(experiment.path / "model.bin")

        # Now, `args`, `approach.model`, `approach.dataset` and other properties
        # are all restored, ready for extensive inspections.


    experiment.load("/path/to/the/experiment")
    ```

## Links

- [License (ISC)](./LICENSE)
- [API Reference](https://github.com/huang2002/dmlx/wiki)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dmlx",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "argument, cli, command line interface, component, declarative, experiment, machine learning, management, option, parameter",
    "author": "3h",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/4a/31/7f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa/dmlx-0.1.0.tar.gz",
    "platform": null,
    "description": "# dmlx\n\n> Declarative Machine Learning eXperiments\n\n## Introduction\n\n`dmlx` is a declarative framework for machine learning (ML) experiments.\nTypically, ML codebases use the standard python library `argparse` to parse\nparameters from command line, and pass these parameters deep into the models and\nother components. `dmlx` standardizes this process and provides an elegant\nframework for experiment declaration and basic management, including the\nfollowing main features:\n\n- **Declarative Experiment Components:** Declarative interfaces are presented\n    for defining resusable and reproducible experiment components and\n    hyperparameters, such as model path, dataset getter and random seed.\n- **`click`-powered Command Line Interface:**\n    [`click`](https://click.palletsprojects.com/) is integrated to provide\n    powerful command line functionalities, including parameter properties.\n- **Automatic Parameter Collection:** Parameter properties will be wired with\n    command line inputs and collected for experiment reproducibility.\n- **Experiment Archive Management:** Archive directories will be automatically\n    created to hold experiment data for further analysis.\n- **ML Framework Independent:** `dmlx` is independent from ML frameworks so you\n    can use whatever ML framework you like (PyTorch/TensorFlow/ScikitLearn/...).\n\n## Example\n\nAn example ML codebase using `dmlx` is illustrated below:\n\n- `my_innovative_approach/`\n    - `model/`\n        - `baseline.py`\n        - `ours.py`\n    - `dataset/`\n        - `dataset_foo.py`\n        - `dataset_bar.py`\n    - `experiments/`\n        - ...\n    - `approach.py`\n    - `train.py`\n    - `analyze.py`\n\n1. Firstly, models are defined as submodules of the `model` module, and dataset\n    loaders are defined as submodules of the `dataset` module. These components\n    should expect normal Python arguments, and the component factories defined\n    later using `component()` will parse command line parameters and pass the\n    arguments to real components.\n\n    ```python\n    # model/xxx.py\n\n    class Model:\n        def __init__(self, alpha: float, beta: float, ...) -> None: ...\n    ```\n\n    ``` python\n    # dataset/dataset_yyy.py\n\n    def get_dataset_yyy(...): ...\n    ```\n\n2. Secondly, the components (models/datasets) and other parameters can be\n    declared as properties on a composed approach using `dmlx`. The parameter\n    properties, declared by `argument()` and `option()`, will define\n    corresponding command line parameters and store them as instance attributes.\n    The component properties, declared by `component()`, will create the actual\n    component objects and store them as instance attributes.\n\n    ```python\n    # approach.py\n\n    from dmlx.context import argument, option, component\n\n\n    class Approach:\n        model = component(\n            argument(\"model_locator\", default=\"ours\"),  # click argument\n            \"model\",  # module base\n            \"Model\",  # default factory name\n        )\n        dataset = component(\n            option(\"dataset_locator\", \"-d\", \"--dataset\"),  # click option\n            \"dataset\",  # module base\n        )\n        epochs = option(\"-e\", \"--epochs\", type=int, default=800)  # click option\n\n        def run(self):\n            for epoch in range(self.epochs):\n                for x, y_true in self.dataset:\n                    y_pred = self.model(x)\n                    yield x, y_true, y_pred\n    ```\n\n3. Thirdly, `dmlx.experiment.Experiment` can be used to declare your experiment.\n    The experiment object will create an underlying `click` command, and the\n    experiment context will collect the parameters(`model_locator`,\n    `dataset_locater` and `epochs`) and wire them with command line inputs.\n\n    ```python\n    # train.py\n\n    from dmlx.experiment import Experiment\n\n    experiment = Experiment()\n\n    with experiment.context():\n        from approach import Approach\n\n    @experiment.main\n    def main(**args):\n        experiment.init()\n\n        approach = Approach()\n        with (experiment.path / \"train.log\").open(\"w\") as log_file:\n            for x, y_true, y_pred in approach.run():\n                metrics = compute_metrics(y_pred, y_true)\n                log_file.write(repr(metrics) + \"\\n\")\n\n        approach.model.save(experiment.path / \"model.bin\")\n\n    experiment.run()\n    ```\n\n4. Finally, you can invoke `train.py` in the command line to actually conduct\n    the experiment, where component params accept string locators in the form\n    of `path.to.module[:factory_name][?[k_0=v_0][;k_n=v_n...]]` with values\n    parsed by `json.loads`.\n\n    ```shell\n    python train.py 'ours?alpha=0.1' \\\n        --dataset 'dataset_foo:get_dataset_foo?\n            version = \"2.0\";\n            shots = 5;\n            # ...\n        ' \\\n        --epochs 500\n    ```\n\n5. After calling `experiment.init()`, an experiment directory will be created in\n    `experiments/`, to which `experiment.path` will point, and the experiment\n    meta will be dumped into `meta.json` in that directory. Extra data can also\n    be saved to the experiment directory, as shown in `train.py`, where a log\n    file `train.log` holding epoch metrics and a model archive `model.bin` are\n    created. This experiment archive can then be loaded to perform extensive\n    inspections, such as visualization and further statistical analysis, where\n    properties defined on `Approach` will be automatically restored:\n\n    ```python\n    # analyze.py\n\n    from dmlx.experiment import Experiment\n\n    experiment = Experiment()\n\n    with experiment.context():\n        from approach import Approach\n\n\n    @experiment.main\n    def main(**args):\n        print(\"Loaded args:\", args)\n        print(\"Loaded meta:\", experiment.meta)\n\n        approach = Approach()\n        approach.model.load(experiment.path / \"model.bin\")\n\n        # Now, `args`, `approach.model`, `approach.dataset` and other properties\n        # are all restored, ready for extensive inspections.\n\n\n    experiment.load(\"/path/to/the/experiment\")\n    ```\n\n## Links\n\n- [License (ISC)](./LICENSE)\n- [API Reference](https://github.com/huang2002/dmlx/wiki)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Declarative machine learning experiments.",
    "version": "0.1.0",
    "project_urls": {
        "Changelog": "https://github.com/huang2002/dmlx/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/huang2002/dmlx/blob/main/README.md",
        "Homepage": "https://github.com/huang2002/dmlx",
        "Repository": "https://github.com/huang2002/dmlx.git"
    },
    "split_keywords": [
        "argument",
        " cli",
        " command line interface",
        " component",
        " declarative",
        " experiment",
        " machine learning",
        " management",
        " option",
        " parameter"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "822cba888311fe498679bce213132cf7bc7728decf3aa562a1245e3e6a55887f",
                "md5": "983fdf51cb9b19fb3d4e906a522857ee",
                "sha256": "99e5f2c7bfe0e0e1867bcca3f51e603374c12c32911c72fd481aa651e38fc2eb"
            },
            "downloads": -1,
            "filename": "dmlx-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "983fdf51cb9b19fb3d4e906a522857ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 3912,
            "upload_time": "2025-07-28T07:26:03",
            "upload_time_iso_8601": "2025-07-28T07:26:03.604423Z",
            "url": "https://files.pythonhosted.org/packages/82/2c/ba888311fe498679bce213132cf7bc7728decf3aa562a1245e3e6a55887f/dmlx-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4a317f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa",
                "md5": "a64d127b20aaffbe7bf6689562a02f03",
                "sha256": "72057bf45f9f8cf63ace62c14eeb0b28fec4f454c433d66891cdc0a8a16deb77"
            },
            "downloads": -1,
            "filename": "dmlx-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a64d127b20aaffbe7bf6689562a02f03",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 7856,
            "upload_time": "2025-07-28T07:26:04",
            "upload_time_iso_8601": "2025-07-28T07:26:04.570856Z",
            "url": "https://files.pythonhosted.org/packages/4a/31/7f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa/dmlx-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-28 07:26:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "huang2002",
    "github_project": "dmlx",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "dmlx"
}
        
3h
Elapsed time: 0.46334s