# dmlx
> Declarative Machine Learning eXperiments
## Introduction
`dmlx` is a declarative framework for machine learning (ML) experiments.
Typically, ML codebases use the standard python library `argparse` to parse
parameters from command line, and pass these parameters deep into the models and
other components. `dmlx` standardizes this process and provides an elegant
framework for experiment declaration and basic management, including the
following main features:
- **Declarative Experiment Components:** Declarative interfaces are presented
for defining resusable and reproducible experiment components and
hyperparameters, such as model path, dataset getter and random seed.
- **`click`-powered Command Line Interface:**
[`click`](https://click.palletsprojects.com/) is integrated to provide
powerful command line functionalities, including parameter properties.
- **Automatic Parameter Collection:** Parameter properties will be wired with
command line inputs and collected for experiment reproducibility.
- **Experiment Archive Management:** Archive directories will be automatically
created to hold experiment data for further analysis.
- **ML Framework Independent:** `dmlx` is independent from ML frameworks so you
can use whatever ML framework you like (PyTorch/TensorFlow/ScikitLearn/...).
## Example
An example ML codebase using `dmlx` is illustrated below:
- `my_innovative_approach/`
- `model/`
- `baseline.py`
- `ours.py`
- `dataset/`
- `dataset_foo.py`
- `dataset_bar.py`
- `experiments/`
- ...
- `approach.py`
- `train.py`
- `analyze.py`
1. Firstly, models are defined as submodules of the `model` module, and dataset
loaders are defined as submodules of the `dataset` module. These components
should expect normal Python arguments, and the component factories defined
later using `component()` will parse command line parameters and pass the
arguments to real components.
```python
# model/xxx.py
class Model:
def __init__(self, alpha: float, beta: float, ...) -> None: ...
```
``` python
# dataset/dataset_yyy.py
def get_dataset_yyy(...): ...
```
2. Secondly, the components (models/datasets) and other parameters can be
declared as properties on a composed approach using `dmlx`. The parameter
properties, declared by `argument()` and `option()`, will define
corresponding command line parameters and store them as instance attributes.
The component properties, declared by `component()`, will create the actual
component objects and store them as instance attributes.
```python
# approach.py
from dmlx.context import argument, option, component
class Approach:
model = component(
argument("model_locator", default="ours"), # click argument
"model", # module base
"Model", # default factory name
)
dataset = component(
option("dataset_locator", "-d", "--dataset"), # click option
"dataset", # module base
)
epochs = option("-e", "--epochs", type=int, default=800) # click option
def run(self):
for epoch in range(self.epochs):
for x, y_true in self.dataset:
y_pred = self.model(x)
yield x, y_true, y_pred
```
3. Thirdly, `dmlx.experiment.Experiment` can be used to declare your experiment.
The experiment object will create an underlying `click` command, and the
experiment context will collect the parameters(`model_locator`,
`dataset_locater` and `epochs`) and wire them with command line inputs.
```python
# train.py
from dmlx.experiment import Experiment
experiment = Experiment()
with experiment.context():
from approach import Approach
@experiment.main
def main(**args):
experiment.init()
approach = Approach()
with (experiment.path / "train.log").open("w") as log_file:
for x, y_true, y_pred in approach.run():
metrics = compute_metrics(y_pred, y_true)
log_file.write(repr(metrics) + "\n")
approach.model.save(experiment.path / "model.bin")
experiment.run()
```
4. Finally, you can invoke `train.py` in the command line to actually conduct
the experiment, where component params accept string locators in the form
of `path.to.module[:factory_name][?[k_0=v_0][;k_n=v_n...]]` with values
parsed by `json.loads`.
```shell
python train.py 'ours?alpha=0.1' \
--dataset 'dataset_foo:get_dataset_foo?
version = "2.0";
shots = 5;
# ...
' \
--epochs 500
```
5. After calling `experiment.init()`, an experiment directory will be created in
`experiments/`, to which `experiment.path` will point, and the experiment
meta will be dumped into `meta.json` in that directory. Extra data can also
be saved to the experiment directory, as shown in `train.py`, where a log
file `train.log` holding epoch metrics and a model archive `model.bin` are
created. This experiment archive can then be loaded to perform extensive
inspections, such as visualization and further statistical analysis, where
properties defined on `Approach` will be automatically restored:
```python
# analyze.py
from dmlx.experiment import Experiment
experiment = Experiment()
with experiment.context():
from approach import Approach
@experiment.main
def main(**args):
print("Loaded args:", args)
print("Loaded meta:", experiment.meta)
approach = Approach()
approach.model.load(experiment.path / "model.bin")
# Now, `args`, `approach.model`, `approach.dataset` and other properties
# are all restored, ready for extensive inspections.
experiment.load("/path/to/the/experiment")
```
## Links
- [License (ISC)](./LICENSE)
- [API Reference](https://github.com/huang2002/dmlx/wiki)
Raw data
{
"_id": null,
"home_page": null,
"name": "dmlx",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "argument, cli, command line interface, component, declarative, experiment, machine learning, management, option, parameter",
"author": "3h",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/4a/31/7f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa/dmlx-0.1.0.tar.gz",
"platform": null,
"description": "# dmlx\n\n> Declarative Machine Learning eXperiments\n\n## Introduction\n\n`dmlx` is a declarative framework for machine learning (ML) experiments.\nTypically, ML codebases use the standard python library `argparse` to parse\nparameters from command line, and pass these parameters deep into the models and\nother components. `dmlx` standardizes this process and provides an elegant\nframework for experiment declaration and basic management, including the\nfollowing main features:\n\n- **Declarative Experiment Components:** Declarative interfaces are presented\n for defining resusable and reproducible experiment components and\n hyperparameters, such as model path, dataset getter and random seed.\n- **`click`-powered Command Line Interface:**\n [`click`](https://click.palletsprojects.com/) is integrated to provide\n powerful command line functionalities, including parameter properties.\n- **Automatic Parameter Collection:** Parameter properties will be wired with\n command line inputs and collected for experiment reproducibility.\n- **Experiment Archive Management:** Archive directories will be automatically\n created to hold experiment data for further analysis.\n- **ML Framework Independent:** `dmlx` is independent from ML frameworks so you\n can use whatever ML framework you like (PyTorch/TensorFlow/ScikitLearn/...).\n\n## Example\n\nAn example ML codebase using `dmlx` is illustrated below:\n\n- `my_innovative_approach/`\n - `model/`\n - `baseline.py`\n - `ours.py`\n - `dataset/`\n - `dataset_foo.py`\n - `dataset_bar.py`\n - `experiments/`\n - ...\n - `approach.py`\n - `train.py`\n - `analyze.py`\n\n1. Firstly, models are defined as submodules of the `model` module, and dataset\n loaders are defined as submodules of the `dataset` module. These components\n should expect normal Python arguments, and the component factories defined\n later using `component()` will parse command line parameters and pass the\n arguments to real components.\n\n ```python\n # model/xxx.py\n\n class Model:\n def __init__(self, alpha: float, beta: float, ...) -> None: ...\n ```\n\n ``` python\n # dataset/dataset_yyy.py\n\n def get_dataset_yyy(...): ...\n ```\n\n2. Secondly, the components (models/datasets) and other parameters can be\n declared as properties on a composed approach using `dmlx`. The parameter\n properties, declared by `argument()` and `option()`, will define\n corresponding command line parameters and store them as instance attributes.\n The component properties, declared by `component()`, will create the actual\n component objects and store them as instance attributes.\n\n ```python\n # approach.py\n\n from dmlx.context import argument, option, component\n\n\n class Approach:\n model = component(\n argument(\"model_locator\", default=\"ours\"), # click argument\n \"model\", # module base\n \"Model\", # default factory name\n )\n dataset = component(\n option(\"dataset_locator\", \"-d\", \"--dataset\"), # click option\n \"dataset\", # module base\n )\n epochs = option(\"-e\", \"--epochs\", type=int, default=800) # click option\n\n def run(self):\n for epoch in range(self.epochs):\n for x, y_true in self.dataset:\n y_pred = self.model(x)\n yield x, y_true, y_pred\n ```\n\n3. Thirdly, `dmlx.experiment.Experiment` can be used to declare your experiment.\n The experiment object will create an underlying `click` command, and the\n experiment context will collect the parameters(`model_locator`,\n `dataset_locater` and `epochs`) and wire them with command line inputs.\n\n ```python\n # train.py\n\n from dmlx.experiment import Experiment\n\n experiment = Experiment()\n\n with experiment.context():\n from approach import Approach\n\n @experiment.main\n def main(**args):\n experiment.init()\n\n approach = Approach()\n with (experiment.path / \"train.log\").open(\"w\") as log_file:\n for x, y_true, y_pred in approach.run():\n metrics = compute_metrics(y_pred, y_true)\n log_file.write(repr(metrics) + \"\\n\")\n\n approach.model.save(experiment.path / \"model.bin\")\n\n experiment.run()\n ```\n\n4. Finally, you can invoke `train.py` in the command line to actually conduct\n the experiment, where component params accept string locators in the form\n of `path.to.module[:factory_name][?[k_0=v_0][;k_n=v_n...]]` with values\n parsed by `json.loads`.\n\n ```shell\n python train.py 'ours?alpha=0.1' \\\n --dataset 'dataset_foo:get_dataset_foo?\n version = \"2.0\";\n shots = 5;\n # ...\n ' \\\n --epochs 500\n ```\n\n5. After calling `experiment.init()`, an experiment directory will be created in\n `experiments/`, to which `experiment.path` will point, and the experiment\n meta will be dumped into `meta.json` in that directory. Extra data can also\n be saved to the experiment directory, as shown in `train.py`, where a log\n file `train.log` holding epoch metrics and a model archive `model.bin` are\n created. This experiment archive can then be loaded to perform extensive\n inspections, such as visualization and further statistical analysis, where\n properties defined on `Approach` will be automatically restored:\n\n ```python\n # analyze.py\n\n from dmlx.experiment import Experiment\n\n experiment = Experiment()\n\n with experiment.context():\n from approach import Approach\n\n\n @experiment.main\n def main(**args):\n print(\"Loaded args:\", args)\n print(\"Loaded meta:\", experiment.meta)\n\n approach = Approach()\n approach.model.load(experiment.path / \"model.bin\")\n\n # Now, `args`, `approach.model`, `approach.dataset` and other properties\n # are all restored, ready for extensive inspections.\n\n\n experiment.load(\"/path/to/the/experiment\")\n ```\n\n## Links\n\n- [License (ISC)](./LICENSE)\n- [API Reference](https://github.com/huang2002/dmlx/wiki)\n",
"bugtrack_url": null,
"license": null,
"summary": "Declarative machine learning experiments.",
"version": "0.1.0",
"project_urls": {
"Changelog": "https://github.com/huang2002/dmlx/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/huang2002/dmlx/blob/main/README.md",
"Homepage": "https://github.com/huang2002/dmlx",
"Repository": "https://github.com/huang2002/dmlx.git"
},
"split_keywords": [
"argument",
" cli",
" command line interface",
" component",
" declarative",
" experiment",
" machine learning",
" management",
" option",
" parameter"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "822cba888311fe498679bce213132cf7bc7728decf3aa562a1245e3e6a55887f",
"md5": "983fdf51cb9b19fb3d4e906a522857ee",
"sha256": "99e5f2c7bfe0e0e1867bcca3f51e603374c12c32911c72fd481aa651e38fc2eb"
},
"downloads": -1,
"filename": "dmlx-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "983fdf51cb9b19fb3d4e906a522857ee",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 3912,
"upload_time": "2025-07-28T07:26:03",
"upload_time_iso_8601": "2025-07-28T07:26:03.604423Z",
"url": "https://files.pythonhosted.org/packages/82/2c/ba888311fe498679bce213132cf7bc7728decf3aa562a1245e3e6a55887f/dmlx-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4a317f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa",
"md5": "a64d127b20aaffbe7bf6689562a02f03",
"sha256": "72057bf45f9f8cf63ace62c14eeb0b28fec4f454c433d66891cdc0a8a16deb77"
},
"downloads": -1,
"filename": "dmlx-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "a64d127b20aaffbe7bf6689562a02f03",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 7856,
"upload_time": "2025-07-28T07:26:04",
"upload_time_iso_8601": "2025-07-28T07:26:04.570856Z",
"url": "https://files.pythonhosted.org/packages/4a/31/7f3d4428331a143b64b4529a6e273723ccb4dbd94ff38f0984be24b96eaa/dmlx-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-28 07:26:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "huang2002",
"github_project": "dmlx",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dmlx"
}