mlforge


Namemlforge JSON
Version 0.2.2 PyPI version JSON
download
home_pagehttps://github.com/renero/mlforge
SummaryA package to design and run sequential ML pipelines
upload_time2024-11-06 10:11:34
maintainerNone
docs_urlNone
authorJ. Renero
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements numpy pandas rich setuptools tqdm mkdocs mkdocstrings mkdocs-material sphinx sphinx_gallery sphinx_rtd_theme furo
Travis-CI No Travis.
coveralls test coverage No coveralls.
            MLForge
=======

|build-status| |coverage| |wheel| |documentation|

**MLForge** is a simple package to write simple pipelines of calls (to
methods, classes, …). You can access the documentation at
`ReadTheDocs <https://mlforge.readthedocs.io/en/latest/>`__

It surges from the need to execute several things in a row, and to be
able to easily add or remove steps in the pipeline.

This is a Work in Progress.

Installation
------------

To use MLForge, first install it using pip:

.. code:: bash

   (.venv) $ pip install mlforge

Basic Usage
-----------

The general assumption is that this module will help you out in
executing a pipeline of tasks. The tasks are defined in a configuration
file, or within your code, and it will execute them in the order they
are defined.

A Pipeline is normally created with a host object, which is an object
that contains some of the methods that will be called in the pipeline,
but primarily, it is used to store the results of the methods that are
called. If you don’t provide a host object, the pipeline will store the
results in an internal dictionary, from where you can retrieve them with
``get_attribute``.

.. code:: python

   from mlforge import Pipeline

   my_stages= [
       ('method1'),
       ('method2', {'param1': 'value2'}),
       ('method3', ClassName, {param1: 'value1'}),
       ('new_attribute', 'method4', ClassName, {'param1': 'value1'}),
   ]
   pipeline = Pipeline().from_list(my_stages)
   pipeline.run()

This pipeline will execute the following tasks:

1. Call the method ``method1``, which will be located in the host object
   or in globals.
2. Call the method ``method2`` which will be located in the host object
   or in globals, passing the parameter ``param1`` with the value
   ``value2``.
3. Call the method ``method3`` of the class ``ClassName``, passing the
   parameter ``param1`` with the value ``value1``.
4. Call the method ``method4`` of the class ``ClassName``, passing the
   parameter ``param1`` with the value ``value1``, and store the result
   in a new attribute ``new_attribute``. To access the attribute you can
   use the method ``pipeline.get_attribute('new_attribute')``.

If you prefer to specify the stages in a separate YAML configuration
file, you then can use MLForge as follows:

.. code:: python

   from mlforge import Pipeline

   pipeline = Pipeline().from_config('path/to/config.yaml')
   pipeline.run()

The configuration file is a YAML file that defines the tasks to be
executed. The following is an example of YAML configuration file:

.. code:: yaml

   step1:
       method: method
       class: SampleClass
   step2:
       attribute: object
       class: SampleClass
   step3:
       attribute: result1
       method: method
       class: SampleClass
       arguments:
           param2: there!

For each stage of the pipeline (specified in order), you can define the
method to be executed, the class that contains the method, the arguments
to be passed to the method, and the attribute to store the result of the
method. Method arguments can be specified as key-value pairs in the
``arguments`` section.

Alternatively, you can define the tasks in your code and execute them as
follows:

.. code:: python

   from mlforge import Pipeline, Stage

   stage1 = Stage(
       attribute_name='result',
       method_name='my_module.my_function',
       arguments={'arg1': 'value1'})
   stage2 = Stage(
       attribute_name='result2',
       method_name='my_module.my_function2',
       arguments={'arg1': 'result'})

   pipeline = Pipeline().add_stages([stage1, stage2])
   pipeline.run()

Syntax for the stages of the pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In your code, define a list with the stages to be added to the pipeline.
Each of the stages can be specified as any of the following options:

Simply call a method of the host object:

.. code:: python

   'method_name',

Same, but put everything in a tuple

.. code:: python

   ('method_name'),

Call the constructor of a class

.. code:: python

   (ClassHolder),

Call a method of a class

.. code:: python

   ('method_name', ClassHolder),

Call a method of the host object, and keep the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name'),

Call the constructor of a class, and keep the result in a new attribute

.. code:: python

   ('new_attribute', ClassHolder),

Call a method of the host object, with specific parameters, and keep the
result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', {'param1': 'value1', 'param2': 'value2'}),

Call a class method, and get the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', ClassHolder),

Call a method of the host object, with specific parameters

.. code:: python

   ('method_name', {'param1': 'value1', 'param2': 'value2'}),

Call a method of a specific class, with specific parameters.

.. code:: python

   ('method_name', ClassHolder, {'param1': 'value1'}),

Call a method of a specific class, with specific parameters, and keep
the result in a new attribute

.. code:: python

   ('new_attribute', 'method_name', ClassHolder, {'param1': 'value1'}),

To do
-----

-  Add a way to add a step at a specific position
-  Add a way to remove a step
-  Add a way to replace a step
-  Add a way to add a step before or after another step
-  And many other things…


.. |build-status| image:: https://github.com/renero/mlforge/actions/workflows/python-test.yml/badge.svg
    :target: https://github.com/renero/mlforge/actions/workflows/python-test.yml
    :alt: Tests Status

.. |coverage| image:: https://codecov.io/gh/renero/mlforge/graph/badge.svg?token=HRZAE9GS0I
    :target: https://codecov.io/gh/renero/mlforge
    :alt: Code Coverage

.. |wheel| image:: https://github.com/renero/mlforge/actions/workflows/python-publish.yml/badge.svg
    :target: https://pypi.org/project/mlforge/
    :alt: PyPi Publish Status

.. |documentation| image:: https://readthedocs.org/projects/mlforge/badge/?version=latest
    :target: https://mlforge.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/renero/mlforge",
    "name": "mlforge",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "J. Renero",
    "author_email": "jesus.renero@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/7d/4e/c8b6b535ecdfd180e8530ad99afe4ea16dc111607e95a89e1382f759272e/mlforge-0.2.2.tar.gz",
    "platform": null,
    "description": "MLForge\n=======\n\n|build-status| |coverage| |wheel| |documentation|\n\n**MLForge** is a simple package to write simple pipelines of calls (to\nmethods, classes, \u2026). You can access the documentation at\n`ReadTheDocs <https://mlforge.readthedocs.io/en/latest/>`__\n\nIt surges from the need to execute several things in a row, and to be\nable to easily add or remove steps in the pipeline.\n\nThis is a Work in Progress.\n\nInstallation\n------------\n\nTo use MLForge, first install it using pip:\n\n.. code:: bash\n\n   (.venv) $ pip install mlforge\n\nBasic Usage\n-----------\n\nThe general assumption is that this module will help you out in\nexecuting a pipeline of tasks. The tasks are defined in a configuration\nfile, or within your code, and it will execute them in the order they\nare defined.\n\nA Pipeline is normally created with a host object, which is an object\nthat contains some of the methods that will be called in the pipeline,\nbut primarily, it is used to store the results of the methods that are\ncalled. If you don\u2019t provide a host object, the pipeline will store the\nresults in an internal dictionary, from where you can retrieve them with\n``get_attribute``.\n\n.. code:: python\n\n   from mlforge import Pipeline\n\n   my_stages= [\n       ('method1'),\n       ('method2', {'param1': 'value2'}),\n       ('method3', ClassName, {param1: 'value1'}),\n       ('new_attribute', 'method4', ClassName, {'param1': 'value1'}),\n   ]\n   pipeline = Pipeline().from_list(my_stages)\n   pipeline.run()\n\nThis pipeline will execute the following tasks:\n\n1. Call the method ``method1``, which will be located in the host object\n   or in globals.\n2. Call the method ``method2`` which will be located in the host object\n   or in globals, passing the parameter ``param1`` with the value\n   ``value2``.\n3. Call the method ``method3`` of the class ``ClassName``, passing the\n   parameter ``param1`` with the value ``value1``.\n4. Call the method ``method4`` of the class ``ClassName``, passing the\n   parameter ``param1`` with the value ``value1``, and store the result\n   in a new attribute ``new_attribute``. To access the attribute you can\n   use the method ``pipeline.get_attribute('new_attribute')``.\n\nIf you prefer to specify the stages in a separate YAML configuration\nfile, you then can use MLForge as follows:\n\n.. code:: python\n\n   from mlforge import Pipeline\n\n   pipeline = Pipeline().from_config('path/to/config.yaml')\n   pipeline.run()\n\nThe configuration file is a YAML file that defines the tasks to be\nexecuted. The following is an example of YAML configuration file:\n\n.. code:: yaml\n\n   step1:\n       method: method\n       class: SampleClass\n   step2:\n       attribute: object\n       class: SampleClass\n   step3:\n       attribute: result1\n       method: method\n       class: SampleClass\n       arguments:\n           param2: there!\n\nFor each stage of the pipeline (specified in order), you can define the\nmethod to be executed, the class that contains the method, the arguments\nto be passed to the method, and the attribute to store the result of the\nmethod. Method arguments can be specified as key-value pairs in the\n``arguments`` section.\n\nAlternatively, you can define the tasks in your code and execute them as\nfollows:\n\n.. code:: python\n\n   from mlforge import Pipeline, Stage\n\n   stage1 = Stage(\n       attribute_name='result',\n       method_name='my_module.my_function',\n       arguments={'arg1': 'value1'})\n   stage2 = Stage(\n       attribute_name='result2',\n       method_name='my_module.my_function2',\n       arguments={'arg1': 'result'})\n\n   pipeline = Pipeline().add_stages([stage1, stage2])\n   pipeline.run()\n\nSyntax for the stages of the pipeline\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIn your code, define a list with the stages to be added to the pipeline.\nEach of the stages can be specified as any of the following options:\n\nSimply call a method of the host object:\n\n.. code:: python\n\n   'method_name',\n\nSame, but put everything in a tuple\n\n.. code:: python\n\n   ('method_name'),\n\nCall the constructor of a class\n\n.. code:: python\n\n   (ClassHolder),\n\nCall a method of a class\n\n.. code:: python\n\n   ('method_name', ClassHolder),\n\nCall a method of the host object, and keep the result in a new attribute\n\n.. code:: python\n\n   ('new_attribute', 'method_name'),\n\nCall the constructor of a class, and keep the result in a new attribute\n\n.. code:: python\n\n   ('new_attribute', ClassHolder),\n\nCall a method of the host object, with specific parameters, and keep the\nresult in a new attribute\n\n.. code:: python\n\n   ('new_attribute', 'method_name', {'param1': 'value1', 'param2': 'value2'}),\n\nCall a class method, and get the result in a new attribute\n\n.. code:: python\n\n   ('new_attribute', 'method_name', ClassHolder),\n\nCall a method of the host object, with specific parameters\n\n.. code:: python\n\n   ('method_name', {'param1': 'value1', 'param2': 'value2'}),\n\nCall a method of a specific class, with specific parameters.\n\n.. code:: python\n\n   ('method_name', ClassHolder, {'param1': 'value1'}),\n\nCall a method of a specific class, with specific parameters, and keep\nthe result in a new attribute\n\n.. code:: python\n\n   ('new_attribute', 'method_name', ClassHolder, {'param1': 'value1'}),\n\nTo do\n-----\n\n-  Add a way to add a step at a specific position\n-  Add a way to remove a step\n-  Add a way to replace a step\n-  Add a way to add a step before or after another step\n-  And many other things\u2026\n\n\n.. |build-status| image:: https://github.com/renero/mlforge/actions/workflows/python-test.yml/badge.svg\n    :target: https://github.com/renero/mlforge/actions/workflows/python-test.yml\n    :alt: Tests Status\n\n.. |coverage| image:: https://codecov.io/gh/renero/mlforge/graph/badge.svg?token=HRZAE9GS0I\n    :target: https://codecov.io/gh/renero/mlforge\n    :alt: Code Coverage\n\n.. |wheel| image:: https://github.com/renero/mlforge/actions/workflows/python-publish.yml/badge.svg\n    :target: https://pypi.org/project/mlforge/\n    :alt: PyPi Publish Status\n\n.. |documentation| image:: https://readthedocs.org/projects/mlforge/badge/?version=latest\n    :target: https://mlforge.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A package to design and run sequential ML pipelines",
    "version": "0.2.2",
    "project_urls": {
        "Homepage": "https://github.com/renero/mlforge"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b9dd77a52990e0d6ac3e81cb7c70f0f93c070fb596ed88dd917396f9a460398c",
                "md5": "75d394b3020b26f34543f3219e4c5495",
                "sha256": "dc1ea81eeb309d45cd27b921719f0d756238e612aeb44622d368ed03636e7200"
            },
            "downloads": -1,
            "filename": "mlforge-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "75d394b3020b26f34543f3219e4c5495",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 21763,
            "upload_time": "2024-11-06T10:11:34",
            "upload_time_iso_8601": "2024-11-06T10:11:34.039604Z",
            "url": "https://files.pythonhosted.org/packages/b9/dd/77a52990e0d6ac3e81cb7c70f0f93c070fb596ed88dd917396f9a460398c/mlforge-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7d4ec8b6b535ecdfd180e8530ad99afe4ea16dc111607e95a89e1382f759272e",
                "md5": "21fff88503b05685a6db7331c4275968",
                "sha256": "91d06471711d304b84ddc87ee586b90987c7f53ec504b03617b488c274efbc50"
            },
            "downloads": -1,
            "filename": "mlforge-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "21fff88503b05685a6db7331c4275968",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 18756,
            "upload_time": "2024-11-06T10:11:34",
            "upload_time_iso_8601": "2024-11-06T10:11:34.992106Z",
            "url": "https://files.pythonhosted.org/packages/7d/4e/c8b6b535ecdfd180e8530ad99afe4ea16dc111607e95a89e1382f759272e/mlforge-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-06 10:11:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "renero",
    "github_project": "mlforge",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.25.2"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.0.3"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.7.1"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    "==",
                    "68.2.2"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.65.2"
                ]
            ]
        },
        {
            "name": "mkdocs",
            "specs": [
                [
                    "==",
                    "1.5.3"
                ]
            ]
        },
        {
            "name": "mkdocstrings",
            "specs": []
        },
        {
            "name": "mkdocs-material",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "sphinx_gallery",
            "specs": []
        },
        {
            "name": "sphinx_rtd_theme",
            "specs": []
        },
        {
            "name": "furo",
            "specs": []
        }
    ],
    "lcname": "mlforge"
}
        
Elapsed time: 0.38249s