mlblocks

Name	mlblocks JSON
Version	0.6.1 JSON
	download
home_page	https://github.com/MLBazaar/MLBlocks
Summary	Pipelines and primitives for machine learning and data science.
upload_time	2023-09-26 17:42:09
maintainer
docs_url	None
author	MIT Data To AI Lab
requires_python	>=3.6,<3.12
license	MIT license
keywords	auto machine learning classification regression data science pipeline
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align="left">
  <a href="https://dai.lids.mit.edu">
    <img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt="DAI-Lab" />
  </a>
  <i>An Open Source Project from the <a href="https://dai.lids.mit.edu">Data to AI Lab, at MIT</a></i>
</p>

<p align="left">
<img width=20% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/mlblocks-icon.png" alt=“MLBlocks” />
</p>

<p align="left">
Pipelines and Primitives for Machine Learning and Data Science.
</p>

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![PyPi](https://img.shields.io/pypi/v/mlblocks.svg)](https://pypi.python.org/pypi/mlblocks)
[![Tests](https://github.com/MLBazaar/MLBlocks/workflows/Run%20Tests/badge.svg)](https://github.com/MLBazaar/MLBlocks/actions?query=workflow%3A%22Run+Tests%22+branch%3Amaster)
[![CodeCov](https://codecov.io/gh/MLBazaar/MLBlocks/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/MLBlocks)
[![Downloads](https://pepy.tech/badge/mlblocks)](https://pepy.tech/project/mlblocks)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/MLBlocks/master?filepath=examples/tutorials)

<br>

# MLBlocks

* Documentation: https://mlbazaar.github.io/MLBlocks
* Github: https://github.com/MLBazaar/MLBlocks
* License: [MIT](https://github.com/MLBazaar/MLBlocks/blob/master/LICENSE)
* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)

## Overview

MLBlocks is a simple framework for composing end-to-end tunable Machine Learning Pipelines by
seamlessly combining tools from any python library with a simple, common and uniform interface.

Features include:

* Build Machine Learning Pipelines combining **any Machine Learning Library in Python**.
* Access a repository with hundreds of primitives and pipelines ready to be used with little to
  no python code to write, carefully curated by Machine Learning and Domain experts.
* Extract machine-readable information about which hyperparameters can be tuned and within
  which ranges, allowing automated integration with Hyperparameter Optimization tools like
  [BTB](https://github.com/MLBazaar/BTB).
* Complex multi-branch pipelines and DAG configurations, with unlimited number of inputs and
  outputs per primitive.
* Easy save and load Pipelines using JSON Annotations.

# Install

## Requirements

**MLBlocks** has been developed and tested on [Python 3.6, 3.7, 3.8, 3.9, and 3.10](https://www.python.org/downloads/)

## Install with `pip`

The easiest and recommended way to install **MLBlocks** is using [pip](
https://pip.pypa.io/en/stable/):

```bash
pip install mlblocks
```

This will pull and install the latest stable release from [PyPi](https://pypi.org/).

If you want to install from source or contribute to the project please read the
[Contributing Guide](https://mlbazaar.github.io/MLBlocks/contributing.html#get-started).

## MLPrimitives

In order to be usable, MLBlocks requires a compatible primitives library.

The official library, required in order to follow the following MLBlocks tutorial,
is [MLPrimitives](https://github.com/MLBazaar/MLPrimitives), which you can install
with this command:

```bash
pip install mlprimitives
```

# Quickstart

Below there is a short example about how to use **MLBlocks** to solve the [Adult Census
Dataset](https://archive.ics.uci.edu/ml/datasets/Adult) classification problem using a
pipeline which combines primitives from [MLPrimitives](https://github.com/MLBazaar/MLPrimitives),
[scikit-learn](https://scikit-learn.org/) and [xgboost](https://xgboost.readthedocs.io/).

```python3
import pandas as pd
from mlblocks import MLPipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

dataset = pd.read_csv('http://mlblocks.s3.amazonaws.com/census.csv')
label = dataset.pop('label')

X_train, X_test, y_train, y_test = train_test_split(dataset, label, stratify=label)

primitives = [
    'mlprimitives.custom.preprocessing.ClassEncoder',
    'mlprimitives.custom.feature_extraction.CategoricalEncoder',
    'sklearn.impute.SimpleImputer',
    'xgboost.XGBClassifier',
    'mlprimitives.custom.preprocessing.ClassDecoder'
]
pipeline = MLPipeline(primitives)

pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)

accuracy_score(y_test, predictions)
```

# What's Next?

If you want to learn more about how to tune the pipeline hyperparameters, save and load
the pipelines using JSON annotations or build complex multi-branched pipelines, please
check our [documentation site](https://mlbazaar.github.io/MLBlocks).

Also do not forget to have a look at the [notebook tutorials](
https://github.com/MLBazaar/MLBlocks/tree/master/examples/tutorials)!

# Citing MLBlocks

If you use MLBlocks for your research, please consider citing our related papers.

For the current design of MLBlocks and its usage within the larger *Machine Learning Bazaar* project at
the MIT Data To AI Lab, please see:

Micah J. Smith, Carles Sala, James Max Kanter, and Kalyan Veeramachaneni. ["The Machine Learning Bazaar:
Harnessing the ML Ecosystem for Effective System Development."](https://arxiv.org/abs/1905.08942) arXiv
Preprint 1905.08942. 2019.

```bibtex
@article{smith2019mlbazaar,
  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
  journal = {arXiv e-prints},
  year = {2019},
  eid = {arXiv:1905.08942},
  pages = {arXiv:1905.08942},
  archivePrefix = {arXiv},
  eprint = {1905.08942},
}
```

For the first MLBlocks version from 2015, designed for only multi table, multi entity temporal data, please
refer to Bryan Collazo’s thesis:

* [Machine learning blocks](https://dai.lids.mit.edu/wp-content/uploads/2018/06/Mlblocks_Bryan.pdf).
  Bryan Collazo. Masters thesis, MIT EECS, 2015.

With recent availability of a multitude of libraries and tools, we decided it was time to integrate
them and expand the library to address other data types: images, text, graph, time series and
integrate with deep learning libraries.


Changelog
=========

0.6.1 - 2023-09-26
------------------

* Add python 3.11 to MLBlocks - [Issue #143](https://github.com/MLBazaar/MLBlocks/issues/143) by @sarahmish

0.6.0 - 2023-04-14
------------------

* Support python 3.9 and 3.10 - [Issue #141](https://github.com/MLBazaar/MLBlocks/issues/141) by @sarahmish

0.5.0 - 2023-01-22
------------------

* Update `numpy` dependency and isolate tests - [Issue #139](https://github.com/MLBazaar/MLBlocks/issues/139) by @sarahmish

0.4.1 - 2021-10-08
------------------

* Update NumPy dependency - [Issue #136](https://github.com/MLBazaar/MLBlocks/issues/136) by @sarahmish
* Support dynamic inputs and outputs - [Issue #134](https://github.com/MLBazaar/MLBlocks/issues/134) by @pvk-developer

0.4.0 - 2021-01-09
------------------

* Stop pipeline fitting after the last block - [Issue #131](https://github.com/MLBazaar/MLBlocks/issues/131) by @sarahmish
* Add memory debug and profiling - [Issue #130](https://github.com/MLBazaar/MLBlocks/issues/130) by @pvk-developer
* Update Python support - [Issue #129](https://github.com/MLBazaar/MLBlocks/issues/129) by @csala
* Get execution time for each block - [Issue #127](https://github.com/MLBazaar/MLBlocks/issues/127) by @sarahmish
* Allow loading a primitive or pipeline directly from the JSON path - [Issue #114](https://github.com/MLBazaar/MLBlocks/issues/114) by @csala
* Pipeline Diagrams - [Issue #113](https://github.com/MLBazaar/MLBlocks/issues/113) by @erica-chiu
* Get Pipeline Inputs - [Issue #112](https://github.com/MLBazaar/MLBlocks/issues/112) by @erica-chiu

0.3.4 - 2019-11-01
------------------

* Ability to return intermediate context - [Issue #110](https://github.com/MLBazaar/MLBlocks/issues/110) by @csala
* Support for static or class methods - [Issue #107](https://github.com/MLBazaar/MLBlocks/issues/107) by @csala

0.3.3 - 2019-09-09
------------------

* Improved intermediate outputs management - [Issue #105](https://github.com/MLBazaar/MLBlocks/issues/105) by @csala

0.3.2 - 2019-08-12
------------------

* Allow passing fit and produce arguments as `init_params` - [Issue #96](https://github.com/MLBazaar/MLBlocks/issues/96) by @csala
* Support optional fit and produce args and arg defaults - [Issue #95](https://github.com/MLBazaar/MLBlocks/issues/95) by @csala
* Isolate primitives from their hyperparameters dictionary - [Issue #94](https://github.com/MLBazaar/MLBlocks/issues/94) by @csala
* Add functions to explore the available primitives and pipelines - [Issue #90](https://github.com/MLBazaar/MLBlocks/issues/90) by @csala
* Add primitive caching - [Issue #22](https://github.com/MLBazaar/MLBlocks/issues/22) by @csala

0.3.1 - Pipelines Discovery
---------------------------

* Support flat hyperparameter dictionaries - [Issue #92](https://github.com/MLBazaar/MLBlocks/issues/92) by @csala
* Load pipelines by name and register them as `entry_points` - [Issue #88](https://github.com/MLBazaar/MLBlocks/issues/88) by @csala
* Implement partial re-fit -[Issue #61](https://github.com/MLBazaar/MLBlocks/issues/61) by @csala
* Move argument parsing to MLBlock - [Issue #86](https://github.com/MLBazaar/MLBlocks/issues/86) by @csala
* Allow getting intermediate outputs - [Issue #58](https://github.com/MLBazaar/MLBlocks/issues/58) by @csala

0.3.0 - New Primitives Discovery
--------------------------------

* New primitives discovery system based on `entry_points`.
* Conditional Hyperparameters filtering in MLBlock initialization.
* Improved logging and exception reporting.

0.2.4 - New Datasets and Unit Tests
-----------------------------------

* Add a new multi-table dataset.
* Add Unit Tests up to 50% coverage.
* Improve documentation.
* Fix minor bug in newsgroups dataset.

0.2.3 - Demo Datasets
---------------------

* Add new methods to Dataset class.
* Add documentation for the datasets module.

0.2.2 - MLPipeline Load/Save
----------------------------

* Implement save and load methods for MLPipelines
* Add more datasets

0.2.1 - New Documentation
-------------------------

* Add mlblocks.datasets module with demo data download functions.
* Extensive documentation, including multiple pipeline examples.

0.2.0 - New MLBlocks API
------------------------

A new MLBlocks API and Primitive format.

This is a summary of the changes:

* Primitives JSONs and Python code has been moved to a different repository, called MLPrimitives
* Optional usage of multiple JSON primitive folders.
* JSON format has been changed to allow more flexibility and features:
    * input and output arguments, as well as argument types, can be specified for each method
    * both classes and function as primitives are supported
    * multitype and conditional hyperparameters fully supported
    * data modalities and primitive classifiers introduced
    * metadata such as documentation, description and author fields added
* Parsers are removed, and now the MLBlock class is responsible for loading and reading the
  JSON primitive.
* Multiple blocks of the same primitive are supported within the same pipeline.
* Arbitrary inputs and outputs for both pipelines and blocks are allowed.
* Shared variables during pipeline execution, usable by multiple blocks.

0.1.9 - Bugfix Release
----------------------

* Disable some NetworkX functions for incompatibilities with some types of graphs.

0.1.8 - New primitives and some improvements
--------------------------------------------

* Improve the NetworkX primitives.
* Add String Vectorization and Datetime Featurization primitives.
* Refactor some Keras primitives to work with single dimension `y` arrays and be compatible with `pickle`.
* Add XGBClassifier and XGBRegressor primitives.
* Add some `keras.applications` pretrained networks as preprocessing primitives.
* Add helper class to allow function primitives.

0.1.7 - Nested hyperparams dicts
--------------------------------

* Support passing hyperparams as nested dicts.

0.1.6 - Text and Graph Pipelines
--------------------------------

* Add LSTM classifier and regressor primitives.
* Add OneHotEncoder and MultiLabelEncoder primitives.
* Add several NetworkX graph featurization primitives.
* Add `community.best_partition` primitive.

0.1.5 - Collaborative Filtering Pipelines
-----------------------------------------

* Add LightFM primitive.

0.1.4 - Image pipelines improved
--------------------------------

* Allow passing `init_params` on `MLPipeline` creation.
* Fix bug with MLHyperparam types and Keras.
* Rename `produce_params` as `predict_params`.
* Add SingleCNN Classifier and Regressor primitives.
* Simplify and improve Trivial Predictor

0.1.3 - Multi Table pipelines improved
--------------------------------------

* Improve RandomForest primitive ranges
* Improve DFS primitive
* Add Tree Based Feature Selection primitives
* Fix bugs in TrivialPredictor
* Improved documentation

0.1.2 - Bugfix release
----------------------

* Fix bug in TrivialMedianPredictor
* Fix bug in OneHotLabelEncoder

0.1.1 - Single Table pipelines improved
---------------------------------------

* New project structure and primitives for integration into MIT-TA2.
* MIT-TA2 default pipelines and single table pipelines fully working.

0.1.0
-----

* First release on PyPI.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MLBazaar/MLBlocks",
    "name": "mlblocks",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6,<3.12",
    "maintainer_email": "",
    "keywords": "auto machine learning classification regression data science pipeline",
    "author": "MIT Data To AI Lab",
    "author_email": "dailabmit@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f7/13/44962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0/mlblocks-0.6.1.tar.gz",
    "platform": null,
    "description": "<p align=\"left\">\n  <a href=\"https://dai.lids.mit.edu\">\n    <img width=15% src=\"https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png\" alt=\"DAI-Lab\" />\n  </a>\n  <i>An Open Source Project from the <a href=\"https://dai.lids.mit.edu\">Data to AI Lab, at MIT</a></i>\n</p>\n\n<p align=\"left\">\n<img width=20% src=\"https://dai.lids.mit.edu/wp-content/uploads/2018/06/mlblocks-icon.png\" alt=\u201cMLBlocks\u201d />\n</p>\n\n<p align=\"left\">\nPipelines and Primitives for Machine Learning and Data Science.\n</p>\n\n[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n[![PyPi](https://img.shields.io/pypi/v/mlblocks.svg)](https://pypi.python.org/pypi/mlblocks)\n[![Tests](https://github.com/MLBazaar/MLBlocks/workflows/Run%20Tests/badge.svg)](https://github.com/MLBazaar/MLBlocks/actions?query=workflow%3A%22Run+Tests%22+branch%3Amaster)\n[![CodeCov](https://codecov.io/gh/MLBazaar/MLBlocks/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/MLBlocks)\n[![Downloads](https://pepy.tech/badge/mlblocks)](https://pepy.tech/project/mlblocks)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/MLBlocks/master?filepath=examples/tutorials)\n\n<br>\n\n# MLBlocks\n\n* Documentation: https://mlbazaar.github.io/MLBlocks\n* Github: https://github.com/MLBazaar/MLBlocks\n* License: [MIT](https://github.com/MLBazaar/MLBlocks/blob/master/LICENSE)\n* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n\n## Overview\n\nMLBlocks is a simple framework for composing end-to-end tunable Machine Learning Pipelines by\nseamlessly combining tools from any python library with a simple, common and uniform interface.\n\nFeatures include:\n\n* Build Machine Learning Pipelines combining **any Machine Learning Library in Python**.\n* Access a repository with hundreds of primitives and pipelines ready to be used with little to\n  no python code to write, carefully curated by Machine Learning and Domain experts.\n* Extract machine-readable information about which hyperparameters can be tuned and within\n  which ranges, allowing automated integration with Hyperparameter Optimization tools like\n  [BTB](https://github.com/MLBazaar/BTB).\n* Complex multi-branch pipelines and DAG configurations, with unlimited number of inputs and\n  outputs per primitive.\n* Easy save and load Pipelines using JSON Annotations.\n\n# Install\n\n## Requirements\n\n**MLBlocks** has been developed and tested on [Python 3.6, 3.7, 3.8, 3.9, and 3.10](https://www.python.org/downloads/)\n\n## Install with `pip`\n\nThe easiest and recommended way to install **MLBlocks** is using [pip](\nhttps://pip.pypa.io/en/stable/):\n\n```bash\npip install mlblocks\n```\n\nThis will pull and install the latest stable release from [PyPi](https://pypi.org/).\n\nIf you want to install from source or contribute to the project please read the\n[Contributing Guide](https://mlbazaar.github.io/MLBlocks/contributing.html#get-started).\n\n## MLPrimitives\n\nIn order to be usable, MLBlocks requires a compatible primitives library.\n\nThe official library, required in order to follow the following MLBlocks tutorial,\nis [MLPrimitives](https://github.com/MLBazaar/MLPrimitives), which you can install\nwith this command:\n\n```bash\npip install mlprimitives\n```\n\n# Quickstart\n\nBelow there is a short example about how to use **MLBlocks** to solve the [Adult Census\nDataset](https://archive.ics.uci.edu/ml/datasets/Adult) classification problem using a\npipeline which combines primitives from [MLPrimitives](https://github.com/MLBazaar/MLPrimitives),\n[scikit-learn](https://scikit-learn.org/) and [xgboost](https://xgboost.readthedocs.io/).\n\n```python3\nimport pandas as pd\nfrom mlblocks import MLPipeline\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import accuracy_score\n\ndataset = pd.read_csv('http://mlblocks.s3.amazonaws.com/census.csv')\nlabel = dataset.pop('label')\n\nX_train, X_test, y_train, y_test = train_test_split(dataset, label, stratify=label)\n\nprimitives = [\n    'mlprimitives.custom.preprocessing.ClassEncoder',\n    'mlprimitives.custom.feature_extraction.CategoricalEncoder',\n    'sklearn.impute.SimpleImputer',\n    'xgboost.XGBClassifier',\n    'mlprimitives.custom.preprocessing.ClassDecoder'\n]\npipeline = MLPipeline(primitives)\n\npipeline.fit(X_train, y_train)\npredictions = pipeline.predict(X_test)\n\naccuracy_score(y_test, predictions)\n```\n\n# What's Next?\n\nIf you want to learn more about how to tune the pipeline hyperparameters, save and load\nthe pipelines using JSON annotations or build complex multi-branched pipelines, please\ncheck our [documentation site](https://mlbazaar.github.io/MLBlocks).\n\nAlso do not forget to have a look at the [notebook tutorials](\nhttps://github.com/MLBazaar/MLBlocks/tree/master/examples/tutorials)!\n\n# Citing MLBlocks\n\nIf you use MLBlocks for your research, please consider citing our related papers.\n\nFor the current design of MLBlocks and its usage within the larger *Machine Learning Bazaar* project at\nthe MIT Data To AI Lab, please see:\n\nMicah J. Smith, Carles Sala, James Max Kanter, and Kalyan Veeramachaneni. [\"The Machine Learning Bazaar:\nHarnessing the ML Ecosystem for Effective System Development.\"](https://arxiv.org/abs/1905.08942) arXiv\nPreprint 1905.08942. 2019.\n\n```bibtex\n@article{smith2019mlbazaar,\n  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},\n  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},\n  journal = {arXiv e-prints},\n  year = {2019},\n  eid = {arXiv:1905.08942},\n  pages = {arXiv:1905.08942},\n  archivePrefix = {arXiv},\n  eprint = {1905.08942},\n}\n```\n\nFor the first MLBlocks version from 2015, designed for only multi table, multi entity temporal data, please\nrefer to Bryan Collazo\u2019s thesis:\n\n* [Machine learning blocks](https://dai.lids.mit.edu/wp-content/uploads/2018/06/Mlblocks_Bryan.pdf).\n  Bryan Collazo. Masters thesis, MIT EECS, 2015.\n\nWith recent availability of a multitude of libraries and tools, we decided it was time to integrate\nthem and expand the library to address other data types: images, text, graph, time series and\nintegrate with deep learning libraries.\n\n\nChangelog\n=========\n\n0.6.1 - 2023-09-26\n------------------\n\n* Add python 3.11 to MLBlocks - [Issue #143](https://github.com/MLBazaar/MLBlocks/issues/143) by @sarahmish\n\n0.6.0 - 2023-04-14\n------------------\n\n* Support python 3.9 and 3.10 - [Issue #141](https://github.com/MLBazaar/MLBlocks/issues/141) by @sarahmish\n\n0.5.0 - 2023-01-22\n------------------\n\n* Update `numpy` dependency and isolate tests - [Issue #139](https://github.com/MLBazaar/MLBlocks/issues/139) by @sarahmish\n\n0.4.1 - 2021-10-08\n------------------\n\n* Update NumPy dependency - [Issue #136](https://github.com/MLBazaar/MLBlocks/issues/136) by @sarahmish\n* Support dynamic inputs and outputs - [Issue #134](https://github.com/MLBazaar/MLBlocks/issues/134) by @pvk-developer\n\n0.4.0 - 2021-01-09\n------------------\n\n* Stop pipeline fitting after the last block - [Issue #131](https://github.com/MLBazaar/MLBlocks/issues/131) by @sarahmish\n* Add memory debug and profiling - [Issue #130](https://github.com/MLBazaar/MLBlocks/issues/130) by @pvk-developer\n* Update Python support - [Issue #129](https://github.com/MLBazaar/MLBlocks/issues/129) by @csala\n* Get execution time for each block - [Issue #127](https://github.com/MLBazaar/MLBlocks/issues/127) by @sarahmish\n* Allow loading a primitive or pipeline directly from the JSON path - [Issue #114](https://github.com/MLBazaar/MLBlocks/issues/114) by @csala\n* Pipeline Diagrams - [Issue #113](https://github.com/MLBazaar/MLBlocks/issues/113) by @erica-chiu\n* Get Pipeline Inputs - [Issue #112](https://github.com/MLBazaar/MLBlocks/issues/112) by @erica-chiu\n\n0.3.4 - 2019-11-01\n------------------\n\n* Ability to return intermediate context - [Issue #110](https://github.com/MLBazaar/MLBlocks/issues/110) by @csala\n* Support for static or class methods - [Issue #107](https://github.com/MLBazaar/MLBlocks/issues/107) by @csala\n\n0.3.3 - 2019-09-09\n------------------\n\n* Improved intermediate outputs management - [Issue #105](https://github.com/MLBazaar/MLBlocks/issues/105) by @csala\n\n0.3.2 - 2019-08-12\n------------------\n\n* Allow passing fit and produce arguments as `init_params` - [Issue #96](https://github.com/MLBazaar/MLBlocks/issues/96) by @csala\n* Support optional fit and produce args and arg defaults - [Issue #95](https://github.com/MLBazaar/MLBlocks/issues/95) by @csala\n* Isolate primitives from their hyperparameters dictionary - [Issue #94](https://github.com/MLBazaar/MLBlocks/issues/94) by @csala\n* Add functions to explore the available primitives and pipelines - [Issue #90](https://github.com/MLBazaar/MLBlocks/issues/90) by @csala\n* Add primitive caching - [Issue #22](https://github.com/MLBazaar/MLBlocks/issues/22) by @csala\n\n0.3.1 - Pipelines Discovery\n---------------------------\n\n* Support flat hyperparameter dictionaries - [Issue #92](https://github.com/MLBazaar/MLBlocks/issues/92) by @csala\n* Load pipelines by name and register them as `entry_points` - [Issue #88](https://github.com/MLBazaar/MLBlocks/issues/88) by @csala\n* Implement partial re-fit -[Issue #61](https://github.com/MLBazaar/MLBlocks/issues/61) by @csala\n* Move argument parsing to MLBlock - [Issue #86](https://github.com/MLBazaar/MLBlocks/issues/86) by @csala\n* Allow getting intermediate outputs - [Issue #58](https://github.com/MLBazaar/MLBlocks/issues/58) by @csala\n\n0.3.0 - New Primitives Discovery\n--------------------------------\n\n* New primitives discovery system based on `entry_points`.\n* Conditional Hyperparameters filtering in MLBlock initialization.\n* Improved logging and exception reporting.\n\n0.2.4 - New Datasets and Unit Tests\n-----------------------------------\n\n* Add a new multi-table dataset.\n* Add Unit Tests up to 50% coverage.\n* Improve documentation.\n* Fix minor bug in newsgroups dataset.\n\n0.2.3 - Demo Datasets\n---------------------\n\n* Add new methods to Dataset class.\n* Add documentation for the datasets module.\n\n0.2.2 - MLPipeline Load/Save\n----------------------------\n\n* Implement save and load methods for MLPipelines\n* Add more datasets\n\n0.2.1 - New Documentation\n-------------------------\n\n* Add mlblocks.datasets module with demo data download functions.\n* Extensive documentation, including multiple pipeline examples.\n\n0.2.0 - New MLBlocks API\n------------------------\n\nA new MLBlocks API and Primitive format.\n\nThis is a summary of the changes:\n\n* Primitives JSONs and Python code has been moved to a different repository, called MLPrimitives\n* Optional usage of multiple JSON primitive folders.\n* JSON format has been changed to allow more flexibility and features:\n    * input and output arguments, as well as argument types, can be specified for each method\n    * both classes and function as primitives are supported\n    * multitype and conditional hyperparameters fully supported\n    * data modalities and primitive classifiers introduced\n    * metadata such as documentation, description and author fields added\n* Parsers are removed, and now the MLBlock class is responsible for loading and reading the\n  JSON primitive.\n* Multiple blocks of the same primitive are supported within the same pipeline.\n* Arbitrary inputs and outputs for both pipelines and blocks are allowed.\n* Shared variables during pipeline execution, usable by multiple blocks.\n\n0.1.9 - Bugfix Release\n----------------------\n\n* Disable some NetworkX functions for incompatibilities with some types of graphs.\n\n0.1.8 - New primitives and some improvements\n--------------------------------------------\n\n* Improve the NetworkX primitives.\n* Add String Vectorization and Datetime Featurization primitives.\n* Refactor some Keras primitives to work with single dimension `y` arrays and be compatible with `pickle`.\n* Add XGBClassifier and XGBRegressor primitives.\n* Add some `keras.applications` pretrained networks as preprocessing primitives.\n* Add helper class to allow function primitives.\n\n0.1.7 - Nested hyperparams dicts\n--------------------------------\n\n* Support passing hyperparams as nested dicts.\n\n0.1.6 - Text and Graph Pipelines\n--------------------------------\n\n* Add LSTM classifier and regressor primitives.\n* Add OneHotEncoder and MultiLabelEncoder primitives.\n* Add several NetworkX graph featurization primitives.\n* Add `community.best_partition` primitive.\n\n0.1.5 - Collaborative Filtering Pipelines\n-----------------------------------------\n\n* Add LightFM primitive.\n\n0.1.4 - Image pipelines improved\n--------------------------------\n\n* Allow passing `init_params` on `MLPipeline` creation.\n* Fix bug with MLHyperparam types and Keras.\n* Rename `produce_params` as `predict_params`.\n* Add SingleCNN Classifier and Regressor primitives.\n* Simplify and improve Trivial Predictor\n\n0.1.3 - Multi Table pipelines improved\n--------------------------------------\n\n* Improve RandomForest primitive ranges\n* Improve DFS primitive\n* Add Tree Based Feature Selection primitives\n* Fix bugs in TrivialPredictor\n* Improved documentation\n\n0.1.2 - Bugfix release\n----------------------\n\n* Fix bug in TrivialMedianPredictor\n* Fix bug in OneHotLabelEncoder\n\n0.1.1 - Single Table pipelines improved\n---------------------------------------\n\n* New project structure and primitives for integration into MIT-TA2.\n* MIT-TA2 default pipelines and single table pipelines fully working.\n\n0.1.0\n-----\n\n* First release on PyPI.\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Pipelines and primitives for machine learning and data science.",
    "version": "0.6.1",
    "project_urls": {
        "Homepage": "https://github.com/MLBazaar/MLBlocks"
    },
    "split_keywords": [
        "auto",
        "machine",
        "learning",
        "classification",
        "regression",
        "data",
        "science",
        "pipeline"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5b502523eeb6552941fef61f087ec0896ae0364c9a0db062a8dcfc2c97af3499",
                "md5": "bb16cf374766a6a69abcf846a1e7e9f1",
                "sha256": "176a5c17dea315342510822026a48a90584a4f39773ba6df989d0bee7b40f801"
            },
            "downloads": -1,
            "filename": "mlblocks-0.6.1-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bb16cf374766a6a69abcf846a1e7e9f1",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.6,<3.12",
            "size": 25728,
            "upload_time": "2023-09-26T17:42:06",
            "upload_time_iso_8601": "2023-09-26T17:42:06.849870Z",
            "url": "https://files.pythonhosted.org/packages/5b/50/2523eeb6552941fef61f087ec0896ae0364c9a0db062a8dcfc2c97af3499/mlblocks-0.6.1-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f71344962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0",
                "md5": "491daf6eba24bae9ea1bb72cdfd04ab1",
                "sha256": "8a67ea025858cc8c317c31d14aa558872f2a2bc233c4e94ae353e7a90e589737"
            },
            "downloads": -1,
            "filename": "mlblocks-0.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "491daf6eba24bae9ea1bb72cdfd04ab1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6,<3.12",
            "size": 79362,
            "upload_time": "2023-09-26T17:42:09",
            "upload_time_iso_8601": "2023-09-26T17:42:09.252211Z",
            "url": "https://files.pythonhosted.org/packages/f7/13/44962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0/mlblocks-0.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-26 17:42:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MLBazaar",
    "github_project": "MLBlocks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "mlblocks"
}

MIT Data To AI Lab