<p align="left">
<a href="https://dai.lids.mit.edu">
<img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt="DAI-Lab" />
</a>
<i>An Open Source Project from the <a href="https://dai.lids.mit.edu">Data to AI Lab, at MIT</a></i>
</p>
<p align="left">
<img width=20% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/mlblocks-icon.png" alt=“MLBlocks” />
</p>
<p align="left">
Pipelines and Primitives for Machine Learning and Data Science.
</p>
[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![PyPi](https://img.shields.io/pypi/v/mlblocks.svg)](https://pypi.python.org/pypi/mlblocks)
[![Tests](https://github.com/MLBazaar/MLBlocks/workflows/Run%20Tests/badge.svg)](https://github.com/MLBazaar/MLBlocks/actions?query=workflow%3A%22Run+Tests%22+branch%3Amaster)
[![CodeCov](https://codecov.io/gh/MLBazaar/MLBlocks/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/MLBlocks)
[![Downloads](https://pepy.tech/badge/mlblocks)](https://pepy.tech/project/mlblocks)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/MLBlocks/master?filepath=examples/tutorials)
<br>
# MLBlocks
* Documentation: https://mlbazaar.github.io/MLBlocks
* Github: https://github.com/MLBazaar/MLBlocks
* License: [MIT](https://github.com/MLBazaar/MLBlocks/blob/master/LICENSE)
* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
## Overview
MLBlocks is a simple framework for composing end-to-end tunable Machine Learning Pipelines by
seamlessly combining tools from any python library with a simple, common and uniform interface.
Features include:
* Build Machine Learning Pipelines combining **any Machine Learning Library in Python**.
* Access a repository with hundreds of primitives and pipelines ready to be used with little to
no python code to write, carefully curated by Machine Learning and Domain experts.
* Extract machine-readable information about which hyperparameters can be tuned and within
which ranges, allowing automated integration with Hyperparameter Optimization tools like
[BTB](https://github.com/MLBazaar/BTB).
* Complex multi-branch pipelines and DAG configurations, with unlimited number of inputs and
outputs per primitive.
* Easy save and load Pipelines using JSON Annotations.
# Install
## Requirements
**MLBlocks** has been developed and tested on [Python 3.6, 3.7, 3.8, 3.9, and 3.10](https://www.python.org/downloads/)
## Install with `pip`
The easiest and recommended way to install **MLBlocks** is using [pip](
https://pip.pypa.io/en/stable/):
```bash
pip install mlblocks
```
This will pull and install the latest stable release from [PyPi](https://pypi.org/).
If you want to install from source or contribute to the project please read the
[Contributing Guide](https://mlbazaar.github.io/MLBlocks/contributing.html#get-started).
## MLPrimitives
In order to be usable, MLBlocks requires a compatible primitives library.
The official library, required in order to follow the following MLBlocks tutorial,
is [MLPrimitives](https://github.com/MLBazaar/MLPrimitives), which you can install
with this command:
```bash
pip install mlprimitives
```
# Quickstart
Below there is a short example about how to use **MLBlocks** to solve the [Adult Census
Dataset](https://archive.ics.uci.edu/ml/datasets/Adult) classification problem using a
pipeline which combines primitives from [MLPrimitives](https://github.com/MLBazaar/MLPrimitives),
[scikit-learn](https://scikit-learn.org/) and [xgboost](https://xgboost.readthedocs.io/).
```python3
import pandas as pd
from mlblocks import MLPipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
dataset = pd.read_csv('http://mlblocks.s3.amazonaws.com/census.csv')
label = dataset.pop('label')
X_train, X_test, y_train, y_test = train_test_split(dataset, label, stratify=label)
primitives = [
'mlprimitives.custom.preprocessing.ClassEncoder',
'mlprimitives.custom.feature_extraction.CategoricalEncoder',
'sklearn.impute.SimpleImputer',
'xgboost.XGBClassifier',
'mlprimitives.custom.preprocessing.ClassDecoder'
]
pipeline = MLPipeline(primitives)
pipeline.fit(X_train, y_train)
predictions = pipeline.predict(X_test)
accuracy_score(y_test, predictions)
```
# What's Next?
If you want to learn more about how to tune the pipeline hyperparameters, save and load
the pipelines using JSON annotations or build complex multi-branched pipelines, please
check our [documentation site](https://mlbazaar.github.io/MLBlocks).
Also do not forget to have a look at the [notebook tutorials](
https://github.com/MLBazaar/MLBlocks/tree/master/examples/tutorials)!
# Citing MLBlocks
If you use MLBlocks for your research, please consider citing our related papers.
For the current design of MLBlocks and its usage within the larger *Machine Learning Bazaar* project at
the MIT Data To AI Lab, please see:
Micah J. Smith, Carles Sala, James Max Kanter, and Kalyan Veeramachaneni. ["The Machine Learning Bazaar:
Harnessing the ML Ecosystem for Effective System Development."](https://arxiv.org/abs/1905.08942) arXiv
Preprint 1905.08942. 2019.
```bibtex
@article{smith2019mlbazaar,
author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
journal = {arXiv e-prints},
year = {2019},
eid = {arXiv:1905.08942},
pages = {arXiv:1905.08942},
archivePrefix = {arXiv},
eprint = {1905.08942},
}
```
For the first MLBlocks version from 2015, designed for only multi table, multi entity temporal data, please
refer to Bryan Collazo’s thesis:
* [Machine learning blocks](https://dai.lids.mit.edu/wp-content/uploads/2018/06/Mlblocks_Bryan.pdf).
Bryan Collazo. Masters thesis, MIT EECS, 2015.
With recent availability of a multitude of libraries and tools, we decided it was time to integrate
them and expand the library to address other data types: images, text, graph, time series and
integrate with deep learning libraries.
Changelog
=========
0.6.1 - 2023-09-26
------------------
* Add python 3.11 to MLBlocks - [Issue #143](https://github.com/MLBazaar/MLBlocks/issues/143) by @sarahmish
0.6.0 - 2023-04-14
------------------
* Support python 3.9 and 3.10 - [Issue #141](https://github.com/MLBazaar/MLBlocks/issues/141) by @sarahmish
0.5.0 - 2023-01-22
------------------
* Update `numpy` dependency and isolate tests - [Issue #139](https://github.com/MLBazaar/MLBlocks/issues/139) by @sarahmish
0.4.1 - 2021-10-08
------------------
* Update NumPy dependency - [Issue #136](https://github.com/MLBazaar/MLBlocks/issues/136) by @sarahmish
* Support dynamic inputs and outputs - [Issue #134](https://github.com/MLBazaar/MLBlocks/issues/134) by @pvk-developer
0.4.0 - 2021-01-09
------------------
* Stop pipeline fitting after the last block - [Issue #131](https://github.com/MLBazaar/MLBlocks/issues/131) by @sarahmish
* Add memory debug and profiling - [Issue #130](https://github.com/MLBazaar/MLBlocks/issues/130) by @pvk-developer
* Update Python support - [Issue #129](https://github.com/MLBazaar/MLBlocks/issues/129) by @csala
* Get execution time for each block - [Issue #127](https://github.com/MLBazaar/MLBlocks/issues/127) by @sarahmish
* Allow loading a primitive or pipeline directly from the JSON path - [Issue #114](https://github.com/MLBazaar/MLBlocks/issues/114) by @csala
* Pipeline Diagrams - [Issue #113](https://github.com/MLBazaar/MLBlocks/issues/113) by @erica-chiu
* Get Pipeline Inputs - [Issue #112](https://github.com/MLBazaar/MLBlocks/issues/112) by @erica-chiu
0.3.4 - 2019-11-01
------------------
* Ability to return intermediate context - [Issue #110](https://github.com/MLBazaar/MLBlocks/issues/110) by @csala
* Support for static or class methods - [Issue #107](https://github.com/MLBazaar/MLBlocks/issues/107) by @csala
0.3.3 - 2019-09-09
------------------
* Improved intermediate outputs management - [Issue #105](https://github.com/MLBazaar/MLBlocks/issues/105) by @csala
0.3.2 - 2019-08-12
------------------
* Allow passing fit and produce arguments as `init_params` - [Issue #96](https://github.com/MLBazaar/MLBlocks/issues/96) by @csala
* Support optional fit and produce args and arg defaults - [Issue #95](https://github.com/MLBazaar/MLBlocks/issues/95) by @csala
* Isolate primitives from their hyperparameters dictionary - [Issue #94](https://github.com/MLBazaar/MLBlocks/issues/94) by @csala
* Add functions to explore the available primitives and pipelines - [Issue #90](https://github.com/MLBazaar/MLBlocks/issues/90) by @csala
* Add primitive caching - [Issue #22](https://github.com/MLBazaar/MLBlocks/issues/22) by @csala
0.3.1 - Pipelines Discovery
---------------------------
* Support flat hyperparameter dictionaries - [Issue #92](https://github.com/MLBazaar/MLBlocks/issues/92) by @csala
* Load pipelines by name and register them as `entry_points` - [Issue #88](https://github.com/MLBazaar/MLBlocks/issues/88) by @csala
* Implement partial re-fit -[Issue #61](https://github.com/MLBazaar/MLBlocks/issues/61) by @csala
* Move argument parsing to MLBlock - [Issue #86](https://github.com/MLBazaar/MLBlocks/issues/86) by @csala
* Allow getting intermediate outputs - [Issue #58](https://github.com/MLBazaar/MLBlocks/issues/58) by @csala
0.3.0 - New Primitives Discovery
--------------------------------
* New primitives discovery system based on `entry_points`.
* Conditional Hyperparameters filtering in MLBlock initialization.
* Improved logging and exception reporting.
0.2.4 - New Datasets and Unit Tests
-----------------------------------
* Add a new multi-table dataset.
* Add Unit Tests up to 50% coverage.
* Improve documentation.
* Fix minor bug in newsgroups dataset.
0.2.3 - Demo Datasets
---------------------
* Add new methods to Dataset class.
* Add documentation for the datasets module.
0.2.2 - MLPipeline Load/Save
----------------------------
* Implement save and load methods for MLPipelines
* Add more datasets
0.2.1 - New Documentation
-------------------------
* Add mlblocks.datasets module with demo data download functions.
* Extensive documentation, including multiple pipeline examples.
0.2.0 - New MLBlocks API
------------------------
A new MLBlocks API and Primitive format.
This is a summary of the changes:
* Primitives JSONs and Python code has been moved to a different repository, called MLPrimitives
* Optional usage of multiple JSON primitive folders.
* JSON format has been changed to allow more flexibility and features:
* input and output arguments, as well as argument types, can be specified for each method
* both classes and function as primitives are supported
* multitype and conditional hyperparameters fully supported
* data modalities and primitive classifiers introduced
* metadata such as documentation, description and author fields added
* Parsers are removed, and now the MLBlock class is responsible for loading and reading the
JSON primitive.
* Multiple blocks of the same primitive are supported within the same pipeline.
* Arbitrary inputs and outputs for both pipelines and blocks are allowed.
* Shared variables during pipeline execution, usable by multiple blocks.
0.1.9 - Bugfix Release
----------------------
* Disable some NetworkX functions for incompatibilities with some types of graphs.
0.1.8 - New primitives and some improvements
--------------------------------------------
* Improve the NetworkX primitives.
* Add String Vectorization and Datetime Featurization primitives.
* Refactor some Keras primitives to work with single dimension `y` arrays and be compatible with `pickle`.
* Add XGBClassifier and XGBRegressor primitives.
* Add some `keras.applications` pretrained networks as preprocessing primitives.
* Add helper class to allow function primitives.
0.1.7 - Nested hyperparams dicts
--------------------------------
* Support passing hyperparams as nested dicts.
0.1.6 - Text and Graph Pipelines
--------------------------------
* Add LSTM classifier and regressor primitives.
* Add OneHotEncoder and MultiLabelEncoder primitives.
* Add several NetworkX graph featurization primitives.
* Add `community.best_partition` primitive.
0.1.5 - Collaborative Filtering Pipelines
-----------------------------------------
* Add LightFM primitive.
0.1.4 - Image pipelines improved
--------------------------------
* Allow passing `init_params` on `MLPipeline` creation.
* Fix bug with MLHyperparam types and Keras.
* Rename `produce_params` as `predict_params`.
* Add SingleCNN Classifier and Regressor primitives.
* Simplify and improve Trivial Predictor
0.1.3 - Multi Table pipelines improved
--------------------------------------
* Improve RandomForest primitive ranges
* Improve DFS primitive
* Add Tree Based Feature Selection primitives
* Fix bugs in TrivialPredictor
* Improved documentation
0.1.2 - Bugfix release
----------------------
* Fix bug in TrivialMedianPredictor
* Fix bug in OneHotLabelEncoder
0.1.1 - Single Table pipelines improved
---------------------------------------
* New project structure and primitives for integration into MIT-TA2.
* MIT-TA2 default pipelines and single table pipelines fully working.
0.1.0
-----
* First release on PyPI.
Raw data
{
"_id": null,
"home_page": "https://github.com/MLBazaar/MLBlocks",
"name": "mlblocks",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6,<3.12",
"maintainer_email": "",
"keywords": "auto machine learning classification regression data science pipeline",
"author": "MIT Data To AI Lab",
"author_email": "dailabmit@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f7/13/44962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0/mlblocks-0.6.1.tar.gz",
"platform": null,
"description": "<p align=\"left\">\n <a href=\"https://dai.lids.mit.edu\">\n <img width=15% src=\"https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png\" alt=\"DAI-Lab\" />\n </a>\n <i>An Open Source Project from the <a href=\"https://dai.lids.mit.edu\">Data to AI Lab, at MIT</a></i>\n</p>\n\n<p align=\"left\">\n<img width=20% src=\"https://dai.lids.mit.edu/wp-content/uploads/2018/06/mlblocks-icon.png\" alt=\u201cMLBlocks\u201d />\n</p>\n\n<p align=\"left\">\nPipelines and Primitives for Machine Learning and Data Science.\n</p>\n\n[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n[![PyPi](https://img.shields.io/pypi/v/mlblocks.svg)](https://pypi.python.org/pypi/mlblocks)\n[![Tests](https://github.com/MLBazaar/MLBlocks/workflows/Run%20Tests/badge.svg)](https://github.com/MLBazaar/MLBlocks/actions?query=workflow%3A%22Run+Tests%22+branch%3Amaster)\n[![CodeCov](https://codecov.io/gh/MLBazaar/MLBlocks/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/MLBlocks)\n[![Downloads](https://pepy.tech/badge/mlblocks)](https://pepy.tech/project/mlblocks)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/MLBlocks/master?filepath=examples/tutorials)\n\n<br>\n\n# MLBlocks\n\n* Documentation: https://mlbazaar.github.io/MLBlocks\n* Github: https://github.com/MLBazaar/MLBlocks\n* License: [MIT](https://github.com/MLBazaar/MLBlocks/blob/master/LICENSE)\n* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n\n## Overview\n\nMLBlocks is a simple framework for composing end-to-end tunable Machine Learning Pipelines by\nseamlessly combining tools from any python library with a simple, common and uniform interface.\n\nFeatures include:\n\n* Build Machine Learning Pipelines combining **any Machine Learning Library in Python**.\n* Access a repository with hundreds of primitives and pipelines ready to be used with little to\n no python code to write, carefully curated by Machine Learning and Domain experts.\n* Extract machine-readable information about which hyperparameters can be tuned and within\n which ranges, allowing automated integration with Hyperparameter Optimization tools like\n [BTB](https://github.com/MLBazaar/BTB).\n* Complex multi-branch pipelines and DAG configurations, with unlimited number of inputs and\n outputs per primitive.\n* Easy save and load Pipelines using JSON Annotations.\n\n# Install\n\n## Requirements\n\n**MLBlocks** has been developed and tested on [Python 3.6, 3.7, 3.8, 3.9, and 3.10](https://www.python.org/downloads/)\n\n## Install with `pip`\n\nThe easiest and recommended way to install **MLBlocks** is using [pip](\nhttps://pip.pypa.io/en/stable/):\n\n```bash\npip install mlblocks\n```\n\nThis will pull and install the latest stable release from [PyPi](https://pypi.org/).\n\nIf you want to install from source or contribute to the project please read the\n[Contributing Guide](https://mlbazaar.github.io/MLBlocks/contributing.html#get-started).\n\n## MLPrimitives\n\nIn order to be usable, MLBlocks requires a compatible primitives library.\n\nThe official library, required in order to follow the following MLBlocks tutorial,\nis [MLPrimitives](https://github.com/MLBazaar/MLPrimitives), which you can install\nwith this command:\n\n```bash\npip install mlprimitives\n```\n\n# Quickstart\n\nBelow there is a short example about how to use **MLBlocks** to solve the [Adult Census\nDataset](https://archive.ics.uci.edu/ml/datasets/Adult) classification problem using a\npipeline which combines primitives from [MLPrimitives](https://github.com/MLBazaar/MLPrimitives),\n[scikit-learn](https://scikit-learn.org/) and [xgboost](https://xgboost.readthedocs.io/).\n\n```python3\nimport pandas as pd\nfrom mlblocks import MLPipeline\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import accuracy_score\n\ndataset = pd.read_csv('http://mlblocks.s3.amazonaws.com/census.csv')\nlabel = dataset.pop('label')\n\nX_train, X_test, y_train, y_test = train_test_split(dataset, label, stratify=label)\n\nprimitives = [\n 'mlprimitives.custom.preprocessing.ClassEncoder',\n 'mlprimitives.custom.feature_extraction.CategoricalEncoder',\n 'sklearn.impute.SimpleImputer',\n 'xgboost.XGBClassifier',\n 'mlprimitives.custom.preprocessing.ClassDecoder'\n]\npipeline = MLPipeline(primitives)\n\npipeline.fit(X_train, y_train)\npredictions = pipeline.predict(X_test)\n\naccuracy_score(y_test, predictions)\n```\n\n# What's Next?\n\nIf you want to learn more about how to tune the pipeline hyperparameters, save and load\nthe pipelines using JSON annotations or build complex multi-branched pipelines, please\ncheck our [documentation site](https://mlbazaar.github.io/MLBlocks).\n\nAlso do not forget to have a look at the [notebook tutorials](\nhttps://github.com/MLBazaar/MLBlocks/tree/master/examples/tutorials)!\n\n# Citing MLBlocks\n\nIf you use MLBlocks for your research, please consider citing our related papers.\n\nFor the current design of MLBlocks and its usage within the larger *Machine Learning Bazaar* project at\nthe MIT Data To AI Lab, please see:\n\nMicah J. Smith, Carles Sala, James Max Kanter, and Kalyan Veeramachaneni. [\"The Machine Learning Bazaar:\nHarnessing the ML Ecosystem for Effective System Development.\"](https://arxiv.org/abs/1905.08942) arXiv\nPreprint 1905.08942. 2019.\n\n```bibtex\n@article{smith2019mlbazaar,\n author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},\n title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},\n journal = {arXiv e-prints},\n year = {2019},\n eid = {arXiv:1905.08942},\n pages = {arXiv:1905.08942},\n archivePrefix = {arXiv},\n eprint = {1905.08942},\n}\n```\n\nFor the first MLBlocks version from 2015, designed for only multi table, multi entity temporal data, please\nrefer to Bryan Collazo\u2019s thesis:\n\n* [Machine learning blocks](https://dai.lids.mit.edu/wp-content/uploads/2018/06/Mlblocks_Bryan.pdf).\n Bryan Collazo. Masters thesis, MIT EECS, 2015.\n\nWith recent availability of a multitude of libraries and tools, we decided it was time to integrate\nthem and expand the library to address other data types: images, text, graph, time series and\nintegrate with deep learning libraries.\n\n\nChangelog\n=========\n\n0.6.1 - 2023-09-26\n------------------\n\n* Add python 3.11 to MLBlocks - [Issue #143](https://github.com/MLBazaar/MLBlocks/issues/143) by @sarahmish\n\n0.6.0 - 2023-04-14\n------------------\n\n* Support python 3.9 and 3.10 - [Issue #141](https://github.com/MLBazaar/MLBlocks/issues/141) by @sarahmish\n\n0.5.0 - 2023-01-22\n------------------\n\n* Update `numpy` dependency and isolate tests - [Issue #139](https://github.com/MLBazaar/MLBlocks/issues/139) by @sarahmish\n\n0.4.1 - 2021-10-08\n------------------\n\n* Update NumPy dependency - [Issue #136](https://github.com/MLBazaar/MLBlocks/issues/136) by @sarahmish\n* Support dynamic inputs and outputs - [Issue #134](https://github.com/MLBazaar/MLBlocks/issues/134) by @pvk-developer\n\n0.4.0 - 2021-01-09\n------------------\n\n* Stop pipeline fitting after the last block - [Issue #131](https://github.com/MLBazaar/MLBlocks/issues/131) by @sarahmish\n* Add memory debug and profiling - [Issue #130](https://github.com/MLBazaar/MLBlocks/issues/130) by @pvk-developer\n* Update Python support - [Issue #129](https://github.com/MLBazaar/MLBlocks/issues/129) by @csala\n* Get execution time for each block - [Issue #127](https://github.com/MLBazaar/MLBlocks/issues/127) by @sarahmish\n* Allow loading a primitive or pipeline directly from the JSON path - [Issue #114](https://github.com/MLBazaar/MLBlocks/issues/114) by @csala\n* Pipeline Diagrams - [Issue #113](https://github.com/MLBazaar/MLBlocks/issues/113) by @erica-chiu\n* Get Pipeline Inputs - [Issue #112](https://github.com/MLBazaar/MLBlocks/issues/112) by @erica-chiu\n\n0.3.4 - 2019-11-01\n------------------\n\n* Ability to return intermediate context - [Issue #110](https://github.com/MLBazaar/MLBlocks/issues/110) by @csala\n* Support for static or class methods - [Issue #107](https://github.com/MLBazaar/MLBlocks/issues/107) by @csala\n\n0.3.3 - 2019-09-09\n------------------\n\n* Improved intermediate outputs management - [Issue #105](https://github.com/MLBazaar/MLBlocks/issues/105) by @csala\n\n0.3.2 - 2019-08-12\n------------------\n\n* Allow passing fit and produce arguments as `init_params` - [Issue #96](https://github.com/MLBazaar/MLBlocks/issues/96) by @csala\n* Support optional fit and produce args and arg defaults - [Issue #95](https://github.com/MLBazaar/MLBlocks/issues/95) by @csala\n* Isolate primitives from their hyperparameters dictionary - [Issue #94](https://github.com/MLBazaar/MLBlocks/issues/94) by @csala\n* Add functions to explore the available primitives and pipelines - [Issue #90](https://github.com/MLBazaar/MLBlocks/issues/90) by @csala\n* Add primitive caching - [Issue #22](https://github.com/MLBazaar/MLBlocks/issues/22) by @csala\n\n0.3.1 - Pipelines Discovery\n---------------------------\n\n* Support flat hyperparameter dictionaries - [Issue #92](https://github.com/MLBazaar/MLBlocks/issues/92) by @csala\n* Load pipelines by name and register them as `entry_points` - [Issue #88](https://github.com/MLBazaar/MLBlocks/issues/88) by @csala\n* Implement partial re-fit -[Issue #61](https://github.com/MLBazaar/MLBlocks/issues/61) by @csala\n* Move argument parsing to MLBlock - [Issue #86](https://github.com/MLBazaar/MLBlocks/issues/86) by @csala\n* Allow getting intermediate outputs - [Issue #58](https://github.com/MLBazaar/MLBlocks/issues/58) by @csala\n\n0.3.0 - New Primitives Discovery\n--------------------------------\n\n* New primitives discovery system based on `entry_points`.\n* Conditional Hyperparameters filtering in MLBlock initialization.\n* Improved logging and exception reporting.\n\n0.2.4 - New Datasets and Unit Tests\n-----------------------------------\n\n* Add a new multi-table dataset.\n* Add Unit Tests up to 50% coverage.\n* Improve documentation.\n* Fix minor bug in newsgroups dataset.\n\n0.2.3 - Demo Datasets\n---------------------\n\n* Add new methods to Dataset class.\n* Add documentation for the datasets module.\n\n0.2.2 - MLPipeline Load/Save\n----------------------------\n\n* Implement save and load methods for MLPipelines\n* Add more datasets\n\n0.2.1 - New Documentation\n-------------------------\n\n* Add mlblocks.datasets module with demo data download functions.\n* Extensive documentation, including multiple pipeline examples.\n\n0.2.0 - New MLBlocks API\n------------------------\n\nA new MLBlocks API and Primitive format.\n\nThis is a summary of the changes:\n\n* Primitives JSONs and Python code has been moved to a different repository, called MLPrimitives\n* Optional usage of multiple JSON primitive folders.\n* JSON format has been changed to allow more flexibility and features:\n * input and output arguments, as well as argument types, can be specified for each method\n * both classes and function as primitives are supported\n * multitype and conditional hyperparameters fully supported\n * data modalities and primitive classifiers introduced\n * metadata such as documentation, description and author fields added\n* Parsers are removed, and now the MLBlock class is responsible for loading and reading the\n JSON primitive.\n* Multiple blocks of the same primitive are supported within the same pipeline.\n* Arbitrary inputs and outputs for both pipelines and blocks are allowed.\n* Shared variables during pipeline execution, usable by multiple blocks.\n\n0.1.9 - Bugfix Release\n----------------------\n\n* Disable some NetworkX functions for incompatibilities with some types of graphs.\n\n0.1.8 - New primitives and some improvements\n--------------------------------------------\n\n* Improve the NetworkX primitives.\n* Add String Vectorization and Datetime Featurization primitives.\n* Refactor some Keras primitives to work with single dimension `y` arrays and be compatible with `pickle`.\n* Add XGBClassifier and XGBRegressor primitives.\n* Add some `keras.applications` pretrained networks as preprocessing primitives.\n* Add helper class to allow function primitives.\n\n0.1.7 - Nested hyperparams dicts\n--------------------------------\n\n* Support passing hyperparams as nested dicts.\n\n0.1.6 - Text and Graph Pipelines\n--------------------------------\n\n* Add LSTM classifier and regressor primitives.\n* Add OneHotEncoder and MultiLabelEncoder primitives.\n* Add several NetworkX graph featurization primitives.\n* Add `community.best_partition` primitive.\n\n0.1.5 - Collaborative Filtering Pipelines\n-----------------------------------------\n\n* Add LightFM primitive.\n\n0.1.4 - Image pipelines improved\n--------------------------------\n\n* Allow passing `init_params` on `MLPipeline` creation.\n* Fix bug with MLHyperparam types and Keras.\n* Rename `produce_params` as `predict_params`.\n* Add SingleCNN Classifier and Regressor primitives.\n* Simplify and improve Trivial Predictor\n\n0.1.3 - Multi Table pipelines improved\n--------------------------------------\n\n* Improve RandomForest primitive ranges\n* Improve DFS primitive\n* Add Tree Based Feature Selection primitives\n* Fix bugs in TrivialPredictor\n* Improved documentation\n\n0.1.2 - Bugfix release\n----------------------\n\n* Fix bug in TrivialMedianPredictor\n* Fix bug in OneHotLabelEncoder\n\n0.1.1 - Single Table pipelines improved\n---------------------------------------\n\n* New project structure and primitives for integration into MIT-TA2.\n* MIT-TA2 default pipelines and single table pipelines fully working.\n\n0.1.0\n-----\n\n* First release on PyPI.\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "Pipelines and primitives for machine learning and data science.",
"version": "0.6.1",
"project_urls": {
"Homepage": "https://github.com/MLBazaar/MLBlocks"
},
"split_keywords": [
"auto",
"machine",
"learning",
"classification",
"regression",
"data",
"science",
"pipeline"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5b502523eeb6552941fef61f087ec0896ae0364c9a0db062a8dcfc2c97af3499",
"md5": "bb16cf374766a6a69abcf846a1e7e9f1",
"sha256": "176a5c17dea315342510822026a48a90584a4f39773ba6df989d0bee7b40f801"
},
"downloads": -1,
"filename": "mlblocks-0.6.1-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "bb16cf374766a6a69abcf846a1e7e9f1",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.6,<3.12",
"size": 25728,
"upload_time": "2023-09-26T17:42:06",
"upload_time_iso_8601": "2023-09-26T17:42:06.849870Z",
"url": "https://files.pythonhosted.org/packages/5b/50/2523eeb6552941fef61f087ec0896ae0364c9a0db062a8dcfc2c97af3499/mlblocks-0.6.1-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f71344962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0",
"md5": "491daf6eba24bae9ea1bb72cdfd04ab1",
"sha256": "8a67ea025858cc8c317c31d14aa558872f2a2bc233c4e94ae353e7a90e589737"
},
"downloads": -1,
"filename": "mlblocks-0.6.1.tar.gz",
"has_sig": false,
"md5_digest": "491daf6eba24bae9ea1bb72cdfd04ab1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6,<3.12",
"size": 79362,
"upload_time": "2023-09-26T17:42:09",
"upload_time_iso_8601": "2023-09-26T17:42:09.252211Z",
"url": "https://files.pythonhosted.org/packages/f7/13/44962c047cc6bb022ae23a0b6566da2a5ee06599e2fb26d2ece5843aa9c0/mlblocks-0.6.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-26 17:42:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "MLBazaar",
"github_project": "MLBlocks",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "mlblocks"
}