feature-fabrica


Namefeature-fabrica JSON
Version 1.3.1 PyPI version JSON
download
home_pageNone
SummaryOpen-source Python library designed to improve engineering practices and transparency in feature engineering.
upload_time2024-09-22 07:02:33
maintainerNone
docs_urlNone
authorChingis Oinar
requires_python<4.0,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <h4 align="center">
    <img alt="Feature Fabrica logo" src="https://raw.githubusercontent.com/cowana-ai/feature-fabrica/main/media/current_logo.png" style="width: 100%;">
</h4>
<h2>
    <p align="center">
     βš™οΈ The Framework to Simplify and Scale Feature Engineering βš™οΈ
    </p>
</h2>

<p align="center">
    <a href="https://colab.research.google.com/drive/1O9i-g3vmxyazwdadTVjgBlY1GFN4f7Xt?usp=sharing">
        <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
    </a>
</p>

<p align="center">
    <img src="https://img.shields.io/pypi/v/feature-fabrica?style=flat-square" alt="PyPI version"/>
    <img src="https://img.shields.io/github/stars/cowana-ai/feature-fabrica?style=flat-square" alt="Stars"/>
    <img src="https://img.shields.io/github/issues/cowana-ai/feature-fabrica?style=flat-square" alt="Issues"/>
    <img src="https://img.shields.io/github/license/cowana-ai/feature-fabrica?style=flat-square" alt="License"/>
    <img src="https://img.shields.io/github/contributors/cowana-ai/feature-fabrica?style=flat-square" alt="Contributors"/>
    <img src="https://app.codacy.com/project/badge/Grade/5df9f22c8a2d49a08058bf8a660b086c" alt="Code Quality"/>

</p>

For **data scientists, ML engineers**, and **AI researchers** who want to simplify feature engineering, manage complex dependencies, and boost productivity.

______________________________________________________________________

## Introduction

**Feature Fabrica** is an open-source Python library designed to improve engineering practices and transparency in feature engineering. It allows users to define features declaratively using YAML, manage dependencies between features, and apply complex transformations in a scalable and convenient manner.

By providing a structured approach to feature engineering, Feature Fabrica aims to save time, reduce errors, and enhance the transparency and reproducibility of your machine learning workflows. Whether you're working on small projects or managing large-scale pipelines, **Feature Fabrica** is designed to meet your needs.

## **Key Features**

- **πŸ“ Declarative Feature Definitions**: Define features, data types, and dependencies using a simple YAML configuration.
- **πŸ”„ Transformations**: Apply custom transformations to raw features to derive new features.
- **πŸ”— Dependency Management**: Automatically handle dependencies between features.
- **βœ”οΈ Pydantic Validation**: Ensure data types and values conform to expected formats.
- **πŸ›‘οΈ Fail-Fast with Beartype**: Catch type-related errors instantly during development, ensuring your transformations are robust.
- **πŸš€ Scalability**: Designed to scale from small projects to large machine learning pipelines.
- **πŸ”§ Hydra Integration**: Leverage Hydra for configuration management, enabling flexible and dynamic configuration of transformations.

______________________________________________________________________

## πŸ› οΈ Quick Start

### Installation

To install **Feature Fabrica**, simply run:

```bash
pip install feature-fabrica
```

### **Defining Features in YAML**

Features are defined in a YAML file. Here’s an example:

```yaml
feature_a:
  description: "Raw feature A"
  data_type: "float32"

feature_b:
  description: "Raw feature B"
  data_type: "float32"

feature_c:
  description: "Derived feature C"
  data_type: "float32"
  dependencies: ["feature_a", "feature_b"]
  transformation:
    sum_fn:
      _target_: feature_fabrica.transform.SumReduce
      iterable: ["feature_a", "feature_b"]
    scale_feature:
      _target_: feature_fabrica.transform.ScaleFeature
      factor: 0.5

```

### **Creating and Using Transformations**

You can define custom transformations by subclassing the Transformation class:

```python
from typing import Union
import numpy as np
from beartype import beartype
from numpy.typing import NDArray
from feature_fabrica.transform import Transformation
from feature_fabrica.transform.utils import NumericArray, NumericValue


class ScaleFeature(Transformation):
    def __init__(self, factor: float):
        super().__init__()
        self.factor = factor

    @beartype
    def execute(self, data: NumericArray | NumericValue) -> NumericArray | NumericValue:
        return np.multiply(data, self.factor)
```

### **Compiling and Executing Features**

To compile and execute features:

```python
import numpy as np
from feature_fabrica.core import FeatureManager

data = {
    "feature_a": np.array([10.0], dtype=np.float32),
    "feature_b": np.array([20.0], dtype=np.float32),
}
feature_manager = FeatureManager(
    config_path="../examples", config_name="basic_features"
)
results = feature_manager.compute_features(data)
print(results["feature_c"])  # 0.5 * (10 + 20) = 15.0
print(results.feature_c)  # 0.5 * (10 + 20) = 15.0
```

### Visualize Features and Dependencies

Track & trace Transformation Chains

```python
import numpy as np
from feature_fabrica.core import FeatureManager

data = {
    "feature_a": np.array([10.0], dtype=np.float32),
    "feature_b": np.array([20.0], dtype=np.float32),
}
feature_manager = FeatureManager(
    config_path="../examples", config_name="basic_features"
)
results = feature_manager.compute_features(data)
print(feature_manager.features.feature_c.get_transformation_chain())
# Transformation Chain: (Transformation: sum_fn, Value: 30.0 Time taken: 9.5367431640625e-07 seconds) -> (Transformation: scale_feature, Value: 15.0, Time taken:  9.5367431640625e-07 seconds)
```

Visualize Dependencies

```python
from feature_fabrica.core import FeatureManager

feature_manager = FeatureManager(
    config_path="../examples", config_name="basic_features"
)
feature_manager.get_visual_dependency_graph()
```

![image.png](media/example.png)

## **Contributing**

We welcome contributions to **Feature Fabrica**! If you have ideas for new features, improvements, or if you'd like to report issues, feel free to open a pull request or an issue on GitHub.

### How to Contribute

1. **Fork** the repository to your own GitHub account.
2. **Clone** your fork locally.
3. **Create a new branch** for your feature or fix.
4. **Commit your changes** with a clear and concise message.
5. **Push** to the branch.
6. **Open a pull request** from your fork to the original repository.

We look forward to your contributions! πŸ˜„


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "feature-fabrica",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Chingis Oinar",
    "author_email": "chingisoinar@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/cd/e9/e0326a0198014801fc0388f3b367cd7eb52eb5106f53d65573b4cff2d39c/feature_fabrica-1.3.1.tar.gz",
    "platform": null,
    "description": "<h4 align=\"center\">\n    <img alt=\"Feature Fabrica logo\" src=\"https://raw.githubusercontent.com/cowana-ai/feature-fabrica/main/media/current_logo.png\" style=\"width: 100%;\">\n</h4>\n<h2>\n    <p align=\"center\">\n     \u2699\ufe0f The Framework to Simplify and Scale Feature Engineering \u2699\ufe0f\n    </p>\n</h2>\n\n<p align=\"center\">\n    <a href=\"https://colab.research.google.com/drive/1O9i-g3vmxyazwdadTVjgBlY1GFN4f7Xt?usp=sharing\">\n        <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\"/>\n    </a>\n</p>\n\n<p align=\"center\">\n    <img src=\"https://img.shields.io/pypi/v/feature-fabrica?style=flat-square\" alt=\"PyPI version\"/>\n    <img src=\"https://img.shields.io/github/stars/cowana-ai/feature-fabrica?style=flat-square\" alt=\"Stars\"/>\n    <img src=\"https://img.shields.io/github/issues/cowana-ai/feature-fabrica?style=flat-square\" alt=\"Issues\"/>\n    <img src=\"https://img.shields.io/github/license/cowana-ai/feature-fabrica?style=flat-square\" alt=\"License\"/>\n    <img src=\"https://img.shields.io/github/contributors/cowana-ai/feature-fabrica?style=flat-square\" alt=\"Contributors\"/>\n    <img src=\"https://app.codacy.com/project/badge/Grade/5df9f22c8a2d49a08058bf8a660b086c\" alt=\"Code Quality\"/>\n\n</p>\n\nFor **data scientists, ML engineers**, and **AI researchers** who want to simplify feature engineering, manage complex dependencies, and boost productivity.\n\n______________________________________________________________________\n\n## Introduction\n\n**Feature Fabrica** is an open-source Python library designed to improve engineering practices and transparency in feature engineering. It allows users to define features declaratively using YAML, manage dependencies between features, and apply complex transformations in a scalable and convenient manner.\n\nBy providing a structured approach to feature engineering, Feature Fabrica aims to save time, reduce errors, and enhance the transparency and reproducibility of your machine learning workflows. Whether you're working on small projects or managing large-scale pipelines, **Feature Fabrica** is designed to meet your needs.\n\n## **Key Features**\n\n- **\ud83d\udcdd Declarative Feature Definitions**: Define features, data types, and dependencies using a simple YAML configuration.\n- **\ud83d\udd04 Transformations**: Apply custom transformations to raw features to derive new features.\n- **\ud83d\udd17 Dependency Management**: Automatically handle dependencies between features.\n- **\u2714\ufe0f Pydantic Validation**: Ensure data types and values conform to expected formats.\n- **\ud83d\udee1\ufe0f Fail-Fast with Beartype**: Catch type-related errors instantly during development, ensuring your transformations are robust.\n- **\ud83d\ude80 Scalability**: Designed to scale from small projects to large machine learning pipelines.\n- **\ud83d\udd27 Hydra Integration**: Leverage Hydra for configuration management, enabling flexible and dynamic configuration of transformations.\n\n______________________________________________________________________\n\n## \ud83d\udee0\ufe0f Quick Start\n\n### Installation\n\nTo install **Feature Fabrica**, simply run:\n\n```bash\npip install feature-fabrica\n```\n\n### **Defining Features in YAML**\n\nFeatures are defined in a YAML file. Here\u2019s an example:\n\n```yaml\nfeature_a:\n  description: \"Raw feature A\"\n  data_type: \"float32\"\n\nfeature_b:\n  description: \"Raw feature B\"\n  data_type: \"float32\"\n\nfeature_c:\n  description: \"Derived feature C\"\n  data_type: \"float32\"\n  dependencies: [\"feature_a\", \"feature_b\"]\n  transformation:\n    sum_fn:\n      _target_: feature_fabrica.transform.SumReduce\n      iterable: [\"feature_a\", \"feature_b\"]\n    scale_feature:\n      _target_: feature_fabrica.transform.ScaleFeature\n      factor: 0.5\n\n```\n\n### **Creating and Using Transformations**\n\nYou can define custom transformations by subclassing the Transformation class:\n\n```python\nfrom typing import Union\nimport numpy as np\nfrom beartype import beartype\nfrom numpy.typing import NDArray\nfrom feature_fabrica.transform import Transformation\nfrom feature_fabrica.transform.utils import NumericArray, NumericValue\n\n\nclass ScaleFeature(Transformation):\n    def __init__(self, factor: float):\n        super().__init__()\n        self.factor = factor\n\n    @beartype\n    def execute(self, data: NumericArray | NumericValue) -> NumericArray | NumericValue:\n        return np.multiply(data, self.factor)\n```\n\n### **Compiling and Executing Features**\n\nTo compile and execute features:\n\n```python\nimport numpy as np\nfrom feature_fabrica.core import FeatureManager\n\ndata = {\n    \"feature_a\": np.array([10.0], dtype=np.float32),\n    \"feature_b\": np.array([20.0], dtype=np.float32),\n}\nfeature_manager = FeatureManager(\n    config_path=\"../examples\", config_name=\"basic_features\"\n)\nresults = feature_manager.compute_features(data)\nprint(results[\"feature_c\"])  # 0.5 * (10 + 20) = 15.0\nprint(results.feature_c)  # 0.5 * (10 + 20) = 15.0\n```\n\n### Visualize Features and Dependencies\n\nTrack & trace Transformation Chains\n\n```python\nimport numpy as np\nfrom feature_fabrica.core import FeatureManager\n\ndata = {\n    \"feature_a\": np.array([10.0], dtype=np.float32),\n    \"feature_b\": np.array([20.0], dtype=np.float32),\n}\nfeature_manager = FeatureManager(\n    config_path=\"../examples\", config_name=\"basic_features\"\n)\nresults = feature_manager.compute_features(data)\nprint(feature_manager.features.feature_c.get_transformation_chain())\n# Transformation Chain: (Transformation: sum_fn, Value: 30.0 Time taken: 9.5367431640625e-07 seconds) -> (Transformation: scale_feature, Value: 15.0, Time taken:  9.5367431640625e-07 seconds)\n```\n\nVisualize Dependencies\n\n```python\nfrom feature_fabrica.core import FeatureManager\n\nfeature_manager = FeatureManager(\n    config_path=\"../examples\", config_name=\"basic_features\"\n)\nfeature_manager.get_visual_dependency_graph()\n```\n\n![image.png](media/example.png)\n\n## **Contributing**\n\nWe welcome contributions to **Feature Fabrica**! If you have ideas for new features, improvements, or if you'd like to report issues, feel free to open a pull request or an issue on GitHub.\n\n### How to Contribute\n\n1. **Fork** the repository to your own GitHub account.\n2. **Clone** your fork locally.\n3. **Create a new branch** for your feature or fix.\n4. **Commit your changes** with a clear and concise message.\n5. **Push** to the branch.\n6. **Open a pull request** from your fork to the original repository.\n\nWe look forward to your contributions! \ud83d\ude04\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Open-source Python library designed to improve engineering practices and transparency in feature engineering.",
    "version": "1.3.1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9a32e2e697191e40d0f605bf1b4eea2781177af9246bdf4d91cc027f6819fa0e",
                "md5": "b427be1a85898d628b164930000d3ac7",
                "sha256": "575271d1b25a345602414cfe11cc3fddab2c857f62330899a64325c4ef918458"
            },
            "downloads": -1,
            "filename": "feature_fabrica-1.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b427be1a85898d628b164930000d3ac7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 25417,
            "upload_time": "2024-09-22T07:02:31",
            "upload_time_iso_8601": "2024-09-22T07:02:31.978805Z",
            "url": "https://files.pythonhosted.org/packages/9a/32/e2e697191e40d0f605bf1b4eea2781177af9246bdf4d91cc027f6819fa0e/feature_fabrica-1.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cde9e0326a0198014801fc0388f3b367cd7eb52eb5106f53d65573b4cff2d39c",
                "md5": "0c6effa605107ce5491d64445bd4151f",
                "sha256": "fc33f7ecfd00e318c94acd52aab6cc7ebe505f62d7093922e622293de8c6d817"
            },
            "downloads": -1,
            "filename": "feature_fabrica-1.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0c6effa605107ce5491d64445bd4151f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 21489,
            "upload_time": "2024-09-22T07:02:33",
            "upload_time_iso_8601": "2024-09-22T07:02:33.962951Z",
            "url": "https://files.pythonhosted.org/packages/cd/e9/e0326a0198014801fc0388f3b367cd7eb52eb5106f53d65573b4cff2d39c/feature_fabrica-1.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-22 07:02:33",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "feature-fabrica"
}
        
Elapsed time: 0.30534s