FeatureRefiner


NameFeatureRefiner JSON
Version 1.1 PyPI version JSON
download
home_pagehttps://github.com/ambilynanjilath/FeatureRefiner
SummaryA no-code solution for performing data transformations like imputation, encoding, scaling, and feature creation, with an intuitive interface for interactive DataFrame manipulation and easy CSV export.
upload_time2024-09-11 12:15:41
maintainerNone
docs_urlNone
authorAmbily Biju
requires_python>=3.8
licenseNone
keywords data transformation imputation encoding scaling feature creation machine learning data preprocessing pandas scikit-learn feature engineering data science python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FeatureRefiner
# <img src="FeatureRefiner/scripts/logo.jpeg" alt="FeatureRefiner logo" width="200"/>

![PyPI](https://img.shields.io/pypi/v/FeatureRefiner?color=blue&label=pypi&logo=pypi)
![License](https://img.shields.io/github/license/ambilynanjilath/FeatureRefiner)
![Python Versions](https://img.shields.io/pypi/pyversions/FeatureRefiner)

FeatureRefiner is a Python package for feature engineering that provides a set of tools for data transformation, imputation, encoding, scaling, and feature creation. This package comes with an interactive Streamlit interface that allows users to easily apply these transformations to their datasets.

## Features

- Create polynomial features
- Handle and extract date-time features
- Encode categorical data using various encoding techniques
- Impute missing values with different strategies
- Normalize and scale data using multiple scaling methods
- Interactive Streamlit interface for easy usage

## Installation

It's recommended to install `FeatureRefiner` in a virtual environment to manage dependencies effectively and avoid conflicts with other projects.

### 1. Set Up a Virtual Environment

**For Python 3.3 and above:**

1. **Create a Virtual Environment:**

    ```bash
    python -m venv env
    ```

    Replace `env` with your preferred name for the virtual environment.

2. **Activate the Virtual Environment:**

    - **On Windows:**
      ```bash
      env\Scripts\activate
      ```

    - **On macOS/Linux:**
      ```bash
      source env/bin/activate
      ```

### 2. Install FeatureRefiner

Once the virtual environment is activated, you can install `FeatureRefiner` using `pip`:

```bash
pip install FeatureRefiner
```

## Quick Start
After installing the package, run the FeatureRefiner interface using:

```bash
run-FeatureRefiner
```
This will open a Streamlit app where you can upload your dataset and start applying transformations.

## Usage

### Command-Line Interface
To launch the Streamlit app, simply use the command:
```bash
run-FeatureRefiner
```
### Importing Modules in Python
You can also use FeatureRefiner modules directly in your Python scripts:
```bash
from FeatureRefiner.imputation import MissingValueImputation
from FeatureRefiner.encoding import FeatureEncoding
from FeatureRefiner.imputation import MissingValueImputation
from FeatureRefiner.encoding import FeatureEncoding
from FeatureRefiner.scaling import DataNormalize
from FeatureRefiner.date_time_features import DateTimeExtractor
from FeatureRefiner.create_features import PolynomialFeaturesTransformer
```

## Modules Overview

The `FeatureRefiner` package provides several modules for different data transformation tasks:

- **create_features.py** - Generate polynomial features.
- **date_time_features.py** - Extract and handle date-time related features.
- **encoding.py** - Encode categorical features using techniques like Label Encoding and One-Hot Encoding.
- **imputation.py** - Handle missing values with multiple imputation strategies.
- **scaling.py** - Normalize and scale numerical features.

Each of these modules is described in detail below.

### 1. `create_features.py`
The `create_features.py` module provides functionality to generate polynomial features from numeric columns in a pandas DataFrame. The `PolynomialFeaturesTransformer` class supports creating polynomial combinations of the input features up to a specified degree, enhancing the feature set for predictive modeling.

#### Key Features
- **Degree Specification:** Allows setting the degree of polynomial features during initialization or transformation.
- **Numeric Column Filtering:** Automatically filters and processes only the numeric columns in the DataFrame.
- **Error Handling:** Provides robust error handling for invalid inputs, including non-numeric data and improper degree values.

#### Supported Transformations
- **Polynomial Feature Creation:** Generates polynomial combinations of input features based on the specified degree.


#### Example Usage:
```python
from FeatureRefiner.create_features import PolynomialFeaturesTransformer
import pandas as pd

# Example DataFrame
data = {'feature1': [1, 2, 3], 'feature2': [4, 5, 6]}
df = pd.DataFrame(data)

# Initialize the PolynomialFeaturesTransformer object
transformer = PolynomialFeaturesTransformer(degree=2)

# Transform the DataFrame to include polynomial features
transformed_df = transformer.fit_transform(df)

print(transformed_df)
```
#### Methods


- **`__init__(degree)`**: Initializes the transformer with the specified degree of polynomial features.

- **`fit_transform(df, degree=None)`**: Fits the transformer to the numeric columns of the DataFrame and generates polynomial features. Optionally, updates the polynomial degree.

- **`_validate_input(df)`**: Validates the input DataFrame, ensuring it contains only numeric columns and no categorical data.

### 2. `date_time_features.py`

The `date_time_features.py` module provides functionality to extract and parse datetime components from a specified column in a pandas DataFrame. The `DateTimeExtractor` class supports extracting year, month, day, and day of the week from a datetime column.

#### Key Features
- **Date Parsing:** Handles multiple date formats for parsing datetime data.
- **Component Extraction:** Extracts year, month, day, and day of the week from a datetime column.

#### Supported Extractors
- **Year Extraction:** Adds a new column `year` with the extracted year.
- **Month Extraction:** Adds a new column `month` with the extracted month.
- **Day Extraction:** Adds a new column `day` with the extracted day.
- **Day of Week Extraction:** Adds a new column `day_of_week` with the extracted day of the week.

#### Example Usage
```python
from FeatureRefiner.date_time_features import DateTimeExtractor
import pandas as pd

# Example DataFrame
data = {'date': ['2024-01-01', '2024-02-14', '2024-03-21']}
df = pd.DataFrame(data)

# Initialize the DateTimeExtractor object
extractor = DateTimeExtractor(df, datetime_col='date')

# Extract all datetime components
result_df = extractor.extract_all()

print(result_df)
```
#### Methods

- **`_parse_date(date_str)`**: Tries to parse a date string using multiple formats.

- **`extract_year()`**: Extracts the year from the datetime column and adds it as a new column named `year`.

- **`extract_month()`**: Extracts the month from the datetime column and adds it as a new column named `month`.

- **`extract_day()`**: Extracts the day from the datetime column and adds it as a new column named `day`.

- **`extract_day_of_week()`**: Extracts the day of the week from the datetime column and adds it as a new column named `day_of_week`.

- **`extract_all()`**: Extracts year, month, day, and day of the week from the datetime column and adds them as new columns.

### 3. `encoding.py`
The `encoding.py` module provides functionality to encode categorical features in a pandas DataFrame using Label Encoding and One-Hot Encoding. The `FeatureEncoding` class in this module offers methods for converting categorical data into a numerical format suitable for machine learning algorithms.

#### Key Features:
- **Label Encoding**: Converts categorical text data into numerical data by assigning a unique integer to each category.
- **One-Hot Encoding**: Converts categorical data into a binary matrix, creating a new column for each unique category.

#### Supported Encoders:
- **LabelEncoder**: Converts each category to a unique integer.
- **OneHotEncoder**: Converts categorical data into a binary matrix, with an option to drop the first category to avoid multicollinearity.

#### Example Usage:
```python
from FeatureRefiner.encoding import FeatureEncoding
import pandas as pd

# Example DataFrame
data = {'Color': ['Red', 'Blue', 'Green'], 'Size': ['S', 'M', 'L']}
df = pd.DataFrame(data)

# Initialize the FeatureEncoding object
encoder = FeatureEncoding(df)

# Apply Label Encoding
df_label_encoded = encoder.label_encode(['Color'])

# Apply One-Hot Encoding
df_one_hot_encoded = encoder.one_hot_encode(['Size'])

print(df_label_encoded)
print(df_one_hot_encoded)
```
#### Methods:
- **`label_encode(columns: list) -> pd.DataFrame`**: Apply Label Encoding to the specified columns.
- **`one_hot_encode(columns: list) -> pd.DataFrame`**: Apply One-Hot Encoding to the specified columns, concatenate the encoded columns with the original DataFrame, and drop the original columns.

### 4. `imputation.py`
The `imputation.py` module provides functionality for handling missing values in a pandas DataFrame using various imputation strategies. The `MissingValueImputation` class in this module offers methods to fill missing values based on the specified strategies.

#### Key Features:
- **Flexible Imputation**: Allows for multiple imputation strategies such as mean, median, mode, or custom values.
- **Column-Specific Strategies**: Supports different strategies for different columns.
- **Fit and Transform**: Includes methods for fitting the imputation model and transforming data in a single step.

#### Supported Strategies:
- **Mean**: Fills missing values with the mean of the column (only applicable to numeric columns).
- **Median**: Fills missing values with the median of the column (only applicable to numeric columns).
- **Mode**: Fills missing values with the mode of the column.
- **Custom Values**: Allows specifying a custom value for imputation.

#### Example Usage:

```python
from FeatureRefiner.imputation import MissingValueImputation
import pandas as pd

# Example DataFrame
data = {'A': [1, 2, np.nan, 4, 5], 'B': [10, np.nan, 30, np.nan, 50]}
df = pd.DataFrame(data)

# Define imputation strategies
strategies = {
    'A': 'mean',
    'B': 25
}

# Initialize the MissingValueImputation object
imputer = MissingValueImputation(strategies=strategies)

# Fit and transform the DataFrame
imputed_df = imputer.fit_transform(df)

print(imputed_df)
```
#### Methods:
- **`_compute_fill_value(df: pd.DataFrame, column: str, strategy: Union[str, int, float]) -> Union[float, str]`**: Computes the fill value based on the imputation strategy for a given column.
- **`fit(df: pd.DataFrame) -> 'MissingValueImputation'`**: Computes the fill values for missing data based on the provided strategies.
- **`transform(df: pd.DataFrame) -> pd.DataFrame`**: Applies the imputation to the DataFrame using the computed fill values.
- **`fit_transform(df: pd.DataFrame) -> pd.DataFrame`**: Computes the fill values and applies the imputation to the DataFrame in one step.

### 5. `scaling.py`
The `scaling.py` module provides functionality to scale and normalize numerical data in a pandas DataFrame using various scaling techniques from `scikit-learn`. The `DataNormalize` class in this module offers methods for scaling data using different techniques provided by `scikit-learn`. It supports several scalers, such as `StandardScaler`, `MinMaxScaler`, `RobustScaler`, and others.

#### Key Features:
- **General Data Scaling**: Scales all numerical columns in the DataFrame.
- **Column-Specific Scaling**: Allows scaling specific columns within the DataFrame.
- **Multiple Scalers Supported**: Supports different scaling methods such as standardization, normalization, robust scaling, and more.

#### Supported Scalers:
- **StandardScaler** (`standard`): Scales data to have zero mean and unit variance.
- **MinMaxScaler** (`minmax`): Scales data to a specified range (default is 0 to 1).
- **RobustScaler** (`robust`): Scales data using statistics that are robust to outliers.
- **MaxAbsScaler** (`maxabs`): Scales data to the range [-1, 1] based on the maximum absolute value.
- **Normalizer** (`l2`): Scales each sample individually to have unit norm (L2 norm).
- **QuantileTransformer** (`quantile`): Transforms features to follow a uniform or normal distribution.
- **PowerTransformer** (`power`): Applies a power transformation to make data more Gaussian-like.

#### Example Usage:

```python
from FeatureRefiner.scaling import DataNormalize
import pandas as pd

# Example DataFrame
data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)

# Initialize the DataNormalize object
scaler = DataNormalize()

# Scale the entire DataFrame using MinMaxScaler
scaled_df = scaler.scale(df, method='minmax')

print(scaled_df)
```
#### Methods:
- **`scale(df: pd.DataFrame, method: str = 'standard') -> pd.DataFrame`**: Scales the entire DataFrame using the specified method.
- **`scale_columns(df: pd.DataFrame, columns: list, method: str = 'standard') -> pd.DataFrame`**: Scales specific columns of the DataFrame using the specified method.


## Requirements

Before installing, please make sure you have the following packages installed:

- Python >= 3.7
- Streamlit
- Pandas
- NumPy
- scikit-learn
- st-aggrid

For more detailed information, see the `requirements.txt` file.

## Contributing

We welcome contributions! Please read our Contributing Guidelines for more details.

## License

This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details.

## Acknowledgements

Special thanks to all the libraries and frameworks that have helped in developing this package, including:

- [Streamlit](https://www.streamlit.io/)
- [Pandas](https://pandas.pydata.org/)
- [NumPy](https://numpy.org/)
- [scikit-learn](https://scikit-learn.org/stable/)



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ambilynanjilath/FeatureRefiner",
    "name": "FeatureRefiner",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "data transformation, imputation, encoding, scaling, feature creation, machine learning, data preprocessing, pandas, scikit-learn, feature engineering, data science, Python",
    "author": "Ambily Biju",
    "author_email": "ambilybiju2408@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d8/9a/141892cf21f4dd9a6473bcdf777085be9f707d6d96e8109fb4b4d7b6a3a1/featurerefiner-1.1.tar.gz",
    "platform": null,
    "description": "# FeatureRefiner\n# <img src=\"FeatureRefiner/scripts/logo.jpeg\" alt=\"FeatureRefiner logo\" width=\"200\"/>\n\n![PyPI](https://img.shields.io/pypi/v/FeatureRefiner?color=blue&label=pypi&logo=pypi)\n![License](https://img.shields.io/github/license/ambilynanjilath/FeatureRefiner)\n![Python Versions](https://img.shields.io/pypi/pyversions/FeatureRefiner)\n\nFeatureRefiner is a Python package for feature engineering that provides a set of tools for data transformation, imputation, encoding, scaling, and feature creation. This package comes with an interactive Streamlit interface that allows users to easily apply these transformations to their datasets.\n\n## Features\n\n- Create polynomial features\n- Handle and extract date-time features\n- Encode categorical data using various encoding techniques\n- Impute missing values with different strategies\n- Normalize and scale data using multiple scaling methods\n- Interactive Streamlit interface for easy usage\n\n## Installation\n\nIt's recommended to install `FeatureRefiner` in a virtual environment to manage dependencies effectively and avoid conflicts with other projects.\n\n### 1. Set Up a Virtual Environment\n\n**For Python 3.3 and above:**\n\n1. **Create a Virtual Environment:**\n\n    ```bash\n    python -m venv env\n    ```\n\n    Replace `env` with your preferred name for the virtual environment.\n\n2. **Activate the Virtual Environment:**\n\n    - **On Windows:**\n      ```bash\n      env\\Scripts\\activate\n      ```\n\n    - **On macOS/Linux:**\n      ```bash\n      source env/bin/activate\n      ```\n\n### 2. Install FeatureRefiner\n\nOnce the virtual environment is activated, you can install `FeatureRefiner` using `pip`:\n\n```bash\npip install FeatureRefiner\n```\n\n## Quick Start\nAfter installing the package, run the FeatureRefiner interface using:\n\n```bash\nrun-FeatureRefiner\n```\nThis will open a Streamlit app where you can upload your dataset and start applying transformations.\n\n## Usage\n\n### Command-Line Interface\nTo launch the Streamlit app, simply use the command:\n```bash\nrun-FeatureRefiner\n```\n### Importing Modules in Python\nYou can also use FeatureRefiner modules directly in your Python scripts:\n```bash\nfrom FeatureRefiner.imputation import MissingValueImputation\nfrom FeatureRefiner.encoding import FeatureEncoding\nfrom FeatureRefiner.imputation import MissingValueImputation\nfrom FeatureRefiner.encoding import FeatureEncoding\nfrom FeatureRefiner.scaling import DataNormalize\nfrom FeatureRefiner.date_time_features import DateTimeExtractor\nfrom FeatureRefiner.create_features import PolynomialFeaturesTransformer\n```\n\n## Modules Overview\n\nThe `FeatureRefiner` package provides several modules for different data transformation tasks:\n\n- **create_features.py** - Generate polynomial features.\n- **date_time_features.py** - Extract and handle date-time related features.\n- **encoding.py** - Encode categorical features using techniques like Label Encoding and One-Hot Encoding.\n- **imputation.py** - Handle missing values with multiple imputation strategies.\n- **scaling.py** - Normalize and scale numerical features.\n\nEach of these modules is described in detail below.\n\n### 1. `create_features.py`\nThe `create_features.py` module provides functionality to generate polynomial features from numeric columns in a pandas DataFrame. The `PolynomialFeaturesTransformer` class supports creating polynomial combinations of the input features up to a specified degree, enhancing the feature set for predictive modeling.\n\n#### Key Features\n- **Degree Specification:** Allows setting the degree of polynomial features during initialization or transformation.\n- **Numeric Column Filtering:** Automatically filters and processes only the numeric columns in the DataFrame.\n- **Error Handling:** Provides robust error handling for invalid inputs, including non-numeric data and improper degree values.\n\n#### Supported Transformations\n- **Polynomial Feature Creation:** Generates polynomial combinations of input features based on the specified degree.\n\n\n#### Example Usage:\n```python\nfrom FeatureRefiner.create_features import PolynomialFeaturesTransformer\nimport pandas as pd\n\n# Example DataFrame\ndata = {'feature1': [1, 2, 3], 'feature2': [4, 5, 6]}\ndf = pd.DataFrame(data)\n\n# Initialize the PolynomialFeaturesTransformer object\ntransformer = PolynomialFeaturesTransformer(degree=2)\n\n# Transform the DataFrame to include polynomial features\ntransformed_df = transformer.fit_transform(df)\n\nprint(transformed_df)\n```\n#### Methods\n\n\n- **`__init__(degree)`**: Initializes the transformer with the specified degree of polynomial features.\n\n- **`fit_transform(df, degree=None)`**: Fits the transformer to the numeric columns of the DataFrame and generates polynomial features. Optionally, updates the polynomial degree.\n\n- **`_validate_input(df)`**: Validates the input DataFrame, ensuring it contains only numeric columns and no categorical data.\n\n### 2. `date_time_features.py`\n\nThe `date_time_features.py` module provides functionality to extract and parse datetime components from a specified column in a pandas DataFrame. The `DateTimeExtractor` class supports extracting year, month, day, and day of the week from a datetime column.\n\n#### Key Features\n- **Date Parsing:** Handles multiple date formats for parsing datetime data.\n- **Component Extraction:** Extracts year, month, day, and day of the week from a datetime column.\n\n#### Supported Extractors\n- **Year Extraction:** Adds a new column `year` with the extracted year.\n- **Month Extraction:** Adds a new column `month` with the extracted month.\n- **Day Extraction:** Adds a new column `day` with the extracted day.\n- **Day of Week Extraction:** Adds a new column `day_of_week` with the extracted day of the week.\n\n#### Example Usage\n```python\nfrom FeatureRefiner.date_time_features import DateTimeExtractor\nimport pandas as pd\n\n# Example DataFrame\ndata = {'date': ['2024-01-01', '2024-02-14', '2024-03-21']}\ndf = pd.DataFrame(data)\n\n# Initialize the DateTimeExtractor object\nextractor = DateTimeExtractor(df, datetime_col='date')\n\n# Extract all datetime components\nresult_df = extractor.extract_all()\n\nprint(result_df)\n```\n#### Methods\n\n- **`_parse_date(date_str)`**: Tries to parse a date string using multiple formats.\n\n- **`extract_year()`**: Extracts the year from the datetime column and adds it as a new column named `year`.\n\n- **`extract_month()`**: Extracts the month from the datetime column and adds it as a new column named `month`.\n\n- **`extract_day()`**: Extracts the day from the datetime column and adds it as a new column named `day`.\n\n- **`extract_day_of_week()`**: Extracts the day of the week from the datetime column and adds it as a new column named `day_of_week`.\n\n- **`extract_all()`**: Extracts year, month, day, and day of the week from the datetime column and adds them as new columns.\n\n### 3. `encoding.py`\nThe `encoding.py` module provides functionality to encode categorical features in a pandas DataFrame using Label Encoding and One-Hot Encoding. The `FeatureEncoding` class in this module offers methods for converting categorical data into a numerical format suitable for machine learning algorithms.\n\n#### Key Features:\n- **Label Encoding**: Converts categorical text data into numerical data by assigning a unique integer to each category.\n- **One-Hot Encoding**: Converts categorical data into a binary matrix, creating a new column for each unique category.\n\n#### Supported Encoders:\n- **LabelEncoder**: Converts each category to a unique integer.\n- **OneHotEncoder**: Converts categorical data into a binary matrix, with an option to drop the first category to avoid multicollinearity.\n\n#### Example Usage:\n```python\nfrom FeatureRefiner.encoding import FeatureEncoding\nimport pandas as pd\n\n# Example DataFrame\ndata = {'Color': ['Red', 'Blue', 'Green'], 'Size': ['S', 'M', 'L']}\ndf = pd.DataFrame(data)\n\n# Initialize the FeatureEncoding object\nencoder = FeatureEncoding(df)\n\n# Apply Label Encoding\ndf_label_encoded = encoder.label_encode(['Color'])\n\n# Apply One-Hot Encoding\ndf_one_hot_encoded = encoder.one_hot_encode(['Size'])\n\nprint(df_label_encoded)\nprint(df_one_hot_encoded)\n```\n#### Methods:\n- **`label_encode(columns: list) -> pd.DataFrame`**: Apply Label Encoding to the specified columns.\n- **`one_hot_encode(columns: list) -> pd.DataFrame`**: Apply One-Hot Encoding to the specified columns, concatenate the encoded columns with the original DataFrame, and drop the original columns.\n\n### 4. `imputation.py`\nThe `imputation.py` module provides functionality for handling missing values in a pandas DataFrame using various imputation strategies. The `MissingValueImputation` class in this module offers methods to fill missing values based on the specified strategies.\n\n#### Key Features:\n- **Flexible Imputation**: Allows for multiple imputation strategies such as mean, median, mode, or custom values.\n- **Column-Specific Strategies**: Supports different strategies for different columns.\n- **Fit and Transform**: Includes methods for fitting the imputation model and transforming data in a single step.\n\n#### Supported Strategies:\n- **Mean**: Fills missing values with the mean of the column (only applicable to numeric columns).\n- **Median**: Fills missing values with the median of the column (only applicable to numeric columns).\n- **Mode**: Fills missing values with the mode of the column.\n- **Custom Values**: Allows specifying a custom value for imputation.\n\n#### Example Usage:\n\n```python\nfrom FeatureRefiner.imputation import MissingValueImputation\nimport pandas as pd\n\n# Example DataFrame\ndata = {'A': [1, 2, np.nan, 4, 5], 'B': [10, np.nan, 30, np.nan, 50]}\ndf = pd.DataFrame(data)\n\n# Define imputation strategies\nstrategies = {\n    'A': 'mean',\n    'B': 25\n}\n\n# Initialize the MissingValueImputation object\nimputer = MissingValueImputation(strategies=strategies)\n\n# Fit and transform the DataFrame\nimputed_df = imputer.fit_transform(df)\n\nprint(imputed_df)\n```\n#### Methods:\n- **`_compute_fill_value(df: pd.DataFrame, column: str, strategy: Union[str, int, float]) -> Union[float, str]`**: Computes the fill value based on the imputation strategy for a given column.\n- **`fit(df: pd.DataFrame) -> 'MissingValueImputation'`**: Computes the fill values for missing data based on the provided strategies.\n- **`transform(df: pd.DataFrame) -> pd.DataFrame`**: Applies the imputation to the DataFrame using the computed fill values.\n- **`fit_transform(df: pd.DataFrame) -> pd.DataFrame`**: Computes the fill values and applies the imputation to the DataFrame in one step.\n\n### 5. `scaling.py`\nThe `scaling.py` module provides functionality to scale and normalize numerical data in a pandas DataFrame using various scaling techniques from `scikit-learn`. The `DataNormalize` class in this module offers methods for scaling data using different techniques provided by `scikit-learn`. It supports several scalers, such as `StandardScaler`, `MinMaxScaler`, `RobustScaler`, and others.\n\n#### Key Features:\n- **General Data Scaling**: Scales all numerical columns in the DataFrame.\n- **Column-Specific Scaling**: Allows scaling specific columns within the DataFrame.\n- **Multiple Scalers Supported**: Supports different scaling methods such as standardization, normalization, robust scaling, and more.\n\n#### Supported Scalers:\n- **StandardScaler** (`standard`): Scales data to have zero mean and unit variance.\n- **MinMaxScaler** (`minmax`): Scales data to a specified range (default is 0 to 1).\n- **RobustScaler** (`robust`): Scales data using statistics that are robust to outliers.\n- **MaxAbsScaler** (`maxabs`): Scales data to the range [-1, 1] based on the maximum absolute value.\n- **Normalizer** (`l2`): Scales each sample individually to have unit norm (L2 norm).\n- **QuantileTransformer** (`quantile`): Transforms features to follow a uniform or normal distribution.\n- **PowerTransformer** (`power`): Applies a power transformation to make data more Gaussian-like.\n\n#### Example Usage:\n\n```python\nfrom FeatureRefiner.scaling import DataNormalize\nimport pandas as pd\n\n# Example DataFrame\ndata = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}\ndf = pd.DataFrame(data)\n\n# Initialize the DataNormalize object\nscaler = DataNormalize()\n\n# Scale the entire DataFrame using MinMaxScaler\nscaled_df = scaler.scale(df, method='minmax')\n\nprint(scaled_df)\n```\n#### Methods:\n- **`scale(df: pd.DataFrame, method: str = 'standard') -> pd.DataFrame`**: Scales the entire DataFrame using the specified method.\n- **`scale_columns(df: pd.DataFrame, columns: list, method: str = 'standard') -> pd.DataFrame`**: Scales specific columns of the DataFrame using the specified method.\n\n\n## Requirements\n\nBefore installing, please make sure you have the following packages installed:\n\n- Python >= 3.7\n- Streamlit\n- Pandas\n- NumPy\n- scikit-learn\n- st-aggrid\n\nFor more detailed information, see the `requirements.txt` file.\n\n## Contributing\n\nWe welcome contributions! Please read our Contributing Guidelines for more details.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details.\n\n## Acknowledgements\n\nSpecial thanks to all the libraries and frameworks that have helped in developing this package, including:\n\n- [Streamlit](https://www.streamlit.io/)\n- [Pandas](https://pandas.pydata.org/)\n- [NumPy](https://numpy.org/)\n- [scikit-learn](https://scikit-learn.org/stable/)\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A no-code solution for performing data transformations like imputation, encoding, scaling, and feature creation, with an intuitive interface for interactive DataFrame manipulation and easy CSV export.",
    "version": "1.1",
    "project_urls": {
        "Documentation": "https://github.com/ambilynanjilath/FeatureRefiner/blob/main/README.md",
        "Homepage": "https://github.com/ambilynanjilath/FeatureRefiner",
        "Source": "https://github.com/ambilynanjilath/FeatureRefiner",
        "Tracker": "https://github.com/ambilynanjilath/FeatureRefiner/issues"
    },
    "split_keywords": [
        "data transformation",
        " imputation",
        " encoding",
        " scaling",
        " feature creation",
        " machine learning",
        " data preprocessing",
        " pandas",
        " scikit-learn",
        " feature engineering",
        " data science",
        " python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c783fdc613262d2f3f2f6969a7d48b3a5ddc0de17092f90f8ab5de112c44df0f",
                "md5": "f6d795fa6ce560bcf30c7077b0186926",
                "sha256": "810bd48a30e254cfdd12b2c870319a1906ff38e8127f522ea80c834fdaf3fa47"
            },
            "downloads": -1,
            "filename": "FeatureRefiner-1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f6d795fa6ce560bcf30c7077b0186926",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 29929,
            "upload_time": "2024-09-11T12:15:36",
            "upload_time_iso_8601": "2024-09-11T12:15:36.681103Z",
            "url": "https://files.pythonhosted.org/packages/c7/83/fdc613262d2f3f2f6969a7d48b3a5ddc0de17092f90f8ab5de112c44df0f/FeatureRefiner-1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d89a141892cf21f4dd9a6473bcdf777085be9f707d6d96e8109fb4b4d7b6a3a1",
                "md5": "ddeceb7c3ccb04bdfa3d735d4f57755b",
                "sha256": "4ad0c29b682538defcde04634633d1cd4fb3f82e354548d002533350b91b5e40"
            },
            "downloads": -1,
            "filename": "featurerefiner-1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "ddeceb7c3ccb04bdfa3d735d4f57755b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 32798,
            "upload_time": "2024-09-11T12:15:41",
            "upload_time_iso_8601": "2024-09-11T12:15:41.490869Z",
            "url": "https://files.pythonhosted.org/packages/d8/9a/141892cf21f4dd9a6473bcdf777085be9f707d6d96e8109fb4b4d7b6a3a1/featurerefiner-1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-11 12:15:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ambilynanjilath",
    "github_project": "FeatureRefiner",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "featurerefiner"
}
        
Elapsed time: 1.55385s