kalman-labs


Namekalman-labs JSON
Version 1.0.322 PyPI version JSON
download
home_page
SummaryThe Global Kalman Package
upload_time2023-07-07 17:07:07
maintainer
docs_urlNone
authorAditya
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # kalman-labs

The `kalman-labs` package provides a set of tools and functionalities for audio signal processing and machine learning tasks. It includes features for feature extraction, machine learning model training, and deep learning model training.

## Features

### 1. Audio Feature Extraction

The package provides a module for extracting audio features from audio files. The `audio_features` module offers various feature extraction techniques, such as MFCC, Mel spectrogram, Chromagram, and more. These features can be used as inputs to machine learning and deep learning models.

Example usage:

```python
from kalman.audio_features import extract_features

audio_file = 'path/to/audio/file.wav'
features = extract_features(audio_file)

# The 'features' variable contains a dictionary of extracted audio features
```

- `generate_feature_file(folder_path, scaler=None, label_folder_map=None)`: This function generates a feature file by extracting audio features from the audio files in a specified folder. It returns a DataFrame containing the extracted features and a dictionary mapping labels to folder names. 

  **Parameters:**
  
  - `folder_path` (str): The path to the folder containing the audio files. This folder should have subfolders representing different classes, and each subfolder should contain audio files corresponding to that class.
  
  - `scaler` (str or None, optional): The scaler to use for feature normalization. Supported options are "standard" and "minmax". If set to None, no scaling will be applied. Default is None.
  
  - `label_folder_map` (dict or None, optional): A dictionary mapping labels to folder names. This allows you to override the default folder names as class labels. If set to None, the folder names will be used as labels. Default is None.
  
  **Example:**
  
  ```python
  folder_path = "audio_data"
  scaler = "standard"
  label_folder_map = {"class1": "folder1", "class2": "folder2"}
  
  audio_df, label_name_dict = generate_feature_file(folder_path, scaler, label_folder_map)
  ```
  
  In this example, audio features will be extracted from the audio files in the "audio_data" folder using the "standard" scaler. The labels will be mapped according to the provided `label_folder_map`.
  
Note: The `audio_feature_extraction` parameter and `generate_feature_file` function are applicable to both the `train_ml_model` and `train_dl_model` functions.

### 2. Machine Learning Model Training

The package includes functionalities for training machine learning models on audio data. It supports several popular machine learning algorithms, such as Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Logistic Regression.

Example usage:

```python
from kalman.machine_learning_training import train_ml_model

ml_model = 'random_forest'
folder_path = 'path/to/audio/files'  # Path to the folder containing audio files
x_train = ...   # Provide the training data
y_train = ...   # Provide the training labels
x_test = ...   # Provide the testing data (optional)
y_test = ...   # Provide the testing labels (optional)

# Example usage with additional parameters:
undersampling = True
oversampling = 'smote'
scaler = 'standard'
label_folder_map = {'class_1': 'folder_1', 'class_2': 'folder_2'}

model_details, classification_report = train_ml_model(ml_model, folder_path=folder_path, x_train=x_train, y_train=y_train,
                                                     x_test=x_test, y_test=y_test, undersampling=undersampling,
                                                     oversampling=oversampling, scaler=scaler,
                                                     label_folder_map=label_folder_map)

# The resulting model_details dictionary contains information about the trained model
# The classification_report contains precision, recall, f1-score, and support for each class
```

**Parameter Descriptions:**

- `ml_model` (str): The machine learning model to train. Supported options are: "random_forest", "svm", "knn", "logistic_regression","gradient_boosting", "adaboost", "xgboost".
- `folder_path` (str): Path to the folder containing audio files. This parameter should be used when the audio data is stored in separate files.
- `x_train` (array-like): Training data features. This should be a 2D array-like object.
- `y_train` (array-like): Training data labels. This should be a 1D array-like object.
- `x_test` (array-like, optional): Testing data features. This should be a 2D array-like object. (default: None)
- `y_test` (array-like, optional): Testing data labels. This should be a 1D array-like object. (default: None)
- `test_size` (float, optional): The proportion of the testing data when `x_test` and `y_test` are not provided. This parameter is used for splitting the training data into training and testing sets. (default: 0.2)
- `undersampling` (bool, optional): Whether to perform undersampling to balance the class distribution. (default: False)
- `oversampling` (str, optional): The oversampling technique to use. Supported options are: "smote", "adasyn". (default: None)
- `scaler` (str, optional): The scaler to apply to the data. Supported options are: "standard", "minmax", "robust". (default: None)
- `label_folder_map` (dict, optional): A mapping of class labels to folder names in the case of separate audio files. This is required when `folder_path` is used. (default: None)
- `testing` (bool, optional): Whether to perform testing and return evaluation results. (default: False)



### 3. Deep Learning Model Training

The package provides functionalities for training deep learning models on audio data. It supports various architectures, including Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Bidirectional LSTM, and Convolutional LSTM.

Example usage:

```python
from kalman.deep_learning_training import train_dl_model

dl_model = 'DNN'
folder_path = 'path/to/audio/files'  # Path to the folder containing audio files
x_train = ...   # Provide the training data
y_train = ...   # Provide the training labels
x_val = ...     # Provide the validation data
y_val = ...     # Provide the validation labels

# Example usage with additional parameters:
val_size = 0.3
oversampling = 'smote'
undersampling = True
batch_size = 48
epochs = 100

results = train_dl_model(dl_model, folder_path=folder_path, x_train=x_train, y_train=y_train,
                         x_val=x_val, y_val=y_val, val_size=val_size, oversampling=oversampling,
                         undersampling=undersampling, batch_size=batch_size, epochs=epochs)

# The resulting results dictionary contains evaluation metrics such as accuracy, precision, recall, and AUC
```

**Parameter Descriptions:**

- `dl_model` (str): The deep learning model to train. Supported options are: "DNN", "DNN-CNN", "DNN-BiLSTM", "DNN-convLSTM".
- `folder_path` (str): Path to the folder containing audio files. This parameter should be used when the audio data is stored in separate files.
- `x_train` (array-like): Training data features. This should be a 2D array-like object.
- `y_train` (array-like): Training data labels. This should be a 1D array-like object.
- `x_val` (array-like): Validation data features. This should be a 2D array-like object.
- `y_val` (array-like): Validation data labels. This should be a 1D array-like object.
- `val_size` (float, optional): The proportion of the validation data when `x_val` and `y_val` are not provided. (default: 0.3)
- `oversampling` (str, optional): The oversampling technique to use. Supported options are: "smote", "adasyn". (default: None)
- `undersampling` (bool, optional): Whether to perform undersampling to balance the class distribution. (default: False)
- `batch_size` (int, optional): The batch size for training the deep learning models. (default: 48)
- `epochs` (int, optional): The number of epochs for training the deep learning models. (default: 100)
- `testing` (bool, optional): Whether to perform testing and return evaluation results. (default: False)

**Parameter combinations:**
- `folder_path`, `audio_feature_extraction`, `x_train`, and `y_train` must not be used together. Use either `folder_path` or `audio_feature_extraction` with `x_train` and `y_train` to provide the training data.
- `x_val` and `y_val` should be provided together. If not provided, the validation data will be split from the training data based on `val_size`.
- `oversampling` and `undersampling` cannot be enabled at the same time. Choose either oversampling or undersampling.

In case `testing` parameter is set to `True` in `train_dl_model`, the function will perform testing by splitting `x_val` and `y_val` into 70% validation and 30% testing data.


The evaluation results will be included in the `results` dictionary, which will contain metrics such as accuracy, precision, recall, and AUC.

Please note that the choice of parameters depends on your specific requirements and the nature of your audio data. Use the appropriate combinations of parameters based on your needs.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "kalman-labs",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Aditya",
    "author_email": "aditya@kalman.in",
    "download_url": "https://files.pythonhosted.org/packages/b0/1f/2cd22b466e29f6f0ec99f37b886b2bcab5365a644d97fce101c3ecf56b56/kalman-labs-1.0.322.tar.gz",
    "platform": null,
    "description": "# kalman-labs\n\nThe `kalman-labs` package provides a set of tools and functionalities for audio signal processing and machine learning tasks. It includes features for feature extraction, machine learning model training, and deep learning model training.\n\n## Features\n\n### 1. Audio Feature Extraction\n\nThe package provides a module for extracting audio features from audio files. The `audio_features` module offers various feature extraction techniques, such as MFCC, Mel spectrogram, Chromagram, and more. These features can be used as inputs to machine learning and deep learning models.\n\nExample usage:\n\n```python\nfrom kalman.audio_features import extract_features\n\naudio_file = 'path/to/audio/file.wav'\nfeatures = extract_features(audio_file)\n\n# The 'features' variable contains a dictionary of extracted audio features\n```\n\n- `generate_feature_file(folder_path, scaler=None, label_folder_map=None)`: This function generates a feature file by extracting audio features from the audio files in a specified folder. It returns a DataFrame containing the extracted features and a dictionary mapping labels to folder names. \n\n  **Parameters:**\n  \n  - `folder_path` (str): The path to the folder containing the audio files. This folder should have subfolders representing different classes, and each subfolder should contain audio files corresponding to that class.\n  \n  - `scaler` (str or None, optional): The scaler to use for feature normalization. Supported options are \"standard\" and \"minmax\". If set to None, no scaling will be applied. Default is None.\n  \n  - `label_folder_map` (dict or None, optional): A dictionary mapping labels to folder names. This allows you to override the default folder names as class labels. If set to None, the folder names will be used as labels. Default is None.\n  \n  **Example:**\n  \n  ```python\n  folder_path = \"audio_data\"\n  scaler = \"standard\"\n  label_folder_map = {\"class1\": \"folder1\", \"class2\": \"folder2\"}\n  \n  audio_df, label_name_dict = generate_feature_file(folder_path, scaler, label_folder_map)\n  ```\n  \n  In this example, audio features will be extracted from the audio files in the \"audio_data\" folder using the \"standard\" scaler. The labels will be mapped according to the provided `label_folder_map`.\n  \nNote: The `audio_feature_extraction` parameter and `generate_feature_file` function are applicable to both the `train_ml_model` and `train_dl_model` functions.\n\n### 2. Machine Learning Model Training\n\nThe package includes functionalities for training machine learning models on audio data. It supports several popular machine learning algorithms, such as Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Logistic Regression.\n\nExample usage:\n\n```python\nfrom kalman.machine_learning_training import train_ml_model\n\nml_model = 'random_forest'\nfolder_path = 'path/to/audio/files'  # Path to the folder containing audio files\nx_train = ...   # Provide the training data\ny_train = ...   # Provide the training labels\nx_test = ...   # Provide the testing data (optional)\ny_test = ...   # Provide the testing labels (optional)\n\n# Example usage with additional parameters:\nundersampling = True\noversampling = 'smote'\nscaler = 'standard'\nlabel_folder_map = {'class_1': 'folder_1', 'class_2': 'folder_2'}\n\nmodel_details, classification_report = train_ml_model(ml_model, folder_path=folder_path, x_train=x_train, y_train=y_train,\n                                                     x_test=x_test, y_test=y_test, undersampling=undersampling,\n                                                     oversampling=oversampling, scaler=scaler,\n                                                     label_folder_map=label_folder_map)\n\n# The resulting model_details dictionary contains information about the trained model\n# The classification_report contains precision, recall, f1-score, and support for each class\n```\n\n**Parameter Descriptions:**\n\n- `ml_model` (str): The machine learning model to train. Supported options are: \"random_forest\", \"svm\", \"knn\", \"logistic_regression\",\"gradient_boosting\", \"adaboost\", \"xgboost\".\n- `folder_path` (str): Path to the folder containing audio files. This parameter should be used when the audio data is stored in separate files.\n- `x_train` (array-like): Training data features. This should be a 2D array-like object.\n- `y_train` (array-like): Training data labels. This should be a 1D array-like object.\n- `x_test` (array-like, optional): Testing data features. This should be a 2D array-like object. (default: None)\n- `y_test` (array-like, optional): Testing data labels. This should be a 1D array-like object. (default: None)\n- `test_size` (float, optional): The proportion of the testing data when `x_test` and `y_test` are not provided. This parameter is used for splitting the training data into training and testing sets. (default: 0.2)\n- `undersampling` (bool, optional): Whether to perform undersampling to balance the class distribution. (default: False)\n- `oversampling` (str, optional): The oversampling technique to use. Supported options are: \"smote\", \"adasyn\". (default: None)\n- `scaler` (str, optional): The scaler to apply to the data. Supported options are: \"standard\", \"minmax\", \"robust\". (default: None)\n- `label_folder_map` (dict, optional): A mapping of class labels to folder names in the case of separate audio files. This is required when `folder_path` is used. (default: None)\n- `testing` (bool, optional): Whether to perform testing and return evaluation results. (default: False)\n\n\n\n### 3. Deep Learning Model Training\n\nThe package provides functionalities for training deep learning models on audio data. It supports various architectures, including Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Bidirectional LSTM, and Convolutional LSTM.\n\nExample usage:\n\n```python\nfrom kalman.deep_learning_training import train_dl_model\n\ndl_model = 'DNN'\nfolder_path = 'path/to/audio/files'  # Path to the folder containing audio files\nx_train = ...   # Provide the training data\ny_train = ...   # Provide the training labels\nx_val = ...     # Provide the validation data\ny_val = ...     # Provide the validation labels\n\n# Example usage with additional parameters:\nval_size = 0.3\noversampling = 'smote'\nundersampling = True\nbatch_size = 48\nepochs = 100\n\nresults = train_dl_model(dl_model, folder_path=folder_path, x_train=x_train, y_train=y_train,\n                         x_val=x_val, y_val=y_val, val_size=val_size, oversampling=oversampling,\n                         undersampling=undersampling, batch_size=batch_size, epochs=epochs)\n\n# The resulting results dictionary contains evaluation metrics such as accuracy, precision, recall, and AUC\n```\n\n**Parameter Descriptions:**\n\n- `dl_model` (str): The deep learning model to train. Supported options are: \"DNN\", \"DNN-CNN\", \"DNN-BiLSTM\", \"DNN-convLSTM\".\n- `folder_path` (str): Path to the folder containing audio files. This parameter should be used when the audio data is stored in separate files.\n- `x_train` (array-like): Training data features. This should be a 2D array-like object.\n- `y_train` (array-like): Training data labels. This should be a 1D array-like object.\n- `x_val` (array-like): Validation data features. This should be a 2D array-like object.\n- `y_val` (array-like): Validation data labels. This should be a 1D array-like object.\n- `val_size` (float, optional): The proportion of the validation data when `x_val` and `y_val` are not provided. (default: 0.3)\n- `oversampling` (str, optional): The oversampling technique to use. Supported options are: \"smote\", \"adasyn\". (default: None)\n- `undersampling` (bool, optional): Whether to perform undersampling to balance the class distribution. (default: False)\n- `batch_size` (int, optional): The batch size for training the deep learning models. (default: 48)\n- `epochs` (int, optional): The number of epochs for training the deep learning models. (default: 100)\n- `testing` (bool, optional): Whether to perform testing and return evaluation results. (default: False)\n\n**Parameter combinations:**\n- `folder_path`, `audio_feature_extraction`, `x_train`, and `y_train` must not be used together. Use either `folder_path` or `audio_feature_extraction` with `x_train` and `y_train` to provide the training data.\n- `x_val` and `y_val` should be provided together. If not provided, the validation data will be split from the training data based on `val_size`.\n- `oversampling` and `undersampling` cannot be enabled at the same time. Choose either oversampling or undersampling.\n\nIn case `testing` parameter is set to `True` in `train_dl_model`, the function will perform testing by splitting `x_val` and `y_val` into 70% validation and 30% testing data.\n\n\nThe evaluation results will be included in the `results` dictionary, which will contain metrics such as accuracy, precision, recall, and AUC.\n\nPlease note that the choice of parameters depends on your specific requirements and the nature of your audio data. Use the appropriate combinations of parameters based on your needs.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "The Global Kalman Package",
    "version": "1.0.322",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b01f2cd22b466e29f6f0ec99f37b886b2bcab5365a644d97fce101c3ecf56b56",
                "md5": "a88666ea6d647b73e18437c8f45f2e2f",
                "sha256": "34a534253ba54d0f2b0d98d1662ae3360f172d1dab0681783590341881fca547"
            },
            "downloads": -1,
            "filename": "kalman-labs-1.0.322.tar.gz",
            "has_sig": false,
            "md5_digest": "a88666ea6d647b73e18437c8f45f2e2f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11551,
            "upload_time": "2023-07-07T17:07:07",
            "upload_time_iso_8601": "2023-07-07T17:07:07.579296Z",
            "url": "https://files.pythonhosted.org/packages/b0/1f/2cd22b466e29f6f0ec99f37b886b2bcab5365a644d97fce101c3ecf56b56/kalman-labs-1.0.322.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-07 17:07:07",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "kalman-labs"
}
        
Elapsed time: 0.11687s