# ###### MLModelAnalysis: #######
**MLModelAnalysis** is a versatile and reusable Python class designed to streamline training, evaluation, and prediction processes for various machine learning regression models. This tool allows users to switch seamlessly between models, perform consistent data preprocessing, evaluate models, and make predictions, making it highly adaptable for different machine learning tasks.
## Supported Models
- Linear Regression (`linear_regression`)
- Decision Tree Regressor (`decision_tree`)
- Random Forest Regressor (`random_forest`)
- Support Vector Machine (`svm`)
- Gradient Boosting Regressor (`gradient_boosting`)
- K-Nearest Neighbors (`knn`)
- AdaBoost Regressor (`ada_boost`)
- Neural Network (MLP Regressor) (`mlp`)
- XGBoost Regressor (`xgboost`)
## Installation
To use **MLModelAnalysis**, install the following dependencies:
```bash
pip install scikit-learn pandas numpy plotly xgboost
```
## Usage
### 1. Initializing the Model
Initialize the **MLModelAnalysis** class by specifying the `model_type` parameter, which sets the machine learning model you wish to use.
```python
from ml_model_analysis import MLModelAnalysis
# Initialize with Linear Regression
analysis = MLModelAnalysis(model_type='linear_regression')
# Initialize with Random Forest
analysis = MLModelAnalysis(model_type='random_forest')
# Initialize with XGBoost
analysis = MLModelAnalysis(model_type='xgboost')
```
### 2. Training and Evaluating the Model
The `train_and_evaluate` method handles data preprocessing, model training, and metric evaluation. Optionally, it can save the trained model, scaler, and encoders for later use.
#### Parameters
- `csv_file`: Path to the CSV file containing the dataset.
- `x_elements`: List of feature columns.
- `y_element`: Name of the target column.
- `model_save_path` (Optional): Path to save the trained model, scaler, and encoders.
#### Example
```python
# Set the parameters
csv_file = 'data.csv' # Path to the data file
x_elements = ['feature1', 'feature2'] # Feature columns
y_element = 'target' # Target column
# Initialize the model
analysis = MLModelAnalysis(model_type='random_forest')
# Train and evaluate the model
analysis.train_and_evaluate(csv_file=csv_file, x_elements=x_elements, y_element=y_element, model_save_path='random_forest_model.pkl')
```
After running this code, the model displays R-squared and Mean Squared Error (MSE) metrics for both the training and test sets. If `model_save_path` is specified, the model will be saved for future predictions.
### 3. Loading the Model and Making Predictions
The `load_model_and_predict` method allows you to load a saved model and make predictions on new input data.
#### Parameters
- `model_path`: Path to the saved model file.
- `input_data`: Dictionary containing feature names and values for prediction.
#### Example
```python
# Define input data for prediction
input_data = {
'feature1': 5.1,
'feature2': 2.3
}
# Load the model and make a prediction
prediction = analysis.load_model_and_predict(model_path='random_forest_model.pkl', input_data=input_data)
print(f'Prediction: {prediction}')
```
### 4. Visualization
For `linear_regression` or `svm` models with only one feature, the `train_and_evaluate` method will automatically generate a Plotly plot of actual vs. predicted values for quick visualization.
#### Example Use Cases
- **Regression Analysis with Random Forest**
```python
analysis = MLModelAnalysis(model_type='random_forest')
analysis.train_and_evaluate(csv_file='data.csv', x_elements=['feature1', 'feature2'], y_element='target', model_save_path='random_forest_model.pkl')
```
- **Quick Prediction with a Pre-trained Model**
```python
prediction = analysis.load_model_and_predict(model_path='random_forest_model.pkl', input_data={'feature1': 5.1, 'feature2': 2.3})
print(f'Prediction: {prediction}')
```
- **Effortless Model Switching**
```python
# Specify a new model type to use a different algorithm
analysis = MLModelAnalysis(model_type='xgboost')
```
## Additional Notes
- **Plotting**: Visualizations are supported for linear models and SVM with single-feature datasets.
- **Model Saving**: The `model_save_path` parameter in `train_and_evaluate` stores the model, scaler, and encoders, allowing consistent predictions when reloading the model later.
- **Dependencies**: Ensure required libraries are installed (`scikit-learn`, `pandas`, `numpy`, `plotly`, and `xgboost`).
## License
This project is licensed under the MIT License.
### ImageClassifier ###
The `ImageClassifier` class in `gurulearn` provides an extensive set of tools for image classification, supporting both custom CNNs and pre-trained models for transfer learning. It includes utilities for data loading, model selection based on dataset size, and model training and evaluation.
#### Key Features
- **Flexible Model Selection**: Automatically selects a model based on dataset size, or allows users to force a specific model (custom CNNs or pre-trained models like VGG16, ResNet50, MobileNet, etc.).
- **Data Loading Options**: Supports loading images from directories or CSV files.
- **Transfer Learning with Fine-Tuning**: Offers optional fine-tuning of pre-trained models for enhanced accuracy.
- **Custom CNN Architectures**: Provides a variety of custom CNN models (`cnn1` to `cnn10`) for different levels of complexity.
- **Evaluation Tools**: Built-in functions for visualizing training accuracy and displaying confusion matrices.
#### Usage
To use the `ImageClassifier` class, follow these steps:
#### 1. Importing and Initializing
```python
from gurulearn import ImageClassifier
# Initialize the image classifier
image_classifier = ImageClassifier()
```
#### 2. Training the Model
The `img_train` method trains an image classification model using either a directory of images or a CSV file with image paths and labels.
```python
image_classifier.img_train(
train_dir="path/to/train/data", # or specify csv_file, img_column, label_column for CSV data
test_dir="path/to/test/data", # Optional, only if test data is in a separate directory
epochs=10,
device="cpu", # Set to "cuda" if using a GPU
force="vgg16", # Force specific model choice (optional)
finetune=True # Fine-tune pre-trained models (optional)
)
```
Parameters:
- **train_dir**: Directory containing training images (organized in subdirectories by class) or `csv_file` for CSV data.
- **test_dir**: Directory containing test images (optional, use if separate from `train_dir`).
- **csv_file**: Path to CSV file if loading data from CSV.
- **img_column**: Column name in the CSV containing image paths.
- **label_column**: Column name in the CSV containing labels.
- **epochs**: Number of training epochs (default: 10).
- **device**: Device to use for training, either `"cpu"` or `"cuda"` (default: `"cpu"`).
- **force**: Specify a particular model (options include `"simple_cnn"`, `"vgg16"`, `"resnet50"`, etc.).
- **finetune**: Whether to fine-tune pre-trained models (default: `False`).
#### Supported Models
The `ImageClassifier` class can select models based on dataset size or through forced selection. Models include:
- **Custom CNNs**: `cnn1` to `cnn10` (e.g., simple CNN, ResNet-inspired, Inception-inspired).
- **Pre-trained Models**: `"vgg16"`, `"resnet50"`, `"mobilenet"`, `"inceptionv3"`, `"densenet"`, `"efficientnet"`, `"xception"`, `"nasnetmobile"`, `"inceptionresnetv2"`.
- **Model Auto-Selection**: Based on dataset size, the class can automatically select the appropriate model.
#### Example Workflow
```python
# Initialize the classifier
image_classifier = ImageClassifier()
# Train the model using images organized in directories
image_classifier.img_train(
train_dir="data/train_images",
test_dir="data/test_images",
epochs=20,
device="cuda", # Use GPU if available
force="resnet50", # Force ResNet50 model
finetune=True # Enable fine-tuning
)
```
#### 3. Plotting Training Accuracy
The `plot_accuracy` method displays the training and validation accuracy across epochs.
```python
history = image_classifier.img_train(train_dir="data/train_images", epochs=10)
image_classifier.plot_accuracy(history)
```
#### 4. Displaying Confusion Matrix
After training, you can plot a confusion matrix to evaluate the model's predictions on validation data.
```python
image_classifier.plot_confusion_matrix(model, validation_generator)
```
#### Files Created
- **Model File**: `selected_model.h5` - The trained model is saved for future use.
### Model Selection Guidelines
The `_select_model` method automatically chooses a model based on dataset size if no specific model is forced. For smaller datasets, simpler models (like `simple_cnn` or `vgg16`) are preferred, while for larger datasets, deeper models (like `resnet50`) are selected for improved accuracy.
#### Model Architectures
Each custom CNN model (from `cnn1` to `cnn10`) and pre-trained model architecture (VGG16, ResNet50, etc.) provides a unique structure optimized for specific types of datasets and computational capacities.
# #### **CTScanProcessor** #### #
**CTScanProcessor** is a Python class designed for advanced processing and quality evaluation of CT scan images. This tool is highly beneficial for applications in medical imaging, data science, and deep learning, providing noise reduction, contrast enhancement, detail preservation, and quality evaluation.
## Features
- **Sharpening**: Enhances image details by applying a sharpening filter.
- **Median Denoising**: Reduces noise while preserving edges using a median filter.
- **Contrast Enhancement**: Enhances contrast using CLAHE (Contrast Limited Adaptive Histogram Equalization).
- **Quality Metrics**: Calculates image quality metrics such as MSE, PSNR, SNR, and Detail Preservation Ratio to evaluate the effectiveness of processing.
- **Image Comparison**: Creates side-by-side comparisons of original and processed images.
## Installation
This class requires the following libraries:
- OpenCV
- NumPy
- SciPy
To install the required dependencies, use:
```bash
pip install opencv-python-headless numpy scipy
```
## Usage
1. **Initialize the Processor**
```python
from ct_scan_processor import CTScanProcessor
processor = CTScanProcessor(kernel_size=5, clip_limit=2.0, tile_grid_size=(8, 8))
```
2. **Process a CT Scan**
Use the `process_ct_scan` method to process a CT scan image and get quality metrics.
```python
denoised, metrics = processor.process_ct_scan("path_to_ct_scan.jpg", "output_folder", compare=True)
```
3. **Quality Metrics**
After processing, the class returns metrics such as Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), Signal-to-Noise Ratio (SNR), and Detail Preservation Ratio.
4. **Compare Images**
If `compare=True`, a side-by-side comparison image is saved in the specified comparison folder.
### Example
```python
if __name__ == "__main__":
processor = CTScanProcessor()
denoised, metrics = processor.process_ct_scan("path_to_ct_scan.jpg", "output_folder", compare=True)
```
## Quality Metrics
The following metrics are calculated to evaluate the quality of the denoised image:
- **MSE**: Mean Squared Error between the original and processed images.
- **PSNR**: Peak Signal-to-Noise Ratio to measure image quality.
- **SNR**: Signal-to-Noise Ratio to measure signal strength relative to noise.
- **Detail Preservation**: Percentage of preserved details after processing.
## Methods
- `sharpen(image)`: Sharpens the input image.
- `median_denoise(image)`: Denoises the input image using a median filter.
- `enhance_contrast(image)`: Enhances contrast using CLAHE.
- `enhanced_denoise(image_path)`: Processes a CT scan image with denoising, contrast enhancement, and sharpening.
- `evaluate_quality(original, denoised)`: Computes MSE, PSNR, SNR, and Detail Preservation.
- `compare_images(original, processed, output_path)`: Saves a side-by-side comparison of the original and processed images.
- `process_ct_scan(input_path, output_folder, comparison_folder="comparison", compare=False)`: Runs the full CT scan processing pipeline and saves the results.
## License
This project is licensed under the MIT License.
## Contributions
Contributions are welcome! Feel free to submit pull requests or open issues.
### AudioRecognition ###
The `AudioRecognition` class in `gurulearn` provides tools for audio data augmentation, feature extraction, model training, and prediction, making it suitable for tasks like audio classification and speech recognition.
#### Key Features
- **Data Augmentation**: Supports time-stretching, pitch-shifting, and noise addition for audio data augmentation.
- **Feature Extraction**: Extracts MFCCs, chroma, and spectral contrast features from audio signals.
- **Model Training**: Trains a deep learning model for audio classification using a Conv1D and BiLSTM-based architecture.
- **Prediction**: Predicts the class of a given audio file based on a trained model.
#### Usage
To use the `AudioRecognition` class, follow these steps:
#### 1. Importing and Initializing
```python
from gurulearn import AudioRecognition
# Initialize the audio recognition class
audio_recognition = AudioRecognition()
```
#### 2. Loading Data with Augmentation
The `load_data_with_augmentation` method loads audio data from a specified directory and performs augmentation to improve model generalization.
```python
data_dir = "path/to/audio/data"
X, y = audio_recognition.load_data_with_augmentation(data_dir)
```
This method returns feature vectors (`X`) and labels (`y`) for training.
#### 3. Training the Model
The `audiotrain` method trains an audio classification model. This method also generates a confusion matrix and training history plot, which are saved in the specified model directory.
```python
audio_recognition.audiotrain(
data_path="path/to/audio/data",
epochs=50,
batch_size=32,
test_size=0.2,
learning_rate=0.001,
model_dir='model_folder'
)
```
Parameters:
- **data_path**: Directory path where audio data is stored (organized by class label).
- **epochs**: Number of training epochs (default: 50).
- **batch_size**: Training batch size (default: 32).
- **test_size**: Proportion of data to use for testing (default: 0.2).
- **learning_rate**: Initial learning rate for model training (default: 0.001).
- **model_dir**: Directory where the model and label mappings will be saved.
#### 4. Predicting the Class of an Audio File
After training, you can predict the class of a new audio file using the `predict` or `predict_class` methods.
```python
# Path to the input audio file
input_wav = "path/to/audio/file.wav"
# Predict the label of the audio file
predicted_label = audio_recognition.predict(input_wav)
print(f"Predicted Label: {predicted_label}")
```
The `predict` method returns the predicted label (text), while `predict_class` returns the numeric class index.
#### Example Workflow
```python
# Initialize the audio recognition instance
audio_recognition = AudioRecognition()
# Load data and perform augmentation
X, y = audio_recognition.load_data_with_augmentation('data/audio_files')
# Train the model on the audio dataset
audio_recognition.audiotrain(
data_path='data/audio_files',
epochs=30,
batch_size=32,
learning_rate=0.001
)
# Predict the class of a new audio sample
predicted_label = audio_recognition.predict('data/test_audio.wav')
print("Predicted Label:", predicted_label)
```
#### Files Created
- **Confusion Matrix**: `confusion_matrix.png` - Saved in the current directory after training.
- **Training History**: `training_history.png` - Contains plots for model accuracy and loss.
- **Model**: `audio_recognition_model.h5` - Saved in the specified model directory.
- **Label Mapping**: `label_mapping.json` - Contains mappings of class indices to labels.
## Introducing FlowBot
`FlowBot` is a flexible framework for creating **dynamic, guided interactions** (chatbots, booking systems, surveys) that adapt to user input and filter datasets in real time. Perfect for travel booking, customer support, or personalized recommendations!
---
## Installation
```bash
pip install gurulearn
```
---
## Quick Start
**Build a Travel Booking Bot in 5 Steps**:
```python
import pandas as pd
from gurulearn import FlowBot
# Sample dataset
hotels = pd.DataFrame({
'destination': ['Paris', 'Tokyo', 'New York'],
'price_range': ['$$$', '$$', '$'],
'hotel_name': ['Luxury Palace', 'Mountain View', 'Downtown Inn']
})
# Initialize FlowBot
bot = FlowBot(hotels)
# Collect user email first
bot.add_personal_info("email", "Please enter your email:")
# Define workflow
bot.add("destination", "Where would you like to go?", required=True)
bot.add("price_range", "Choose your budget:", required=False)
bot.finish("hotel_name", "price_range") # Final output columns
# Simulate user interaction
response = bot.process("user123", "") # Start flow!
print(response['message']) # "Where would you like to go?"
print(response['suggestions']) # ["Paris", "Tokyo", "New York"]
# User selects 'Paris'
response = bot.process("user123", "Paris")
print(response['message']) # "Choose your budget:"
print(response['suggestions']) # ["$$$", "$$"]
```
---
## Key Features
### 1. **Dynamic Suggestions**
Auto-filter valid options based on prior choices:
```python
bot.add("activity", "Choose an activity:", required=True)
# Suggests only activities available in the selected destination
```
### 2. **Personalized Data Collection**
```python
bot.add_personal_info("phone", "Your phone number:", required=True)
```
### 3. **Session Management**
Resume progress or reset conversations:
```python
bot.reset_session("user123") # Restart workflow
```
### 4. **Save Results**
User data and chat history auto-saved to JSON:
```json
user_data/user123.json
{
"personal_info": {"email": "user@example.com"},
"chat_history": [...]
}
```
---
## Detailed Usage
### Initialize FlowBot
```python
bot = FlowBot(
data=df, # Your pandas DataFrame
)
```
### Add Workflow Steps
```python
bot.add(
field="room_type", # DataFrame column to filter
prompt="Select room type:", # User prompt
required=True # Force valid input
)
```
### Get Final Results
```python
results = response['results'] # Filtered DataFrame rows as dicts
# Example: [{'hotel_name': 'Luxury Palace', 'price_range': '$$$'}]
```
---
## 🔧 Dependencies
- Python 3.7+
- `pandas`
---
## 📜 License
[MIT License](LICENSE)
---
## Get Help
Found a bug? Open an [issue](https://github.com/guru-dharsan-git/gurulearn/issues).
---
**Happy Building!**
*Tag your projects with #gurulearn to share them with the community!*
---
Raw data
{
"_id": null,
"home_page": null,
"name": "gurulearn",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "machine learning, deep learning, computer vision, medical imaging, audio processing, AI",
"author": "Guru Dharsan T",
"author_email": "gurudharsan123@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/69/ec/6b0cda13d2885f5780da70eefa1bef22bb8f81ed330cb27638db7c0e149e/gurulearn-2.0.2.tar.gz",
"platform": null,
"description": "# ###### MLModelAnalysis: #######\r\n\r\n**MLModelAnalysis** is a versatile and reusable Python class designed to streamline training, evaluation, and prediction processes for various machine learning regression models. This tool allows users to switch seamlessly between models, perform consistent data preprocessing, evaluate models, and make predictions, making it highly adaptable for different machine learning tasks.\r\n\r\n## Supported Models\r\n\r\n- Linear Regression (`linear_regression`)\r\n- Decision Tree Regressor (`decision_tree`)\r\n- Random Forest Regressor (`random_forest`)\r\n- Support Vector Machine (`svm`)\r\n- Gradient Boosting Regressor (`gradient_boosting`)\r\n- K-Nearest Neighbors (`knn`)\r\n- AdaBoost Regressor (`ada_boost`)\r\n- Neural Network (MLP Regressor) (`mlp`)\r\n- XGBoost Regressor (`xgboost`)\r\n\r\n## Installation\r\n\r\nTo use **MLModelAnalysis**, install the following dependencies:\r\n```bash\r\npip install scikit-learn pandas numpy plotly xgboost\r\n```\r\n\r\n## Usage\r\n\r\n### 1. Initializing the Model\r\n\r\nInitialize the **MLModelAnalysis** class by specifying the `model_type` parameter, which sets the machine learning model you wish to use.\r\n\r\n```python\r\nfrom ml_model_analysis import MLModelAnalysis\r\n\r\n# Initialize with Linear Regression\r\nanalysis = MLModelAnalysis(model_type='linear_regression')\r\n\r\n# Initialize with Random Forest\r\nanalysis = MLModelAnalysis(model_type='random_forest')\r\n\r\n# Initialize with XGBoost\r\nanalysis = MLModelAnalysis(model_type='xgboost')\r\n```\r\n\r\n### 2. Training and Evaluating the Model\r\n\r\nThe `train_and_evaluate` method handles data preprocessing, model training, and metric evaluation. Optionally, it can save the trained model, scaler, and encoders for later use.\r\n\r\n#### Parameters\r\n- `csv_file`: Path to the CSV file containing the dataset.\r\n- `x_elements`: List of feature columns.\r\n- `y_element`: Name of the target column.\r\n- `model_save_path` (Optional): Path to save the trained model, scaler, and encoders.\r\n\r\n#### Example\r\n```python\r\n# Set the parameters\r\ncsv_file = 'data.csv' # Path to the data file\r\nx_elements = ['feature1', 'feature2'] # Feature columns\r\ny_element = 'target' # Target column\r\n\r\n# Initialize the model\r\nanalysis = MLModelAnalysis(model_type='random_forest')\r\n\r\n# Train and evaluate the model\r\nanalysis.train_and_evaluate(csv_file=csv_file, x_elements=x_elements, y_element=y_element, model_save_path='random_forest_model.pkl')\r\n```\r\nAfter running this code, the model displays R-squared and Mean Squared Error (MSE) metrics for both the training and test sets. If `model_save_path` is specified, the model will be saved for future predictions.\r\n\r\n### 3. Loading the Model and Making Predictions\r\n\r\nThe `load_model_and_predict` method allows you to load a saved model and make predictions on new input data.\r\n\r\n#### Parameters\r\n- `model_path`: Path to the saved model file.\r\n- `input_data`: Dictionary containing feature names and values for prediction.\r\n\r\n#### Example\r\n```python\r\n# Define input data for prediction\r\ninput_data = {\r\n 'feature1': 5.1,\r\n 'feature2': 2.3\r\n}\r\n\r\n# Load the model and make a prediction\r\nprediction = analysis.load_model_and_predict(model_path='random_forest_model.pkl', input_data=input_data)\r\nprint(f'Prediction: {prediction}')\r\n```\r\n\r\n### 4. Visualization\r\n\r\nFor `linear_regression` or `svm` models with only one feature, the `train_and_evaluate` method will automatically generate a Plotly plot of actual vs. predicted values for quick visualization.\r\n\r\n#### Example Use Cases\r\n\r\n- **Regression Analysis with Random Forest**\r\n ```python\r\n analysis = MLModelAnalysis(model_type='random_forest')\r\n analysis.train_and_evaluate(csv_file='data.csv', x_elements=['feature1', 'feature2'], y_element='target', model_save_path='random_forest_model.pkl')\r\n ```\r\n\r\n- **Quick Prediction with a Pre-trained Model**\r\n ```python\r\n prediction = analysis.load_model_and_predict(model_path='random_forest_model.pkl', input_data={'feature1': 5.1, 'feature2': 2.3})\r\n print(f'Prediction: {prediction}')\r\n ```\r\n\r\n- **Effortless Model Switching**\r\n ```python\r\n # Specify a new model type to use a different algorithm\r\n analysis = MLModelAnalysis(model_type='xgboost')\r\n ```\r\n\r\n## Additional Notes\r\n\r\n- **Plotting**: Visualizations are supported for linear models and SVM with single-feature datasets.\r\n- **Model Saving**: The `model_save_path` parameter in `train_and_evaluate` stores the model, scaler, and encoders, allowing consistent predictions when reloading the model later.\r\n- **Dependencies**: Ensure required libraries are installed (`scikit-learn`, `pandas`, `numpy`, `plotly`, and `xgboost`).\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License.\r\n\r\n\r\n\r\n\r\n\r\n### ImageClassifier ###\r\n\r\nThe `ImageClassifier` class in `gurulearn` provides an extensive set of tools for image classification, supporting both custom CNNs and pre-trained models for transfer learning. It includes utilities for data loading, model selection based on dataset size, and model training and evaluation.\r\n\r\n#### Key Features\r\n\r\n- **Flexible Model Selection**: Automatically selects a model based on dataset size, or allows users to force a specific model (custom CNNs or pre-trained models like VGG16, ResNet50, MobileNet, etc.).\r\n- **Data Loading Options**: Supports loading images from directories or CSV files.\r\n- **Transfer Learning with Fine-Tuning**: Offers optional fine-tuning of pre-trained models for enhanced accuracy.\r\n- **Custom CNN Architectures**: Provides a variety of custom CNN models (`cnn1` to `cnn10`) for different levels of complexity.\r\n- **Evaluation Tools**: Built-in functions for visualizing training accuracy and displaying confusion matrices.\r\n\r\n#### Usage\r\n\r\nTo use the `ImageClassifier` class, follow these steps:\r\n\r\n#### 1. Importing and Initializing\r\n\r\n```python\r\nfrom gurulearn import ImageClassifier\r\n\r\n# Initialize the image classifier\r\nimage_classifier = ImageClassifier()\r\n```\r\n\r\n#### 2. Training the Model\r\n\r\nThe `img_train` method trains an image classification model using either a directory of images or a CSV file with image paths and labels.\r\n\r\n```python\r\nimage_classifier.img_train(\r\n train_dir=\"path/to/train/data\", # or specify csv_file, img_column, label_column for CSV data\r\n test_dir=\"path/to/test/data\", # Optional, only if test data is in a separate directory\r\n epochs=10,\r\n device=\"cpu\", # Set to \"cuda\" if using a GPU\r\n force=\"vgg16\", # Force specific model choice (optional)\r\n finetune=True # Fine-tune pre-trained models (optional)\r\n)\r\n```\r\n\r\nParameters:\r\n- **train_dir**: Directory containing training images (organized in subdirectories by class) or `csv_file` for CSV data.\r\n- **test_dir**: Directory containing test images (optional, use if separate from `train_dir`).\r\n- **csv_file**: Path to CSV file if loading data from CSV.\r\n- **img_column**: Column name in the CSV containing image paths.\r\n- **label_column**: Column name in the CSV containing labels.\r\n- **epochs**: Number of training epochs (default: 10).\r\n- **device**: Device to use for training, either `\"cpu\"` or `\"cuda\"` (default: `\"cpu\"`).\r\n- **force**: Specify a particular model (options include `\"simple_cnn\"`, `\"vgg16\"`, `\"resnet50\"`, etc.).\r\n- **finetune**: Whether to fine-tune pre-trained models (default: `False`).\r\n\r\n#### Supported Models\r\n\r\nThe `ImageClassifier` class can select models based on dataset size or through forced selection. Models include:\r\n\r\n- **Custom CNNs**: `cnn1` to `cnn10` (e.g., simple CNN, ResNet-inspired, Inception-inspired).\r\n- **Pre-trained Models**: `\"vgg16\"`, `\"resnet50\"`, `\"mobilenet\"`, `\"inceptionv3\"`, `\"densenet\"`, `\"efficientnet\"`, `\"xception\"`, `\"nasnetmobile\"`, `\"inceptionresnetv2\"`.\r\n- **Model Auto-Selection**: Based on dataset size, the class can automatically select the appropriate model.\r\n\r\n#### Example Workflow\r\n\r\n```python\r\n# Initialize the classifier\r\nimage_classifier = ImageClassifier()\r\n\r\n# Train the model using images organized in directories\r\nimage_classifier.img_train(\r\n train_dir=\"data/train_images\",\r\n test_dir=\"data/test_images\",\r\n epochs=20,\r\n device=\"cuda\", # Use GPU if available\r\n force=\"resnet50\", # Force ResNet50 model\r\n finetune=True # Enable fine-tuning\r\n)\r\n```\r\n\r\n#### 3. Plotting Training Accuracy\r\n\r\nThe `plot_accuracy` method displays the training and validation accuracy across epochs.\r\n\r\n```python\r\nhistory = image_classifier.img_train(train_dir=\"data/train_images\", epochs=10)\r\nimage_classifier.plot_accuracy(history)\r\n```\r\n\r\n#### 4. Displaying Confusion Matrix\r\n\r\nAfter training, you can plot a confusion matrix to evaluate the model's predictions on validation data.\r\n\r\n```python\r\nimage_classifier.plot_confusion_matrix(model, validation_generator)\r\n```\r\n\r\n#### Files Created\r\n\r\n- **Model File**: `selected_model.h5` - The trained model is saved for future use.\r\n\r\n### Model Selection Guidelines\r\n\r\nThe `_select_model` method automatically chooses a model based on dataset size if no specific model is forced. For smaller datasets, simpler models (like `simple_cnn` or `vgg16`) are preferred, while for larger datasets, deeper models (like `resnet50`) are selected for improved accuracy.\r\n\r\n#### Model Architectures\r\n\r\nEach custom CNN model (from `cnn1` to `cnn10`) and pre-trained model architecture (VGG16, ResNet50, etc.) provides a unique structure optimized for specific types of datasets and computational capacities.\r\n\r\n\r\n\r\n# #### **CTScanProcessor** #### # \r\n\r\n\r\n**CTScanProcessor** is a Python class designed for advanced processing and quality evaluation of CT scan images. This tool is highly beneficial for applications in medical imaging, data science, and deep learning, providing noise reduction, contrast enhancement, detail preservation, and quality evaluation.\r\n\r\n## Features\r\n\r\n- **Sharpening**: Enhances image details by applying a sharpening filter.\r\n- **Median Denoising**: Reduces noise while preserving edges using a median filter.\r\n- **Contrast Enhancement**: Enhances contrast using CLAHE (Contrast Limited Adaptive Histogram Equalization).\r\n- **Quality Metrics**: Calculates image quality metrics such as MSE, PSNR, SNR, and Detail Preservation Ratio to evaluate the effectiveness of processing.\r\n- **Image Comparison**: Creates side-by-side comparisons of original and processed images.\r\n\r\n## Installation\r\n\r\nThis class requires the following libraries:\r\n- OpenCV\r\n- NumPy\r\n- SciPy\r\n\r\nTo install the required dependencies, use:\r\n```bash\r\npip install opencv-python-headless numpy scipy\r\n```\r\n\r\n## Usage\r\n\r\n1. **Initialize the Processor**\r\n ```python\r\n from ct_scan_processor import CTScanProcessor\r\n processor = CTScanProcessor(kernel_size=5, clip_limit=2.0, tile_grid_size=(8, 8))\r\n ```\r\n\r\n2. **Process a CT Scan**\r\n Use the `process_ct_scan` method to process a CT scan image and get quality metrics.\r\n ```python\r\n denoised, metrics = processor.process_ct_scan(\"path_to_ct_scan.jpg\", \"output_folder\", compare=True)\r\n ```\r\n\r\n3. **Quality Metrics**\r\n After processing, the class returns metrics such as Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), Signal-to-Noise Ratio (SNR), and Detail Preservation Ratio.\r\n\r\n4. **Compare Images**\r\n If `compare=True`, a side-by-side comparison image is saved in the specified comparison folder.\r\n\r\n### Example\r\n\r\n```python\r\nif __name__ == \"__main__\":\r\n processor = CTScanProcessor()\r\n denoised, metrics = processor.process_ct_scan(\"path_to_ct_scan.jpg\", \"output_folder\", compare=True)\r\n```\r\n\r\n## Quality Metrics\r\n\r\nThe following metrics are calculated to evaluate the quality of the denoised image:\r\n\r\n- **MSE**: Mean Squared Error between the original and processed images.\r\n- **PSNR**: Peak Signal-to-Noise Ratio to measure image quality.\r\n- **SNR**: Signal-to-Noise Ratio to measure signal strength relative to noise.\r\n- **Detail Preservation**: Percentage of preserved details after processing.\r\n\r\n## Methods\r\n\r\n- `sharpen(image)`: Sharpens the input image.\r\n- `median_denoise(image)`: Denoises the input image using a median filter.\r\n- `enhance_contrast(image)`: Enhances contrast using CLAHE.\r\n- `enhanced_denoise(image_path)`: Processes a CT scan image with denoising, contrast enhancement, and sharpening.\r\n- `evaluate_quality(original, denoised)`: Computes MSE, PSNR, SNR, and Detail Preservation.\r\n- `compare_images(original, processed, output_path)`: Saves a side-by-side comparison of the original and processed images.\r\n- `process_ct_scan(input_path, output_folder, comparison_folder=\"comparison\", compare=False)`: Runs the full CT scan processing pipeline and saves the results.\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License.\r\n\r\n## Contributions\r\n\r\nContributions are welcome! Feel free to submit pull requests or open issues.\r\n\r\n\r\n### AudioRecognition ###\r\n\r\nThe `AudioRecognition` class in `gurulearn` provides tools for audio data augmentation, feature extraction, model training, and prediction, making it suitable for tasks like audio classification and speech recognition.\r\n\r\n#### Key Features\r\n\r\n- **Data Augmentation**: Supports time-stretching, pitch-shifting, and noise addition for audio data augmentation.\r\n- **Feature Extraction**: Extracts MFCCs, chroma, and spectral contrast features from audio signals.\r\n- **Model Training**: Trains a deep learning model for audio classification using a Conv1D and BiLSTM-based architecture.\r\n- **Prediction**: Predicts the class of a given audio file based on a trained model.\r\n\r\n#### Usage\r\n\r\nTo use the `AudioRecognition` class, follow these steps:\r\n\r\n#### 1. Importing and Initializing\r\n\r\n```python\r\nfrom gurulearn import AudioRecognition\r\n\r\n# Initialize the audio recognition class\r\naudio_recognition = AudioRecognition()\r\n```\r\n\r\n#### 2. Loading Data with Augmentation\r\n\r\nThe `load_data_with_augmentation` method loads audio data from a specified directory and performs augmentation to improve model generalization.\r\n\r\n```python\r\ndata_dir = \"path/to/audio/data\"\r\nX, y = audio_recognition.load_data_with_augmentation(data_dir)\r\n```\r\n\r\nThis method returns feature vectors (`X`) and labels (`y`) for training.\r\n\r\n#### 3. Training the Model\r\n\r\nThe `audiotrain` method trains an audio classification model. This method also generates a confusion matrix and training history plot, which are saved in the specified model directory.\r\n\r\n```python\r\naudio_recognition.audiotrain(\r\n data_path=\"path/to/audio/data\",\r\n epochs=50,\r\n batch_size=32,\r\n test_size=0.2,\r\n learning_rate=0.001,\r\n model_dir='model_folder'\r\n)\r\n```\r\n\r\nParameters:\r\n- **data_path**: Directory path where audio data is stored (organized by class label).\r\n- **epochs**: Number of training epochs (default: 50).\r\n- **batch_size**: Training batch size (default: 32).\r\n- **test_size**: Proportion of data to use for testing (default: 0.2).\r\n- **learning_rate**: Initial learning rate for model training (default: 0.001).\r\n- **model_dir**: Directory where the model and label mappings will be saved.\r\n\r\n#### 4. Predicting the Class of an Audio File\r\n\r\nAfter training, you can predict the class of a new audio file using the `predict` or `predict_class` methods.\r\n\r\n```python\r\n# Path to the input audio file\r\ninput_wav = \"path/to/audio/file.wav\"\r\n\r\n# Predict the label of the audio file\r\npredicted_label = audio_recognition.predict(input_wav)\r\nprint(f\"Predicted Label: {predicted_label}\")\r\n```\r\n\r\nThe `predict` method returns the predicted label (text), while `predict_class` returns the numeric class index.\r\n\r\n#### Example Workflow\r\n\r\n```python\r\n# Initialize the audio recognition instance\r\naudio_recognition = AudioRecognition()\r\n\r\n# Load data and perform augmentation\r\nX, y = audio_recognition.load_data_with_augmentation('data/audio_files')\r\n\r\n# Train the model on the audio dataset\r\naudio_recognition.audiotrain(\r\n data_path='data/audio_files',\r\n epochs=30,\r\n batch_size=32,\r\n learning_rate=0.001\r\n)\r\n\r\n# Predict the class of a new audio sample\r\npredicted_label = audio_recognition.predict('data/test_audio.wav')\r\nprint(\"Predicted Label:\", predicted_label)\r\n```\r\n\r\n#### Files Created\r\n\r\n- **Confusion Matrix**: `confusion_matrix.png` - Saved in the current directory after training.\r\n- **Training History**: `training_history.png` - Contains plots for model accuracy and loss.\r\n- **Model**: `audio_recognition_model.h5` - Saved in the specified model directory.\r\n- **Label Mapping**: `label_mapping.json` - Contains mappings of class indices to labels.\r\n\r\n## Introducing FlowBot \r\n`FlowBot` is a flexible framework for creating **dynamic, guided interactions** (chatbots, booking systems, surveys) that adapt to user input and filter datasets in real time. Perfect for travel booking, customer support, or personalized recommendations! \r\n\r\n\r\n---\r\n\r\n## Installation \r\n```bash\r\npip install gurulearn\r\n```\r\n\r\n---\r\n\r\n## Quick Start \r\n**Build a Travel Booking Bot in 5 Steps**: \r\n\r\n```python\r\nimport pandas as pd\r\nfrom gurulearn import FlowBot\r\n\r\n# Sample dataset\r\nhotels = pd.DataFrame({\r\n 'destination': ['Paris', 'Tokyo', 'New York'],\r\n 'price_range': ['$$$', '$$', '$'],\r\n 'hotel_name': ['Luxury Palace', 'Mountain View', 'Downtown Inn']\r\n})\r\n\r\n# Initialize FlowBot\r\nbot = FlowBot(hotels)\r\n\r\n# Collect user email first\r\nbot.add_personal_info(\"email\", \"Please enter your email:\")\r\n\r\n# Define workflow\r\nbot.add(\"destination\", \"Where would you like to go?\", required=True)\r\nbot.add(\"price_range\", \"Choose your budget:\", required=False)\r\nbot.finish(\"hotel_name\", \"price_range\") # Final output columns\r\n\r\n# Simulate user interaction\r\nresponse = bot.process(\"user123\", \"\") # Start flow!\r\nprint(response['message']) # \"Where would you like to go?\"\r\nprint(response['suggestions']) # [\"Paris\", \"Tokyo\", \"New York\"]\r\n\r\n# User selects 'Paris'\r\nresponse = bot.process(\"user123\", \"Paris\")\r\nprint(response['message']) # \"Choose your budget:\"\r\nprint(response['suggestions']) # [\"$$$\", \"$$\"]\r\n```\r\n\r\n---\r\n\r\n## Key Features \r\n\r\n### 1. **Dynamic Suggestions** \r\nAuto-filter valid options based on prior choices: \r\n```python\r\nbot.add(\"activity\", \"Choose an activity:\", required=True)\r\n# Suggests only activities available in the selected destination\r\n```\r\n\r\n### 2. **Personalized Data Collection** \r\n```python\r\nbot.add_personal_info(\"phone\", \"Your phone number:\", required=True)\r\n```\r\n\r\n### 3. **Session Management** \r\nResume progress or reset conversations: \r\n```python\r\nbot.reset_session(\"user123\") # Restart workflow\r\n```\r\n\r\n### 4. **Save Results** \r\nUser data and chat history auto-saved to JSON: \r\n```json\r\nuser_data/user123.json\r\n{\r\n \"personal_info\": {\"email\": \"user@example.com\"},\r\n \"chat_history\": [...]\r\n}\r\n```\r\n\r\n---\r\n\r\n## Detailed Usage \r\n\r\n### Initialize FlowBot \r\n```python\r\nbot = FlowBot(\r\n data=df, # Your pandas DataFrame\r\n)\r\n```\r\n\r\n### Add Workflow Steps \r\n```python\r\nbot.add(\r\n field=\"room_type\", # DataFrame column to filter\r\n prompt=\"Select room type:\", # User prompt\r\n required=True # Force valid input\r\n)\r\n```\r\n\r\n### Get Final Results \r\n```python\r\nresults = response['results'] # Filtered DataFrame rows as dicts\r\n# Example: [{'hotel_name': 'Luxury Palace', 'price_range': '$$$'}]\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udd27 Dependencies \r\n- Python 3.7+ \r\n- `pandas` \r\n\r\n---\r\n\r\n\r\n## \ud83d\udcdc License \r\n[MIT License](LICENSE) \r\n\r\n---\r\n\r\n## Get Help \r\nFound a bug? Open an [issue](https://github.com/guru-dharsan-git/gurulearn/issues). \r\n\r\n--- \r\n **Happy Building!** \r\n*Tag your projects with #gurulearn to share them with the community!* \r\n\r\n---\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Comprehensive ML library for model analysis, computer vision, medical imaging, and audio processing with enhanced features including confidence metrics and flowbot integration (bug fixes)",
"version": "2.0.2",
"project_urls": null,
"split_keywords": [
"machine learning",
" deep learning",
" computer vision",
" medical imaging",
" audio processing",
" ai"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f5f34cd0f451cdb7ffad69b7238da1fcba310616f1c75f1dd3a32233729435b8",
"md5": "c85d2448ccbfff31cd557a4abd7e2d0f",
"sha256": "9b16271de54788ef3e3f150e2fb7311f0d1f80356060376c50dd73432041d8be"
},
"downloads": -1,
"filename": "gurulearn-2.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c85d2448ccbfff31cd557a4abd7e2d0f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 21045,
"upload_time": "2025-02-09T06:53:52",
"upload_time_iso_8601": "2025-02-09T06:53:52.438891Z",
"url": "https://files.pythonhosted.org/packages/f5/f3/4cd0f451cdb7ffad69b7238da1fcba310616f1c75f1dd3a32233729435b8/gurulearn-2.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "69ec6b0cda13d2885f5780da70eefa1bef22bb8f81ed330cb27638db7c0e149e",
"md5": "e66b2b000409992a2abc606eb8e66245",
"sha256": "57e1885bc098926a8fe480105078d3a585e2456931df4641226e50a7addf5d73"
},
"downloads": -1,
"filename": "gurulearn-2.0.2.tar.gz",
"has_sig": false,
"md5_digest": "e66b2b000409992a2abc606eb8e66245",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 27751,
"upload_time": "2025-02-09T06:53:54",
"upload_time_iso_8601": "2025-02-09T06:53:54.731078Z",
"url": "https://files.pythonhosted.org/packages/69/ec/6b0cda13d2885f5780da70eefa1bef22bb8f81ed330cb27638db7c0e149e/gurulearn-2.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-09 06:53:54",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "gurulearn"
}