# StadisticsML
**StadisticsML** is a Python library designed to facilitate the use of machine learning models, specifically neural networks and Support Vector Regression (SVR), with a focus on data prediction and analysis. This library allows users to create customizable models, adjust hyperparameters, and obtain predictions and performance evaluations.
## Features
- **Customizable Neural Network**: Allows the creation of neural networks with multiple layers and activations, specifying the number of epochs and optimization function.
- **SVR**: Implementation of Support Vector Regression, with easy adjustment of parameters like `C`, `gamma`, and `epsilon`.
- **Model Evaluation**: Generates performance metrics such as MSE and RMSE to assess the accuracy of trained models.
- **User-Friendly Interface**: Easy-to-use functions to train models and make predictions on new data.
## Installation
To install the library, use the following `pip` command:
```bash
pip install StadisticsML
```
## Requirements
- `numpy`
- `scipy`
- `scikit-learn`
- `tensorflow`
These packages will be installed automatically as dependencies when you install **StadisticsML**.
## Functions
### 1. **Customizable Neural Network** (`train_neural_network`)
This function allows the user to create and train a neural network for regression. The adjustable parameters include:
- **X**: Training input data.
- **y**: Labels or expected outcomes.
- **hidden_layers_config**: Configuration of the hidden layers with the number of neurons and activation functions.
- **output_neurons**: Number of neurons in the output layer.
- **output_activation**: Activation function for the output layer.
- **optimizer**: Optimizer to use (e.g., `adam`, `sgd`).
- **loss**: Loss function to use (e.g., `mean_squared_error`).
- **metrics**: Additional metrics for model evaluation (e.g., `mae`, `mse`).
- **epochs**: Number of training epochs.
- **test_size**: Percentage of data allocated for testing.
- **random_state**: Seed for randomization.
### Example:
```python
from StadisticsML import train_neural_network
# Example data
X = [[1], [2], [3], [4], [5]]
y = [1.1, 2.0, 2.9, 4.0, 5.1]
# Hidden layers configuration: [(neurons, activation function)]
hidden_layers_config = [(12, 'relu'), (8, 'relu')]
# Train the neural network
model, history, test_loss, predictions = train_neural_network(
X=X, y=y,
hidden_layers_config=hidden_layers_config,
epochs=100,
test_size=0.2,
random_state=42
)
print("Predictions:", predictions)
print("Test Loss:", test_loss)
```
### 2. **Support Vector Regression (SVR)** (`train_svr`)
This function trains a support vector regression model using customizable parameters.
- **X**: Training input data.
- **y**: Labels or expected outcomes.
- **C**: Penalty parameter.
- **gamma**: Kernel function coefficient.
- **epsilon**: Tolerance margin.
- **test_size**: Percentage of data allocated for testing.
### Example:
```python
from StadisticsML import train_svr
# Example data
X = [[1], [2], [3], [4], [5]]
y = [1.1, 2.0, 2.9, 4.0, 5.1]
# Train the SVR model
mse, rmse, predictions = train_svr(X=X, y=y, C=100, gamma=0.001, epsilon=0.001, test_size=0.2)
print("SVR Predictions:", predictions)
print("RMSE:", rmse)
```
### 3. **Cross-Validation for Model Optimization** (`cross_validate_model`)
This function allows the user to perform cross-validation for optimizing the hyperparameters of a neural network or SVR model.
- **model_type**: Choose between `'nn'` (neural network) or `'svr'` (Support Vector Regression).
- **X**: Input data for training.
- **y**: Expected outcomes.
- **hyperparameters**: A dictionary of hyperparameters to tune, such as `C`, `gamma`, `epochs`, etc.
- **cv_folds**: Number of cross-validation folds.
- **random_state**: Seed for reproducibility.
### Example:
```python
from StadisticsML import cross_validate_model
# Example data
X = [[1], [2], [3], [4], [5]]
y = [1.1, 2.0, 2.9, 4.0, 5.1]
# Hyperparameters to tune
hyperparameters = {'C': [0.1, 1, 10], 'gamma': [0.001, 0.01, 0.1]}
# Perform cross-validation for SVR
best_params, mean_score = cross_validate_model(
model_type='svr', X=X, y=y, hyperparameters=hyperparameters, cv_folds=5, random_state=42
)
print("Best parameters:", best_params)
print("Mean score from cross-validation:", mean_score)
```
## Contribution
If you want to contribute to this project, please follow these steps:
1. Fork the repository.
2. Create a new branch (`git checkout -b feature/new-feature`).
3. Make your changes and commit them (`git commit -am 'Add new feature'`).
4. Push the branch (`git push origin feature/new-feature`).
5. Open a pull request.
**Expected Contribution:** Implementation of additional machine learning models.
## License
This project is licensed under the **MIT** License. For more details, please refer to the [LICENSE](LICENSE) file.
## Contact
- **Author**: Jorge Eduardo Londoño Arango
- **Email**: [joelondonoar@unal.edu.co](mailto:joelondonoar@unal.edu.co) - [jorge.nebulanoir@gmail.com](mailto:jorge.nebulanoir@gmail.com)
Raw data
{
"_id": null,
"home_page": "https://github.com/Guacen/StadisticsML.git",
"name": "StadisticsML",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Jorge Eduardo Londo\u00f1o Arango",
"author_email": "joelondonoar@unal.edu.co",
"download_url": "https://files.pythonhosted.org/packages/5a/cf/e9c0c7411fe8dca81741f358d87ec815665ca78172863a78dc90aef3b447/StadisticsML-1.0.0.tar.gz",
"platform": null,
"description": "\r\n# StadisticsML\r\n\r\n**StadisticsML** is a Python library designed to facilitate the use of machine learning models, specifically neural networks and Support Vector Regression (SVR), with a focus on data prediction and analysis. This library allows users to create customizable models, adjust hyperparameters, and obtain predictions and performance evaluations.\r\n\r\n## Features\r\n\r\n- **Customizable Neural Network**: Allows the creation of neural networks with multiple layers and activations, specifying the number of epochs and optimization function.\r\n- **SVR**: Implementation of Support Vector Regression, with easy adjustment of parameters like `C`, `gamma`, and `epsilon`.\r\n- **Model Evaluation**: Generates performance metrics such as MSE and RMSE to assess the accuracy of trained models.\r\n- **User-Friendly Interface**: Easy-to-use functions to train models and make predictions on new data.\r\n\r\n## Installation\r\n\r\nTo install the library, use the following `pip` command:\r\n\r\n```bash\r\npip install StadisticsML\r\n```\r\n\r\n## Requirements\r\n\r\n- `numpy`\r\n- `scipy`\r\n- `scikit-learn`\r\n- `tensorflow`\r\n \r\nThese packages will be installed automatically as dependencies when you install **StadisticsML**.\r\n\r\n## Functions\r\n\r\n### 1. **Customizable Neural Network** (`train_neural_network`)\r\n\r\nThis function allows the user to create and train a neural network for regression. The adjustable parameters include:\r\n\r\n- **X**: Training input data.\r\n- **y**: Labels or expected outcomes.\r\n- **hidden_layers_config**: Configuration of the hidden layers with the number of neurons and activation functions.\r\n- **output_neurons**: Number of neurons in the output layer.\r\n- **output_activation**: Activation function for the output layer.\r\n- **optimizer**: Optimizer to use (e.g., `adam`, `sgd`).\r\n- **loss**: Loss function to use (e.g., `mean_squared_error`).\r\n- **metrics**: Additional metrics for model evaluation (e.g., `mae`, `mse`).\r\n- **epochs**: Number of training epochs.\r\n- **test_size**: Percentage of data allocated for testing.\r\n- **random_state**: Seed for randomization.\r\n\r\n### Example:\r\n\r\n```python\r\nfrom StadisticsML import train_neural_network\r\n\r\n# Example data\r\nX = [[1], [2], [3], [4], [5]]\r\ny = [1.1, 2.0, 2.9, 4.0, 5.1]\r\n\r\n# Hidden layers configuration: [(neurons, activation function)]\r\nhidden_layers_config = [(12, 'relu'), (8, 'relu')]\r\n\r\n# Train the neural network\r\nmodel, history, test_loss, predictions = train_neural_network(\r\n X=X, y=y, \r\n hidden_layers_config=hidden_layers_config,\r\n epochs=100,\r\n test_size=0.2,\r\n random_state=42\r\n)\r\n\r\nprint(\"Predictions:\", predictions)\r\nprint(\"Test Loss:\", test_loss)\r\n```\r\n\r\n### 2. **Support Vector Regression (SVR)** (`train_svr`)\r\n\r\nThis function trains a support vector regression model using customizable parameters.\r\n\r\n- **X**: Training input data.\r\n- **y**: Labels or expected outcomes.\r\n- **C**: Penalty parameter.\r\n- **gamma**: Kernel function coefficient.\r\n- **epsilon**: Tolerance margin.\r\n- **test_size**: Percentage of data allocated for testing.\r\n\r\n### Example:\r\n\r\n```python\r\nfrom StadisticsML import train_svr\r\n\r\n# Example data\r\nX = [[1], [2], [3], [4], [5]]\r\ny = [1.1, 2.0, 2.9, 4.0, 5.1]\r\n\r\n# Train the SVR model\r\nmse, rmse, predictions = train_svr(X=X, y=y, C=100, gamma=0.001, epsilon=0.001, test_size=0.2)\r\n\r\nprint(\"SVR Predictions:\", predictions)\r\nprint(\"RMSE:\", rmse)\r\n```\r\n\r\n### 3. **Cross-Validation for Model Optimization** (`cross_validate_model`)\r\n\r\nThis function allows the user to perform cross-validation for optimizing the hyperparameters of a neural network or SVR model.\r\n\r\n- **model_type**: Choose between `'nn'` (neural network) or `'svr'` (Support Vector Regression).\r\n- **X**: Input data for training.\r\n- **y**: Expected outcomes.\r\n- **hyperparameters**: A dictionary of hyperparameters to tune, such as `C`, `gamma`, `epochs`, etc.\r\n- **cv_folds**: Number of cross-validation folds.\r\n- **random_state**: Seed for reproducibility.\r\n\r\n### Example:\r\n\r\n```python\r\nfrom StadisticsML import cross_validate_model\r\n\r\n# Example data\r\nX = [[1], [2], [3], [4], [5]]\r\ny = [1.1, 2.0, 2.9, 4.0, 5.1]\r\n\r\n# Hyperparameters to tune\r\nhyperparameters = {'C': [0.1, 1, 10], 'gamma': [0.001, 0.01, 0.1]}\r\n\r\n# Perform cross-validation for SVR\r\nbest_params, mean_score = cross_validate_model(\r\n model_type='svr', X=X, y=y, hyperparameters=hyperparameters, cv_folds=5, random_state=42\r\n)\r\n\r\nprint(\"Best parameters:\", best_params)\r\nprint(\"Mean score from cross-validation:\", mean_score)\r\n```\r\n\r\n## Contribution\r\n\r\nIf you want to contribute to this project, please follow these steps:\r\n\r\n1. Fork the repository.\r\n2. Create a new branch (`git checkout -b feature/new-feature`).\r\n3. Make your changes and commit them (`git commit -am 'Add new feature'`).\r\n4. Push the branch (`git push origin feature/new-feature`).\r\n5. Open a pull request.\r\n\r\n**Expected Contribution:** Implementation of additional machine learning models.\r\n\r\n## License\r\n\r\nThis project is licensed under the **MIT** License. For more details, please refer to the [LICENSE](LICENSE) file.\r\n\r\n## Contact\r\n\r\n- **Author**: Jorge Eduardo Londo\u00f1o Arango\r\n- **Email**: [joelondonoar@unal.edu.co](mailto:joelondonoar@unal.edu.co) - [jorge.nebulanoir@gmail.com](mailto:jorge.nebulanoir@gmail.com)\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Librer\u00eda para obtener valores estad\u00edsticos para pruebas de distribuci\u00f3n de signos, determinaci\u00f3n de outliers y pruebas Durbin-Watson",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/Guacen/StadisticsML.git"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "427a4a672a1a50f09d9fdaab2a7df4c5cdb88516080b92f8f4fbc0abc19aaf68",
"md5": "e1623df6db9463e2638a79b6757779ad",
"sha256": "2de6553d48030d0baad4663888c9932932e47d6ea8ba7cfdc784e49127739104"
},
"downloads": -1,
"filename": "StadisticsML-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e1623df6db9463e2638a79b6757779ad",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 6596,
"upload_time": "2024-11-15T01:13:46",
"upload_time_iso_8601": "2024-11-15T01:13:46.875287Z",
"url": "https://files.pythonhosted.org/packages/42/7a/4a672a1a50f09d9fdaab2a7df4c5cdb88516080b92f8f4fbc0abc19aaf68/StadisticsML-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5acfe9c0c7411fe8dca81741f358d87ec815665ca78172863a78dc90aef3b447",
"md5": "63f3eaada2852b28f0b7a78bc07694d9",
"sha256": "34643a6247aa5f30fd511f68fc44ed72d4d9b78c9f8ed3078303d7a6b4c05d85"
},
"downloads": -1,
"filename": "StadisticsML-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "63f3eaada2852b28f0b7a78bc07694d9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6660,
"upload_time": "2024-11-15T01:13:49",
"upload_time_iso_8601": "2024-11-15T01:13:49.863768Z",
"url": "https://files.pythonhosted.org/packages/5a/cf/e9c0c7411fe8dca81741f358d87ec815665ca78172863a78dc90aef3b447/StadisticsML-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-15 01:13:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Guacen",
"github_project": "StadisticsML",
"github_fetch_exception": true,
"lcname": "stadisticsml"
}