# Search in a Third
Search in a Third is a Python package designed for optimization of hyperparameters in neural networks with an emphasis on moderate computational usage. It utilizes a greedy algorithm guided by heuristic directions that avoid traversing the entire multidimensional space of hyperparameters to achieve an optimal configuration of models efficiently.
## Features
- **Efficient Hyperparameter Optimization**: Focuses on reducing the number of models trained while still achieving optimal results.
- **Greedy Algorithm with Heuristic Guidance**: Narrows down the search space intelligently to find the best model configurations without exhaustive search.
- **Optimized Computational Use**: Designed to make the most out of available computational resources, avoiding unnecessary model training.
## Installation
You can install the package using pip:
```bash
pip install search_in_a_third
```
## Implementation Guide
### Introduction
This implementation guide is designed to quickly guide users through the implementation of Search in a Third.
### Installation and Library Import
To install the library, execute the following command:
```pip install search_in_a_third```
To import the library, you should include in your code:
```import search_in_a_third as siat```
### Data Loading and Processing
The first thing you need to do is process the data. For this, I will first give you some tips about the source files that will facilitate your work with search_in_a_third:
- Prioritize the use of files in csv format.
- Do not include the column names in your file.
- Be careful that the attribute separator is not included in the content of the columns of the file.
- Manage nulls before loading the file, the library executes a very basic management of nulls.
Once you have your file ready, you can include the following code in your program:
```data_model = siat.build_data_model(url, ['col1', 'col2', 'class'], 'class', 0.3, 'csv', ',')```
### Search Configuration
Preparing the configuration object is the most tedious part that involves the use of Search in a Third. You must be very careful in this step so that the algorithm understands what you want it to do.
The configuration object is a Python dictionary that must have the following mandatory elements:
- loss which must be a list with at least one valid loss function.
- optimize which must be a list with at least one valid optimization function.
- output_layer which must be a list with at least one valid activation function.
- learning_rate which must be a list of two numbers between 0 and 1.
- middle_layers_configuration with at least one Layer type element.
In turn, each element in middle_layers_configuration is a list that must have the following mandatory elements:
- At index 0 a guid type identifier.
- At index 1 a list of two integers that represent the minimum and maximum number of neurons to search for.
- At index 2 a list of two decimal numbers between 0 and 1 that represent the minimum and maximum dropout value to search for.
- At index 3 a list of valid activation functions.
An example of a configuration object is as follows:
```configuration_json = {'loss': ['categorical_crossentropy'], 'optimizer': ['Adam'], 'output_layer': ['sigmoid'], 'learning_rate': [0.0001, 0.1], 'middle_layers_configuration': [['51e2ebaa-1f44-41cd-b057-d56aaa42a13c', [1, 100], [0, 0.25], ['relu', 'sigmoid']], ['cdec3ae7-cd55-42bb-8894-2f11952488fc', [1, 100], [0, 0.25], ['relu', 'sigmoid']]]}```
### Execution of the Search
To execute the search for optimal configurations for your model's hyperparameters, you should include the following line of code in your program, where configuration_json is the configuration object you generated earlier, data_model is the data object you also generated earlier, and n is the number of iterations you want the algorithm to perform (I recommend starting with 3):
```iterations = siat.search(configuration_json, data_model, n)```
### Interpreting the Results
In the iterations variable, you will have a dictionary where each element represents each of the iterations. Each element will be named after the iteration index converted to String, for example: '0', '1', '2', ...
And each iteration will be a dictionary that in its 'results' element will have a list of result objects as follows:
```[{'id': '831865d9-cd6a-492b-a461-c823cce8644e', 'loss': 'categorical_crossentropy', 'model': '', 'result': 0.976190447807312, 'optimizer': 'Adam', 'output_layer': 'sigmoid', 'learning_rate': 0.06670000000000001, 'middle_layers': [{'id': '87b14a0a-eb15-42b5-a998-5f452d1ae622', 'units': 34, 'dropout': 0, 'activation': 'relu'}], 'x_characteristics': 4}, {…….]```
In the example, it is a model with an accuracy result of 0.976, using a Categorical Crossentropy loss function, an Adam optimization function with a learning rate of 0.0667, output activated by the Sigmoid function, and a single hidden layer with 34 neurons, without dropout, activated by the Relu function.
If you want to find in this list the element that corresponds to the best-performing model, you can use the following line of code where i corresponds to the integer index of the iteration:
```best_performer = iterations['i']['best_performer']```
And if you want to discover the most promising search space offered by the iteration for a next iteration, you can run the following code where i corresponds to the integer index of the iteration:
```next_configuration = iterations['i']['next_configuration']```
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Author
- Diego Larriera
For more information, please contact proflarriera@gmail.com.
Raw data
{
"_id": null,
"home_page": "https://github.com/proflarriera/searchinathird",
"name": "search-in-a-third",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "hyperparameter optimization, neural networks, machine learning, greedy algorithm, heuristic",
"author": "Diego Larriera",
"author_email": "proflarriera@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d8/47/4c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b/search_in_a_third-0.1.1.tar.gz",
"platform": null,
"description": "# Search in a Third\r\n\r\nSearch in a Third is a Python package designed for optimization of hyperparameters in neural networks with an emphasis on moderate computational usage. It utilizes a greedy algorithm guided by heuristic directions that avoid traversing the entire multidimensional space of hyperparameters to achieve an optimal configuration of models efficiently.\r\n\r\n## Features\r\n\r\n- **Efficient Hyperparameter Optimization**: Focuses on reducing the number of models trained while still achieving optimal results.\r\n- **Greedy Algorithm with Heuristic Guidance**: Narrows down the search space intelligently to find the best model configurations without exhaustive search.\r\n- **Optimized Computational Use**: Designed to make the most out of available computational resources, avoiding unnecessary model training.\r\n\r\n## Installation\r\n\r\nYou can install the package using pip:\r\n\r\n```bash\r\npip install search_in_a_third\r\n```\r\n\r\n## Implementation Guide\r\n\r\n### Introduction\r\nThis implementation guide is designed to quickly guide users through the implementation of Search in a Third.\r\n\r\n### Installation and Library Import\r\nTo install the library, execute the following command:\r\n```pip install search_in_a_third```\r\n\r\nTo import the library, you should include in your code:\r\n```import search_in_a_third as siat```\r\n\r\n### Data Loading and Processing\r\nThe first thing you need to do is process the data. For this, I will first give you some tips about the source files that will facilitate your work with search_in_a_third:\r\n- Prioritize the use of files in csv format.\r\n- Do not include the column names in your file.\r\n- Be careful that the attribute separator is not included in the content of the columns of the file.\r\n- Manage nulls before loading the file, the library executes a very basic management of nulls.\r\n\r\nOnce you have your file ready, you can include the following code in your program:\r\n```data_model = siat.build_data_model(url, ['col1', 'col2', 'class'], 'class', 0.3, 'csv', ',')```\r\n\r\n### Search Configuration\r\nPreparing the configuration object is the most tedious part that involves the use of Search in a Third. You must be very careful in this step so that the algorithm understands what you want it to do.\r\n\r\nThe configuration object is a Python dictionary that must have the following mandatory elements:\r\n- loss which must be a list with at least one valid loss function.\r\n- optimize which must be a list with at least one valid optimization function.\r\n- output_layer which must be a list with at least one valid activation function.\r\n- learning_rate which must be a list of two numbers between 0 and 1.\r\n- middle_layers_configuration with at least one Layer type element.\r\n\r\nIn turn, each element in middle_layers_configuration is a list that must have the following mandatory elements:\r\n- At index 0 a guid type identifier.\r\n- At index 1 a list of two integers that represent the minimum and maximum number of neurons to search for.\r\n- At index 2 a list of two decimal numbers between 0 and 1 that represent the minimum and maximum dropout value to search for.\r\n- At index 3 a list of valid activation functions.\r\n\r\nAn example of a configuration object is as follows:\r\n\r\n```configuration_json = {'loss': ['categorical_crossentropy'], 'optimizer': ['Adam'], 'output_layer': ['sigmoid'], 'learning_rate': [0.0001, 0.1], 'middle_layers_configuration': [['51e2ebaa-1f44-41cd-b057-d56aaa42a13c', [1, 100], [0, 0.25], ['relu', 'sigmoid']], ['cdec3ae7-cd55-42bb-8894-2f11952488fc', [1, 100], [0, 0.25], ['relu', 'sigmoid']]]}```\r\n\r\n### Execution of the Search\r\n\r\nTo execute the search for optimal configurations for your model's hyperparameters, you should include the following line of code in your program, where configuration_json is the configuration object you generated earlier, data_model is the data object you also generated earlier, and n is the number of iterations you want the algorithm to perform (I recommend starting with 3):\r\n\r\n\r\n```iterations = siat.search(configuration_json, data_model, n)```\r\n\r\n### Interpreting the Results\r\nIn the iterations variable, you will have a dictionary where each element represents each of the iterations. Each element will be named after the iteration index converted to String, for example: '0', '1', '2', ...\r\n\r\nAnd each iteration will be a dictionary that in its 'results' element will have a list of result objects as follows:\r\n\r\n\r\n```[{'id': '831865d9-cd6a-492b-a461-c823cce8644e', 'loss': 'categorical_crossentropy', 'model': '', 'result': 0.976190447807312, 'optimizer': 'Adam', 'output_layer': 'sigmoid', 'learning_rate': 0.06670000000000001, 'middle_layers': [{'id': '87b14a0a-eb15-42b5-a998-5f452d1ae622', 'units': 34, 'dropout': 0, 'activation': 'relu'}], 'x_characteristics': 4}, {\u00e2\u20ac\u00a6\u00e2\u20ac\u00a6.]```\r\n\r\n\r\nIn the example, it is a model with an accuracy result of 0.976, using a Categorical Crossentropy loss function, an Adam optimization function with a learning rate of 0.0667, output activated by the Sigmoid function, and a single hidden layer with 34 neurons, without dropout, activated by the Relu function.\r\n\r\nIf you want to find in this list the element that corresponds to the best-performing model, you can use the following line of code where i corresponds to the integer index of the iteration:\r\n\r\n\r\n```best_performer = iterations['i']['best_performer']```\r\n\r\nAnd if you want to discover the most promising search space offered by the iteration for a next iteration, you can run the following code where i corresponds to the integer index of the iteration:\r\n\r\n```next_configuration = iterations['i']['next_configuration']```\r\n\r\n\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## Author\r\n\r\n- Diego Larriera\r\n\r\nFor more information, please contact proflarriera@gmail.com.\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package for efficient hyperparameter optimization in neural networks, using a greedy algorithm guided by heuristic directions.",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/proflarriera/searchinathird"
},
"split_keywords": [
"hyperparameter optimization",
" neural networks",
" machine learning",
" greedy algorithm",
" heuristic"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4db2f4bb9203cde3f1d9375cb9802fdfb34f5b7e40260da135ba8c0cdde17b29",
"md5": "5920865e8a4e10f4ce4ac29b59ef9e9b",
"sha256": "ccb8109009368f0e212c24e2e1d4802eb23502bf8e23c798b2b3a0a4dad3fc2a"
},
"downloads": -1,
"filename": "search_in_a_third-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5920865e8a4e10f4ce4ac29b59ef9e9b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 11863,
"upload_time": "2024-04-01T10:38:05",
"upload_time_iso_8601": "2024-04-01T10:38:05.345315Z",
"url": "https://files.pythonhosted.org/packages/4d/b2/f4bb9203cde3f1d9375cb9802fdfb34f5b7e40260da135ba8c0cdde17b29/search_in_a_third-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d8474c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b",
"md5": "89494904f887ce21295bab9b759036f6",
"sha256": "4f491195e6df39c8b34006b95d9e9d173c5725a1721bfcb75e5a642ec8348645"
},
"downloads": -1,
"filename": "search_in_a_third-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "89494904f887ce21295bab9b759036f6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14020,
"upload_time": "2024-04-01T10:38:06",
"upload_time_iso_8601": "2024-04-01T10:38:06.775085Z",
"url": "https://files.pythonhosted.org/packages/d8/47/4c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b/search_in_a_third-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-01 10:38:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "proflarriera",
"github_project": "searchinathird",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "tensorflow",
"specs": []
},
{
"name": "scikit-learn",
"specs": []
},
{
"name": "pandas",
"specs": []
}
],
"lcname": "search-in-a-third"
}