search-in-a-third


Namesearch-in-a-third JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/proflarriera/searchinathird
SummaryA Python package for efficient hyperparameter optimization in neural networks, using a greedy algorithm guided by heuristic directions.
upload_time2024-04-01 10:38:06
maintainerNone
docs_urlNone
authorDiego Larriera
requires_pythonNone
licenseMIT
keywords hyperparameter optimization neural networks machine learning greedy algorithm heuristic
VCS
bugtrack_url
requirements tensorflow scikit-learn pandas
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Search in a Third

Search in a Third is a Python package designed for optimization of hyperparameters in neural networks with an emphasis on moderate computational usage. It utilizes a greedy algorithm guided by heuristic directions that avoid traversing the entire multidimensional space of hyperparameters to achieve an optimal configuration of models efficiently.

## Features

- **Efficient Hyperparameter Optimization**: Focuses on reducing the number of models trained while still achieving optimal results.
- **Greedy Algorithm with Heuristic Guidance**: Narrows down the search space intelligently to find the best model configurations without exhaustive search.
- **Optimized Computational Use**: Designed to make the most out of available computational resources, avoiding unnecessary model training.

## Installation

You can install the package using pip:

```bash
pip install search_in_a_third
```

## Implementation Guide

### Introduction
This implementation guide is designed to quickly guide users through the implementation of Search in a Third.

### Installation and Library Import
To install the library, execute the following command:
```pip install search_in_a_third```

To import the library, you should include in your code:
```import search_in_a_third as siat```

### Data Loading and Processing
The first thing you need to do is process the data. For this, I will first give you some tips about the source files that will facilitate your work with search_in_a_third:
- Prioritize the use of files in csv format.
- Do not include the column names in your file.
- Be careful that the attribute separator is not included in the content of the columns of the file.
- Manage nulls before loading the file, the library executes a very basic management of nulls.

Once you have your file ready, you can include the following code in your program:
```data_model = siat.build_data_model(url, ['col1', 'col2', 'class'], 'class', 0.3, 'csv', ',')```

### Search Configuration
Preparing the configuration object is the most tedious part that involves the use of Search in a Third. You must be very careful in this step so that the algorithm understands what you want it to do.

The configuration object is a Python dictionary that must have the following mandatory elements:
- loss which must be a list with at least one valid loss function.
- optimize which must be a list with at least one valid optimization function.
- output_layer which must be a list with at least one valid activation function.
- learning_rate which must be a list of two numbers between 0 and 1.
- middle_layers_configuration with at least one Layer type element.

In turn, each element in middle_layers_configuration is a list that must have the following mandatory elements:
- At index 0 a guid type identifier.
- At index 1 a list of two integers that represent the minimum and maximum number of neurons to search for.
- At index 2 a list of two decimal numbers between 0 and 1 that represent the minimum and maximum dropout value to search for.
- At index 3 a list of valid activation functions.

An example of a configuration object is as follows:

```configuration_json = {'loss': ['categorical_crossentropy'], 'optimizer': ['Adam'], 'output_layer': ['sigmoid'], 'learning_rate': [0.0001, 0.1], 'middle_layers_configuration': [['51e2ebaa-1f44-41cd-b057-d56aaa42a13c', [1, 100], [0, 0.25], ['relu', 'sigmoid']], ['cdec3ae7-cd55-42bb-8894-2f11952488fc', [1, 100], [0, 0.25], ['relu', 'sigmoid']]]}```

### Execution of the Search

To execute the search for optimal configurations for your model's hyperparameters, you should include the following line of code in your program, where configuration_json is the configuration object you generated earlier, data_model is the data object you also generated earlier, and n is the number of iterations you want the algorithm to perform (I recommend starting with 3):


```iterations = siat.search(configuration_json, data_model, n)```

### Interpreting the Results
In the iterations variable, you will have a dictionary where each element represents each of the iterations. Each element will be named after the iteration index converted to String, for example: '0', '1', '2', ...

And each iteration will be a dictionary that in its 'results' element will have a list of result objects as follows:


```[{'id': '831865d9-cd6a-492b-a461-c823cce8644e', 'loss': 'categorical_crossentropy', 'model': '', 'result': 0.976190447807312, 'optimizer': 'Adam', 'output_layer': 'sigmoid', 'learning_rate': 0.06670000000000001, 'middle_layers': [{'id': '87b14a0a-eb15-42b5-a998-5f452d1ae622', 'units': 34, 'dropout': 0, 'activation': 'relu'}], 'x_characteristics': 4}, {…….]```


In the example, it is a model with an accuracy result of 0.976, using a Categorical Crossentropy loss function, an Adam optimization function with a learning rate of 0.0667, output activated by the Sigmoid function, and a single hidden layer with 34 neurons, without dropout, activated by the Relu function.

If you want to find in this list the element that corresponds to the best-performing model, you can use the following line of code where i corresponds to the integer index of the iteration:


```best_performer = iterations['i']['best_performer']```

And if you want to discover the most promising search space offered by the iteration for a next iteration, you can run the following code where i corresponds to the integer index of the iteration:

```next_configuration = iterations['i']['next_configuration']```



## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Author

- Diego Larriera

For more information, please contact proflarriera@gmail.com.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/proflarriera/searchinathird",
    "name": "search-in-a-third",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "hyperparameter optimization, neural networks, machine learning, greedy algorithm, heuristic",
    "author": "Diego Larriera",
    "author_email": "proflarriera@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d8/47/4c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b/search_in_a_third-0.1.1.tar.gz",
    "platform": null,
    "description": "# Search in a Third\r\n\r\nSearch in a Third is a Python package designed for optimization of hyperparameters in neural networks with an emphasis on moderate computational usage. It utilizes a greedy algorithm guided by heuristic directions that avoid traversing the entire multidimensional space of hyperparameters to achieve an optimal configuration of models efficiently.\r\n\r\n## Features\r\n\r\n- **Efficient Hyperparameter Optimization**: Focuses on reducing the number of models trained while still achieving optimal results.\r\n- **Greedy Algorithm with Heuristic Guidance**: Narrows down the search space intelligently to find the best model configurations without exhaustive search.\r\n- **Optimized Computational Use**: Designed to make the most out of available computational resources, avoiding unnecessary model training.\r\n\r\n## Installation\r\n\r\nYou can install the package using pip:\r\n\r\n```bash\r\npip install search_in_a_third\r\n```\r\n\r\n## Implementation Guide\r\n\r\n### Introduction\r\nThis implementation guide is designed to quickly guide users through the implementation of Search in a Third.\r\n\r\n### Installation and Library Import\r\nTo install the library, execute the following command:\r\n```pip install search_in_a_third```\r\n\r\nTo import the library, you should include in your code:\r\n```import search_in_a_third as siat```\r\n\r\n### Data Loading and Processing\r\nThe first thing you need to do is process the data. For this, I will first give you some tips about the source files that will facilitate your work with search_in_a_third:\r\n- Prioritize the use of files in csv format.\r\n- Do not include the column names in your file.\r\n- Be careful that the attribute separator is not included in the content of the columns of the file.\r\n- Manage nulls before loading the file, the library executes a very basic management of nulls.\r\n\r\nOnce you have your file ready, you can include the following code in your program:\r\n```data_model = siat.build_data_model(url, ['col1', 'col2', 'class'], 'class', 0.3, 'csv', ',')```\r\n\r\n### Search Configuration\r\nPreparing the configuration object is the most tedious part that involves the use of Search in a Third. You must be very careful in this step so that the algorithm understands what you want it to do.\r\n\r\nThe configuration object is a Python dictionary that must have the following mandatory elements:\r\n- loss which must be a list with at least one valid loss function.\r\n- optimize which must be a list with at least one valid optimization function.\r\n- output_layer which must be a list with at least one valid activation function.\r\n- learning_rate which must be a list of two numbers between 0 and 1.\r\n- middle_layers_configuration with at least one Layer type element.\r\n\r\nIn turn, each element in middle_layers_configuration is a list that must have the following mandatory elements:\r\n- At index 0 a guid type identifier.\r\n- At index 1 a list of two integers that represent the minimum and maximum number of neurons to search for.\r\n- At index 2 a list of two decimal numbers between 0 and 1 that represent the minimum and maximum dropout value to search for.\r\n- At index 3 a list of valid activation functions.\r\n\r\nAn example of a configuration object is as follows:\r\n\r\n```configuration_json = {'loss': ['categorical_crossentropy'], 'optimizer': ['Adam'], 'output_layer': ['sigmoid'], 'learning_rate': [0.0001, 0.1], 'middle_layers_configuration': [['51e2ebaa-1f44-41cd-b057-d56aaa42a13c', [1, 100], [0, 0.25], ['relu', 'sigmoid']], ['cdec3ae7-cd55-42bb-8894-2f11952488fc', [1, 100], [0, 0.25], ['relu', 'sigmoid']]]}```\r\n\r\n### Execution of the Search\r\n\r\nTo execute the search for optimal configurations for your model's hyperparameters, you should include the following line of code in your program, where configuration_json is the configuration object you generated earlier, data_model is the data object you also generated earlier, and n is the number of iterations you want the algorithm to perform (I recommend starting with 3):\r\n\r\n\r\n```iterations = siat.search(configuration_json, data_model, n)```\r\n\r\n### Interpreting the Results\r\nIn the iterations variable, you will have a dictionary where each element represents each of the iterations. Each element will be named after the iteration index converted to String, for example: '0', '1', '2', ...\r\n\r\nAnd each iteration will be a dictionary that in its 'results' element will have a list of result objects as follows:\r\n\r\n\r\n```[{'id': '831865d9-cd6a-492b-a461-c823cce8644e', 'loss': 'categorical_crossentropy', 'model': '', 'result': 0.976190447807312, 'optimizer': 'Adam', 'output_layer': 'sigmoid', 'learning_rate': 0.06670000000000001, 'middle_layers': [{'id': '87b14a0a-eb15-42b5-a998-5f452d1ae622', 'units': 34, 'dropout': 0, 'activation': 'relu'}], 'x_characteristics': 4}, {\u00e2\u20ac\u00a6\u00e2\u20ac\u00a6.]```\r\n\r\n\r\nIn the example, it is a model with an accuracy result of 0.976, using a Categorical Crossentropy loss function, an Adam optimization function with a learning rate of 0.0667, output activated by the Sigmoid function, and a single hidden layer with 34 neurons, without dropout, activated by the Relu function.\r\n\r\nIf you want to find in this list the element that corresponds to the best-performing model, you can use the following line of code where i corresponds to the integer index of the iteration:\r\n\r\n\r\n```best_performer = iterations['i']['best_performer']```\r\n\r\nAnd if you want to discover the most promising search space offered by the iteration for a next iteration, you can run the following code where i corresponds to the integer index of the iteration:\r\n\r\n```next_configuration = iterations['i']['next_configuration']```\r\n\r\n\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## Author\r\n\r\n- Diego Larriera\r\n\r\nFor more information, please contact proflarriera@gmail.com.\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for efficient hyperparameter optimization in neural networks, using a greedy algorithm guided by heuristic directions.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/proflarriera/searchinathird"
    },
    "split_keywords": [
        "hyperparameter optimization",
        " neural networks",
        " machine learning",
        " greedy algorithm",
        " heuristic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4db2f4bb9203cde3f1d9375cb9802fdfb34f5b7e40260da135ba8c0cdde17b29",
                "md5": "5920865e8a4e10f4ce4ac29b59ef9e9b",
                "sha256": "ccb8109009368f0e212c24e2e1d4802eb23502bf8e23c798b2b3a0a4dad3fc2a"
            },
            "downloads": -1,
            "filename": "search_in_a_third-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5920865e8a4e10f4ce4ac29b59ef9e9b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 11863,
            "upload_time": "2024-04-01T10:38:05",
            "upload_time_iso_8601": "2024-04-01T10:38:05.345315Z",
            "url": "https://files.pythonhosted.org/packages/4d/b2/f4bb9203cde3f1d9375cb9802fdfb34f5b7e40260da135ba8c0cdde17b29/search_in_a_third-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d8474c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b",
                "md5": "89494904f887ce21295bab9b759036f6",
                "sha256": "4f491195e6df39c8b34006b95d9e9d173c5725a1721bfcb75e5a642ec8348645"
            },
            "downloads": -1,
            "filename": "search_in_a_third-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "89494904f887ce21295bab9b759036f6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 14020,
            "upload_time": "2024-04-01T10:38:06",
            "upload_time_iso_8601": "2024-04-01T10:38:06.775085Z",
            "url": "https://files.pythonhosted.org/packages/d8/47/4c0b7dc879c18cd976ca5cc1a551f6af51a8726dd96ab703baa266c7af9b/search_in_a_third-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-01 10:38:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "proflarriera",
    "github_project": "searchinathird",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "tensorflow",
            "specs": []
        },
        {
            "name": "scikit-learn",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        }
    ],
    "lcname": "search-in-a-third"
}
        
Elapsed time: 0.28531s