# FSRLeaning - Python Library
[![Downloads](https://static.pepy.tech/badge/FSRLearning)](https://pepy.tech/project/FSRLearning)
[![Downloads](https://static.pepy.tech/badge/FSRLearning/month)](https://pepy.tech/project/FSRLearning)
FSRLeaning is a Python library for feature selection using reinforcement learning. It's designed to be easy to use and efficient, particularly for selecting the most relevant features from a very large set.
## Installation
Install FSRLearning using pip:
```bash
pip install FSRLearning
```
## Example usage
### Data Pre-processing
#### The Dataset
In this example, we're using the Australian credit approval dataset. It has 14 features that have been intentionally anonymized. The goal is to predict whether the label is 0 or 1. We're using this dataset to demonstrate how to use the library, but the model can work with any dataset. You can find more details about the dataset [here](https://archive.ics.uci.edu/dataset/143/statlog+australian+credit+approval).
#### The process
The first step is a pre-processing of the data. You need to give as input to the method for feature selection a X and y pandas DataFrame. X is the dataset with all the features that we want to evaluate and y the label to be predicted. **It is highly recommended to create a mapping between features and a list of number.** For example each feature is associated with a number. Here is an example of the data pre-processing step on a data set with 14 features including 1 label.
```python
import pandas as pd
# Get the pandas DataFrame
australian_data = pd.read_csv('australian_data.csv', header=None)
# Get the dataset with the features
X = australian_data.drop(14, axis=1)
# Get the dataset with the label values
y = australian_data[14]
```
After this step we can simply run a feature selection and ranking process that maximises a metric.
```python
from FSRLearning import FeatureSelectorRL
# Create the object of feature selection with RL
fsrl_obj = FeatureSelectorRL(14, nb_iter=200)
# Returns the results of the selection and the ranking
results = fsrl_obj.fit_predict(X, y)
results
```
The feature_Selector_RL has several parameters that can be tuned. Here is all of them and the values that they can take.
- feature_number (integer) : number of features in the DataFrame X
- feature_structure (dictionary, optional) : dictionary for the graph implementation
- eps (float [0; 1], optional) : probability of choosing a random next state, 0 is an only greedy algorithm and 1 only random
- alpha (float [0; 1], optional): control the rate of updates, 0 is a very not updating state and 1 a very updated
- gamma (float [0, 1], optional): factor of moderation of the observation of the next state, 0 is a shortsighted condition and 1 it exhibits farsighted behavior
- nb_iter (int, optional): number of sequences to go through the graph
- starting_state ("empty" or "random", optional) : if "empty" the algorithm starts from the empty state and if "random" the algorithm starts from a random state in the graph
The output of the selection process is a 5-tuple object.
- Index of the features that have been sorted
- Number of times that each feature has been chosen
- Mean reward brought by each feature
- Ranking of the features from the less important to the most important
- Number of states visited
## Existing methods
- Compare the performance of the FSRLearning library with RFE from Sickit-Learn :
```python
fsrl_obj.compare_with_benchmark(X, y, results)
```
Returns some comparisons and plot a graph with the metric for each set of features selected. It is useful for parameters tuning.
- Get the evolution of the number of the visited states for the first time and the already visited states :
```python
fsrl_obj.get_plot_ratio_exploration()
```
Returns a plot. It is useful to get an overview of how the graph is browse and to tune the epsilon parameter (exploration parameter).
- Get an overview of the relative impact of each feature on the model :
```python
fsrl_obj.get_feature_strengh(results)
```
Returns a bar plot.
- Get an overview of the action of the stop conditions :
```python
fsrl_obj.get_depth_of_visited_states()
```
Returns a plot. It is useful to see how deep the Markovian Decision Process goes in the graph.
## Your contribution is welcomed !
- Automatise the data processing step and generalize the input data format and type
- Distribute the computation of each reward for making the algorithm faster
- Add more vizualization and feedback methods
## References
This library has been implemented with the help of these two articles :
- Sali Rasoul, Sodiq Adewole and Alphonse Akakpo, FEATURE SELECTION USING REINFORCEMENT LEARNING (2021)
- Seyed Mehdin Hazrati Fard, Ali Hamzeh and Sattar Hashemi, USING REINFORCEMENT LEARNING TO FIND AN OPTIMAL SET OF FEATURES (2013)
Raw data
{
"_id": null,
"home_page": "https://github.com/blefo/FSRLearning",
"name": "FSRLearning",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "feature, selection, reinforcement learning, large dataset, ai",
"author": "Baptiste Lefort",
"author_email": "lefort.baptiste@icloud.com",
"download_url": "https://files.pythonhosted.org/packages/2d/04/633d5ce611f2d6f96b07fde08c45b25eba9e4deab43f24036a4368805518/fsrlearning-1.0.7.tar.gz",
"platform": null,
"description": "# FSRLeaning - Python Library\n\n[![Downloads](https://static.pepy.tech/badge/FSRLearning)](https://pepy.tech/project/FSRLearning)\n[![Downloads](https://static.pepy.tech/badge/FSRLearning/month)](https://pepy.tech/project/FSRLearning)\n\nFSRLeaning is a Python library for feature selection using reinforcement learning. It's designed to be easy to use and efficient, particularly for selecting the most relevant features from a very large set.\n\n## Installation\n\nInstall FSRLearning using pip:\n\n```bash\npip install FSRLearning\n```\n\n## Example usage\n\n### Data Pre-processing\n\n#### The Dataset\n\nIn this example, we're using the Australian credit approval dataset. It has 14 features that have been intentionally anonymized. The goal is to predict whether the label is 0 or 1. We're using this dataset to demonstrate how to use the library, but the model can work with any dataset. You can find more details about the dataset [here](https://archive.ics.uci.edu/dataset/143/statlog+australian+credit+approval).\n\n#### The process\n\nThe first step is a pre-processing of the data. You need to give as input to the method for feature selection a X and y pandas DataFrame. X is the dataset with all the features that we want to evaluate and y the label to be predicted. **It is highly recommended to create a mapping between features and a list of number.** For example each feature is associated with a number. Here is an example of the data pre-processing step on a data set with 14 features including 1 label.\n```python\nimport pandas as pd\n\n# Get the pandas DataFrame\naustralian_data = pd.read_csv('australian_data.csv', header=None)\n\n# Get the dataset with the features\nX = australian_data.drop(14, axis=1)\n\n# Get the dataset with the label values\ny = australian_data[14]\n```\n\nAfter this step we can simply run a feature selection and ranking process that maximises a metric. \n\n```python\nfrom FSRLearning import FeatureSelectorRL\n\n# Create the object of feature selection with RL\nfsrl_obj = FeatureSelectorRL(14, nb_iter=200)\n\n# Returns the results of the selection and the ranking\nresults = fsrl_obj.fit_predict(X, y)\nresults\n```\n\nThe feature_Selector_RL has several parameters that can be tuned. Here is all of them and the values that they can take.\n\n- feature_number (integer)\u00a0: number of features in the DataFrame X\n\n- feature_structure (dictionary, optional)\u00a0: dictionary for the graph implementation\n- eps (float [0; 1], optional)\u00a0: probability of choosing a random next state, 0 is an only greedy algorithm and 1 only random\n- alpha (float [0; 1], optional): control the rate of updates, 0 is a very not updating state and 1 a very updated\n- gamma (float [0, 1], optional): factor of moderation of the observation of the next state, 0 is a shortsighted condition and 1 it exhibits farsighted behavior\n- nb_iter (int, optional): number of sequences to go through the graph\n- starting_state (\"empty\" or \"random\", optional)\u00a0: if \"empty\" the algorithm starts from the empty state and if \"random\" the algorithm starts from a random state in the graph \n\nThe output of the selection process is a 5-tuple object.\n\n- Index of the features that have been sorted\n\n- Number of times that each feature has been chosen\n- Mean reward brought by each feature\n- Ranking of the features from the less important to the most important\n- Number of states visited\n\n\n## Existing methods\n\n- Compare the performance of the FSRLearning library with RFE from Sickit-Learn :\n\n```python\nfsrl_obj.compare_with_benchmark(X, y, results)\n```\nReturns some comparisons and plot a graph with the metric for each set of features selected. It is useful for parameters tuning. \n\n- Get the evolution of the number of the visited states for the first time and the already visited states :\n\n```python\nfsrl_obj.get_plot_ratio_exploration()\n```\nReturns a plot. It is useful to get an overview of how the graph is browse and to tune the epsilon parameter (exploration parameter).\n\n- Get an overview of the relative impact of each feature on the model :\n\n```python\nfsrl_obj.get_feature_strengh(results)\n```\n\nReturns a bar plot.\n\n- Get an overview of the action of the stop conditions :\n\n```python\nfsrl_obj.get_depth_of_visited_states()\n```\n\nReturns a plot. It is useful to see how deep the Markovian Decision Process goes in the graph. \n\n## Your contribution is welcomed !\n\n- Automatise the data processing step and generalize the input data format and type\n- Distribute the computation of each reward for making the algorithm faster\n- Add more vizualization and feedback methods\n\n## References\n\nThis library has been implemented with the help of these two articles\u00a0:\n- Sali Rasoul, Sodiq Adewole and Alphonse Akakpo, FEATURE SELECTION USING REINFORCEMENT LEARNING (2021)\n- Seyed Mehdin Hazrati Fard, Ali Hamzeh and Sattar Hashemi, USING REINFORCEMENT LEARNING TO FIND AN OPTIMAL SET OF FEATURES (2013)\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "The first feature selection method based on reinforcement learning - Python library available on pip for a fast deployment.",
"version": "1.0.7",
"project_urls": {
"Homepage": "https://github.com/blefo/FSRLearning"
},
"split_keywords": [
"feature",
" selection",
" reinforcement learning",
" large dataset",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "29f7f42240f7a71d681080707fd2847e4ea50d14cd17d3b96ba1db419c4c4f5b",
"md5": "9df39ab88953595b321ca6565c9dfecf",
"sha256": "de8e60c8e5146ab9355d719324bab2d177a7fee804da8a7638847356e1f3e228"
},
"downloads": -1,
"filename": "FSRLearning-1.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9df39ab88953595b321ca6565c9dfecf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 12081,
"upload_time": "2024-06-17T22:25:47",
"upload_time_iso_8601": "2024-06-17T22:25:47.166491Z",
"url": "https://files.pythonhosted.org/packages/29/f7/f42240f7a71d681080707fd2847e4ea50d14cd17d3b96ba1db419c4c4f5b/FSRLearning-1.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2d04633d5ce611f2d6f96b07fde08c45b25eba9e4deab43f24036a4368805518",
"md5": "544133ba393d27afa7e524549d7519c3",
"sha256": "e3b975a2fe0513c402babe64393e67666418801ecd78019d432c12e0dc4d4b6b"
},
"downloads": -1,
"filename": "fsrlearning-1.0.7.tar.gz",
"has_sig": false,
"md5_digest": "544133ba393d27afa7e524549d7519c3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10301,
"upload_time": "2024-06-17T22:25:48",
"upload_time_iso_8601": "2024-06-17T22:25:48.651414Z",
"url": "https://files.pythonhosted.org/packages/2d/04/633d5ce611f2d6f96b07fde08c45b25eba9e4deab43f24036a4368805518/fsrlearning-1.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-17 22:25:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "blefo",
"github_project": "FSRLearning",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "fsrlearning"
}