# kydavra
Kydavra is a python sci-kit learn inspired package for feature selection. It used some statistical methods to extract from pure pandas Data Frames the columns that are related to column that your model should predict.
This version of kydavra has the next methods of feature selection:
* ANOVA test selector (ANOVASelector)
* Chi squared selector (ChiSquaredSelector)
* Genetic Algorithm selector (GeneticAlgorithmSelector)
* Kendall Correlation selector (KendallCorrelationSelector)
* Lasso selector (LassoSelector)
* Pearson Correlation selector (PearsonCorrelationSelector)
* Point-Biserial selector (PointBiserialCorrSelector)
* P-value selector (PValueSelector)
* Spearman Correlation selector (SpermanCorrelationSelector)
* Shannon selector (ShannonSelector)
* ElasticNet Selector (ElasticNetSelector)
* M3U Selector (M3USelector)
* MUSE Selector (MUSESelector)
* Mixer Selector (MixerSelector)
* PCA Filter (PCAFilter)
* PCA Reducer (PCAReducer)
* LDA Reducer (LDAReducer)
* Bregman Divergence selector (BregmanDivergenceSelector)
* Fisher Selector (FisherSelector)
* ICA Reducer (ICAReducer)
* ICA Filter (ICAFilter)
* Itakura-Saito Divergence selector (ItakuraSaitoSelector)
* Jensen-Shannon Divergence selector (JensenShannonSelector)
* Kullback-Leibler selector (KullbackLeiblerSelector)
* MultiSURF selector (MultiSURFSelector)
* Phik selector (PhikSelector)
* ReliefF selector (ReliefFSelector)
All these methods takes the pandas Data Frame and y column to select from remained columns in the Data Frame.
How to use kydavraTo use selector from kydavra you should just import the selector from kydavra in the following framework:
```python
from kydavra import PValueSelector
```
class names are written above in parantheses.Next create a object of this algorithm (I will use p-value method as an example).
```python
method = PValueSelector()
```
To get the best feature on the opinion of the method you should use the 'select' function, using as parameters the pandas Data Frame and the column that you want your model to predict.
```python
selected_columns = method.select(df, 'target')
```
Returned value is a list of columns selected by the algorithm.
Some methods could plot the process of selecting the best features.In these methods dotted are features that wasn't selected by the method.*ChiSquaredSelector*
```python
method.plot_chi2()
```
For ploting and
```python
method.plot_chi2(save=True, file_path='FILE/PATH.png')
```
and
```python
method.plot_p_value()
```
for ploting the p-values.*LassoSelector*
```python
method.plot_process()
```
also you can save the plot using the same parameters.*PValueSelector*
```
method.plot_process()
```
Some advice.
* Use ChiSquaredSelector for categorical features.
* Use LassoSelector and PValueSelector for regression problems.
* Use PointBiserialCorrSelector for binary classification problems.
* Use ShannonSelector to choose whatever to keep the NaN values (as another value) and to drop column with a lot of NaN values.
With love from Sigmoid.
We are open for feedback. Please send your impression to vladimir.stojoc@gmail.com
Raw data
{
"_id": null,
"home_page": "https://github.com/SigmoidAI/kydavra",
"name": "kydavra",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "ml,machine learning,feature selection,python",
"author": "SigmoidAI - P\u0103p\u0103lu\u021b\u0103 Vasile, Stojoc Vladimir",
"author_email": "vladimir.stojoc@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/fc/0a/d2e796ac41845872dcacfdecb1fe3de60eb4458af9512aa37b8ae384b6db/kydavra-0.3.4.tar.gz",
"platform": null,
"description": "\n# kydavra\nKydavra is a python sci-kit learn inspired package for feature selection. It used some statistical methods to extract from pure pandas Data Frames the columns that are related to column that your model should predict.\nThis version of kydavra has the next methods of feature selection:\n* ANOVA test selector (ANOVASelector)\n* Chi squared selector (ChiSquaredSelector)\n* Genetic Algorithm selector (GeneticAlgorithmSelector)\n* Kendall Correlation selector (KendallCorrelationSelector)\n* Lasso selector (LassoSelector)\n* Pearson Correlation selector (PearsonCorrelationSelector)\n* Point-Biserial selector (PointBiserialCorrSelector)\n* P-value selector (PValueSelector)\n* Spearman Correlation selector (SpermanCorrelationSelector)\n* Shannon selector (ShannonSelector)\n* ElasticNet Selector (ElasticNetSelector)\n* M3U Selector (M3USelector)\n* MUSE Selector (MUSESelector)\n* Mixer Selector (MixerSelector)\n* PCA Filter (PCAFilter)\n* PCA Reducer (PCAReducer)\n* LDA Reducer (LDAReducer)\n* Bregman Divergence selector (BregmanDivergenceSelector)\n* Fisher Selector (FisherSelector)\n* ICA Reducer (ICAReducer)\n* ICA Filter (ICAFilter)\n* Itakura-Saito Divergence selector (ItakuraSaitoSelector)\n* Jensen-Shannon Divergence selector (JensenShannonSelector)\n* Kullback-Leibler selector (KullbackLeiblerSelector)\n* MultiSURF selector (MultiSURFSelector)\n* Phik selector (PhikSelector)\n* ReliefF selector (ReliefFSelector)\n\nAll these methods takes the pandas Data Frame and y column to select from remained columns in the Data Frame.\n\nHow to use kydavraTo use selector from kydavra you should just import the selector from kydavra in the following framework:\n```python\nfrom kydavra import PValueSelector\n```\nclass names are written above in parantheses.Next create a object of this algorithm (I will use p-value method as an example).\n```python\nmethod = PValueSelector()\n```\nTo get the best feature on the opinion of the method you should use the 'select' function, using as parameters the pandas Data Frame and the column that you want your model to predict.\n```python\nselected_columns = method.select(df, 'target')\n```\nReturned value is a list of columns selected by the algorithm.\n\nSome methods could plot the process of selecting the best features.In these methods dotted are features that wasn't selected by the method.*ChiSquaredSelector*\n```python\nmethod.plot_chi2()\n```\nFor ploting and\n```python\nmethod.plot_chi2(save=True, file_path='FILE/PATH.png')\n```\nand\n```python\nmethod.plot_p_value()\n```\nfor ploting the p-values.*LassoSelector*\n```python\nmethod.plot_process()\n```\nalso you can save the plot using the same parameters.*PValueSelector*\n```\nmethod.plot_process()\n```\n\nSome advice.\n* Use ChiSquaredSelector for categorical features.\n* Use LassoSelector and PValueSelector for regression problems.\n* Use PointBiserialCorrSelector for binary classification problems.\n* Use ShannonSelector to choose whatever to keep the NaN values (as another value) and to drop column with a lot of NaN values.\n\n\nWith love from Sigmoid.\n\nWe are open for feedback. Please send your impression to vladimir.stojoc@gmail.com\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Kydavra is a sci-kit learn inspired python library with feature selection methods for Data Science and Macine Learning Model development",
"version": "0.3.4",
"split_keywords": [
"ml",
"machine learning",
"feature selection",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1b6e00488814fd1a9637a1e95e5c6b2a2e06bf06de58cc472e8c8e3f9b845f2b",
"md5": "ea67fc9c0446d4dba4b5dc2f0b75133e",
"sha256": "9fdfc25b85a4b0a4de5b580fa1bcec71000ff4a366072f8ca2d2d424ce2c2b05"
},
"downloads": -1,
"filename": "kydavra-0.3.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ea67fc9c0446d4dba4b5dc2f0b75133e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 47144,
"upload_time": "2023-02-06T19:33:23",
"upload_time_iso_8601": "2023-02-06T19:33:23.012034Z",
"url": "https://files.pythonhosted.org/packages/1b/6e/00488814fd1a9637a1e95e5c6b2a2e06bf06de58cc472e8c8e3f9b845f2b/kydavra-0.3.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fc0ad2e796ac41845872dcacfdecb1fe3de60eb4458af9512aa37b8ae384b6db",
"md5": "673c702add11953cc619a87a28eb884f",
"sha256": "cf45e1fe2492c54d0d38c3f5752c127e116e96c58f4142286d046d7aa5004c44"
},
"downloads": -1,
"filename": "kydavra-0.3.4.tar.gz",
"has_sig": false,
"md5_digest": "673c702add11953cc619a87a28eb884f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 28965,
"upload_time": "2023-02-06T19:33:24",
"upload_time_iso_8601": "2023-02-06T19:33:24.450872Z",
"url": "https://files.pythonhosted.org/packages/fc/0a/d2e796ac41845872dcacfdecb1fe3de60eb4458af9512aa37b8ae384b6db/kydavra-0.3.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-06 19:33:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "SigmoidAI",
"github_project": "kydavra",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "kydavra"
}