#### Statistical Processing of attributes via Recursive Cross Elimination
*SPARCE*
The sparce software is a statistical machine learning software that automates
feature seleciton in genomics data files. The software was originally outiftted
for general use in genetics, transcirptomics, methylomics and ATAC-seq data.
Installation
```{python}
conda create -n sparce pip
conda activate sparce
```
```{python}
pip install sparce
```
***HOW TO RUN***
```{python}
'''
Run inside script
'''
import sparce
import pandas as pd
from sklearn.preprocessing import OrdinalEncoder
def preprocess(file):
X = pd.read_csv('file')
enc = OrdinalEncoder()
enc.fit(X['a column in X'])
X['a column in X'] = enc.transform(X['a column in X'])
y = X['a column in X']
X = X.drop('a column in X', axis = 1)
return X,y
X, y = preprocess(file)
nFeatures = 5
nJobs = 10
CV = sparce.feature_selection.grade_features(X = X, y = y, nFeatures = nFeatures , nJobs = nJobs)
```
# CLI
Clone the repository and re-invoke the main function.
import args_parse into the sparce.py
Ready to run in the cli
```console
python sparce.py -x <file> -y <target> -nFeatures <int> -nJobs <int>
conda deactivate sparce
```
sparce assumptions
The data is in tidy format where (Features x samples) with a column labeled "target"
The features are continuous attributes in a classificaiton problem
The classes are mutually exclusive
nFeatures > nSamples, you are attempting to reduce the dimensionality of the problem to produce nSamples > nFeatures
Raw data
{
"_id": null,
"home_page": "https://github.com/michaelSkaro/sparce/",
"name": "sparce",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Michael Skaro",
"author_email": "mskaro.ms@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/5d/7f/1e587a4a1a08b9aa17fcf9465510e9e28e55cfc56eee9ce60b454eff1628/sparce-0.1.14.tar.gz",
"platform": null,
"description": "#### Statistical Processing of attributes via Recursive Cross Elimination\n\n\n\n\n*SPARCE*\n\nThe sparce software is a statistical machine learning software that automates\nfeature seleciton in genomics data files. The software was originally outiftted\nfor general use in genetics, transcirptomics, methylomics and ATAC-seq data.\n\nInstallation\n\n```{python}\nconda create -n sparce pip\nconda activate sparce\n```\n\n```{python}\npip install sparce\n```\n\n\n***HOW TO RUN***\n\n```{python}\n'''\nRun inside script\n'''\n\n\nimport sparce\nimport pandas as pd\nfrom sklearn.preprocessing import OrdinalEncoder\n\ndef preprocess(file): \n X = pd.read_csv('file')\n enc = OrdinalEncoder()\n enc.fit(X['a column in X'])\n X['a column in X'] = enc.transform(X['a column in X'])\n y = X['a column in X']\n X = X.drop('a column in X', axis = 1)\n \n return X,y\n\nX, y = preprocess(file)\n\nnFeatures = 5\nnJobs = 10\n\nCV = sparce.feature_selection.grade_features(X = X, y = y, nFeatures = nFeatures , nJobs = nJobs)\n\n```\n\n\n# CLI\n\nClone the repository and re-invoke the main function.\nimport args_parse into the sparce.py\nReady to run in the cli\n\n```console\n\npython sparce.py -x <file> -y <target> -nFeatures <int> -nJobs <int>\n\nconda deactivate sparce\n\n```\n\nsparce assumptions\n\nThe data is in tidy format where (Features x samples) with a column labeled \"target\"\nThe features are continuous attributes in a classificaiton problem\nThe classes are mutually exclusive\nnFeatures > nSamples, you are attempting to reduce the dimensionality of the problem to produce nSamples > nFeatures\n\n\n\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "A python package for automated feature selection",
"version": "0.1.14",
"project_urls": {
"Bug Tracker": "https://github.com/michaelSkaro/sparce/issues",
"Homepage": "https://github.com/michaelSkaro/sparce/"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "340e96f2abd5687710ee44553250f9b0e35830864811080347c29ea390d42221",
"md5": "b27c9bf3a9fcebe83c51a53a6d84364e",
"sha256": "9fe8dea7c3a5ebc29242840f19b4c72141b47413de1d63ffbd2c974a283623d2"
},
"downloads": -1,
"filename": "sparce-0.1.14-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b27c9bf3a9fcebe83c51a53a6d84364e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 20507,
"upload_time": "2023-11-04T16:34:33",
"upload_time_iso_8601": "2023-11-04T16:34:33.375460Z",
"url": "https://files.pythonhosted.org/packages/34/0e/96f2abd5687710ee44553250f9b0e35830864811080347c29ea390d42221/sparce-0.1.14-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5d7f1e587a4a1a08b9aa17fcf9465510e9e28e55cfc56eee9ce60b454eff1628",
"md5": "7ef07d6043c7a374c0d026d7b1c91c0c",
"sha256": "9967298e1f75b15aee455e94452e54ac6708137e947e9b2615a0e38a6a1a907a"
},
"downloads": -1,
"filename": "sparce-0.1.14.tar.gz",
"has_sig": false,
"md5_digest": "7ef07d6043c7a374c0d026d7b1c91c0c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 11509,
"upload_time": "2023-11-04T16:34:35",
"upload_time_iso_8601": "2023-11-04T16:34:35.115901Z",
"url": "https://files.pythonhosted.org/packages/5d/7f/1e587a4a1a08b9aa17fcf9465510e9e28e55cfc56eee9ce60b454eff1628/sparce-0.1.14.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-04 16:34:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "michaelSkaro",
"github_project": "sparce",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pandas",
"specs": [
[
"==",
"1.1.4"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"0.23.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.5.4"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.19.4"
]
]
}
],
"lcname": "sparce"
}