# pyLLS
### a Python library for missing gene value imputation using local least square algorithm
The Local Least Square (LLS) algorithm is an algorithm that is particularly effective at imputing missing values.<br>
We developed pyLLS by implementing the LLS into python framework.<br>
Our pyLLS offers more options and is significantly faster than LLS in R.<br>
pyLLS is subjected to the MIT License, which guarantees free and commercial use for industry as well as the research community.<br>
Please report bugs to Sejin Oh, at <sejin.oh<at>theragenbio.com> or at
<https://github.com/Theragen-Bio/pyLLS/issues>.
## Installation
You can install the pyLLS from
[pypi](https://pypi.org/project/pyLLS/) with:
``` r
pip install pyLLS
```
## Example
If you run 'impute_missing_gene()' without any data,<br>
it will return its description.
``` r
import pyLLS
pyLLS.impute_missing_gene()
>>> pyLLS.impute_missing_gene()
This function estimates missing values of the specified target probes.
# parameters
ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.
target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.
metric (str) : ['correlation'(default),'L1','L2']
Similarity metric to prioritize the probes for each target.
maxK : maximum number of probe to be used in missing value estimation.
useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).
If useKneedle==False, then maxK probes will be used to estimate missing values.
verbose : If True, progress is reported. Otherwise, no progress is reported.
n_jobs : Use all threads ('all') or speicified number of threads (int)
addK = Intenger that added to Kneedle's K to prevent underfitting.
This will use K+addK probes to estimate missing values of a gene. (default=1)
return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.
# Return
* target : table with estimated values of missing genes that are not present in original target table.
matrix shape will be (n x k).
* mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.
# tutorial
<------omit-------->
```
## Parameters
``` r
ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.
target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.
metric (str) : ['correlation'(default),'L1','L2']
Similarity metric to prioritize the probes for each target.
maxK : maximum number of probe to be used in missing value estimation.
useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).
If useKneedle==False, then maxK probes will be used to estimate missing values.
verbose : If True, progress is reported. Otherwise, no progress is reported.
n_jobs : Use all threads ('all') or speicified number of threads (int)
addK = Intenger that added to Kneedle's K to prevent underfitting.
This will use K+addK probes to estimate missing values of a gene.
return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.
```
## Returns
``` r
* target : table with estimated values of missing genes that are not present in original target table.
matrix shape will be (n x k).
* mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.
```
## Tutorial
You can simply run the following tutorial codes.
``` r
import pyLLS
import pandas as pd
import numpy as np
import random
tmp=pd.DataFrame(np.array(random.sample(range(1000),1000)).reshape(100,10))
tmp.index=['g'+str(i) for i in tmp.index]
tmp.columns=['s'+str(i) for i in tmp.columns]
tmp2=tmp.iloc[:90,:5]
tmp3=pyLLS.impute_missing_gene(ref=tmp,target=tmp2)
```
If you want experience more sophisticated tutorial,<br>please refer the notebook.
Raw data
{
"_id": null,
"home_page": "https://github.com/osj118/pyLLS",
"name": "pyLLS",
"maintainer": "Sejin Oh",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "missing value imputation,local least square",
"author": "Sejin Oh",
"author_email": "Sejin Oh <agicic@naver.com>",
"download_url": "https://files.pythonhosted.org/packages/78/43/1dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9/pyLLS-0.5.tar.gz",
"platform": null,
"description": "# pyLLS\n### a Python library for missing gene value imputation using local least square algorithm\n\nThe Local Least Square (LLS) algorithm is an algorithm that is particularly effective at imputing missing values.<br>\nWe developed pyLLS by implementing the LLS into python framework.<br>\nOur pyLLS offers more options and is significantly faster than LLS in R.<br>\npyLLS is subjected to the MIT License, which guarantees free and commercial use for industry as well as the research community.<br>\n\nPlease report bugs to Sejin Oh, at <sejin.oh<at>theragenbio.com> or at\n<https://github.com/Theragen-Bio/pyLLS/issues>.\n\n## Installation\n\nYou can install the pyLLS from\n[pypi](https://pypi.org/project/pyLLS/) with:\n\n``` r\npip install pyLLS\n```\n\n## Example\n\nIf you run 'impute_missing_gene()' without any data,<br>\nit will return its description.\n\n``` r\nimport pyLLS\npyLLS.impute_missing_gene()\n>>> pyLLS.impute_missing_gene()\n\n This function estimates missing values of the specified target probes.\n # parameters\n ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.\n target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.\n metric (str) : ['correlation'(default),'L1','L2']\n Similarity metric to prioritize the probes for each target.\n maxK : maximum number of probe to be used in missing value estimation.\n useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).\n If useKneedle==False, then maxK probes will be used to estimate missing values.\n verbose : If True, progress is reported. Otherwise, no progress is reported.\n n_jobs : Use all threads ('all') or speicified number of threads (int)\n addK = Intenger that added to Kneedle's K to prevent underfitting.\n This will use K+addK probes to estimate missing values of a gene. (default=1)\n return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.\n # Return\n * target : table with estimated values of missing genes that are not present in original target table.\n matrix shape will be (n x k).\n * mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.\n # tutorial\n <------omit-------->\n```\n\n## Parameters\n``` r\nref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.\ntarget (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.\nmetric (str) : ['correlation'(default),'L1','L2']\n Similarity metric to prioritize the probes for each target.\nmaxK : maximum number of probe to be used in missing value estimation.\nuseKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).\n If useKneedle==False, then maxK probes will be used to estimate missing values.\nverbose : If True, progress is reported. Otherwise, no progress is reported.\nn_jobs : Use all threads ('all') or speicified number of threads (int)\naddK = Intenger that added to Kneedle's K to prevent underfitting.\n This will use K+addK probes to estimate missing values of a gene.\nreturn_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.\n```\n## Returns\n``` r\n* target : table with estimated values of missing genes that are not present in original target table.\n matrix shape will be (n x k).\n* mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.\n```\n\n## Tutorial\nYou can simply run the following tutorial codes.\n``` r\nimport pyLLS\nimport pandas as pd\nimport numpy as np\nimport random\ntmp=pd.DataFrame(np.array(random.sample(range(1000),1000)).reshape(100,10))\ntmp.index=['g'+str(i) for i in tmp.index]\ntmp.columns=['s'+str(i) for i in tmp.columns]\ntmp2=tmp.iloc[:90,:5]\ntmp3=pyLLS.impute_missing_gene(ref=tmp,target=tmp2)\n```\nIf you want experience more sophisticated tutorial,<br>please refer the notebook.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Missing value imputation with the local least square algorithm in python",
"version": "0.5",
"project_urls": {
"Homepage": "https://github.com/osj118/pyLLS"
},
"split_keywords": [
"missing value imputation",
"local least square"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e57b6ec2808a002a1ff2b0955446de5730182ffaadd70baed09d54a4a1a5ad31",
"md5": "27bff8058e09854f5b7697e81d2b693d",
"sha256": "7488e820cc33f5deb45cddba8a08b337a561fd45268d46a9413517f38bdcaf13"
},
"downloads": -1,
"filename": "pyLLS-0.5-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "27bff8058e09854f5b7697e81d2b693d",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.7",
"size": 7561,
"upload_time": "2023-09-08T00:58:19",
"upload_time_iso_8601": "2023-09-08T00:58:19.524864Z",
"url": "https://files.pythonhosted.org/packages/e5/7b/6ec2808a002a1ff2b0955446de5730182ffaadd70baed09d54a4a1a5ad31/pyLLS-0.5-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "78431dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9",
"md5": "c6d819fe1e9f5cb42342191cdae83a57",
"sha256": "72a8cb21ca879d921fb3d0a93f860d1fab4c851258330c9e3a3c6effbea2bae0"
},
"downloads": -1,
"filename": "pyLLS-0.5.tar.gz",
"has_sig": false,
"md5_digest": "c6d819fe1e9f5cb42342191cdae83a57",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 7586,
"upload_time": "2023-09-08T00:58:21",
"upload_time_iso_8601": "2023-09-08T00:58:21.105680Z",
"url": "https://files.pythonhosted.org/packages/78/43/1dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9/pyLLS-0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-08 00:58:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "osj118",
"github_project": "pyLLS",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pylls"
}