pyLLS


NamepyLLS JSON
Version 0.5 PyPI version JSON
download
home_pagehttps://github.com/osj118/pyLLS
SummaryMissing value imputation with the local least square algorithm in python
upload_time2023-09-08 00:58:21
maintainerSejin Oh
docs_urlNone
authorSejin Oh
requires_python>=3.7
licenseMIT
keywords missing value imputation local least square
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pyLLS
### a Python library for missing gene value imputation using local least square algorithm

The Local Least Square (LLS) algorithm is an algorithm that is particularly effective at imputing missing values.<br>
We developed pyLLS by implementing the LLS into python framework.<br>
Our pyLLS offers more options and is significantly faster than LLS in R.<br>
pyLLS is subjected to the MIT License, which guarantees free and commercial use for industry as well as the research community.<br>

Please report bugs to Sejin Oh, at <sejin.oh<at>theragenbio.com> or at
<https://github.com/Theragen-Bio/pyLLS/issues>.

## Installation

You can install the pyLLS from
[pypi](https://pypi.org/project/pyLLS/) with:

``` r
pip install pyLLS
```

## Example

If you run 'impute_missing_gene()' without any data,<br>
it will return its description.

``` r
import pyLLS
pyLLS.impute_missing_gene()
>>> pyLLS.impute_missing_gene()

            This function estimates missing values of the specified target probes.
            # parameters
            ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.
            target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.
            metric (str) : ['correlation'(default),'L1','L2']
                           Similarity metric to prioritize the probes for each target.
            maxK : maximum number of probe to be used in missing value estimation.
            useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).
                         If useKneedle==False, then maxK probes will be used to estimate missing values.
            verbose : If True, progress is reported. Otherwise, no progress is reported.
            n_jobs : Use all threads ('all') or speicified number of threads (int)
            addK = Intenger that added to Kneedle's K to prevent underfitting.
                   This will use K+addK probes to estimate missing values of a gene. (default=1)
            return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.
            # Return
            * target : table with estimated values of missing genes that are not present in original target table.
            matrix shape will be (n x k).
            * mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.
            # tutorial
            <------omit-------->
```

## Parameters
``` r
ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.
target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.
metric (str) : ['correlation'(default),'L1','L2']
               Similarity metric to prioritize the probes for each target.
maxK : maximum number of probe to be used in missing value estimation.
useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).
             If useKneedle==False, then maxK probes will be used to estimate missing values.
verbose : If True, progress is reported. Otherwise, no progress is reported.
n_jobs : Use all threads ('all') or speicified number of threads (int)
addK = Intenger that added to Kneedle's K to prevent underfitting.
       This will use K+addK probes to estimate missing values of a gene.
return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.
```
## Returns
``` r
* target : table with estimated values of missing genes that are not present in original target table.
            matrix shape will be (n x k).
* mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.
```

## Tutorial
You can simply run the following tutorial codes.
``` r
import pyLLS
import pandas as pd
import numpy as np
import random
tmp=pd.DataFrame(np.array(random.sample(range(1000),1000)).reshape(100,10))
tmp.index=['g'+str(i) for i in tmp.index]
tmp.columns=['s'+str(i) for i in tmp.columns]
tmp2=tmp.iloc[:90,:5]
tmp3=pyLLS.impute_missing_gene(ref=tmp,target=tmp2)
```
If you want experience more sophisticated tutorial,<br>please refer the notebook.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/osj118/pyLLS",
    "name": "pyLLS",
    "maintainer": "Sejin Oh",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "missing value imputation,local least square",
    "author": "Sejin Oh",
    "author_email": "Sejin Oh <agicic@naver.com>",
    "download_url": "https://files.pythonhosted.org/packages/78/43/1dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9/pyLLS-0.5.tar.gz",
    "platform": null,
    "description": "# pyLLS\n### a Python library for missing gene value imputation using local least square algorithm\n\nThe Local Least Square (LLS) algorithm is an algorithm that is particularly effective at imputing missing values.<br>\nWe developed pyLLS by implementing the LLS into python framework.<br>\nOur pyLLS offers more options and is significantly faster than LLS in R.<br>\npyLLS is subjected to the MIT License, which guarantees free and commercial use for industry as well as the research community.<br>\n\nPlease report bugs to Sejin Oh, at <sejin.oh<at>theragenbio.com> or at\n<https://github.com/Theragen-Bio/pyLLS/issues>.\n\n## Installation\n\nYou can install the pyLLS from\n[pypi](https://pypi.org/project/pyLLS/) with:\n\n``` r\npip install pyLLS\n```\n\n## Example\n\nIf you run 'impute_missing_gene()' without any data,<br>\nit will return its description.\n\n``` r\nimport pyLLS\npyLLS.impute_missing_gene()\n>>> pyLLS.impute_missing_gene()\n\n            This function estimates missing values of the specified target probes.\n            # parameters\n            ref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.\n            target (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.\n            metric (str) : ['correlation'(default),'L1','L2']\n                           Similarity metric to prioritize the probes for each target.\n            maxK : maximum number of probe to be used in missing value estimation.\n            useKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).\n                         If useKneedle==False, then maxK probes will be used to estimate missing values.\n            verbose : If True, progress is reported. Otherwise, no progress is reported.\n            n_jobs : Use all threads ('all') or speicified number of threads (int)\n            addK = Intenger that added to Kneedle's K to prevent underfitting.\n                   This will use K+addK probes to estimate missing values of a gene. (default=1)\n            return_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.\n            # Return\n            * target : table with estimated values of missing genes that are not present in original target table.\n            matrix shape will be (n x k).\n            * mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.\n            # tutorial\n            <------omit-------->\n```\n\n## Parameters\n``` r\nref (pd.DataFrame): reference data. gene x sample (n x p) DataFrame.\ntarget (pd.DataFrame) : Target table containing missing values. gene x sample (i x k) DataFrame.\nmetric (str) : ['correlation'(default),'L1','L2']\n               Similarity metric to prioritize the probes for each target.\nmaxK : maximum number of probe to be used in missing value estimation.\nuseKneedle : It determines whether Kneedle algorithm should be used (True) or not (False).\n             If useKneedle==False, then maxK probes will be used to estimate missing values.\nverbose : If True, progress is reported. Otherwise, no progress is reported.\nn_jobs : Use all threads ('all') or speicified number of threads (int)\naddK = Intenger that added to Kneedle's K to prevent underfitting.\n       This will use K+addK probes to estimate missing values of a gene.\nreturn_probes = if true, 'target-table and mgcp' will be returned else 'target' will be returned.\n```\n## Returns\n``` r\n* target : table with estimated values of missing genes that are not present in original target table.\n            matrix shape will be (n x k).\n* mgcp : missing gene correlative probes. If useKneedle == True, mgcp will have R2-square column.\n```\n\n## Tutorial\nYou can simply run the following tutorial codes.\n``` r\nimport pyLLS\nimport pandas as pd\nimport numpy as np\nimport random\ntmp=pd.DataFrame(np.array(random.sample(range(1000),1000)).reshape(100,10))\ntmp.index=['g'+str(i) for i in tmp.index]\ntmp.columns=['s'+str(i) for i in tmp.columns]\ntmp2=tmp.iloc[:90,:5]\ntmp3=pyLLS.impute_missing_gene(ref=tmp,target=tmp2)\n```\nIf you want experience more sophisticated tutorial,<br>please refer the notebook.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Missing value imputation with the local least square algorithm in python",
    "version": "0.5",
    "project_urls": {
        "Homepage": "https://github.com/osj118/pyLLS"
    },
    "split_keywords": [
        "missing value imputation",
        "local least square"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e57b6ec2808a002a1ff2b0955446de5730182ffaadd70baed09d54a4a1a5ad31",
                "md5": "27bff8058e09854f5b7697e81d2b693d",
                "sha256": "7488e820cc33f5deb45cddba8a08b337a561fd45268d46a9413517f38bdcaf13"
            },
            "downloads": -1,
            "filename": "pyLLS-0.5-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "27bff8058e09854f5b7697e81d2b693d",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.7",
            "size": 7561,
            "upload_time": "2023-09-08T00:58:19",
            "upload_time_iso_8601": "2023-09-08T00:58:19.524864Z",
            "url": "https://files.pythonhosted.org/packages/e5/7b/6ec2808a002a1ff2b0955446de5730182ffaadd70baed09d54a4a1a5ad31/pyLLS-0.5-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "78431dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9",
                "md5": "c6d819fe1e9f5cb42342191cdae83a57",
                "sha256": "72a8cb21ca879d921fb3d0a93f860d1fab4c851258330c9e3a3c6effbea2bae0"
            },
            "downloads": -1,
            "filename": "pyLLS-0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "c6d819fe1e9f5cb42342191cdae83a57",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 7586,
            "upload_time": "2023-09-08T00:58:21",
            "upload_time_iso_8601": "2023-09-08T00:58:21.105680Z",
            "url": "https://files.pythonhosted.org/packages/78/43/1dc6636692c69e986052a3545bb537c0ba94492644a175000dffc0103cc9/pyLLS-0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-08 00:58:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "osj118",
    "github_project": "pyLLS",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pylls"
}
        
Elapsed time: 0.10735s