pycombat


Namepycombat JSON
Version 0.20 PyPI version JSON
download
home_pagehttps://github.com/CoAxLab/pycombat
SummaryPython version of data harmonisation technique Combat
upload_time2022-01-06 23:13:25
maintainer
docs_urlNone
authorCoAxLab
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pycombat

Python version of data harmonisation techinque COMBAT. This package also allows for covariate effects to be removed from the data in addition to batch effects.

Combat is a technique for data harmonisation based on a linear mixed model in which location and scale random effects across batches are adjusted using a bayesian approach (Johnson, 2007):

<img src="images/eq1.png" align="center"/>

Original Combat tecnique allowed to keep the baseline effects alpha and the effects of interest beta by reintroducing these after harmonisation:

<img src="images/eq2.png" align="center"/>

One extension of this python package is the possibility of removing unwanted variables' effect by no reintroducing them again. Using the same linear mixed model of the begining, we now separate the sources of covariation *C* from sources of effects of interest *X*:

<img src="images/eq3.png" align="center"/>

And then in this case, combat adjustment will be given by:

<img src="images/eq4.png" align="center"/>

Such an easy and straightforward modification to combat has been recently proposed and introduced by some authors (Wachinger, 2020).

*References*:

- W. Evan Johnson, Cheng Li, Ariel Rabinovic, Adjusting batch effects in microarray expression data using empirical Bayes methods, Biostatistics, Volume 8, Issue 1, January 2007, Pages 118–127, https://doi.org/10.1093/biostatistics/kxj037

- L. Dyrskjot, M. Kruhoffer, T. Thykjaer, N. Marcussen, J. L. Jensen,K. Moller, and T. F. Orntoft. Gene expression in the urinary bladder: acommon carcinoma in situ gene expression signature exists disregardinghistopathological classification.Cancer Res., 64:4040–4048, Jun 2004.

- Christian Wachinger, Anna Rieckmann, Sebastian Pölsterl. Detect and Correct Bias in Multi-Site Neuroimaging Datasets. arXiv:2002.05049

- Fortin, J. P., N. Cullen, Y. I. Sheline, W. D. Taylor, I. Aselcioglu, P. A. Cook, P. Adams, C. Cooper, M. Fava, P. J. McGrath, M. McInnis, M. L. Phillips, M. H. Trivedi, M. M. Weissman and R. T. Shinohara (2017). "Harmonization of cortical thickness measurements across scanners and sites." Neuroimage 167: 104-120.

# Install

    pip install pycombat

# Usage

Following the spirit of scikit-learn, Combat is a class that includes a method called **fit**, which finds the fitted values of the linear mixed model, and **transform**, a method that used the previously learning paramters to adjust the data. There is also a method called **fit_transform**, which concatenates both methods.

So, the first thing that you need to do is to define a instance of this class:

    combat = Combat()

At the time of defining the combat instance, you can pass it the folowing parameters:

  - method: which is either "p" for paramteric or "np" for non-parametric (not implemented yet!!)
  - conv: the criterion to decide when to stop the EB optimization step (default value = 0.0001)

Now, you have to call the method **fit**, passsing it the data.

    combat.fit(Y=Y, b=b, X=X, C=C)

 These input data consist of the following ingredients:

  - Y: The matrix of response variables, with dimensions [observations x features]
  - b: The array of batch label for each observation. In principle these could be labelled as numbers or strings.
  - X: The matrix of effects of interest to keep, with dimensions [observations x features_interest]
  - C: The matrix of covariates to remove, with dimensions [observations x features_covariates]

***Important:***  If you have effects of interest or covariates that involve categorical features, make sure that you drop the first level of these categories when building the independent matrices, otherwise they would be singular. You can easily accomplished this using pandas and **pd.get_dummies** with the option *drop_first* checked.

After fitting the data, you can adjust it by calling the **transform** method:

    Y_adjusted = combat.transform(Y=Y, b=b, X=X, C=C)

Alternatively, you can combine both steps by just calling the method **fit_transform**:

    Y_adjusted = combat.fit_trasnform(Y=Y, b=b, X=X, C=C)



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/CoAxLab/pycombat",
    "name": "pycombat",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "CoAxLab",
    "author_email": "jrasero.daparte@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/be/05/c19cb6832f0edb96b12fe2f664447f66b28b88df00d856ce91a8197aedea/pycombat-0.20.tar.gz",
    "platform": "",
    "description": "# pycombat\n\nPython version of data harmonisation techinque COMBAT. This package also allows for covariate effects to be removed from the data in addition to batch effects.\n\nCombat is a technique for data harmonisation based on a linear mixed model in which location and scale random effects across batches are adjusted using a bayesian approach (Johnson, 2007):\n\n<img src=\"images/eq1.png\" align=\"center\"/>\n\nOriginal Combat tecnique allowed to keep the baseline effects alpha and the effects of interest beta by reintroducing these after harmonisation:\n\n<img src=\"images/eq2.png\" align=\"center\"/>\n\nOne extension of this python package is the possibility of removing unwanted variables' effect by no reintroducing them again. Using the same linear mixed model of the begining, we now separate the sources of covariation *C* from sources of effects of interest *X*:\n\n<img src=\"images/eq3.png\" align=\"center\"/>\n\nAnd then in this case, combat adjustment will be given by:\n\n<img src=\"images/eq4.png\" align=\"center\"/>\n\nSuch an easy and straightforward modification to combat has been recently proposed and introduced by some authors (Wachinger, 2020).\n\n*References*:\n\n- W. Evan Johnson, Cheng Li, Ariel Rabinovic, Adjusting batch effects in microarray expression data using empirical Bayes methods, Biostatistics, Volume 8, Issue 1, January 2007, Pages 118\u2013127, https://doi.org/10.1093/biostatistics/kxj037\n\n- L. Dyrskjot, M. Kruhoffer, T. Thykjaer, N. Marcussen, J. L. Jensen,K. Moller, and T. F. Orntoft. Gene expression in the urinary bladder: acommon carcinoma in situ gene expression signature exists disregardinghistopathological classification.Cancer Res., 64:4040\u20134048, Jun 2004.\n\n- Christian Wachinger, Anna Rieckmann, Sebastian P\u00f6lsterl. Detect and Correct Bias in Multi-Site Neuroimaging Datasets. arXiv:2002.05049\n\n- Fortin, J. P., N. Cullen, Y. I. Sheline, W. D. Taylor, I. Aselcioglu, P. A. Cook, P. Adams, C. Cooper, M. Fava, P. J. McGrath, M. McInnis, M. L. Phillips, M. H. Trivedi, M. M. Weissman and R. T. Shinohara (2017). \"Harmonization of cortical thickness measurements across scanners and sites.\" Neuroimage 167: 104-120.\n\n# Install\n\n    pip install pycombat\n\n# Usage\n\nFollowing the spirit of scikit-learn, Combat is a class that includes a method called **fit**, which finds the fitted values of the linear mixed model, and **transform**, a method that used the previously learning paramters to adjust the data. There is also a method called **fit_transform**, which concatenates both methods.\n\nSo, the first thing that you need to do is to define a instance of this class:\n\n    combat = Combat()\n\nAt the time of defining the combat instance, you can pass it the folowing parameters:\n\n  - method: which is either \"p\" for paramteric or \"np\" for non-parametric (not implemented yet!!)\n  - conv: the criterion to decide when to stop the EB optimization step (default value = 0.0001)\n\nNow, you have to call the method **fit**, passsing it the data.\n\n    combat.fit(Y=Y, b=b, X=X, C=C)\n\n These input data consist of the following ingredients:\n\n  - Y: The matrix of response variables, with dimensions [observations x features]\n  - b: The array of batch label for each observation. In principle these could be labelled as numbers or strings.\n  - X: The matrix of effects of interest to keep, with dimensions [observations x features_interest]\n  - C: The matrix of covariates to remove, with dimensions [observations x features_covariates]\n\n***Important:***  If you have effects of interest or covariates that involve categorical features, make sure that you drop the first level of these categories when building the independent matrices, otherwise they would be singular. You can easily accomplished this using pandas and **pd.get_dummies** with the option *drop_first* checked.\n\nAfter fitting the data, you can adjust it by calling the **transform** method:\n\n    Y_adjusted = combat.transform(Y=Y, b=b, X=X, C=C)\n\nAlternatively, you can combine both steps by just calling the method **fit_transform**:\n\n    Y_adjusted = combat.fit_trasnform(Y=Y, b=b, X=X, C=C)\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python version of data harmonisation technique Combat",
    "version": "0.20",
    "project_urls": {
        "Bug Tracker": "https://github.com/CoAxLab/pycombat/issues",
        "Homepage": "https://github.com/CoAxLab/pycombat"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d4bee7bc844f503ac41b482a855a6a86260443d0cb2710dcd06cd895993b216f",
                "md5": "a8e7524066a695ecb9362f05d47b8544",
                "sha256": "5b7db9a2a13375760caf9460f37844f3cc0d318bab8434bcf72a7ff8cb978b41"
            },
            "downloads": -1,
            "filename": "pycombat-0.20-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a8e7524066a695ecb9362f05d47b8544",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 6303,
            "upload_time": "2022-01-06T23:13:24",
            "upload_time_iso_8601": "2022-01-06T23:13:24.524286Z",
            "url": "https://files.pythonhosted.org/packages/d4/be/e7bc844f503ac41b482a855a6a86260443d0cb2710dcd06cd895993b216f/pycombat-0.20-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "be05c19cb6832f0edb96b12fe2f664447f66b28b88df00d856ce91a8197aedea",
                "md5": "09788faa64ec0bf7fd244005e8dd8e54",
                "sha256": "4ca38f01c0eb5e3ae79a243e22f464385efc85a67fb739b504116c14160d414b"
            },
            "downloads": -1,
            "filename": "pycombat-0.20.tar.gz",
            "has_sig": false,
            "md5_digest": "09788faa64ec0bf7fd244005e8dd8e54",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6185,
            "upload_time": "2022-01-06T23:13:25",
            "upload_time_iso_8601": "2022-01-06T23:13:25.674033Z",
            "url": "https://files.pythonhosted.org/packages/be/05/c19cb6832f0edb96b12fe2f664447f66b28b88df00d856ce91a8197aedea/pycombat-0.20.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-01-06 23:13:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "CoAxLab",
    "github_project": "pycombat",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pycombat"
}
        
Elapsed time: 2.58171s