# Nunchaku: Optimally partitioning data into piece-wise segments
`nunchaku` is a statistically rigorous, Bayesian algorithm to infer the optimal partitioning of a data set into contiguous piece-wise segments.
## Who might find this useful?
Scientists and engineers who wish to detect change points within a dataset, at which the dependency of one variable on the other change.
For example, if `y`'s underlying function is a piece-wise linear function of `x`, `nunchaku` will find the points at which the gradient and the intercept change.
## What does it do?
Given a dataset with two variables (e.g. a 1D time series), it infers the piece-wise function that best approximates the dataset. The function can be a piece-wise constant function, a piece-wise linear function, or a piece-wise function described by linear combinations of arbitrary basis functions (e.g. polynomials, sines).
For piece-wise linear functions, it provides statistics for each segment, from which users select the segment(s) of most interest, for example, the one with the largest gradient or the one with the largest $R^2$.
For details about how it works, please refer to our [paper](https://doi.org/10.1093/bioinformatics/btad688), freely available on *Bioinformatics*.
## Installation
To install via PyPI, type in Terminal (for Linux/Mac OS users) or Anaconda Prompt (for Windows users with [Anaconda](https://docs.anaconda.com/free/anaconda/install/windows/) installed):
```
> pip install nunchaku
```
For developers, create a virtual environment, install poetry and then install `nunchaku` with Poetry:
```
> git clone https://git.ecdf.ed.ac.uk/s1856140/nunchaku.git
> cd nunchaku
> poetry install --with dev
```
## Quick start
Data `x` is a list or a 1D Numpy array, sorted ascendingly; the data `y` is a list or a 1D Numpy array, or a 2D Numpy array with each row being one replicate of the measurement.
Below is a script to analyse the built-in example data.
```
>>> from nunchaku import Nunchaku, get_example_data
>>> x, y = get_example_data()
>>> # load data and set the prior of the gradient
>>> nc = Nunchaku(x, y, prior=[-5, 5])
>>> # compare models with 1, 2, 3 and 4 linear segments
>>> numseg, evidences = nc.get_number(num_range=(1, 4))
>>> # get the mean and standard deviation of the boundary points
>>> bds, bds_std = nc.get_iboundaries(numseg)
>>> # get the information of all segments
>>> info_df = nc.get_info(bds)
>>> # plot the data and the segments
>>> nc.plot(info_df)
>>> # get the underlying piece-wise function (for piece-wise linear functions only)
>>> y_prediction = nc.predict(info_df)
```
More detailed examples are provided in [a Jupyter Notebook in our repository](https://git.ecdf.ed.ac.uk/s1856140/nunchaku/-/blob/master/examples.ipynb).
## Documentation
Detailed documentation is available on [Readthedocs](https://nunchaku.readthedocs.io/en/latest/).
## Development history
+ `v0.15.0`: supports detection of piece-wise functions described by a linear combination of arbitrary basis; supports Python 3.11.
+ `v0.14.0`: supports detection of linear segments.
## Similar packages
+ The [`NOT`](https://cran.r-project.org/web/packages/not/index.html) package written in R.
+ The [`beast`](https://cran.r-project.org/web/packages/beast/index.html) package written in R.
## Citation
If you find this useful, please cite our paper:
Huo, Y., Li, H., Wang, X., Du, X., & Swain, P. S. (2023). Nunchaku: Optimally partitioning data into piece-wise linear segments. *Bioinformatics*. https://doi.org/10.1093/bioinformatics/btad688
Raw data
{
"_id": null,
"home_page": "https://nunchaku.readthedocs.io",
"name": "nunchaku",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<3.12",
"maintainer_email": "",
"keywords": "linear trend,Bayesian inference,log phase,exponential growth,change point analysis",
"author": "Yu Huo",
"author_email": "yhuo@tuta.io",
"download_url": "https://files.pythonhosted.org/packages/88/24/1df7941489d539adc99c0e68c59d5c7eb4523448e5fdb0e99d4a1df9f1a8/nunchaku-0.15.1.tar.gz",
"platform": null,
"description": "# Nunchaku: Optimally partitioning data into piece-wise segments\n`nunchaku` is a statistically rigorous, Bayesian algorithm to infer the optimal partitioning of a data set into contiguous piece-wise segments.\n\n## Who might find this useful?\nScientists and engineers who wish to detect change points within a dataset, at which the dependency of one variable on the other change. \n\nFor example, if `y`'s underlying function is a piece-wise linear function of `x`, `nunchaku` will find the points at which the gradient and the intercept change. \n\n## What does it do?\nGiven a dataset with two variables (e.g. a 1D time series), it infers the piece-wise function that best approximates the dataset. The function can be a piece-wise constant function, a piece-wise linear function, or a piece-wise function described by linear combinations of arbitrary basis functions (e.g. polynomials, sines).\n\nFor piece-wise linear functions, it provides statistics for each segment, from which users select the segment(s) of most interest, for example, the one with the largest gradient or the one with the largest $R^2$. \n\nFor details about how it works, please refer to our [paper](https://doi.org/10.1093/bioinformatics/btad688), freely available on *Bioinformatics*.\n\n## Installation\nTo install via PyPI, type in Terminal (for Linux/Mac OS users) or Anaconda Prompt (for Windows users with [Anaconda](https://docs.anaconda.com/free/anaconda/install/windows/) installed): \n```\n> pip install nunchaku\n```\n\nFor developers, create a virtual environment, install poetry and then install `nunchaku` with Poetry: \n```\n> git clone https://git.ecdf.ed.ac.uk/s1856140/nunchaku.git\n> cd nunchaku \n> poetry install --with dev \n```\n\n## Quick start\nData `x` is a list or a 1D Numpy array, sorted ascendingly; the data `y` is a list or a 1D Numpy array, or a 2D Numpy array with each row being one replicate of the measurement.\nBelow is a script to analyse the built-in example data. \n```\n>>> from nunchaku import Nunchaku, get_example_data\n>>> x, y = get_example_data()\n>>> # load data and set the prior of the gradient\n>>> nc = Nunchaku(x, y, prior=[-5, 5]) \n>>> # compare models with 1, 2, 3 and 4 linear segments\n>>> numseg, evidences = nc.get_number(num_range=(1, 4))\n>>> # get the mean and standard deviation of the boundary points\n>>> bds, bds_std = nc.get_iboundaries(numseg)\n>>> # get the information of all segments\n>>> info_df = nc.get_info(bds)\n>>> # plot the data and the segments\n>>> nc.plot(info_df)\n>>> # get the underlying piece-wise function (for piece-wise linear functions only)\n>>> y_prediction = nc.predict(info_df)\n```\n\nMore detailed examples are provided in [a Jupyter Notebook in our repository](https://git.ecdf.ed.ac.uk/s1856140/nunchaku/-/blob/master/examples.ipynb).\n\n## Documentation\nDetailed documentation is available on [Readthedocs](https://nunchaku.readthedocs.io/en/latest/).\n\n## Development history\n+ `v0.15.0`: supports detection of piece-wise functions described by a linear combination of arbitrary basis; supports Python 3.11.\n+ `v0.14.0`: supports detection of linear segments.\n\n## Similar packages\n+ The [`NOT`](https://cran.r-project.org/web/packages/not/index.html) package written in R.\n+ The [`beast`](https://cran.r-project.org/web/packages/beast/index.html) package written in R. \n\n## Citation\nIf you find this useful, please cite our paper: \n\nHuo, Y., Li, H., Wang, X., Du, X., & Swain, P. S. (2023). Nunchaku: Optimally partitioning data into piece-wise linear segments. *Bioinformatics*. https://doi.org/10.1093/bioinformatics/btad688\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Optimally partitioning data into piece-wise linear segments.",
"version": "0.15.1",
"project_urls": {
"Documentation": "https://nunchaku.readthedocs.io",
"Homepage": "https://nunchaku.readthedocs.io",
"Repository": "https://git.ecdf.ed.ac.uk/s1856140/nunchaku"
},
"split_keywords": [
"linear trend",
"bayesian inference",
"log phase",
"exponential growth",
"change point analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "79cace1d0eaf8b7cec6ee3520d90b69099d296cf02be9e9fc3ef0c1434969f20",
"md5": "e75fdb243f70d3de0b8f6e37753d53ef",
"sha256": "78aa7a98755bf143cb1dffb5b6a728d5d538bf224f7cc9f47032c23acaf10b68"
},
"downloads": -1,
"filename": "nunchaku-0.15.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e75fdb243f70d3de0b8f6e37753d53ef",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<3.12",
"size": 17078,
"upload_time": "2023-11-19T13:01:23",
"upload_time_iso_8601": "2023-11-19T13:01:23.760476Z",
"url": "https://files.pythonhosted.org/packages/79/ca/ce1d0eaf8b7cec6ee3520d90b69099d296cf02be9e9fc3ef0c1434969f20/nunchaku-0.15.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "88241df7941489d539adc99c0e68c59d5c7eb4523448e5fdb0e99d4a1df9f1a8",
"md5": "d9f08de0e0dccd22404319bb87baabb2",
"sha256": "7dc61d58904041ed1061e67f8ec1ec22401a285c0f0ab27a81a5589debc8e50c"
},
"downloads": -1,
"filename": "nunchaku-0.15.1.tar.gz",
"has_sig": false,
"md5_digest": "d9f08de0e0dccd22404319bb87baabb2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<3.12",
"size": 15954,
"upload_time": "2023-11-19T13:01:25",
"upload_time_iso_8601": "2023-11-19T13:01:25.806955Z",
"url": "https://files.pythonhosted.org/packages/88/24/1df7941489d539adc99c0e68c59d5c7eb4523448e5fdb0e99d4a1df9f1a8/nunchaku-0.15.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-19 13:01:25",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "nunchaku"
}