pyfish


Namepyfish JSON
Version 1.0.3 PyPI version JSON
download
home_pagehttps://bitbucket.org/schwarzlab/pyfish
SummaryPlotting tool for evolutionary population dynamics. Creates a Fish (Muller) plot.
upload_time2023-03-14 00:29:41
maintainer
docs_urlNone
authorAdam Streck, Tom L. Kaufmann
requires_python>=3.8
licenseMIT
keywords plot genomics visualization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyFish


[![PyPI](https://img.shields.io/pypi/v/pyfish?color=green)](https://pypi.org/project/pyfish/)
[![Conda](https://img.shields.io/conda/v/bioconda/pyfish?color=green)](https://anaconda.org/bioconda/pyfish)

PyFish is a Python 3 package for creation of [Fish (Muller) plots](https://en.wikipedia.org/wiki/Muller_plot) like the one below.

### Primary features
* polynomial interpolation
* curve smoothing
* high performance
* works with low and high density data

PyFish can be used either as a stand-alone tool or as a plotting library.

<img src="https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/fish.png" width="600" />

## Installation

PyFish requires Python >= 3.8

The package can be installed using Conda (from the bioconda channel)

`conda install -c bioconda pyfish`

or Pip

`pip install pyfish`.



## Input

The program takes two tables:
* one describing the size of individual subgroups at given points in time, referred to as _populations_,
* one describing the parent-child relationships between the subgroups, referred to as _parent tree_.

### Populations

Populations table has the schema `(Id: +int, Step: +int, Pop: +int)`, where:
* `Id` is a numerical identifier of a subgroup`,
* `Step` is a natural ordinal describing the logical time when the population is measured,
* `Pop` is the size of the population of the subgroup at the given step.

An example populations table:

| Id  | Step | Pop |
|-----|------|-----|
| 0   | 0    | 100 |
| 0   | 1    | 40  |
| 0   | 2    | 20  |
| 0   | 3    | 0   |
| 1   | 0    | 10  |
| 1   | 3    | 50  |
| 1   | 5    | 100 |
| 2   | 4    | 20  |
| 2   | 5    | 50  |
| 3   | 0    | 10  |
| 3   | 1    | 20  |
| 3   | 5    | 10  |

### Parent Tree

Parent tree has the schema `(ParentId: +int, ChildId: +int)`, where:
* `ParentId` is an id matching the population table,
* `ChildId` is an id matching the population table describing the direct progeny of the parent.

An example parent tree:

| ParentId | ChildId | 
|----------|---------|
| 0        | 1       |
| 1        | 2       |
| 0        | 3       | 

**Note: there must be exactly one node in the parent tree that has no parent. This is the root (0 in the example above).**


## Tool 

We provide example data. From the root folder of the project call: 

`pyfish tests/populations.csv tests/parent_tree.csv out.png`

This will create a plot called `out.png` in the folder.  

Additional execution parameters are described below.

## Library

The populations and parent_tree tables can be constructed directly as dataframes.

The library contains three public functions:

* `process_data` Takes the input data and parameters and creates data suitable for plotting. 
Additional arguments match the parameters as described below.
* `setup_figure` Resizes the figure and adds labels for axes.  
* `fish_plot` Calls the plotting function on the input parameters.

### Example:
```python
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pyfish import fish_plot, process_data, setup_figure

populations = np.array([[0, 0, 100], [0, 1, 40], [0, 2, 20], [0, 3, 0], [1, 0, 10], [1, 3, 50], 
    [1, 5, 100], [2, 4, 20], [2, 5, 50], [3, 0, 10], [3, 1, 20], [3, 5, 10]])
parent_tree = np.array([[0, 1], [1, 2], [0, 3]])
populations_df = pd.DataFrame(populations, columns=["Id", "Step", "Pop"])
parent_tree_df = pd.DataFrame(parent_tree, columns=["ParentId", "ChildId"])
data = process_data(populations_df, parent_tree_df)
setup_figure()
fish_plot(*data)
plt.show()
```

Calling the above code displays the following image:

<img src="https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/test.png" width="350" />

## Parameters

### `-a, --absolute`

Plots absolute population counts at each step.

| Base                          | --absolute                       |
|-------------------------------|----------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Absolute plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/abs.png) |

### `-I, --interpolate int`

Fills in missing values by interpolation by a polynomial of the given degree. 
If a value is not given, each population is set to 0 at the first and last step.

| Base                          | --interpolate 2                                |
|-------------------------------|------------------------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/test.png) | ![Interpolated plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/interpolation.png) |

### `-S, --smooth float`

Smoothing of the graph using Gaussian filter. 
The parameter value is the standard deviation of the kernel. 
The bigger the population the bigger the value should be.

**NOTE: If the population values are sparse, using smoothing without interpolation might lead to misleading population sizes.**

| Base                          | --smooth 50                         |
|-------------------------------|-------------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/smooth.png) |

### `-F, --first int+`, `-L, --last int+`

Only limits the steps to the range `[first, last]` inclusive.

| Base                          | --first 4000 --last 4500           |
|-------------------------------|------------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/bound.png) |

### `-M, --cmap string`

Use the specified [matplotlib colormap](https://matplotlib.org/stable/tutorials/colors/colormaps.html). 

Default colormap is rainbow.

| Base                          | --cmap viridis                   |
|-------------------------------|----------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/map.png) |

### `-C, --color_by string`

Color the ids based on a separate column in the populations.csv file.
It will select the first value of the column per id, so the value should be constant for all entries with the same id.

Best combined with a sequential colormap using `--cmap`

| Base                          | --color-by Feature --cmap viridis |
|-------------------------------|-----------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/color_by.png) |


### `-R, --seed int+`

Specifies the seed for the randomization of colors.

| Base                          | --seed 2022                       |
|-------------------------------|-----------------------------------|
| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/seed.png) |

### `-W, --width int+`, `-H, --height int+`

Specifies the dimensions for the output image. The size is including the axes' labels.

## Citation
Please cite as: *Adam Streck, Tom L Kaufmann, Roland F Schwarz, SMITH: Spatially Constrained Stochastic Model for Simulation of Intra-Tumour Heterogeneity, Bioinformatics, 2023; https://doi.org/10.1093/bioinformatics/btad102*

## Contact
Email questions, feature requests and bug reports to Adam Streck, `adam.streck@mdc-berlin.de`.

## License
PyFish is available under the MIT License.

## Development
To actively develop the package, we recommend to install pyfish in development mode using pip `pip install -e . --user`.
In order to run the main routine from the command line without installing it first, run `python -m pyfish.main -- tests/populations.csv tests/parent_tree.csv out.png`.

To trigger testing, run `pytest -v .`.



            

Raw data

            {
    "_id": null,
    "home_page": "https://bitbucket.org/schwarzlab/pyfish",
    "name": "pyfish",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "plot genomics visualization",
    "author": "Adam Streck, Tom L. Kaufmann",
    "author_email": "adam.streck@mdc-berlin.de",
    "download_url": "https://files.pythonhosted.org/packages/6b/9c/9191ed3490739c366fddf185e5055262e3bf477a7e7f041efdd2669eb535/pyfish-1.0.3.tar.gz",
    "platform": null,
    "description": "# PyFish\n\n\n[![PyPI](https://img.shields.io/pypi/v/pyfish?color=green)](https://pypi.org/project/pyfish/)\n[![Conda](https://img.shields.io/conda/v/bioconda/pyfish?color=green)](https://anaconda.org/bioconda/pyfish)\n\nPyFish is a Python 3 package for creation of [Fish (Muller) plots](https://en.wikipedia.org/wiki/Muller_plot) like the one below.\n\n### Primary features\n* polynomial interpolation\n* curve smoothing\n* high performance\n* works with low and high density data\n\nPyFish can be used either as a stand-alone tool or as a plotting library.\n\n<img src=\"https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/fish.png\" width=\"600\" />\n\n## Installation\n\nPyFish requires Python >= 3.8\n\nThe package can be installed using Conda (from the bioconda channel)\n\n`conda install -c bioconda pyfish`\n\nor Pip\n\n`pip install pyfish`.\n\n\n\n## Input\n\nThe program takes two tables:\n* one describing the size of individual subgroups at given points in time, referred to as _populations_,\n* one describing the parent-child relationships between the subgroups, referred to as _parent tree_.\n\n### Populations\n\nPopulations table has the schema `(Id: +int, Step: +int, Pop: +int)`, where:\n* `Id` is a numerical identifier of a subgroup`,\n* `Step` is a natural ordinal describing the logical time when the population is measured,\n* `Pop` is the size of the population of the subgroup at the given step.\n\nAn example populations table:\n\n| Id  | Step | Pop |\n|-----|------|-----|\n| 0   | 0    | 100 |\n| 0   | 1    | 40  |\n| 0   | 2    | 20  |\n| 0   | 3    | 0   |\n| 1   | 0    | 10  |\n| 1   | 3    | 50  |\n| 1   | 5    | 100 |\n| 2   | 4    | 20  |\n| 2   | 5    | 50  |\n| 3   | 0    | 10  |\n| 3   | 1    | 20  |\n| 3   | 5    | 10  |\n\n### Parent Tree\n\nParent tree has the schema `(ParentId: +int, ChildId: +int)`, where:\n* `ParentId` is an id matching the population table,\n* `ChildId` is an id matching the population table describing the direct progeny of the parent.\n\nAn example parent tree:\n\n| ParentId | ChildId | \n|----------|---------|\n| 0        | 1       |\n| 1        | 2       |\n| 0        | 3       | \n\n**Note: there must be exactly one node in the parent tree that has no parent. This is the root (0 in the example above).**\n\n\n## Tool \n\nWe provide example data. From the root folder of the project call: \n\n`pyfish tests/populations.csv tests/parent_tree.csv out.png`\n\nThis will create a plot called `out.png` in the folder.  \n\nAdditional execution parameters are described below.\n\n## Library\n\nThe populations and parent_tree tables can be constructed directly as dataframes.\n\nThe library contains three public functions:\n\n* `process_data` Takes the input data and parameters and creates data suitable for plotting. \nAdditional arguments match the parameters as described below.\n* `setup_figure` Resizes the figure and adds labels for axes.  \n* `fish_plot` Calls the plotting function on the input parameters.\n\n### Example:\n```python\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport pandas as pd\nfrom pyfish import fish_plot, process_data, setup_figure\n\npopulations = np.array([[0, 0, 100], [0, 1, 40], [0, 2, 20], [0, 3, 0], [1, 0, 10], [1, 3, 50], \n    [1, 5, 100], [2, 4, 20], [2, 5, 50], [3, 0, 10], [3, 1, 20], [3, 5, 10]])\nparent_tree = np.array([[0, 1], [1, 2], [0, 3]])\npopulations_df = pd.DataFrame(populations, columns=[\"Id\", \"Step\", \"Pop\"])\nparent_tree_df = pd.DataFrame(parent_tree, columns=[\"ParentId\", \"ChildId\"])\ndata = process_data(populations_df, parent_tree_df)\nsetup_figure()\nfish_plot(*data)\nplt.show()\n```\n\nCalling the above code displays the following image:\n\n<img src=\"https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/test.png\" width=\"350\" />\n\n## Parameters\n\n### `-a, --absolute`\n\nPlots absolute population counts at each step.\n\n| Base                          | --absolute                       |\n|-------------------------------|----------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Absolute plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/abs.png) |\n\n### `-I, --interpolate int`\n\nFills in missing values by interpolation by a polynomial of the given degree. \nIf a value is not given, each population is set to 0 at the first and last step.\n\n| Base                          | --interpolate 2                                |\n|-------------------------------|------------------------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/test.png) | ![Interpolated plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/interpolation.png) |\n\n### `-S, --smooth float`\n\nSmoothing of the graph using Gaussian filter. \nThe parameter value is the standard deviation of the kernel. \nThe bigger the population the bigger the value should be.\n\n**NOTE: If the population values are sparse, using smoothing without interpolation might lead to misleading population sizes.**\n\n| Base                          | --smooth 50                         |\n|-------------------------------|-------------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/smooth.png) |\n\n### `-F, --first int+`, `-L, --last int+`\n\nOnly limits the steps to the range `[first, last]` inclusive.\n\n| Base                          | --first 4000 --last 4500           |\n|-------------------------------|------------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/bound.png) |\n\n### `-M, --cmap string`\n\nUse the specified [matplotlib colormap](https://matplotlib.org/stable/tutorials/colors/colormaps.html). \n\nDefault colormap is rainbow.\n\n| Base                          | --cmap viridis                   |\n|-------------------------------|----------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/map.png) |\n\n### `-C, --color_by string`\n\nColor the ids based on a separate column in the populations.csv file.\nIt will select the first value of the column per id, so the value should be constant for all entries with the same id.\n\nBest combined with a sequential colormap using `--cmap`\n\n| Base                          | --color-by Feature --cmap viridis |\n|-------------------------------|-----------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/color_by.png) |\n\n\n### `-R, --seed int+`\n\nSpecifies the seed for the randomization of colors.\n\n| Base                          | --seed 2022                       |\n|-------------------------------|-----------------------------------|\n| ![Base plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/base.png) | ![Smoothed plot](https://bytebucket.org/schwarzlab/pyfish/raw/main/doc/seed.png) |\n\n### `-W, --width int+`, `-H, --height int+`\n\nSpecifies the dimensions for the output image. The size is including the axes' labels.\n\n## Citation\nPlease cite as: *Adam Streck, Tom L Kaufmann, Roland F Schwarz, SMITH: Spatially Constrained Stochastic Model for Simulation of Intra-Tumour Heterogeneity, Bioinformatics, 2023; https://doi.org/10.1093/bioinformatics/btad102*\n\n## Contact\nEmail questions, feature requests and bug reports to Adam Streck, `adam.streck@mdc-berlin.de`.\n\n## License\nPyFish is available under the MIT License.\n\n## Development\nTo actively develop the package, we recommend to install pyfish in development mode using pip `pip install -e . --user`.\nIn order to run the main routine from the command line without installing it first, run `python -m pyfish.main -- tests/populations.csv tests/parent_tree.csv out.png`.\n\nTo trigger testing, run `pytest -v .`.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Plotting tool for evolutionary population dynamics. Creates a Fish (Muller) plot.",
    "version": "1.0.3",
    "split_keywords": [
        "plot",
        "genomics",
        "visualization"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6b9c9191ed3490739c366fddf185e5055262e3bf477a7e7f041efdd2669eb535",
                "md5": "8e82187927761adf7badb300c5ffe571",
                "sha256": "ce4985baf6e4b1f6721b85c46d61ee5c7b79ba8b5e402a312048df6353252a0d"
            },
            "downloads": -1,
            "filename": "pyfish-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "8e82187927761adf7badb300c5ffe571",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 8972,
            "upload_time": "2023-03-14T00:29:41",
            "upload_time_iso_8601": "2023-03-14T00:29:41.267928Z",
            "url": "https://files.pythonhosted.org/packages/6b/9c/9191ed3490739c366fddf185e5055262e3bf477a7e7f041efdd2669eb535/pyfish-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-14 00:29:41",
    "github": false,
    "gitlab": false,
    "bitbucket": true,
    "bitbucket_user": "schwarzlab",
    "bitbucket_project": "pyfish",
    "lcname": "pyfish"
}
        
Elapsed time: 0.18967s