picasso-phylo


Namepicasso-phylo JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryPhylogenetic Inference of Copy number Alterations in Single-cell Sequencing data Optimization
upload_time2025-08-28 17:24:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords phylogenetics single-cell genomics copy-number-alterations cancer tumor-evolution scrna-seq
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PICASSO: Phylogenetic Inference of Copy number Alterations in Single-cell Sequencing data Optimization

[![PyPI version](https://badge.fury.io/py/picasso-phylo.svg)](https://badge.fury.io/py/picasso-phylo)
[![Conda Version](https://img.shields.io/conda/vn/conda-forge/picasso_phylo.svg)](https://anaconda.org/conda-forge/picasso_phylo)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

PICASSO is a computational method for reconstructing tumor phylogenies from noisy, inferred copy number alteration (CNA) data derived from single-cell RNA sequencing (scRNA-seq). Unlike methods designed for direct scDNA-seq data, PICASSO specifically handles the uncertainty and noise inherent in CNA profiles inferred from gene expression data.

## Key Features

- **Noise-aware phylogenetic inference**: Uses probabilistic models to handle uncertainty in scRNA-seq-inferred CNAs
- **Confidence-based termination**: Prevents over-fitting to noise through assignment confidence thresholds  
- **Comprehensive visualization**: Integrated plotting and iTOL export capabilities
- **Scalable implementation**: Handles datasets with hundreds to thousands of cells
- **Well-documented**: Extensive documentation with focus on noisy data handling

## Installation

### PyPI (recommended)
```bash
pip install picasso_phylo
```

### Conda
The package is not (yet) available on conda-forge due to some dependency issues. To use it in a conda or mamba environment, please install via pip inside your environment:
```bash
conda create -n picasso_env python=3.10
conda activate picasso_env
pip install picasso-phylo
```

### Development Installation
```bash
git clone https://github.com/dpeerlab/picasso
cd picasso
pip install -e ".[dev]"
```

## Requirements

- **Python**: ≥ 3.10
- **Core dependencies**: numpy, pandas, pomegranate, ete3, matplotlib, seaborn, tqdm, scipy
- **Optional**: jupyter (notebooks), pyqt5 (advanced visualization)

## Quick Start

```python
from picasso import Picasso, CloneTree, load_data

# Load example CNA data
cna_data = load_data()

# Initialize PICASSO with noise-appropriate parameters
picasso = Picasso(cna_data,
                 min_clone_size=10,  # Larger for noisy data
                 assignment_confidence_threshold=0.8,
                 terminate_by='probability')

# Reconstruct phylogeny
picasso.fit()

# Extract results
phylogeny = picasso.get_phylogeny()
assignments = picasso.get_clone_assignments()

# Create integrated analysis object
clone_tree = CloneTree(phylogeny, assignments, cna_data)
clone_tree.plot_alterations(save_as='cna_heatmap.pdf')
```

### For Very Noisy scRNA-seq Data

```python
# Use stricter parameters for very noisy data
picasso_strict = Picasso(cna_data,
                        min_clone_size=50,
                        max_depth=8,  # Limit depth
                        assignment_confidence_threshold=0.9,
                        assignment_confidence_proportion=0.95,
                        bic_penalty_strength=1.5)
picasso_strict.fit()
```

## Features

### Data Processing
- Load and process copy number alteration (CNA) data
- Encode CNVs as ternary values for more meaningful similarity measures
- Feature selection to remove non-informative regions

### Tree Construction
- Construct phylogenetic trees using the PICASSO algorithm
- Flexible tree manipulation and rooting options
- Support for both clone-level and sample-level phylogenies

### Visualization
- Basic tree visualization
- Clone size plotting
- Alteration plotting
- Integration with iTOL for advanced visualization
- Support for:
  - Heatmaps
  - Colorstrips
  - Stacked bar charts

## Advanced Usage

### Tree Construction and Manipulation

```python
from picasso import CloneTree

# Create and manipulate the clone tree
tree = CloneTree(phylogeny, clone_assignments, filtered_matrix, clone_aggregation='mode')
outgroup = tree.get_most_ancestral_clone()
tree.root_tree(outgroup)

# Get different tree representations
clone_tree = tree.get_clone_phylogeny()
cell_tree = tree.get_sample_phylogeny()
```

### iTOL Visualization

```python
# Generate heatmap of copy number changes
heatmap_annot = picasso.itol.dataframe_to_itol_heatmap(character_matrix)
with open('heatmap_annotation.txt', 'w') as f:
    f.write(heatmap_annot)

# Generate colorstrip annotation
colorstrip_annot = picasso.itol.dataframe_to_itol_colorstrip(
    data_series,
    color_map,
    dataset_label='Label'
)

# Generate stacked bar visualization
stackedbar_annot = picasso.itol.dataframe_to_itol_stackedbar(
    proportions_df,
    color_map,
    dataset_label='Label'
)
```

## API Reference

### Picasso Class Parameters

- `min_depth`: Minimum depth of the phylogenetic tree
- `max_depth`: Maximum depth of the tree (None for unlimited)
- `min_clone_size`: Minimum number of samples in a clone
- `terminate_by`: Criterion for terminating tree growth
- `assignment_confidence_threshold`: Confidence threshold for sample assignment
- `assignment_confidence_proportion`: Required proportion of samples meeting confidence threshold
- `bic_penalty_strength`: Strength of BIC penalty term. Higher values (>1.0) encourage simpler models, useful for noisy data to prevent over-fitting.

## Visualization

For detailed visualization, we recommend using the [iTOL website/application](https://itol.embl.de/), which accepts newick strings as input and allows for detailed customization of tree visualization. Picasso provides convenience functions for generating iTOL annotation files to visualize metadata on the tree.

## Support

If you encounter any problems, please open an issue along with a detailed description.

## License

This project is licensed under the MIT License:

```
MIT License

Copyright (c) 2024 [Pe'er Lab]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

## Citation

If you use Picasso in your research, please cite our paper. 


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "picasso-phylo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Sitara Persad <sitara.persad@columbia.edu>",
    "keywords": "phylogenetics, single-cell, genomics, copy-number-alterations, cancer, tumor-evolution, scRNA-seq",
    "author": null,
    "author_email": "Sitara Persad <sitara.persad@columbia.edu>",
    "download_url": "https://files.pythonhosted.org/packages/f7/33/358cef1a58a825a95fe2c98bc83269f85f276bf3b946d4cbd56eb02d9820/picasso_phylo-0.1.2.tar.gz",
    "platform": null,
    "description": "# PICASSO: Phylogenetic Inference of Copy number Alterations in Single-cell Sequencing data Optimization\n\n[![PyPI version](https://badge.fury.io/py/picasso-phylo.svg)](https://badge.fury.io/py/picasso-phylo)\n[![Conda Version](https://img.shields.io/conda/vn/conda-forge/picasso_phylo.svg)](https://anaconda.org/conda-forge/picasso_phylo)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nPICASSO is a computational method for reconstructing tumor phylogenies from noisy, inferred copy number alteration (CNA) data derived from single-cell RNA sequencing (scRNA-seq). Unlike methods designed for direct scDNA-seq data, PICASSO specifically handles the uncertainty and noise inherent in CNA profiles inferred from gene expression data.\n\n## Key Features\n\n- **Noise-aware phylogenetic inference**: Uses probabilistic models to handle uncertainty in scRNA-seq-inferred CNAs\n- **Confidence-based termination**: Prevents over-fitting to noise through assignment confidence thresholds  \n- **Comprehensive visualization**: Integrated plotting and iTOL export capabilities\n- **Scalable implementation**: Handles datasets with hundreds to thousands of cells\n- **Well-documented**: Extensive documentation with focus on noisy data handling\n\n## Installation\n\n### PyPI (recommended)\n```bash\npip install picasso_phylo\n```\n\n### Conda\nThe package is not (yet) available on conda-forge due to some dependency issues. To use it in a conda or mamba environment, please install via pip inside your environment:\n```bash\nconda create -n picasso_env python=3.10\nconda activate picasso_env\npip install picasso-phylo\n```\n\n### Development Installation\n```bash\ngit clone https://github.com/dpeerlab/picasso\ncd picasso\npip install -e \".[dev]\"\n```\n\n## Requirements\n\n- **Python**: \u2265 3.10\n- **Core dependencies**: numpy, pandas, pomegranate, ete3, matplotlib, seaborn, tqdm, scipy\n- **Optional**: jupyter (notebooks), pyqt5 (advanced visualization)\n\n## Quick Start\n\n```python\nfrom picasso import Picasso, CloneTree, load_data\n\n# Load example CNA data\ncna_data = load_data()\n\n# Initialize PICASSO with noise-appropriate parameters\npicasso = Picasso(cna_data,\n                 min_clone_size=10,  # Larger for noisy data\n                 assignment_confidence_threshold=0.8,\n                 terminate_by='probability')\n\n# Reconstruct phylogeny\npicasso.fit()\n\n# Extract results\nphylogeny = picasso.get_phylogeny()\nassignments = picasso.get_clone_assignments()\n\n# Create integrated analysis object\nclone_tree = CloneTree(phylogeny, assignments, cna_data)\nclone_tree.plot_alterations(save_as='cna_heatmap.pdf')\n```\n\n### For Very Noisy scRNA-seq Data\n\n```python\n# Use stricter parameters for very noisy data\npicasso_strict = Picasso(cna_data,\n                        min_clone_size=50,\n                        max_depth=8,  # Limit depth\n                        assignment_confidence_threshold=0.9,\n                        assignment_confidence_proportion=0.95,\n                        bic_penalty_strength=1.5)\npicasso_strict.fit()\n```\n\n## Features\n\n### Data Processing\n- Load and process copy number alteration (CNA) data\n- Encode CNVs as ternary values for more meaningful similarity measures\n- Feature selection to remove non-informative regions\n\n### Tree Construction\n- Construct phylogenetic trees using the PICASSO algorithm\n- Flexible tree manipulation and rooting options\n- Support for both clone-level and sample-level phylogenies\n\n### Visualization\n- Basic tree visualization\n- Clone size plotting\n- Alteration plotting\n- Integration with iTOL for advanced visualization\n- Support for:\n  - Heatmaps\n  - Colorstrips\n  - Stacked bar charts\n\n## Advanced Usage\n\n### Tree Construction and Manipulation\n\n```python\nfrom picasso import CloneTree\n\n# Create and manipulate the clone tree\ntree = CloneTree(phylogeny, clone_assignments, filtered_matrix, clone_aggregation='mode')\noutgroup = tree.get_most_ancestral_clone()\ntree.root_tree(outgroup)\n\n# Get different tree representations\nclone_tree = tree.get_clone_phylogeny()\ncell_tree = tree.get_sample_phylogeny()\n```\n\n### iTOL Visualization\n\n```python\n# Generate heatmap of copy number changes\nheatmap_annot = picasso.itol.dataframe_to_itol_heatmap(character_matrix)\nwith open('heatmap_annotation.txt', 'w') as f:\n    f.write(heatmap_annot)\n\n# Generate colorstrip annotation\ncolorstrip_annot = picasso.itol.dataframe_to_itol_colorstrip(\n    data_series,\n    color_map,\n    dataset_label='Label'\n)\n\n# Generate stacked bar visualization\nstackedbar_annot = picasso.itol.dataframe_to_itol_stackedbar(\n    proportions_df,\n    color_map,\n    dataset_label='Label'\n)\n```\n\n## API Reference\n\n### Picasso Class Parameters\n\n- `min_depth`: Minimum depth of the phylogenetic tree\n- `max_depth`: Maximum depth of the tree (None for unlimited)\n- `min_clone_size`: Minimum number of samples in a clone\n- `terminate_by`: Criterion for terminating tree growth\n- `assignment_confidence_threshold`: Confidence threshold for sample assignment\n- `assignment_confidence_proportion`: Required proportion of samples meeting confidence threshold\n- `bic_penalty_strength`: Strength of BIC penalty term. Higher values (>1.0) encourage simpler models, useful for noisy data to prevent over-fitting.\n\n## Visualization\n\nFor detailed visualization, we recommend using the [iTOL website/application](https://itol.embl.de/), which accepts newick strings as input and allows for detailed customization of tree visualization. Picasso provides convenience functions for generating iTOL annotation files to visualize metadata on the tree.\n\n## Support\n\nIf you encounter any problems, please open an issue along with a detailed description.\n\n## License\n\nThis project is licensed under the MIT License:\n\n```\nMIT License\n\nCopyright (c) 2024 [Pe'er Lab]\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n```\n\n## Citation\n\nIf you use Picasso in your research, please cite our paper. \n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Phylogenetic Inference of Copy number Alterations in Single-cell Sequencing data Optimization",
    "version": "0.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/sitarapersad/picasso_phylo/issues",
        "Changelog": "https://github.com/sitarapersad/picasso_phylo/blob/main/CHANGELOG.md",
        "Documentation": "https://picasso-phylo.readthedocs.io/",
        "Homepage": "https://github.com/sitarapersad/picasso_phylo",
        "Repository": "https://github.com/sitarapersad/picasso_phylo.git"
    },
    "split_keywords": [
        "phylogenetics",
        " single-cell",
        " genomics",
        " copy-number-alterations",
        " cancer",
        " tumor-evolution",
        " scrna-seq"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "79904ad6c173b469b26bd9db7b2601235454cf061c5bc05f8182094bbea9ed74",
                "md5": "ae3eafc764255b8d591f4eea47078ff5",
                "sha256": "c8b5c42f5ea2b40327e3f7a719ad432f88ff51075d5ec13c4c3229a8db241980"
            },
            "downloads": -1,
            "filename": "picasso_phylo-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ae3eafc764255b8d591f4eea47078ff5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 263021,
            "upload_time": "2025-08-28T17:24:10",
            "upload_time_iso_8601": "2025-08-28T17:24:10.860209Z",
            "url": "https://files.pythonhosted.org/packages/79/90/4ad6c173b469b26bd9db7b2601235454cf061c5bc05f8182094bbea9ed74/picasso_phylo-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f733358cef1a58a825a95fe2c98bc83269f85f276bf3b946d4cbd56eb02d9820",
                "md5": "e318cfbddf20594b16aaafe95b374d57",
                "sha256": "76d0d93b46803fbfcf86e9bc8d9fadc97681d490393615fd9f1b0f6b012eba1d"
            },
            "downloads": -1,
            "filename": "picasso_phylo-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e318cfbddf20594b16aaafe95b374d57",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 1167257,
            "upload_time": "2025-08-28T17:24:12",
            "upload_time_iso_8601": "2025-08-28T17:24:12.348050Z",
            "url": "https://files.pythonhosted.org/packages/f7/33/358cef1a58a825a95fe2c98bc83269f85f276bf3b946d4cbd56eb02d9820/picasso_phylo-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 17:24:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sitarapersad",
    "github_project": "picasso_phylo",
    "github_not_found": true,
    "lcname": "picasso-phylo"
}
        
Elapsed time: 0.94693s