cellproportion


Namecellproportion JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/ankitpatel/cellproportion
SummaryCell type proportion analysis for single-cell and spatial transcriptomics data
upload_time2025-08-19 22:05:02
maintainerNone
docs_urlNone
authorAnkit Patel
requires_python>=3.8
licenseMIT
keywords single-cell spatial-transcriptomics cell-type-analysis proportion-analysis bioinformatics genomics scrna-seq
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # CellProportion

[![PyPI version](https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/PyPI_logo.svg/2560px-PyPI_logo.svg.png)
[![Python 3.8+](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Blue_Python_3.8_Shield_Badge.svg/2560px-Blue_Python_3.8_Shield_Badge.svg.png)
[![License: MIT](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2e/MIT_Logo_New.svg/1200px-MIT_Logo_New.svg.png)

A Python package for comparing cell type proportions between experimental groups in **single-cell RNA-seq** and **spatial transcriptomics** data.

## 🚀 Features

- **Flexible Input**: Works directly with AnnData objects or Pandas DataFrames
- **Multiple Statistical Methods**: 
  - `signed_r2` – Signed R² from regression (captures direction + fit strength)
  - `mean_diff` – Simple difference in mean proportions
  - `log2_fc` – Log₂ fold-change for multiplicative differences
  - `corr` – Pearson correlation with group labels
- **Spatial Analysis**: Compare proportions within tissue regions/spatial domains
- **Statistical Testing**: Mann-Whitney U tests with significance categorization
- **Visualization Ready**: Built-in color schemes and customizable color mapping
- **Robust Error Handling**: Comprehensive validation and informative warnings

## 📦 Installation

```bash
pip install cellproportion
```

## 🔧 Quick Start

### Single-Cell Analysis

```python
import pandas as pd
import scanpy as sc
from cellproportion import cell_type_abundance

# Load your single-cell data
adata = sc.read_h5ad("your_data.h5ad")

# Compare cell type proportions between conditions
results = cell_type_abundance(
    adata,                          # AnnData object
    annotation="cell_type",         # Column with cell type labels
    sample_types="condition",       # Column with experimental conditions
    sample_ID="patient_id",         # Column with sample/patient IDs
    sample_types_1="tumor",         # First group for comparison
    sample_types_2="normal",        # Second group for comparison
    method="signed_r2",             # Statistical method
    signed_r2_cutoff=0.15,         # Optional: cutoff for significance
    explain=True                    # Print method explanation
)

print(results.head())
```

### Spatial Transcriptomics Analysis

```python
from cellproportion.spatial import spatial_cell_type_abundance

# Analyze proportions within each spatial region
spatial_results = spatial_cell_type_abundance(
    adata,                          # AnnData with spatial information
    region_col="tissue_region",     # Column with spatial region labels
    annotation="cell_type",
    sample_types="condition", 
    sample_ID="patient_id",
    sample_types_1="tumor",
    sample_types_2="normal",
    method="signed_r2"
)

print(f"Analyzed {spatial_results['region'].nunique()} spatial regions")
print(spatial_results.head())
```

### Using DataFrames

```python
# Works with any DataFrame containing the required columns
metadata_df = pd.DataFrame({
    'cell_type': ['T_cell', 'B_cell', 'Macrophage'] * 100,
    'condition': ['tumor', 'normal'] * 150,
    'patient_id': ['P1', 'P2', 'P3'] * 100,
    # ... other columns
})

results = cell_type_abundance(
    metadata_df,
    annotation="cell_type",
    sample_types="condition",
    sample_ID="patient_id",
    sample_types_1="tumor",
    sample_types_2="normal"
)
```

## 📊 Understanding the Results

The output DataFrame contains:

- **V1**: Cell type annotation
- **V2**: Statistical metric value (depends on method chosen)
- **V3**: P-value from Mann-Whitney U test
- **sig_p**: Significance category (`p<0.01`, `p<0.05`, `p<0.1`, `p<0.5`, `p>0.5`)
- **color**: Color code for visualization

```python
# Example output
print(results[['V1', 'V2', 'V3', 'sig_p']].head())
#         V1        V2      V3   sig_p
# 0   T_cell  0.234567  0.0123  p<0.05
# 1   B_cell -0.123456  0.2341  p<0.5
# 2  NK_cell  0.456789  0.0001  p<0.01
```

## 🎨 Custom Color Mapping

Create a TSV file with your preferred colors:

```tsv
annotation	color
T_cell	#E41A1C
B_cell	#377EB8
NK_cell	#4DAF4A
Macrophage	#984EA3
```

```python
results = cell_type_abundance(
    adata,
    # ... other parameters
    colours_file="my_colors.tsv"
)
```

## 📈 Statistical Methods Explained

```python
from cellproportion.methods import explain_v2

# Get detailed explanations of all methods
explanations = explain_v2()
for method, info in explanations.items():
    print(f"\n{method.upper()}:")
    for key, value in info.items():
        print(f"  {key}: {value}")
```

### Method Comparison

| Method | Best For | Pros | Cons |
|--------|----------|------|------|
| `signed_r2` | Linear relationships | Direction + fit strength | Assumes linearity |
| `mean_diff` | Simple comparisons | Easy interpretation | Ignores variance |
| `log2_fc` | Multiplicative changes | Ratio-based | Sensitive to low values |
| `corr` | Association strength | Scale-invariant | Only linear association |

## 🔬 Advanced Usage

### Batch Processing Multiple Datasets

```python
datasets = ["dataset1.h5ad", "dataset2.h5ad", "dataset3.h5ad"]
all_results = []

for dataset_path in datasets:
    adata = sc.read_h5ad(dataset_path)
    results = cell_type_abundance(adata, method="signed_r2")
    results['dataset'] = dataset_path
    all_results.append(results)

combined_results = pd.concat(all_results, ignore_index=True)
```

### Method Comparison

```python
methods = ["signed_r2", "mean_diff", "log2_fc", "corr"]
method_comparison = {}

for method in methods:
    results = cell_type_abundance(adata, method=method)
    method_comparison[method] = results

# Compare results across methods
comparison_df = pd.DataFrame({
    method: method_comparison[method].set_index('V1')['V2'] 
    for method in methods
})
```

## 📝 Citation

If you use CellProportion in your research, please cite:

```bibtex
@software{cellproportion2024,
  author = {Patel, Ankit},
  title = {CellProportion: Cell type proportion analysis for single-cell and spatial transcriptomics},
  url = {https://github.com/avpatel18/cellproportion},
  version = {1.0.0},
  year = {2025}
}
```

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🐛 Bug Reports

If you encounter any bugs or have feature requests, please file an issue on [GitHub Issues](https://github.com/avpatel18/cellproportion/issues).

## 📧 Contact

- **Author**: Ankit Patel
- **Email**: ankit.patel@qmul.ac.uk
- **GitHub**: [@avpatel18](https://github.com/avpatel18)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ankitpatel/cellproportion",
    "name": "cellproportion",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "single-cell, spatial-transcriptomics, cell-type-analysis, proportion-analysis, bioinformatics, genomics, scRNA-seq",
    "author": "Ankit Patel",
    "author_email": "Ankit Patel <ankit.patel@qmul.ac.uk>",
    "download_url": "https://files.pythonhosted.org/packages/1d/cc/8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738/cellproportion-1.0.0.tar.gz",
    "platform": null,
    "description": "# CellProportion\n\n[![PyPI version](https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/PyPI_logo.svg/2560px-PyPI_logo.svg.png)\n[![Python 3.8+](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Blue_Python_3.8_Shield_Badge.svg/2560px-Blue_Python_3.8_Shield_Badge.svg.png)\n[![License: MIT](https://upload.wikimedia.org/wikipedia/commons/thumb/2/2e/MIT_Logo_New.svg/1200px-MIT_Logo_New.svg.png)\n\nA Python package for comparing cell type proportions between experimental groups in **single-cell RNA-seq** and **spatial transcriptomics** data.\n\n## \ud83d\ude80 Features\n\n- **Flexible Input**: Works directly with AnnData objects or Pandas DataFrames\n- **Multiple Statistical Methods**: \n  - `signed_r2` \u2013 Signed R\u00b2 from regression (captures direction + fit strength)\n  - `mean_diff` \u2013 Simple difference in mean proportions\n  - `log2_fc` \u2013 Log\u2082 fold-change for multiplicative differences\n  - `corr` \u2013 Pearson correlation with group labels\n- **Spatial Analysis**: Compare proportions within tissue regions/spatial domains\n- **Statistical Testing**: Mann-Whitney U tests with significance categorization\n- **Visualization Ready**: Built-in color schemes and customizable color mapping\n- **Robust Error Handling**: Comprehensive validation and informative warnings\n\n## \ud83d\udce6 Installation\n\n```bash\npip install cellproportion\n```\n\n## \ud83d\udd27 Quick Start\n\n### Single-Cell Analysis\n\n```python\nimport pandas as pd\nimport scanpy as sc\nfrom cellproportion import cell_type_abundance\n\n# Load your single-cell data\nadata = sc.read_h5ad(\"your_data.h5ad\")\n\n# Compare cell type proportions between conditions\nresults = cell_type_abundance(\n    adata,                          # AnnData object\n    annotation=\"cell_type\",         # Column with cell type labels\n    sample_types=\"condition\",       # Column with experimental conditions\n    sample_ID=\"patient_id\",         # Column with sample/patient IDs\n    sample_types_1=\"tumor\",         # First group for comparison\n    sample_types_2=\"normal\",        # Second group for comparison\n    method=\"signed_r2\",             # Statistical method\n    signed_r2_cutoff=0.15,         # Optional: cutoff for significance\n    explain=True                    # Print method explanation\n)\n\nprint(results.head())\n```\n\n### Spatial Transcriptomics Analysis\n\n```python\nfrom cellproportion.spatial import spatial_cell_type_abundance\n\n# Analyze proportions within each spatial region\nspatial_results = spatial_cell_type_abundance(\n    adata,                          # AnnData with spatial information\n    region_col=\"tissue_region\",     # Column with spatial region labels\n    annotation=\"cell_type\",\n    sample_types=\"condition\", \n    sample_ID=\"patient_id\",\n    sample_types_1=\"tumor\",\n    sample_types_2=\"normal\",\n    method=\"signed_r2\"\n)\n\nprint(f\"Analyzed {spatial_results['region'].nunique()} spatial regions\")\nprint(spatial_results.head())\n```\n\n### Using DataFrames\n\n```python\n# Works with any DataFrame containing the required columns\nmetadata_df = pd.DataFrame({\n    'cell_type': ['T_cell', 'B_cell', 'Macrophage'] * 100,\n    'condition': ['tumor', 'normal'] * 150,\n    'patient_id': ['P1', 'P2', 'P3'] * 100,\n    # ... other columns\n})\n\nresults = cell_type_abundance(\n    metadata_df,\n    annotation=\"cell_type\",\n    sample_types=\"condition\",\n    sample_ID=\"patient_id\",\n    sample_types_1=\"tumor\",\n    sample_types_2=\"normal\"\n)\n```\n\n## \ud83d\udcca Understanding the Results\n\nThe output DataFrame contains:\n\n- **V1**: Cell type annotation\n- **V2**: Statistical metric value (depends on method chosen)\n- **V3**: P-value from Mann-Whitney U test\n- **sig_p**: Significance category (`p<0.01`, `p<0.05`, `p<0.1`, `p<0.5`, `p>0.5`)\n- **color**: Color code for visualization\n\n```python\n# Example output\nprint(results[['V1', 'V2', 'V3', 'sig_p']].head())\n#         V1        V2      V3   sig_p\n# 0   T_cell  0.234567  0.0123  p<0.05\n# 1   B_cell -0.123456  0.2341  p<0.5\n# 2  NK_cell  0.456789  0.0001  p<0.01\n```\n\n## \ud83c\udfa8 Custom Color Mapping\n\nCreate a TSV file with your preferred colors:\n\n```tsv\nannotation\tcolor\nT_cell\t#E41A1C\nB_cell\t#377EB8\nNK_cell\t#4DAF4A\nMacrophage\t#984EA3\n```\n\n```python\nresults = cell_type_abundance(\n    adata,\n    # ... other parameters\n    colours_file=\"my_colors.tsv\"\n)\n```\n\n## \ud83d\udcc8 Statistical Methods Explained\n\n```python\nfrom cellproportion.methods import explain_v2\n\n# Get detailed explanations of all methods\nexplanations = explain_v2()\nfor method, info in explanations.items():\n    print(f\"\\n{method.upper()}:\")\n    for key, value in info.items():\n        print(f\"  {key}: {value}\")\n```\n\n### Method Comparison\n\n| Method | Best For | Pros | Cons |\n|--------|----------|------|------|\n| `signed_r2` | Linear relationships | Direction + fit strength | Assumes linearity |\n| `mean_diff` | Simple comparisons | Easy interpretation | Ignores variance |\n| `log2_fc` | Multiplicative changes | Ratio-based | Sensitive to low values |\n| `corr` | Association strength | Scale-invariant | Only linear association |\n\n## \ud83d\udd2c Advanced Usage\n\n### Batch Processing Multiple Datasets\n\n```python\ndatasets = [\"dataset1.h5ad\", \"dataset2.h5ad\", \"dataset3.h5ad\"]\nall_results = []\n\nfor dataset_path in datasets:\n    adata = sc.read_h5ad(dataset_path)\n    results = cell_type_abundance(adata, method=\"signed_r2\")\n    results['dataset'] = dataset_path\n    all_results.append(results)\n\ncombined_results = pd.concat(all_results, ignore_index=True)\n```\n\n### Method Comparison\n\n```python\nmethods = [\"signed_r2\", \"mean_diff\", \"log2_fc\", \"corr\"]\nmethod_comparison = {}\n\nfor method in methods:\n    results = cell_type_abundance(adata, method=method)\n    method_comparison[method] = results\n\n# Compare results across methods\ncomparison_df = pd.DataFrame({\n    method: method_comparison[method].set_index('V1')['V2'] \n    for method in methods\n})\n```\n\n## \ud83d\udcdd Citation\n\nIf you use CellProportion in your research, please cite:\n\n```bibtex\n@software{cellproportion2024,\n  author = {Patel, Ankit},\n  title = {CellProportion: Cell type proportion analysis for single-cell and spatial transcriptomics},\n  url = {https://github.com/avpatel18/cellproportion},\n  version = {1.0.0},\n  year = {2025}\n}\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udc1b Bug Reports\n\nIf you encounter any bugs or have feature requests, please file an issue on [GitHub Issues](https://github.com/avpatel18/cellproportion/issues).\n\n## \ud83d\udce7 Contact\n\n- **Author**: Ankit Patel\n- **Email**: ankit.patel@qmul.ac.uk\n- **GitHub**: [@avpatel18](https://github.com/avpatel18)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Cell type proportion analysis for single-cell and spatial transcriptomics data",
    "version": "1.0.0",
    "project_urls": {
        "Bug Reports": "https://github.com/ankitpatel/cellproportion/issues",
        "Documentation": "https://cellproportion.readthedocs.io/",
        "Homepage": "https://github.com/ankitpatel/cellproportion",
        "Source": "https://github.com/ankitpatel/cellproportion"
    },
    "split_keywords": [
        "single-cell",
        " spatial-transcriptomics",
        " cell-type-analysis",
        " proportion-analysis",
        " bioinformatics",
        " genomics",
        " scrna-seq"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "716e2d4db9ab3b078351017cd07298c7642ea45b84b21c4ab1abe0884fbc6ccb",
                "md5": "54fac6efe790bebdbf25b5f8a5bd5426",
                "sha256": "e0cfe2198cbc49bd9464572cf4c6bfe31ebfd1c15deb337c46f40e466a5a64e3"
            },
            "downloads": -1,
            "filename": "cellproportion-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "54fac6efe790bebdbf25b5f8a5bd5426",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 14764,
            "upload_time": "2025-08-19T22:05:01",
            "upload_time_iso_8601": "2025-08-19T22:05:01.324797Z",
            "url": "https://files.pythonhosted.org/packages/71/6e/2d4db9ab3b078351017cd07298c7642ea45b84b21c4ab1abe0884fbc6ccb/cellproportion-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1dcc8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738",
                "md5": "83a3bbee2f124c97f68416e2ae15de30",
                "sha256": "716cee634a3a0ac53afe9f0e9ff1511209a8b026a50bb9e05f74d23be2c0ac24"
            },
            "downloads": -1,
            "filename": "cellproportion-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "83a3bbee2f124c97f68416e2ae15de30",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 15200,
            "upload_time": "2025-08-19T22:05:02",
            "upload_time_iso_8601": "2025-08-19T22:05:02.511852Z",
            "url": "https://files.pythonhosted.org/packages/1d/cc/8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738/cellproportion-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-19 22:05:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ankitpatel",
    "github_project": "cellproportion",
    "github_not_found": true,
    "lcname": "cellproportion"
}
        
Elapsed time: 1.75861s