# CellProportion
[
[
[
A Python package for comparing cell type proportions between experimental groups in **single-cell RNA-seq** and **spatial transcriptomics** data.
## 🚀 Features
- **Flexible Input**: Works directly with AnnData objects or Pandas DataFrames
- **Multiple Statistical Methods**:
- `signed_r2` – Signed R² from regression (captures direction + fit strength)
- `mean_diff` – Simple difference in mean proportions
- `log2_fc` – Log₂ fold-change for multiplicative differences
- `corr` – Pearson correlation with group labels
- **Spatial Analysis**: Compare proportions within tissue regions/spatial domains
- **Statistical Testing**: Mann-Whitney U tests with significance categorization
- **Visualization Ready**: Built-in color schemes and customizable color mapping
- **Robust Error Handling**: Comprehensive validation and informative warnings
## 📦 Installation
```bash
pip install cellproportion
```
## 🔧 Quick Start
### Single-Cell Analysis
```python
import pandas as pd
import scanpy as sc
from cellproportion import cell_type_abundance
# Load your single-cell data
adata = sc.read_h5ad("your_data.h5ad")
# Compare cell type proportions between conditions
results = cell_type_abundance(
adata, # AnnData object
annotation="cell_type", # Column with cell type labels
sample_types="condition", # Column with experimental conditions
sample_ID="patient_id", # Column with sample/patient IDs
sample_types_1="tumor", # First group for comparison
sample_types_2="normal", # Second group for comparison
method="signed_r2", # Statistical method
signed_r2_cutoff=0.15, # Optional: cutoff for significance
explain=True # Print method explanation
)
print(results.head())
```
### Spatial Transcriptomics Analysis
```python
from cellproportion.spatial import spatial_cell_type_abundance
# Analyze proportions within each spatial region
spatial_results = spatial_cell_type_abundance(
adata, # AnnData with spatial information
region_col="tissue_region", # Column with spatial region labels
annotation="cell_type",
sample_types="condition",
sample_ID="patient_id",
sample_types_1="tumor",
sample_types_2="normal",
method="signed_r2"
)
print(f"Analyzed {spatial_results['region'].nunique()} spatial regions")
print(spatial_results.head())
```
### Using DataFrames
```python
# Works with any DataFrame containing the required columns
metadata_df = pd.DataFrame({
'cell_type': ['T_cell', 'B_cell', 'Macrophage'] * 100,
'condition': ['tumor', 'normal'] * 150,
'patient_id': ['P1', 'P2', 'P3'] * 100,
# ... other columns
})
results = cell_type_abundance(
metadata_df,
annotation="cell_type",
sample_types="condition",
sample_ID="patient_id",
sample_types_1="tumor",
sample_types_2="normal"
)
```
## 📊 Understanding the Results
The output DataFrame contains:
- **V1**: Cell type annotation
- **V2**: Statistical metric value (depends on method chosen)
- **V3**: P-value from Mann-Whitney U test
- **sig_p**: Significance category (`p<0.01`, `p<0.05`, `p<0.1`, `p<0.5`, `p>0.5`)
- **color**: Color code for visualization
```python
# Example output
print(results[['V1', 'V2', 'V3', 'sig_p']].head())
# V1 V2 V3 sig_p
# 0 T_cell 0.234567 0.0123 p<0.05
# 1 B_cell -0.123456 0.2341 p<0.5
# 2 NK_cell 0.456789 0.0001 p<0.01
```
## 🎨 Custom Color Mapping
Create a TSV file with your preferred colors:
```tsv
annotation color
T_cell #E41A1C
B_cell #377EB8
NK_cell #4DAF4A
Macrophage #984EA3
```
```python
results = cell_type_abundance(
adata,
# ... other parameters
colours_file="my_colors.tsv"
)
```
## 📈 Statistical Methods Explained
```python
from cellproportion.methods import explain_v2
# Get detailed explanations of all methods
explanations = explain_v2()
for method, info in explanations.items():
print(f"\n{method.upper()}:")
for key, value in info.items():
print(f" {key}: {value}")
```
### Method Comparison
| Method | Best For | Pros | Cons |
|--------|----------|------|------|
| `signed_r2` | Linear relationships | Direction + fit strength | Assumes linearity |
| `mean_diff` | Simple comparisons | Easy interpretation | Ignores variance |
| `log2_fc` | Multiplicative changes | Ratio-based | Sensitive to low values |
| `corr` | Association strength | Scale-invariant | Only linear association |
## 🔬 Advanced Usage
### Batch Processing Multiple Datasets
```python
datasets = ["dataset1.h5ad", "dataset2.h5ad", "dataset3.h5ad"]
all_results = []
for dataset_path in datasets:
adata = sc.read_h5ad(dataset_path)
results = cell_type_abundance(adata, method="signed_r2")
results['dataset'] = dataset_path
all_results.append(results)
combined_results = pd.concat(all_results, ignore_index=True)
```
### Method Comparison
```python
methods = ["signed_r2", "mean_diff", "log2_fc", "corr"]
method_comparison = {}
for method in methods:
results = cell_type_abundance(adata, method=method)
method_comparison[method] = results
# Compare results across methods
comparison_df = pd.DataFrame({
method: method_comparison[method].set_index('V1')['V2']
for method in methods
})
```
## 📝 Citation
If you use CellProportion in your research, please cite:
```bibtex
@software{cellproportion2024,
author = {Patel, Ankit},
title = {CellProportion: Cell type proportion analysis for single-cell and spatial transcriptomics},
url = {https://github.com/avpatel18/cellproportion},
version = {1.0.0},
year = {2025}
}
```
## 🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🐛 Bug Reports
If you encounter any bugs or have feature requests, please file an issue on [GitHub Issues](https://github.com/avpatel18/cellproportion/issues).
## 📧 Contact
- **Author**: Ankit Patel
- **Email**: ankit.patel@qmul.ac.uk
- **GitHub**: [@avpatel18](https://github.com/avpatel18)
Raw data
{
"_id": null,
"home_page": "https://github.com/ankitpatel/cellproportion",
"name": "cellproportion",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "single-cell, spatial-transcriptomics, cell-type-analysis, proportion-analysis, bioinformatics, genomics, scRNA-seq",
"author": "Ankit Patel",
"author_email": "Ankit Patel <ankit.patel@qmul.ac.uk>",
"download_url": "https://files.pythonhosted.org/packages/1d/cc/8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738/cellproportion-1.0.0.tar.gz",
"platform": null,
"description": "# CellProportion\n\n[\n[\n[\n\nA Python package for comparing cell type proportions between experimental groups in **single-cell RNA-seq** and **spatial transcriptomics** data.\n\n## \ud83d\ude80 Features\n\n- **Flexible Input**: Works directly with AnnData objects or Pandas DataFrames\n- **Multiple Statistical Methods**: \n - `signed_r2` \u2013 Signed R\u00b2 from regression (captures direction + fit strength)\n - `mean_diff` \u2013 Simple difference in mean proportions\n - `log2_fc` \u2013 Log\u2082 fold-change for multiplicative differences\n - `corr` \u2013 Pearson correlation with group labels\n- **Spatial Analysis**: Compare proportions within tissue regions/spatial domains\n- **Statistical Testing**: Mann-Whitney U tests with significance categorization\n- **Visualization Ready**: Built-in color schemes and customizable color mapping\n- **Robust Error Handling**: Comprehensive validation and informative warnings\n\n## \ud83d\udce6 Installation\n\n```bash\npip install cellproportion\n```\n\n## \ud83d\udd27 Quick Start\n\n### Single-Cell Analysis\n\n```python\nimport pandas as pd\nimport scanpy as sc\nfrom cellproportion import cell_type_abundance\n\n# Load your single-cell data\nadata = sc.read_h5ad(\"your_data.h5ad\")\n\n# Compare cell type proportions between conditions\nresults = cell_type_abundance(\n adata, # AnnData object\n annotation=\"cell_type\", # Column with cell type labels\n sample_types=\"condition\", # Column with experimental conditions\n sample_ID=\"patient_id\", # Column with sample/patient IDs\n sample_types_1=\"tumor\", # First group for comparison\n sample_types_2=\"normal\", # Second group for comparison\n method=\"signed_r2\", # Statistical method\n signed_r2_cutoff=0.15, # Optional: cutoff for significance\n explain=True # Print method explanation\n)\n\nprint(results.head())\n```\n\n### Spatial Transcriptomics Analysis\n\n```python\nfrom cellproportion.spatial import spatial_cell_type_abundance\n\n# Analyze proportions within each spatial region\nspatial_results = spatial_cell_type_abundance(\n adata, # AnnData with spatial information\n region_col=\"tissue_region\", # Column with spatial region labels\n annotation=\"cell_type\",\n sample_types=\"condition\", \n sample_ID=\"patient_id\",\n sample_types_1=\"tumor\",\n sample_types_2=\"normal\",\n method=\"signed_r2\"\n)\n\nprint(f\"Analyzed {spatial_results['region'].nunique()} spatial regions\")\nprint(spatial_results.head())\n```\n\n### Using DataFrames\n\n```python\n# Works with any DataFrame containing the required columns\nmetadata_df = pd.DataFrame({\n 'cell_type': ['T_cell', 'B_cell', 'Macrophage'] * 100,\n 'condition': ['tumor', 'normal'] * 150,\n 'patient_id': ['P1', 'P2', 'P3'] * 100,\n # ... other columns\n})\n\nresults = cell_type_abundance(\n metadata_df,\n annotation=\"cell_type\",\n sample_types=\"condition\",\n sample_ID=\"patient_id\",\n sample_types_1=\"tumor\",\n sample_types_2=\"normal\"\n)\n```\n\n## \ud83d\udcca Understanding the Results\n\nThe output DataFrame contains:\n\n- **V1**: Cell type annotation\n- **V2**: Statistical metric value (depends on method chosen)\n- **V3**: P-value from Mann-Whitney U test\n- **sig_p**: Significance category (`p<0.01`, `p<0.05`, `p<0.1`, `p<0.5`, `p>0.5`)\n- **color**: Color code for visualization\n\n```python\n# Example output\nprint(results[['V1', 'V2', 'V3', 'sig_p']].head())\n# V1 V2 V3 sig_p\n# 0 T_cell 0.234567 0.0123 p<0.05\n# 1 B_cell -0.123456 0.2341 p<0.5\n# 2 NK_cell 0.456789 0.0001 p<0.01\n```\n\n## \ud83c\udfa8 Custom Color Mapping\n\nCreate a TSV file with your preferred colors:\n\n```tsv\nannotation\tcolor\nT_cell\t#E41A1C\nB_cell\t#377EB8\nNK_cell\t#4DAF4A\nMacrophage\t#984EA3\n```\n\n```python\nresults = cell_type_abundance(\n adata,\n # ... other parameters\n colours_file=\"my_colors.tsv\"\n)\n```\n\n## \ud83d\udcc8 Statistical Methods Explained\n\n```python\nfrom cellproportion.methods import explain_v2\n\n# Get detailed explanations of all methods\nexplanations = explain_v2()\nfor method, info in explanations.items():\n print(f\"\\n{method.upper()}:\")\n for key, value in info.items():\n print(f\" {key}: {value}\")\n```\n\n### Method Comparison\n\n| Method | Best For | Pros | Cons |\n|--------|----------|------|------|\n| `signed_r2` | Linear relationships | Direction + fit strength | Assumes linearity |\n| `mean_diff` | Simple comparisons | Easy interpretation | Ignores variance |\n| `log2_fc` | Multiplicative changes | Ratio-based | Sensitive to low values |\n| `corr` | Association strength | Scale-invariant | Only linear association |\n\n## \ud83d\udd2c Advanced Usage\n\n### Batch Processing Multiple Datasets\n\n```python\ndatasets = [\"dataset1.h5ad\", \"dataset2.h5ad\", \"dataset3.h5ad\"]\nall_results = []\n\nfor dataset_path in datasets:\n adata = sc.read_h5ad(dataset_path)\n results = cell_type_abundance(adata, method=\"signed_r2\")\n results['dataset'] = dataset_path\n all_results.append(results)\n\ncombined_results = pd.concat(all_results, ignore_index=True)\n```\n\n### Method Comparison\n\n```python\nmethods = [\"signed_r2\", \"mean_diff\", \"log2_fc\", \"corr\"]\nmethod_comparison = {}\n\nfor method in methods:\n results = cell_type_abundance(adata, method=method)\n method_comparison[method] = results\n\n# Compare results across methods\ncomparison_df = pd.DataFrame({\n method: method_comparison[method].set_index('V1')['V2'] \n for method in methods\n})\n```\n\n## \ud83d\udcdd Citation\n\nIf you use CellProportion in your research, please cite:\n\n```bibtex\n@software{cellproportion2024,\n author = {Patel, Ankit},\n title = {CellProportion: Cell type proportion analysis for single-cell and spatial transcriptomics},\n url = {https://github.com/avpatel18/cellproportion},\n version = {1.0.0},\n year = {2025}\n}\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udc1b Bug Reports\n\nIf you encounter any bugs or have feature requests, please file an issue on [GitHub Issues](https://github.com/avpatel18/cellproportion/issues).\n\n## \ud83d\udce7 Contact\n\n- **Author**: Ankit Patel\n- **Email**: ankit.patel@qmul.ac.uk\n- **GitHub**: [@avpatel18](https://github.com/avpatel18)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Cell type proportion analysis for single-cell and spatial transcriptomics data",
"version": "1.0.0",
"project_urls": {
"Bug Reports": "https://github.com/ankitpatel/cellproportion/issues",
"Documentation": "https://cellproportion.readthedocs.io/",
"Homepage": "https://github.com/ankitpatel/cellproportion",
"Source": "https://github.com/ankitpatel/cellproportion"
},
"split_keywords": [
"single-cell",
" spatial-transcriptomics",
" cell-type-analysis",
" proportion-analysis",
" bioinformatics",
" genomics",
" scrna-seq"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "716e2d4db9ab3b078351017cd07298c7642ea45b84b21c4ab1abe0884fbc6ccb",
"md5": "54fac6efe790bebdbf25b5f8a5bd5426",
"sha256": "e0cfe2198cbc49bd9464572cf4c6bfe31ebfd1c15deb337c46f40e466a5a64e3"
},
"downloads": -1,
"filename": "cellproportion-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "54fac6efe790bebdbf25b5f8a5bd5426",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 14764,
"upload_time": "2025-08-19T22:05:01",
"upload_time_iso_8601": "2025-08-19T22:05:01.324797Z",
"url": "https://files.pythonhosted.org/packages/71/6e/2d4db9ab3b078351017cd07298c7642ea45b84b21c4ab1abe0884fbc6ccb/cellproportion-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1dcc8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738",
"md5": "83a3bbee2f124c97f68416e2ae15de30",
"sha256": "716cee634a3a0ac53afe9f0e9ff1511209a8b026a50bb9e05f74d23be2c0ac24"
},
"downloads": -1,
"filename": "cellproportion-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "83a3bbee2f124c97f68416e2ae15de30",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 15200,
"upload_time": "2025-08-19T22:05:02",
"upload_time_iso_8601": "2025-08-19T22:05:02.511852Z",
"url": "https://files.pythonhosted.org/packages/1d/cc/8e1ddb3a9d9dc70b501d0f8c9974749a7c9703f1bd4e59b9947a77824738/cellproportion-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-19 22:05:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ankitpatel",
"github_project": "cellproportion",
"github_not_found": true,
"lcname": "cellproportion"
}