# CN-SBM: Categorical Block Modelling For Primary and Residual Copy Number Variation
This repository contains the implementation of the model described in the paper: **CN-SBM: Categorical Block Modelling For Primary and Residual Copy Number Variation** ([arXiv:2506.22963](https://arxiv.org/abs/2506.22963)), to appear in MLCB 2025.
# Note on NumPy Version
If you plan to use pickle files in an environment with NumPy < 2, you should install numpy < 2 for compatibility. Otherwise, it is fine to use NumPy 2 or newer.
# Installation
## Option 1: Install from PyPI (Recommended)
```bash
pip install cnsbm
```
## Option 2: Install from source
```bash
git clone https://github.com/lamke07/CNSBM.git
cd CNSBM
pip install .
```
## Option 3: Development installation
```bash
git clone https://github.com/lamke07/CNSBM.git
cd CNSBM
pip install -e .
```
## Option 4: Using Conda (Alternative)
```bash
conda env create -f environment.yml
conda activate cnsbm
pip install .
```
### GPU Support (Optional)
For GPU acceleration with JAX:
```bash
pip install cnsbm[gpu]
```
# Simple usage
```python
import os
import jax.numpy as jnp
from cnsbm import CNSBM
cwd = os.getcwd()
# C is a categorical matrix (integer-encoded categories starting from 0),
# missing values are encoded as -1. The number of categories will be inferred by C.max().
# For an example of how to construct and use C, see cn_vi-simple.ipynb in this repository
C = jnp.asarray(C)
K, L = 15, 10
# Initialize Jax model
sbm_test = CNSBM(C, K, L, rand_init='spectral_bi', fill_na=2)
# Run batch variational inference
_ = sbm_test.batch_vi(75, batch_print=1, fitted=False, tol=1e-6)
# plot reordered output and get summary information
sbm_test.plt_blocks(plt_init=True)
sbm_test.summary()
_ = sbm_test.ICL(verbose=True, slow=True)
# Save model outputs and export cluster labels / probabilities
os.makedirs(os.path.join(cwd, 'output'), exist_ok=True)
sbm_test.export_outputs_csv(os.path.join(cwd, 'output'), model_name='test_sbm')
sbm_test.save_jax_model(os.path.join(cwd, 'output', f'test_sbm.pickle'))
```
Raw data
{
"_id": null,
"home_page": null,
"name": "cnsbm",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "machine learning, stochastic block model, copy number variation, bioinformatics, clustering, variational inference",
"author": null,
"author_email": "Kevin Lam <kevin.lam@stat.ubc.ca>",
"download_url": "https://files.pythonhosted.org/packages/4e/38/bf1a1be0f971d9dac0392d2c7029644b5030e6c3475790216a9ab92a871c/cnsbm-1.0.0.tar.gz",
"platform": null,
"description": "# CN-SBM: Categorical Block Modelling For Primary and Residual Copy Number Variation\nThis repository contains the implementation of the model described in the paper: **CN-SBM: Categorical Block Modelling For Primary and Residual Copy Number Variation** ([arXiv:2506.22963](https://arxiv.org/abs/2506.22963)), to appear in MLCB 2025.\n\n# Note on NumPy Version\nIf you plan to use pickle files in an environment with NumPy < 2, you should install numpy < 2 for compatibility. Otherwise, it is fine to use NumPy 2 or newer.\n\n# Installation\n\n## Option 1: Install from PyPI (Recommended)\n\n```bash\npip install cnsbm\n```\n\n## Option 2: Install from source\n\n```bash\ngit clone https://github.com/lamke07/CNSBM.git\ncd CNSBM\npip install .\n```\n\n## Option 3: Development installation\n\n```bash\ngit clone https://github.com/lamke07/CNSBM.git\ncd CNSBM\npip install -e .\n```\n\n## Option 4: Using Conda (Alternative)\n\n```bash\nconda env create -f environment.yml\nconda activate cnsbm\npip install .\n```\n\n### GPU Support (Optional)\n\nFor GPU acceleration with JAX:\n\n```bash\npip install cnsbm[gpu]\n```\n\n# Simple usage\n\n```python\nimport os\nimport jax.numpy as jnp\nfrom cnsbm import CNSBM\n\ncwd = os.getcwd()\n\n# C is a categorical matrix (integer-encoded categories starting from 0),\n# missing values are encoded as -1. The number of categories will be inferred by C.max().\n# For an example of how to construct and use C, see cn_vi-simple.ipynb in this repository\nC = jnp.asarray(C)\nK, L = 15, 10\n\n# Initialize Jax model\nsbm_test = CNSBM(C, K, L, rand_init='spectral_bi', fill_na=2)\n# Run batch variational inference\n_ = sbm_test.batch_vi(75, batch_print=1, fitted=False, tol=1e-6)\n\n# plot reordered output and get summary information\nsbm_test.plt_blocks(plt_init=True)\nsbm_test.summary()\n_ = sbm_test.ICL(verbose=True, slow=True)\n\n# Save model outputs and export cluster labels / probabilities\nos.makedirs(os.path.join(cwd, 'output'), exist_ok=True)\nsbm_test.export_outputs_csv(os.path.join(cwd, 'output'), model_name='test_sbm')\nsbm_test.save_jax_model(os.path.join(cwd, 'output', f'test_sbm.pickle'))\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Categorical Block Modelling For Primary and Residual Copy Number Variation",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/lamke07/CNSBM/issues",
"Documentation": "https://github.com/lamke07/CNSBM#readme",
"Homepage": "https://github.com/lamke07/CNSBM",
"Paper": "https://arxiv.org/abs/2506.22963",
"Repository": "https://github.com/lamke07/CNSBM"
},
"split_keywords": [
"machine learning",
" stochastic block model",
" copy number variation",
" bioinformatics",
" clustering",
" variational inference"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ffc7395b0794bdfe09593ba1fa9957485c165e413fda0c536e652b69d8a2a763",
"md5": "c03e48e29aaa82025245818758a36618",
"sha256": "9ece6d2b62dd690309e952c3607230ce84835f56a36725d8db399e53202b2867"
},
"downloads": -1,
"filename": "cnsbm-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c03e48e29aaa82025245818758a36618",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 23465,
"upload_time": "2025-08-22T10:33:04",
"upload_time_iso_8601": "2025-08-22T10:33:04.788857Z",
"url": "https://files.pythonhosted.org/packages/ff/c7/395b0794bdfe09593ba1fa9957485c165e413fda0c536e652b69d8a2a763/cnsbm-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4e38bf1a1be0f971d9dac0392d2c7029644b5030e6c3475790216a9ab92a871c",
"md5": "ccb10a39d6e9d8e18884273859410353",
"sha256": "c679f2c51814dda8ee12b2f114e291f5a472b0890a05db76c69ba9f4fbb5fead"
},
"downloads": -1,
"filename": "cnsbm-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "ccb10a39d6e9d8e18884273859410353",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 1510244,
"upload_time": "2025-08-22T10:33:07",
"upload_time_iso_8601": "2025-08-22T10:33:07.341905Z",
"url": "https://files.pythonhosted.org/packages/4e/38/bf1a1be0f971d9dac0392d2c7029644b5030e6c3475790216a9ab92a871c/cnsbm-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-22 10:33:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lamke07",
"github_project": "CNSBM",
"github_not_found": true,
"lcname": "cnsbm"
}