# CLUMPS-PTM
An algorithm for identifying 3D clusters ("clumps") of post-translational modifications (PTMs). Developed for the Clinical Proteomic Tumor Atlas Consortium ([CPTAC](https://proteomics.cancer.gov/programs/cptac)). Full project repoistory for pan-cancer project can be found [here](https://github.com/getzlab/CPTAC_PanCan_2021).
__Author__: Shankara Anand
__Email__: sanand@broadinstitute.org
_Requires Python 3.6.0 or higher._
## Installation
##### PIP
`pip3 install clumps-ptm`
or
##### Git Clone
```
git clone git@github.com:getzlab/CLUMPS-PTM.git
cd CLUMPS-PTM
pip3 install -e .
```
## Use
CLUMPS-PTM has 3 general phases of analysis:
1. __Mapping__: taking input PTM proteomic data and mapping them onto PDB structural data.
Mapping relies on the source data and involves programmatic calling of `blastp+` depending on the source data-base to map to UNIPROT and ultimately PDB structures. An example notebook that walks through the mapping and demonstrates use of `clumps-ptm` API for running these steps programmatically can be found [here](https://github.com/getzlab/CLUMPS-PTM/blob/main/examples/CPTAC_Mapping_Workflow.ipynb). Once the mapping is performed once for a new data-set, the mapping file is used as the `--maps` flag in `clumpsptm` command (below).
2. __CLUMPS__: running the algorithm for identifying statistically significant clustering of PTM sites.
CLUMPS-PTM was designed for use with differential expression proteomic data. Due to the nature of drop-out in Mass-Spectrometry data, we opt for using broad changes in PTM levels across sample groups to interrogate "clumping" of modifications. Thus, the input requires out-put from Limma-Voom differential expression.
```{python}
usage: clumpsptm [-h] -i INPUT -m MAPS -w WEIGHT -s PDBSTORE [-o OUTPUT_DIR]
[-x XPO] [--threads THREADS] [-v]
[-f [FEATURES [FEATURES ...]]] [-g GROUPING] [-q]
[--min_sites MIN_SITES] [--subset {positive,negative}]
[--protein_id PROTEIN_ID] [--site_id SITE_ID] [--alphafold]
[--alphafold_threshold ALPHAFOLD_THRESHOLD]
Run CLUMPS-PTM.
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
<Required> Input file.
-m MAPS, --maps MAPS <Required> Mapping with index as indices that overlap
input.
-w WEIGHT, --weight WEIGHT
<Required> Weighting for CLUMPS-PTM (ex. logFC).
-s PDBSTORE, --pdbstore PDBSTORE
<Required> path to PDBStore directory.
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Output directory.
-x XPO, --xpo XPO Soft threshold parameter for truncated Gaussian.
--threads THREADS Number of threads for sampling.
-v, --verbose Verbosity.
-f [FEATURES [FEATURES ...]], --features [FEATURES [FEATURES ...]]
Assays to subset for.
-g GROUPING, --grouping GROUPING
DE group to use.
-q, --use_only_significant_sites
Only use significant sites for CLUMPS-PTM.
--min_sites MIN_SITES
Minimum number of sites.
--subset {positive,negative}
Subset sites.
--protein_id PROTEIN_ID
Unique protein id in input.
--site_id SITE_ID Unique site id in input.
--alphafold Run using alphafold structures.
--alphafold_threshold ALPHAFOLD_THRESHOLD
Threshold confidence level for alphafold sites.
```
3. __Post-Processing__: post-processing (FDR correction) \& visualization in Pymol.
Raw data
{
"_id": null,
"home_page": "https://github.com/getzlab/CLUMPS-PTM",
"name": "clumps-ptm",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "cancer,bioinformatics,genomics,proteomics,proteins,alphafold,post-translational modifications,phosphorylation,acetylation",
"author": "Shankara Anand",
"author_email": "sanand@broadinstitute.org",
"download_url": "https://files.pythonhosted.org/packages/53/08/c575be8645b0d0cc3ad0fa1a803830c0fb3a95057e0e8c7ee517f9c1658b/clumps-ptm-0.0.6.tar.gz",
"platform": null,
"description": "# CLUMPS-PTM\n\nAn algorithm for identifying 3D clusters (\"clumps\") of post-translational modifications (PTMs). Developed for the Clinical Proteomic Tumor Atlas Consortium ([CPTAC](https://proteomics.cancer.gov/programs/cptac)). Full project repoistory for pan-cancer project can be found [here](https://github.com/getzlab/CPTAC_PanCan_2021).\n\n__Author__: Shankara Anand\n\n__Email__: sanand@broadinstitute.org\n\n_Requires Python 3.6.0 or higher._\n\n## Installation\n\n##### PIP\n\n`pip3 install clumps-ptm`\n\nor\n\n##### Git Clone\n\n```\ngit clone git@github.com:getzlab/CLUMPS-PTM.git\ncd CLUMPS-PTM\npip3 install -e .\n```\n\n## Use\n\nCLUMPS-PTM has 3 general phases of analysis:\n1. __Mapping__: taking input PTM proteomic data and mapping them onto PDB structural data.\n\n Mapping relies on the source data and involves programmatic calling of `blastp+` depending on the source data-base to map to UNIPROT and ultimately PDB structures. An example notebook that walks through the mapping and demonstrates use of `clumps-ptm` API for running these steps programmatically can be found [here](https://github.com/getzlab/CLUMPS-PTM/blob/main/examples/CPTAC_Mapping_Workflow.ipynb). Once the mapping is performed once for a new data-set, the mapping file is used as the `--maps` flag in `clumpsptm` command (below).\n\n2. __CLUMPS__: running the algorithm for identifying statistically significant clustering of PTM sites.\n\n CLUMPS-PTM was designed for use with differential expression proteomic data. Due to the nature of drop-out in Mass-Spectrometry data, we opt for using broad changes in PTM levels across sample groups to interrogate \"clumping\" of modifications. Thus, the input requires out-put from Limma-Voom differential expression.\n\n```{python}\nusage: clumpsptm [-h] -i INPUT -m MAPS -w WEIGHT -s PDBSTORE [-o OUTPUT_DIR]\n [-x XPO] [--threads THREADS] [-v]\n [-f [FEATURES [FEATURES ...]]] [-g GROUPING] [-q]\n [--min_sites MIN_SITES] [--subset {positive,negative}]\n [--protein_id PROTEIN_ID] [--site_id SITE_ID] [--alphafold]\n [--alphafold_threshold ALPHAFOLD_THRESHOLD]\n\nRun CLUMPS-PTM.\n\noptional arguments:\n -h, --help show this help message and exit\n -i INPUT, --input INPUT\n <Required> Input file.\n -m MAPS, --maps MAPS <Required> Mapping with index as indices that overlap\n input.\n -w WEIGHT, --weight WEIGHT\n <Required> Weighting for CLUMPS-PTM (ex. logFC).\n -s PDBSTORE, --pdbstore PDBSTORE\n <Required> path to PDBStore directory.\n -o OUTPUT_DIR, --output_dir OUTPUT_DIR\n Output directory.\n -x XPO, --xpo XPO Soft threshold parameter for truncated Gaussian.\n --threads THREADS Number of threads for sampling.\n -v, --verbose Verbosity.\n -f [FEATURES [FEATURES ...]], --features [FEATURES [FEATURES ...]]\n Assays to subset for.\n -g GROUPING, --grouping GROUPING\n DE group to use.\n -q, --use_only_significant_sites\n Only use significant sites for CLUMPS-PTM.\n --min_sites MIN_SITES\n Minimum number of sites.\n --subset {positive,negative}\n Subset sites.\n --protein_id PROTEIN_ID\n Unique protein id in input.\n --site_id SITE_ID Unique site id in input.\n --alphafold Run using alphafold structures.\n --alphafold_threshold ALPHAFOLD_THRESHOLD\n Threshold confidence level for alphafold sites.\n\n```\n\n3. __Post-Processing__: post-processing (FDR correction) \\& visualization in Pymol.\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "CLUMPS-PTM driver gene discovery using 3D protein structure (Getz Lab).",
"version": "0.0.6",
"split_keywords": [
"cancer",
"bioinformatics",
"genomics",
"proteomics",
"proteins",
"alphafold",
"post-translational modifications",
"phosphorylation",
"acetylation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "82b263d7c145bc31c1d5c3c13fcb648479118164aa3d546fe5f5e62a6c5e55ae",
"md5": "90dee5c263c28e3369b5d4f5a90bec1a",
"sha256": "96934b32fe4391c5250fd8baf5470d67ebd1ad5662dd44603f6252b9f9507f1d"
},
"downloads": -1,
"filename": "clumps_ptm-0.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "90dee5c263c28e3369b5d4f5a90bec1a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 25480,
"upload_time": "2023-04-17T14:32:09",
"upload_time_iso_8601": "2023-04-17T14:32:09.495652Z",
"url": "https://files.pythonhosted.org/packages/82/b2/63d7c145bc31c1d5c3c13fcb648479118164aa3d546fe5f5e62a6c5e55ae/clumps_ptm-0.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5308c575be8645b0d0cc3ad0fa1a803830c0fb3a95057e0e8c7ee517f9c1658b",
"md5": "087e11fb25f17e51a38c911f08babd92",
"sha256": "6a36b5599fa2702cd7bf48541ce842ee24c8ac27329590ef6acbeea748b09faf"
},
"downloads": -1,
"filename": "clumps-ptm-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "087e11fb25f17e51a38c911f08babd92",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 581948,
"upload_time": "2023-04-17T14:32:11",
"upload_time_iso_8601": "2023-04-17T14:32:11.991958Z",
"url": "https://files.pythonhosted.org/packages/53/08/c575be8645b0d0cc3ad0fa1a803830c0fb3a95057e0e8c7ee517f9c1658b/clumps-ptm-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-17 14:32:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "getzlab",
"github_project": "CLUMPS-PTM",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "clumps-ptm"
}