pydssp


Namepydssp JSON
Version 0.9.0 PyPI version JSON
download
home_pagehttps://github.com/ShintaroMinami/PyDSSP
SummaryA simplified implementation of DSSP algorithm for PyTorch and NumPy
upload_time2022-11-15 02:04:06
maintainer
docs_urlNone
authorShintaro Minami
requires_python
licenseMIT
keywords dssp secondary structure protein structure
VCS
bugtrack_url
requirements torch numpy einops tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyDSSP
A simplified implementation of DSSP algorithm for PyTorch and NumPy

# What's this?
DSSP (Dictionary of Secondary Structure of Protein) is a popular algorithm for assigning secondary structure of protein backbone structure. [<a href="https://onlinelibrary.wiley.com/doi/abs/10.1002/bip.360221211">
Wolfgang Kabsch, and Christian Sander (1983)</a>] This repository is a python implementation of DSSP algorithm that simplifies some parts of the algorithm.

# General Info
- It's NOT a complete implementation of the original DSSP, as some parts have been simplified (some more details [here](#differences-from-the-original-dssp)). However, an average of over 97% of secondary structure determinations agree with the original.
- The algorithm used to identify hydrogen bonded residue pairs is exactly the same as the original DSSP algorithm, but is extended to output the hydrogen-bond-pair-matrix as continuous values in the range [0,1].
- With the continuous variable extension above, the hydrogen-bond-pair-matrix is differentiable with torch.Tensor as input.

# Install
## install through PyPi
``` bash
pip install pydssp
```
## install by git clone
``` bash
git clone https://github.com/ShintaroMinami/PyDSSP.git
cd PyDSSSP
python setup.py install
```

# How to use
## To use pydssp script
If you have already installed pydssp, you should be able to use pydssp command.
``` bash
pydssp  input_01.pdb input_02.pdb ... input_N.pdb -o output.result
```
The output.result will be a text format, looking like follows,
``` bash
-EEEEE-E--EEEEEE---EEEE-HHHH--EEEE--------- input_01.pdb
-HHHHHHHHHHHHHH----HHHHHHHHHHHHHHHHHHH--- input_02.pdb
-EEEE-----EEEE----EEEE--E---EEE-----EEE-EEE-- input_03.pdb
...
```

## To use as python module
### Import & test coordinates
``` python
# Import
import torch
import pydssp

# Sample coordinates
batch, length, atoms, xyz = 10, 100, 4, 3
## atoms should be 4 (N, CA, C, O) or 5 (N, CA, C, O, H)
coord = torch.randn([batch, length, atom, xyz]) # batch-dim is optional
```

### To get hydrogen-bond matrix: ```pydssp.get_hbond_map()```
``` python
hbond_matrix = pydssp.get_hbond_map(coord)

print(hbond_matrix.shape) # should be (batch, length, length)
```
- For hbond_matrix[b, i, j], index 'i' is for donner (N-H) and 'j' is for acceptor (C=O), respectively
- The output matrix consists of constant values in the range [0,1], which is defined as follows.

$HbondMat(i,j) = (1+\sin((-0.5-E(i,j)-margin)/margin*\pi/2))/2$

Here $E$ is the electrostatic energy defined by (Kabsch and Sander 1983) and $margin(=1.0)$ is introduced to control smoothness.

### To get secondary structure assignment: ```pydssp.assign()```
``` python
dssp = pydssp.assign(coord, out_type='c3')
## output is batched np.ndarray of C3 annotation, like ['-', 'H', 'H', ..., 'E', '-']

# To get secondary str. as index
dssp = pydssp.assign(coord, out_type='index')
## 0: loop,  1: alpha-helix,  2: beta-strand

# To get secondary str. as onehot representation
dssp = pydssp.assign(coord, out_type='onehot')
## dim-0: loop,  dim-1: alpha-helix,  dim-2: beta-strand
```

# Differences from the original DSSP
This implementation was simplified from the original DSSP algorithm. The differences from the original DSSP are as follows
- The implementation omitted β-bulge annotation, so β-bulge is determined as a loop instead of β-strand.
- Parameters for adding hydrogen atoms are slightly different from the original DSSP, which may cause small differences in hydrogen bond annotation.
- Only support C3 ('-', 'H', and 'E') type assignment instead of C8 type (B, E, G, H, I, S, T, and ' ').

Although the above simplifications, the C3 type annotation still matches with the original DSSP for more than 97% of residues on average.

## Reference
``` bibtex
@article{kabsch1983dictionary,
  title={Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features},
  author={Kabsch, Wolfgang and Sander, Christian},
  journal={Biopolymers: Original Research on Biomolecules},
  volume={22},
  number={12},
  pages={2577--2637},
  year={1983},
  publisher={Wiley Online Library}
}
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ShintaroMinami/PyDSSP",
    "name": "pydssp",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "DSSP,Secondary Structure,Protein Structure",
    "author": "Shintaro Minami",
    "author_email": "",
    "download_url": "",
    "platform": null,
    "description": "# PyDSSP\nA simplified implementation of DSSP algorithm for PyTorch and NumPy\n\n# What's this?\nDSSP (Dictionary of Secondary Structure of Protein) is a popular algorithm for assigning secondary structure of protein backbone structure. [<a href=\"https://onlinelibrary.wiley.com/doi/abs/10.1002/bip.360221211\">\nWolfgang Kabsch, and Christian Sander (1983)</a>] This repository is a python implementation of DSSP algorithm that simplifies some parts of the algorithm.\n\n# General Info\n- It's NOT a complete implementation of the original DSSP, as some parts have been simplified (some more details [here](#differences-from-the-original-dssp)). However, an average of over 97% of secondary structure determinations agree with the original.\n- The algorithm used to identify hydrogen bonded residue pairs is exactly the same as the original DSSP algorithm, but is extended to output the hydrogen-bond-pair-matrix as continuous values in the range [0,1].\n- With the continuous variable extension above, the hydrogen-bond-pair-matrix is differentiable with torch.Tensor as input.\n\n# Install\n## install through PyPi\n``` bash\npip install pydssp\n```\n## install by git clone\n``` bash\ngit clone https://github.com/ShintaroMinami/PyDSSP.git\ncd PyDSSSP\npython setup.py install\n```\n\n# How to use\n## To use pydssp script\nIf you have already installed pydssp, you should be able to use pydssp command.\n``` bash\npydssp  input_01.pdb input_02.pdb ... input_N.pdb -o output.result\n```\nThe output.result will be a text format, looking like follows,\n``` bash\n-EEEEE-E--EEEEEE---EEEE-HHHH--EEEE--------- input_01.pdb\n-HHHHHHHHHHHHHH----HHHHHHHHHHHHHHHHHHH--- input_02.pdb\n-EEEE-----EEEE----EEEE--E---EEE-----EEE-EEE-- input_03.pdb\n...\n```\n\n## To use as python module\n### Import & test coordinates\n``` python\n# Import\nimport torch\nimport pydssp\n\n# Sample coordinates\nbatch, length, atoms, xyz = 10, 100, 4, 3\n## atoms should be 4 (N, CA, C, O) or 5 (N, CA, C, O, H)\ncoord = torch.randn([batch, length, atom, xyz]) # batch-dim is optional\n```\n\n### To get hydrogen-bond matrix: ```pydssp.get_hbond_map()```\n``` python\nhbond_matrix = pydssp.get_hbond_map(coord)\n\nprint(hbond_matrix.shape) # should be (batch, length, length)\n```\n- For hbond_matrix[b, i, j], index 'i' is for donner (N-H) and 'j' is for acceptor (C=O), respectively\n- The output matrix consists of constant values in the range [0,1], which is defined as follows.\n\n$HbondMat(i,j) = (1+\\sin((-0.5-E(i,j)-margin)/margin*\\pi/2))/2$\n\nHere $E$ is the electrostatic energy defined by (Kabsch and Sander 1983) and $margin(=1.0)$ is introduced to control smoothness.\n\n### To get secondary structure assignment: ```pydssp.assign()```\n``` python\ndssp = pydssp.assign(coord, out_type='c3')\n## output is batched np.ndarray of C3 annotation, like ['-', 'H', 'H', ..., 'E', '-']\n\n# To get secondary str. as index\ndssp = pydssp.assign(coord, out_type='index')\n## 0: loop,  1: alpha-helix,  2: beta-strand\n\n# To get secondary str. as onehot representation\ndssp = pydssp.assign(coord, out_type='onehot')\n## dim-0: loop,  dim-1: alpha-helix,  dim-2: beta-strand\n```\n\n# Differences from the original DSSP\nThis implementation was simplified from the original DSSP algorithm. The differences from the original DSSP are as follows\n- The implementation omitted \u03b2-bulge annotation, so \u03b2-bulge is determined as a loop instead of \u03b2-strand.\n- Parameters for adding hydrogen atoms are slightly different from the original DSSP, which may cause small differences in hydrogen bond annotation.\n- Only support C3 ('-', 'H', and 'E') type assignment instead of C8 type (B, E, G, H, I, S, T, and ' ').\n\nAlthough the above simplifications, the C3 type annotation still matches with the original DSSP for more than 97% of residues on average.\n\n## Reference\n``` bibtex\n@article{kabsch1983dictionary,\n  title={Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features},\n  author={Kabsch, Wolfgang and Sander, Christian},\n  journal={Biopolymers: Original Research on Biomolecules},\n  volume={22},\n  number={12},\n  pages={2577--2637},\n  year={1983},\n  publisher={Wiley Online Library}\n}\n```\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simplified implementation of DSSP algorithm for PyTorch and NumPy",
    "version": "0.9.0",
    "project_urls": {
        "Homepage": "https://github.com/ShintaroMinami/PyDSSP"
    },
    "split_keywords": [
        "dssp",
        "secondary structure",
        "protein structure"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f4383d9f7dd8de27f5886c04c88170096d44962d32dcd9f45bae4044d7efc272",
                "md5": "c7e28e45c567cc4c3dcb687135203af7",
                "sha256": "2cbe4fb88f6403656dc1994ec236b9b94a5a215377fd89bd317da14660d7e0c9"
            },
            "downloads": -1,
            "filename": "pydssp-0.9.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c7e28e45c567cc4c3dcb687135203af7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9761,
            "upload_time": "2022-11-15T02:04:06",
            "upload_time_iso_8601": "2022-11-15T02:04:06.641558Z",
            "url": "https://files.pythonhosted.org/packages/f4/38/3d9f7dd8de27f5886c04c88170096d44962d32dcd9f45bae4044d7efc272/pydssp-0.9.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-11-15 02:04:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ShintaroMinami",
    "github_project": "PyDSSP",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.12.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.20.3"
                ]
            ]
        },
        {
            "name": "einops",
            "specs": [
                [
                    ">=",
                    "0.4.1"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.60"
                ]
            ]
        }
    ],
    "lcname": "pydssp"
}
        
Elapsed time: 4.86459s