# PPSS - Python Protein Subunit Syntax
A simple Python package for defining protein subunit structures.
## Installation
The simplest way to install this is to do as follows:
```
pip install ppss
```
You can also install the Poetry packaging and dependency tool and then clone this repository and install with poetry, as follows:
```
pipx install poetry
git clone https://github.com/sandyjmacdonald/ppss
cd ppss
poetry install
```
## Usage
The simple example below defines a protein comprised of either an S1 *plus* S1 subunit, or just an S3 subunit. The bar (|) symbol here represents an OR condition or "alternation", while the + represents an AND condition or "concatenation". Subunit IDs can be any combination of upper and/or lowercase alphabetical and numerical characters.
```
from ppss import ProteinParser
# Instantiate the parser
parser = ProteinParser()
# Define a protein structure
protein_definition = "(S1 + S2) | S3"
# Parse the protein structure
structures = parser.parse(protein_definition)
# Display the structures
for structure in structures:
print(structure)
```
The two possible structures are printed:
```
S1 + S2
S3
```
The full grammar of the Lark parser is as follows:
```
?start: protein
protein: alternation
alternation: concatenation ("|" concatenation)*
concatenation: required_term ("+" term)*
?term: required_term
| optional_term
required_term: multiplicity
| subunit
| "(" alternation ")"
optional_term: optional
multiplicity: subunit "{" DIGIT+ "}"
| "(" alternation ")" "{" DIGIT+ "}"
optional: "[" alternation "]"
subunit: SUBUNIT
SUBUNIT: /[A-Za-z0-9]+/
DIGIT: "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
%import common.WS
%ignore WS
```
Raw data
{
"_id": null,
"home_page": "https://github.com/sandyjmacdonald/ppss",
"name": "ppss",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.7",
"maintainer_email": null,
"keywords": "protein, parser, lark, bioinformatics",
"author": "sandyjmacdonald",
"author_email": "sandyjmacdonald@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/fd/63/64a17abc4cf57e937698ec6f572870b0ffe7e2b1142c25f36ff3b5e5b554/ppss-0.2.0.tar.gz",
"platform": null,
"description": "# PPSS - Python Protein Subunit Syntax\n\nA simple Python package for defining protein subunit structures.\n\n## Installation\n\nThe simplest way to install this is to do as follows:\n\n```\npip install ppss\n```\n\nYou can also install the Poetry packaging and dependency tool and then clone this repository and install with poetry, as follows:\n\n```\npipx install poetry\ngit clone https://github.com/sandyjmacdonald/ppss\ncd ppss\npoetry install\n```\n\n## Usage\n\nThe simple example below defines a protein comprised of either an S1 *plus* S1 subunit, or just an S3 subunit. The bar (|) symbol here represents an OR condition or \"alternation\", while the + represents an AND condition or \"concatenation\". Subunit IDs can be any combination of upper and/or lowercase alphabetical and numerical characters.\n\n```\nfrom ppss import ProteinParser\n\n# Instantiate the parser\nparser = ProteinParser()\n\n# Define a protein structure\nprotein_definition = \"(S1 + S2) | S3\"\n\n# Parse the protein structure\nstructures = parser.parse(protein_definition)\n\n# Display the structures\nfor structure in structures:\n print(structure)\n```\n\nThe two possible structures are printed:\n\n```\nS1 + S2\nS3\n```\n\nThe full grammar of the Lark parser is as follows:\n\n```\n?start: protein\n\nprotein: alternation\n\nalternation: concatenation (\"|\" concatenation)*\n\nconcatenation: required_term (\"+\" term)*\n\n?term: required_term\n | optional_term\n\nrequired_term: multiplicity\n | subunit\n | \"(\" alternation \")\"\n\noptional_term: optional\n\nmultiplicity: subunit \"{\" DIGIT+ \"}\"\n | \"(\" alternation \")\" \"{\" DIGIT+ \"}\"\n\noptional: \"[\" alternation \"]\"\n\nsubunit: SUBUNIT\n\nSUBUNIT: /[A-Za-z0-9]+/\n\nDIGIT: \"0\" | \"1\" | \"2\" | \"3\" | \"4\" | \"5\" | \"6\" | \"7\" | \"8\" | \"9\"\n\n%import common.WS\n%ignore WS\n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python Protein Subunit Syntax",
"version": "0.2.0",
"project_urls": {
"Homepage": "https://github.com/sandyjmacdonald/ppss",
"Repository": "https://github.com/sandyjmacdonald/ppss"
},
"split_keywords": [
"protein",
" parser",
" lark",
" bioinformatics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "52222bf04f62f4e5598318cf2e56dc19b345b860ec4dd2c0eeda8a23656a79b8",
"md5": "11d537444b456c17aad4ff6a06bee8bf",
"sha256": "6b903f8f2403d70b4f6085fd2fdffa6ebb6d12c061648e36c3421e424e0aead4"
},
"downloads": -1,
"filename": "ppss-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "11d537444b456c17aad4ff6a06bee8bf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.7",
"size": 5240,
"upload_time": "2024-10-18T14:17:10",
"upload_time_iso_8601": "2024-10-18T14:17:10.371604Z",
"url": "https://files.pythonhosted.org/packages/52/22/2bf04f62f4e5598318cf2e56dc19b345b860ec4dd2c0eeda8a23656a79b8/ppss-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fd6364a17abc4cf57e937698ec6f572870b0ffe7e2b1142c25f36ff3b5e5b554",
"md5": "8332135cd77c47dbbb46364642b4026f",
"sha256": "1662922fd7b37c092ea4d4150712bfbf9d6eabf790a6b497ec07453a6ca5f673"
},
"downloads": -1,
"filename": "ppss-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "8332135cd77c47dbbb46364642b4026f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.7",
"size": 4611,
"upload_time": "2024-10-18T14:17:11",
"upload_time_iso_8601": "2024-10-18T14:17:11.669638Z",
"url": "https://files.pythonhosted.org/packages/fd/63/64a17abc4cf57e937698ec6f572870b0ffe7e2b1142c25f36ff3b5e5b554/ppss-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-18 14:17:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sandyjmacdonald",
"github_project": "ppss",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "ppss"
}