[![Tests](https://github.com/oxfordmmm/piezo/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/piezo/actions/workflows/tests.yaml)
[![codecov](https://codecov.io/gh/oxfordmmm/piezo/branch/master/graph/badge.svg)](https://codecov.io/gh/oxfordmmm/piezo)
[![Docs](https://github.com/oxfordmmm/piezo/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/piezo/)
[![PyPI version](https://badge.fury.io/py/piezo.svg)](https://badge.fury.io/py/piezo)
# piezo
Predict the effect of a genetic mutation on the effect of an antibiotic using a supplied AMR catalogue.
This code was developed as part of the [CRyPTIC](http://www.crypticproject.org) international tuberculosis consortium. If you would like to use the software commercially, please consult the LICENCE file.
## Installation
### Using `pip`
This will install the most recent release on PyPI.
```
pip install piezo
```
### From GitHub
This will install the current version from GitHub and therefore may be ahead of the PyPI version.
```
git clone https://github.com/oxfordmmm/piezo
cd piezo
pip install .
```
The pre-requisites are all fairly standard and are listed in `setup.cfg` so will be automatically installed.
## Documentation
API documentation for developers can be found here: https://oxfordmmm.github.io/piezo/
## Included files
```
$ ls tests/test-catalogue/
NC_004148.2.gbk NC_004148.2_TEST_GM1_RFUS_v1.0.csv
```
NC_004148 is the reference genome of the human metapneumovirus and is used primarily for unit testing since it is small and fast to parse.
## Design of AMR catalogue
`piezo` is written so as to be extendable in the future to other ways of describing genetic variation with respect to a reference. It includes the concept of a `grammar` which specifies how the genetic variation is described.
At present only a single grammar, `GARC1` is supported. `GARC` is short for Grammar for Antimicrobial Resistance Catalogues. This grammar is described in more detail [elsewhere](http://fowlerlab.org/2018/11/25/goarc-a-general-ontology-for-antimicrobial-resistance-catalogues/), however in brief, it is a gene-centric view (and therefore has no way of describing genetic variation that lies outside a coding region, other than as a 'promoter' mutation). All mutations start with the gene (or locus) name which must match the name of a gene (or locus) in the relevant GenBank file. It is the user's responsibility to ensure this, although e.g. the `gumpy` package can be used to perform such sanity checks. The mutation is delineated from the gene using a `@` symbol and within the mutation `_` is used as a field separator to separate the different components. All variation is described as either a `SNP` or an `INDEL`. If they occur within a coding region `SNP`s are specified by their effect on the amino acids which are always in UPPERCASE e.g. `rpoB@S450L`. If in the assumed promoter region, then the nucleotide change and position is specified e.g. `fabG1@c-15t`. Nucleotides are always in lowercase. `INDEL`s can be specified at different levels of granularity e.g. `rpoB@1250_indel` means 'any insertion of deletion at this position', but we could equally be highly specific and say `rpoB@1250_ins_cta` which means 'an insertion of cta at this position'. There is also the special case of frameshifting mutations which are described by `fs`.
Wildcards are also supported. Hence `rpoB@*?` means 'any non-synoymous mutation in the coding region of the protein'. To avoid confusion the stop codon is represented by `!` which is non-standard. Het calls are, at present, represented by a `Z` or `z` depending on whether they occur in the coding or promoter regions. This may be extended in the future. Likewise null calls are represented by an `X` or `x`.
The general principle is each mutation can 'hit' multiple rules in the catalogue, but it is the most specific rule that will be followed. Hence consider a toy example, again from TB
```
rpoB@*? RIF U any non-synoymous mutation in the coding region has an unknown effect of RIF
rpoB@S450? RIF R any non-synoymous mutation at Ser450 confers resistance
rpoB@S450Z RIF F a het call at Ser450 should be reported as an F (fail).
```
## Example
A demonstration script called `piezo-predict.py` can be found in the `bin/` folder of the repository which following installation should be in your `$PATH`. A made-up catalogue for testing purposes can be found in `tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv` which is based on the Human metapneumovirus, however the entries are fictious. It contains two drugs and a series of mutations in the *M2* gene.
```
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73L
{'DRUG_B': 'S', 'DRUG_A': 'S'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73R
{'DRUG_A': 'R', 'DRUG_B': 'U'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73Z
{'DRUG_B': 'S', 'DRUG_A': 'F'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_indel
{'DRUG_B': 'U', 'DRUG_A': 'U'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins
{'DRUG_B': 'U', 'DRUG_A': 'U'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_2
{'DRUG_B': 'U', 'DRUG_A': 'U'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_3
{'DRUG_A': 'U', 'DRUG_B': 'R'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_4
{'DRUG_B': 'U', 'DRUG_A': 'U'}
$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_cta
{'DRUG_B': 'R', 'DRUG_A': 'U'}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/oxfordmmm/piezo",
"name": "piezo",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Philip W Fowler",
"author_email": "philip.fowler@ndm.ox.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/bd/8e/530614a66c453c0add3552574b61ce008ce61199c4902ae3e830ef2ec7c1/piezo-0.8.2.tar.gz",
"platform": null,
"description": "[![Tests](https://github.com/oxfordmmm/piezo/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/piezo/actions/workflows/tests.yaml)\n[![codecov](https://codecov.io/gh/oxfordmmm/piezo/branch/master/graph/badge.svg)](https://codecov.io/gh/oxfordmmm/piezo)\n[![Docs](https://github.com/oxfordmmm/piezo/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/piezo/)\n[![PyPI version](https://badge.fury.io/py/piezo.svg)](https://badge.fury.io/py/piezo)\n\n# piezo\n\nPredict the effect of a genetic mutation on the effect of an antibiotic using a supplied AMR catalogue.\n\nThis code was developed as part of the [CRyPTIC](http://www.crypticproject.org) international tuberculosis consortium. If you would like to use the software commercially, please consult the LICENCE file.\n\n## Installation\n\n### Using `pip`\n\nThis will install the most recent release on PyPI.\n\n```\npip install piezo\n```\n\n### From GitHub\n\nThis will install the current version from GitHub and therefore may be ahead of the PyPI version.\n\n```\ngit clone https://github.com/oxfordmmm/piezo\ncd piezo\npip install .\n```\nThe pre-requisites are all fairly standard and are listed in `setup.cfg` so will be automatically installed.\n\n## Documentation\nAPI documentation for developers can be found here: https://oxfordmmm.github.io/piezo/\n\n## Included files\n\n```\n$ ls tests/test-catalogue/\nNC_004148.2.gbk NC_004148.2_TEST_GM1_RFUS_v1.0.csv\n```\nNC_004148 is the reference genome of the human metapneumovirus and is used primarily for unit testing since it is small and fast to parse.\n\n## Design of AMR catalogue\n\n`piezo` is written so as to be extendable in the future to other ways of describing genetic variation with respect to a reference. It includes the concept of a `grammar` which specifies how the genetic variation is described.\n\nAt present only a single grammar, `GARC1` is supported. `GARC` is short for Grammar for Antimicrobial Resistance Catalogues. This grammar is described in more detail [elsewhere](http://fowlerlab.org/2018/11/25/goarc-a-general-ontology-for-antimicrobial-resistance-catalogues/), however in brief, it is a gene-centric view (and therefore has no way of describing genetic variation that lies outside a coding region, other than as a 'promoter' mutation). All mutations start with the gene (or locus) name which must match the name of a gene (or locus) in the relevant GenBank file. It is the user's responsibility to ensure this, although e.g. the `gumpy` package can be used to perform such sanity checks. The mutation is delineated from the gene using a `@` symbol and within the mutation `_` is used as a field separator to separate the different components. All variation is described as either a `SNP` or an `INDEL`. If they occur within a coding region `SNP`s are specified by their effect on the amino acids which are always in UPPERCASE e.g. `rpoB@S450L`. If in the assumed promoter region, then the nucleotide change and position is specified e.g. `fabG1@c-15t`. Nucleotides are always in lowercase. `INDEL`s can be specified at different levels of granularity e.g. `rpoB@1250_indel` means 'any insertion of deletion at this position', but we could equally be highly specific and say `rpoB@1250_ins_cta` which means 'an insertion of cta at this position'. There is also the special case of frameshifting mutations which are described by `fs`.\n\nWildcards are also supported. Hence `rpoB@*?` means 'any non-synoymous mutation in the coding region of the protein'. To avoid confusion the stop codon is represented by `!` which is non-standard. Het calls are, at present, represented by a `Z` or `z` depending on whether they occur in the coding or promoter regions. This may be extended in the future. Likewise null calls are represented by an `X` or `x`.\n\nThe general principle is each mutation can 'hit' multiple rules in the catalogue, but it is the most specific rule that will be followed. Hence consider a toy example, again from TB\n\n```\nrpoB@*? RIF U any non-synoymous mutation in the coding region has an unknown effect of RIF\nrpoB@S450? RIF R any non-synoymous mutation at Ser450 confers resistance\nrpoB@S450Z RIF F a het call at Ser450 should be reported as an F (fail).\n```\n\n## Example\n\nA demonstration script called `piezo-predict.py` can be found in the `bin/` folder of the repository which following installation should be in your `$PATH`. A made-up catalogue for testing purposes can be found in `tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv` which is based on the Human metapneumovirus, however the entries are fictious. It contains two drugs and a series of mutations in the *M2* gene.\n\n```\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73L\n{'DRUG_B': 'S', 'DRUG_A': 'S'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73R\n{'DRUG_A': 'R', 'DRUG_B': 'U'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@L73Z\n{'DRUG_B': 'S', 'DRUG_A': 'F'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_indel\n{'DRUG_B': 'U', 'DRUG_A': 'U'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins\n{'DRUG_B': 'U', 'DRUG_A': 'U'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_2\n{'DRUG_B': 'U', 'DRUG_A': 'U'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_3\n{'DRUG_A': 'U', 'DRUG_B': 'R'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_4\n{'DRUG_B': 'U', 'DRUG_A': 'U'}\n\n$ piezo-predict.py --catalogue tests/test-catalogue/NC_004148.2_TEST_v1.0_GARC1_RFUS.csv --mutation M2@300_ins_cta\n{'DRUG_B': 'R', 'DRUG_A': 'U'}\n```\n",
"bugtrack_url": null,
"license": "University of Oxford, see LICENSE.md",
"summary": "Predicting the effect of an antibiotic from gene mutations",
"version": "0.8.2",
"project_urls": {
"Homepage": "https://github.com/oxfordmmm/piezo"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "013d01063858bce4b90a3017f27d197550a9c5a8d2db136ae800e013bdf50d24",
"md5": "ae04b6995ddb3c73538a78a7ce1c3d24",
"sha256": "5b547c447bf2ae12c028b1f73bcda2aa80630334330db9e922c0e13e1af7951f"
},
"downloads": -1,
"filename": "piezo-0.8.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ae04b6995ddb3c73538a78a7ce1c3d24",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 20605,
"upload_time": "2024-03-27T12:16:05",
"upload_time_iso_8601": "2024-03-27T12:16:05.718253Z",
"url": "https://files.pythonhosted.org/packages/01/3d/01063858bce4b90a3017f27d197550a9c5a8d2db136ae800e013bdf50d24/piezo-0.8.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bd8e530614a66c453c0add3552574b61ce008ce61199c4902ae3e830ef2ec7c1",
"md5": "19970ce124c6325803539cbd4902bb7c",
"sha256": "b63e97500c31400aaa63f294708b0c5e4ef37030798895f031ca751053cda6d3"
},
"downloads": -1,
"filename": "piezo-0.8.2.tar.gz",
"has_sig": false,
"md5_digest": "19970ce124c6325803539cbd4902bb7c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 24181,
"upload_time": "2024-03-27T12:16:06",
"upload_time_iso_8601": "2024-03-27T12:16:06.905121Z",
"url": "https://files.pythonhosted.org/packages/bd/8e/530614a66c453c0add3552574b61ce008ce61199c4902ae3e830ef2ec7c1/piezo-0.8.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-27 12:16:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "oxfordmmm",
"github_project": "piezo",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "piezo"
}