gnomonicus

Name	gnomonicus JSON
Version	3.0.7 JSON
	download
home_page	https://github.com/oxfordmmm/gnomonicus
Summary	Python code to integrate results of tb-pipeline and provide an antibiogram, mutations and variants
upload_time	2025-02-10 14:55:31
maintainer	None
docs_url	None
author	Philip W Fowler, Jeremy Westhead
requires_python	>=3.10
license	University of Oxford License, see LICENSE
keywords	gnomonicus piezo lodestone clockwork tb
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![Tests](https://github.com/oxfordmmm/gnomonicus/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/gnomonicus/actions/workflows/tests.yaml) 
[![Build and release Docker](https://github.com/oxfordmmm/gnomonicus/actions/workflows/build.yaml/badge.svg)](https://github.com/oxfordmmm/gnomonicus/actions/workflows/build.yaml) 
[![PyPI version](https://badge.fury.io/py/gnomonicus.svg)](https://badge.fury.io/py/gnomonicus)
[![Docs](https://github.com/oxfordmmm/gnomonicus/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/gnomonicus/)

# gnomonicus
Python code to integrate results of tb-pipeline and provide an antibiogram, mutations and variations

Provides a library of functions for use within scripts, as well as a CLI tool for linking the functions together to produce output

## Documentation
API reference for developers, and CLI instructions can be found here: https://oxfordmmm.github.io/gnomonicus/ 
## Usage
```
usage: gnomonicus [-h] [-v] --vcf_file VCF_FILE --genome_object GENOME_OBJECT [--catalogue_file CATALOGUE_FILE] [--ignore_vcf_filter] [--output_dir OUTPUT_DIR] [--json] [--csvs CSVS [CSVS ...]] [--debug]
                  [--resistance_genes] --min_dp MIN_DP

options:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --vcf_file VCF_FILE   the path to a single VCF file
  --genome_object GENOME_OBJECT
                        the path to a genbank file
  --catalogue_file CATALOGUE_FILE
                        the path to the resistance catalogue
  --ignore_vcf_filter   whether to ignore the FILTER field in the vcf (e.g. necessary for some versions of Clockwork VCFs)
  --output_dir OUTPUT_DIR
                        Directory to save output files to. Defaults to wherever the script is run from.
  --json                Flag to create a single JSON output as well as the CSVs
  --csvs CSVS [CSVS ...]
                        Types of CSV to produce. Accepted values are [variants, mutations, effects, predictions, all]. `all` produces all of the CSVs
  --debug               Whether to log debugging messages to the log. Defaults to False
  --resistance_genes    Flag to filter mutations and variants to only include genes present in the resistance catalogue
  --min_dp MIN_DP       Minimum depth for a variant to be considered in the VCF. Below this value, rows are interpreted as null calls.
```

## Install
Simple install using pip for the latest release
```
pip install gnomonicus
```

Install from source
```
git clone https://github.com/oxfordmmm/gnomonicus.git
cd gnomonicus
pip install -e .
```

## Docker
A Docker image should be built on releases. To open a shell with gnomonicus installed:
```
docker run -it oxfordmmm/gnomonicus:latest
```

## Notes
When generating mutations, in cases of synonymous amino acid mutation, the nucelotides changed are also included. This can lead to a mix of nucleotides and amino acids for coding genes, but these are excluded from generating effects unless specified in the catalogue. This means that the default rule of `gene@*= --> S` is still in place regardless of the introduced `gene@*?` which would otherwise take precedence. For example:
```
  'MUTATIONS': [
      {
          'MUTATION': 'F2F',
          'GENE': 'S',
          'GENE_POSITION': 2
      },
      {
          'MUTATION': 't6c',
          'GENE': 'S',
          'GENE_POSITION': 6
      },
  ],
  'EFFECTS': {
      'AAA': [
          {
              'GENE': 'S',
              'MUTATION': 'F2F',
              'PREDICTION': 'S'
          },
          {
              'PHENOTYPE': 'S'
          }
      ],
  }
```
The nucelotide variation is included in the the `MUTATIONS`, but explictly removed from the `EFFECTS` unless it is specified within the catalogue.
In order for this variation to be included, a line in the catalogue of `S@F2F&S@t6c` would have to be present.

## User stories

1. As a bioinformatician, I want to be able to run `gnomonicus` on the command line, passing it (i) a GenBank file ~~(or pickled `gumpy.Genome` object)~~, (ii) a resistance catalogue and (iii) a VCF file, and get back `pandas.DataFrames` of the genetic variants, mutations, effects and predictions/antibiogram. The latter is for all the drugs described in the passed resistance catalogue.

2. As a GPAS developer, I want to be able to embed `gnomonicus` in a Docker image/NextFlow pipeline that consumes the outputs of [tb-pipeline](https://github.com/Pathogen-Genomics-Cymru/tb-pipeline) and emits a structured, well-designed `JSON` object describing the genetic variants, mutations, effects and predictions/antibiogram.

3. In general, I would also like the option to output fixed- and variable-length FASTA files (the latter takes into account insertions and deletions described in any input VCF file).

## Unit testing

For speed, rather than use NC_000962.3 (i.e. H37Rv *M. tuberculosis*), we shall use SARS-CoV-2 and have created a fictious drug resistance catalogue, along with some `vcf` files and the expected outputs in `tests/`.

These can be run with `pytest -vv`

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/oxfordmmm/gnomonicus",
    "name": "gnomonicus",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "gnomonicus, piezo, lodestone, clockwork, TB",
    "author": "Philip W Fowler, Jeremy Westhead",
    "author_email": "philip.fowler@ndm.ox.ac.uk",
    "download_url": "https://files.pythonhosted.org/packages/df/82/3de65ea4fc7bd757e149a36f22a8ecf843a414a592a9d8b61c7ab23adb2f/gnomonicus-3.0.7.tar.gz",
    "platform": null,
    "description": "[![Tests](https://github.com/oxfordmmm/gnomonicus/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/gnomonicus/actions/workflows/tests.yaml) \n[![Build and release Docker](https://github.com/oxfordmmm/gnomonicus/actions/workflows/build.yaml/badge.svg)](https://github.com/oxfordmmm/gnomonicus/actions/workflows/build.yaml) \n[![PyPI version](https://badge.fury.io/py/gnomonicus.svg)](https://badge.fury.io/py/gnomonicus)\n[![Docs](https://github.com/oxfordmmm/gnomonicus/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/gnomonicus/)\n\n# gnomonicus\nPython code to integrate results of tb-pipeline and provide an antibiogram, mutations and variations\n\nProvides a library of functions for use within scripts, as well as a CLI tool for linking the functions together to produce output\n\n## Documentation\nAPI reference for developers, and CLI instructions can be found here: https://oxfordmmm.github.io/gnomonicus/ \n## Usage\n```\nusage: gnomonicus [-h] [-v] --vcf_file VCF_FILE --genome_object GENOME_OBJECT [--catalogue_file CATALOGUE_FILE] [--ignore_vcf_filter] [--output_dir OUTPUT_DIR] [--json] [--csvs CSVS [CSVS ...]] [--debug]\n                  [--resistance_genes] --min_dp MIN_DP\n\noptions:\n  -h, --help            show this help message and exit\n  -v, --version         show program's version number and exit\n  --vcf_file VCF_FILE   the path to a single VCF file\n  --genome_object GENOME_OBJECT\n                        the path to a genbank file\n  --catalogue_file CATALOGUE_FILE\n                        the path to the resistance catalogue\n  --ignore_vcf_filter   whether to ignore the FILTER field in the vcf (e.g. necessary for some versions of Clockwork VCFs)\n  --output_dir OUTPUT_DIR\n                        Directory to save output files to. Defaults to wherever the script is run from.\n  --json                Flag to create a single JSON output as well as the CSVs\n  --csvs CSVS [CSVS ...]\n                        Types of CSV to produce. Accepted values are [variants, mutations, effects, predictions, all]. `all` produces all of the CSVs\n  --debug               Whether to log debugging messages to the log. Defaults to False\n  --resistance_genes    Flag to filter mutations and variants to only include genes present in the resistance catalogue\n  --min_dp MIN_DP       Minimum depth for a variant to be considered in the VCF. Below this value, rows are interpreted as null calls.\n```\n\n## Install\nSimple install using pip for the latest release\n```\npip install gnomonicus\n```\n\nInstall from source\n```\ngit clone https://github.com/oxfordmmm/gnomonicus.git\ncd gnomonicus\npip install -e .\n```\n\n## Docker\nA Docker image should be built on releases. To open a shell with gnomonicus installed:\n```\ndocker run -it oxfordmmm/gnomonicus:latest\n```\n\n## Notes\nWhen generating mutations, in cases of synonymous amino acid mutation, the nucelotides changed are also included. This can lead to a mix of nucleotides and amino acids for coding genes, but these are excluded from generating effects unless specified in the catalogue. This means that the default rule of `gene@*= --> S` is still in place regardless of the introduced `gene@*?` which would otherwise take precedence. For example:\n```\n  'MUTATIONS': [\n      {\n          'MUTATION': 'F2F',\n          'GENE': 'S',\n          'GENE_POSITION': 2\n      },\n      {\n          'MUTATION': 't6c',\n          'GENE': 'S',\n          'GENE_POSITION': 6\n      },\n  ],\n  'EFFECTS': {\n      'AAA': [\n          {\n              'GENE': 'S',\n              'MUTATION': 'F2F',\n              'PREDICTION': 'S'\n          },\n          {\n              'PHENOTYPE': 'S'\n          }\n      ],\n  }\n```\nThe nucelotide variation is included in the the `MUTATIONS`, but explictly removed from the `EFFECTS` unless it is specified within the catalogue.\nIn order for this variation to be included, a line in the catalogue of `S@F2F&S@t6c` would have to be present.\n\n## User stories\n\n1. As a bioinformatician, I want to be able to run `gnomonicus` on the command line, passing it (i) a GenBank file ~~(or pickled `gumpy.Genome` object)~~, (ii) a resistance catalogue and (iii) a VCF file, and get back `pandas.DataFrames` of the genetic variants, mutations, effects and predictions/antibiogram. The latter is for all the drugs described in the passed resistance catalogue.\n\n2. As a GPAS developer, I want to be able to embed `gnomonicus` in a Docker image/NextFlow pipeline that consumes the outputs of [tb-pipeline](https://github.com/Pathogen-Genomics-Cymru/tb-pipeline) and emits a structured, well-designed `JSON` object describing the genetic variants, mutations, effects and predictions/antibiogram.\n\n3. In general, I would also like the option to output fixed- and variable-length FASTA files (the latter takes into account insertions and deletions described in any input VCF file).\n\n## Unit testing\n\nFor speed, rather than use NC_000962.3 (i.e. H37Rv *M. tuberculosis*), we shall use SARS-CoV-2 and have created a fictious drug resistance catalogue, along with some `vcf` files and the expected outputs in `tests/`.\n\nThese can be run with `pytest -vv`\n",
    "bugtrack_url": null,
    "license": "University of Oxford License, see LICENSE",
    "summary": "Python code to integrate results of tb-pipeline and provide an antibiogram, mutations and variants",
    "version": "3.0.7",
    "project_urls": {
        "Homepage": "https://github.com/oxfordmmm/gnomonicus"
    },
    "split_keywords": [
        "gnomonicus",
        " piezo",
        " lodestone",
        " clockwork",
        " tb"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e6e8a61989053540d33bc3129403a86f04288d398fb81434129d19caf739b87e",
                "md5": "7999eab114ca461688ec529440afbe63",
                "sha256": "56aa2f0a9a9c92b88b966b6c35890a599c16f87afc85d25e716ff7952bea0875"
            },
            "downloads": -1,
            "filename": "gnomonicus-3.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7999eab114ca461688ec529440afbe63",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 23325,
            "upload_time": "2025-02-10T14:55:29",
            "upload_time_iso_8601": "2025-02-10T14:55:29.524435Z",
            "url": "https://files.pythonhosted.org/packages/e6/e8/a61989053540d33bc3129403a86f04288d398fb81434129d19caf739b87e/gnomonicus-3.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df823de65ea4fc7bd757e149a36f22a8ecf843a414a592a9d8b61c7ab23adb2f",
                "md5": "75c69c7c19d81726e3aeb16963630d1b",
                "sha256": "c5b2f96850abe79d81496eeada45dcf5108e4e2a1cfe297d9d6ecbadc0616b89"
            },
            "downloads": -1,
            "filename": "gnomonicus-3.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "75c69c7c19d81726e3aeb16963630d1b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 23278,
            "upload_time": "2025-02-10T14:55:31",
            "upload_time_iso_8601": "2025-02-10T14:55:31.493538Z",
            "url": "https://files.pythonhosted.org/packages/df/82/3de65ea4fc7bd757e149a36f22a8ecf843a414a592a9d8b61c7ab23adb2f/gnomonicus-3.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-10 14:55:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "oxfordmmm",
    "github_project": "gnomonicus",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "gnomonicus"
}

Philip W Fowler, Jeremy Westhead