cazy-parser


Namecazy-parser JSON
Version 2.0.3 PyPI version JSON
download
home_page
SummaryA way to extract specific information from CAZy
upload_time2023-10-12 10:13:21
maintainer
docs_urlNone
authorRodrigo V. Honorato
requires_python>=3.9,<4.0
licenseGPLv3
keywords cazy database datamining
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # cazy-parser

_A way to extract specific information from the Carbohydrate-Active enZYmes._

[![Downloads](https://pepy.tech/badge/cazy-parser)](https://pepy.tech/project/cazy-parser)
[![status](http://joss.theoj.org/papers/f709afe5d720fc6eee82fca277942a46/status.svg)](http://joss.theoj.org/papers/f709afe5d720fc6eee82fca277942a46)
[![unittests](https://github.com/rvhonorato/cazy-parser/actions/workflows/unittests.yml/badge.svg?branch=main)](https://github.com/rvhonorato/cazy-parser/actions/workflows/unittests.yml)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/33f087332ec24da689268a13d2f4ca23)](https://www.codacy.com/gh/rvhonorato/cazy-parser/dashboard?utm_source=github.com&utm_medium=referral&utm_content=rvhonorato/cazy-parser&utm_campaign=Badge_Grade)
[![Codacy Badge](https://app.codacy.com/project/badge/Coverage/33f087332ec24da689268a13d2f4ca23)](https://www.codacy.com/gh/rvhonorato/cazy-parser/dashboard?utm_source=github.com&utm_medium=referral&utm_content=rvhonorato/cazy-parser&utm_campaign=Badge_Coverage)

Make sure to visit and cite the CAZy website!

- <http://www.cazy.org/>
- Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM,
  Henrissat B (2014) The Carbohydrate-active enzymes database
  (CAZy) in 2013. **Nucleic Acids Res** 42:D490–D495. [PMID: [24270786](http://www.ncbi.nlm.nih.gov/sites/entrez?db=pubmed&cmd=search&term=24270786)].

License: [GNU GPLv3](https://www.gnu.org/licenses/gpl-3.0.html)

RV Honorato. CAZy-parser a way to extract information from
the Carbohydrate-Active enZYmes Database.
The Journal of Open Source Software\_, 1(8), dec 2016.
[10.21105/joss.00053](https://github.com/openjournals/joss-papers/blob/master/joss.00053/10.21105.joss.00053.pdf)

## Introduction

_cazy-parser_ is a tool that extract information from
[CAZy](http://www.cazy.org/) in a more usable and readable format.
Firstly, a script reads the HTML structure and creates a mirror of the
database as a tab delimited file. Secondly, information is extracted from
the database according to user inputted parameters and presented to the user
as a set of accession codes.

## Install / Upgrade

```text
pip install --upgrade cazy-parser
```

## Usage (internet connection required)

```text
cazy-parser -h
usage: cazy-parser [-h] [-f FAMILY] [-s SUBFAMILY] [-c CHARACTERIZED] [-v] {GH,GT,PL,CA,AA}

positional arguments:
  {GH,GT,PL,CA,AA}

optional arguments:
  -h, --help            show this help message and exit
  -f FAMILY, --family FAMILY
  -s SUBFAMILY, --subfamily SUBFAMILY
  -c CHARACTERIZED, --characterized CHARACTERIZED
  -v, --version         show version
```

### Example

Extract all fasta sequences from family 43 of Glycoside Hydrolase subfamily 1

```text
$ cazy-parser GH -f 43 -s 1
 [2022-05-26 16:39:21,511 91 INFO] ------------------------------------------
 [2022-05-26 16:39:21,511 92 INFO]
 [2022-05-26 16:39:21,511 93 INFO] ┌─┐┌─┐┌─┐┬ ┬   ┌─┐┌─┐┬─┐┌─┐┌─┐┬─┐
 [2022-05-26 16:39:21,511 94 INFO] │  ├─┤┌─┘└┬┘───├─┘├─┤├┬┘└─┐├┤ ├┬┘
 [2022-05-26 16:39:21,511 95 INFO] └─┘┴ ┴└─┘ ┴    ┴  ┴ ┴┴└─└─┘└─┘┴└─ v2.0.1
 [2022-05-26 16:39:21,511 96 INFO]
 [2022-05-26 16:39:21,511 97 INFO] ------------------------------------------
 [2022-05-26 16:39:21,511 183 INFO] Fetching links for Glycoside-Hydrolases, url: http://www.cazy.org/Glycoside-Hydrolases.html
 [2022-05-26 16:39:22,454 189 INFO] Only using links of family 43 subfamily 1
 [2022-05-26 16:39:23,029 26 INFO] Dowloading 1415 fasta sequences...
 [2022-05-26 16:40:32,187 51 INFO] Dumping fasta sequences to file GH43_1_26052022.fasta
```

This will generate the following file `GH43_1_DDMMYYY.fasta`
containing the fasta sequences.

## To-do and how to contribute

Please refer to [CONTRIBUTING](CONTRIBUTING.md) 🤓

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "cazy-parser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "cazy,database,datamining",
    "author": "Rodrigo V. Honorato",
    "author_email": "rvhonorato@protonmail.com",
    "download_url": "https://files.pythonhosted.org/packages/66/ca/7c4a75991dcc268b7be0256d05e9a7ca43137b8b0195907e6faf0446c3c5/cazy_parser-2.0.3.tar.gz",
    "platform": null,
    "description": "# cazy-parser\n\n_A way to extract specific information from the Carbohydrate-Active enZYmes._\n\n[![Downloads](https://pepy.tech/badge/cazy-parser)](https://pepy.tech/project/cazy-parser)\n[![status](http://joss.theoj.org/papers/f709afe5d720fc6eee82fca277942a46/status.svg)](http://joss.theoj.org/papers/f709afe5d720fc6eee82fca277942a46)\n[![unittests](https://github.com/rvhonorato/cazy-parser/actions/workflows/unittests.yml/badge.svg?branch=main)](https://github.com/rvhonorato/cazy-parser/actions/workflows/unittests.yml)\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/33f087332ec24da689268a13d2f4ca23)](https://www.codacy.com/gh/rvhonorato/cazy-parser/dashboard?utm_source=github.com&utm_medium=referral&utm_content=rvhonorato/cazy-parser&utm_campaign=Badge_Grade)\n[![Codacy Badge](https://app.codacy.com/project/badge/Coverage/33f087332ec24da689268a13d2f4ca23)](https://www.codacy.com/gh/rvhonorato/cazy-parser/dashboard?utm_source=github.com&utm_medium=referral&utm_content=rvhonorato/cazy-parser&utm_campaign=Badge_Coverage)\n\nMake sure to visit and cite the CAZy website!\n\n- <http://www.cazy.org/>\n- Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM,\n  Henrissat B (2014) The Carbohydrate-active enzymes database\n  (CAZy) in 2013. **Nucleic Acids Res** 42:D490\u2013D495. [PMID: [24270786](http://www.ncbi.nlm.nih.gov/sites/entrez?db=pubmed&cmd=search&term=24270786)].\n\nLicense: [GNU GPLv3](https://www.gnu.org/licenses/gpl-3.0.html)\n\nRV Honorato. CAZy-parser a way to extract information from\nthe Carbohydrate-Active enZYmes Database.\nThe Journal of Open Source Software\\_, 1(8), dec 2016.\n[10.21105/joss.00053](https://github.com/openjournals/joss-papers/blob/master/joss.00053/10.21105.joss.00053.pdf)\n\n## Introduction\n\n_cazy-parser_ is a tool that extract information from\n[CAZy](http://www.cazy.org/) in a more usable and readable format.\nFirstly, a script reads the HTML structure and creates a mirror of the\ndatabase as a tab delimited file. Secondly, information is extracted from\nthe database according to user inputted parameters and presented to the user\nas a set of accession codes.\n\n## Install / Upgrade\n\n```text\npip install --upgrade cazy-parser\n```\n\n## Usage (internet connection required)\n\n```text\ncazy-parser -h\nusage: cazy-parser [-h] [-f FAMILY] [-s SUBFAMILY] [-c CHARACTERIZED] [-v] {GH,GT,PL,CA,AA}\n\npositional arguments:\n  {GH,GT,PL,CA,AA}\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -f FAMILY, --family FAMILY\n  -s SUBFAMILY, --subfamily SUBFAMILY\n  -c CHARACTERIZED, --characterized CHARACTERIZED\n  -v, --version         show version\n```\n\n### Example\n\nExtract all fasta sequences from family 43 of Glycoside Hydrolase subfamily 1\n\n```text\n$ cazy-parser GH -f 43 -s 1\n [2022-05-26 16:39:21,511 91 INFO] ------------------------------------------\n [2022-05-26 16:39:21,511 92 INFO]\n [2022-05-26 16:39:21,511 93 INFO] \u250c\u2500\u2510\u250c\u2500\u2510\u250c\u2500\u2510\u252c \u252c   \u250c\u2500\u2510\u250c\u2500\u2510\u252c\u2500\u2510\u250c\u2500\u2510\u250c\u2500\u2510\u252c\u2500\u2510\n [2022-05-26 16:39:21,511 94 INFO] \u2502  \u251c\u2500\u2524\u250c\u2500\u2518\u2514\u252c\u2518\u2500\u2500\u2500\u251c\u2500\u2518\u251c\u2500\u2524\u251c\u252c\u2518\u2514\u2500\u2510\u251c\u2524 \u251c\u252c\u2518\n [2022-05-26 16:39:21,511 95 INFO] \u2514\u2500\u2518\u2534 \u2534\u2514\u2500\u2518 \u2534    \u2534  \u2534 \u2534\u2534\u2514\u2500\u2514\u2500\u2518\u2514\u2500\u2518\u2534\u2514\u2500 v2.0.1\n [2022-05-26 16:39:21,511 96 INFO]\n [2022-05-26 16:39:21,511 97 INFO] ------------------------------------------\n [2022-05-26 16:39:21,511 183 INFO] Fetching links for Glycoside-Hydrolases, url: http://www.cazy.org/Glycoside-Hydrolases.html\n [2022-05-26 16:39:22,454 189 INFO] Only using links of family 43 subfamily 1\n [2022-05-26 16:39:23,029 26 INFO] Dowloading 1415 fasta sequences...\n [2022-05-26 16:40:32,187 51 INFO] Dumping fasta sequences to file GH43_1_26052022.fasta\n```\n\nThis will generate the following file `GH43_1_DDMMYYY.fasta`\ncontaining the fasta sequences.\n\n## To-do and how to contribute\n\nPlease refer to [CONTRIBUTING](CONTRIBUTING.md) \ud83e\udd13\n",
    "bugtrack_url": null,
    "license": "GPLv3",
    "summary": "A way to extract specific information from CAZy",
    "version": "2.0.3",
    "project_urls": null,
    "split_keywords": [
        "cazy",
        "database",
        "datamining"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d21de3d8748d82c4f995b1599d5a574b169ea7b174c1c2a382bc194f4628db06",
                "md5": "4956403eb79d333861e1a25663787204",
                "sha256": "beff5ec5845e2f1dc45d43b584a003920a68f3cb1c880bd74fd576edb177b9fa"
            },
            "downloads": -1,
            "filename": "cazy_parser-2.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4956403eb79d333861e1a25663787204",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 21389,
            "upload_time": "2023-10-12T10:13:19",
            "upload_time_iso_8601": "2023-10-12T10:13:19.582232Z",
            "url": "https://files.pythonhosted.org/packages/d2/1d/e3d8748d82c4f995b1599d5a574b169ea7b174c1c2a382bc194f4628db06/cazy_parser-2.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "66ca7c4a75991dcc268b7be0256d05e9a7ca43137b8b0195907e6faf0446c3c5",
                "md5": "c391b89f9918c12afde6a9c9ec5fc4ac",
                "sha256": "f74fb33a9106a3d402870a3ca757d1cbf94e0ee8b6321695a97b8e7a28f632a9"
            },
            "downloads": -1,
            "filename": "cazy_parser-2.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "c391b89f9918c12afde6a9c9ec5fc4ac",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 20785,
            "upload_time": "2023-10-12T10:13:21",
            "upload_time_iso_8601": "2023-10-12T10:13:21.187849Z",
            "url": "https://files.pythonhosted.org/packages/66/ca/7c4a75991dcc268b7be0256d05e9a7ca43137b8b0195907e6faf0446c3c5/cazy_parser-2.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-12 10:13:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "cazy-parser"
}
        
Elapsed time: 0.16845s