# cvmcgmlst


cvmcgmlst is a tool developed based on the [cvmmlst](https://github.com/hbucqp/cvmmlst) for core genome MLST analysis .
```shell
Usage: cvmcgmlst -i <genome assemble directory> -o <output_directory> -db database_name
Author: Qingpo Cui(SZQ Lab, China Agricultural University)
options:
-h, --help show this help message and exit
-i I <input_file>: the PATH of assembled genome file
-db DB <database_path>: name of cgMLST database
-o O <output_directory>: output PATH
-minid MINID <minimum threshold of identity>, default=95
-mincov MINCOV <minimum threshold of coverage>, default=90
-t T <number of threads>: default=8
-v, --version Display version
cvmcgmlst subcommand:
{show_db,init,create_db}
show_db <show the list of all available database>
init <initialize the reference database>
create_db <add custome database, use cvmcgmlst createdb -h for help>
```
## Installation
### Using pip
```shell
pip3 install cvmcgmlst
```
## Dependency
- BLAST+ >2.7.0
**you should add BLAST in your PATH**
## Blast installation
### Windows
Following this tutorial:
[Add blast into your windows PATH](http://82.157.185.121:22300/shares/BevQrP0j8EXn76p7CwfheA)
### Linux/Mac
The easyest way to install blast is:
```
conda install -c bioconda blast
```
## Usage
### 1. Create reference cgmlst database
Users could create their own core genome database. All you need is a FASTA file of nucleotide sequences. The sequence IDs should have the format >locus_allelenumber, where **LOCUS** is the loci name, **ALLELENUMBER** is the number of this allele.
The curated core genome fasta file should like this:
```shell
>GBAA_RS00015_1
TTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT
AAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA
AAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC
TCCTTACCTAAAATAGGTGAAGAATTTGGTGGACGTGATCATACAACCGTTATCCATGCC
CATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA
ATTAACGATATTTTAAAGTAG
>GBAA_RS00015_2
TTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT
AAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA
TTAACAATTACGGCTCCAAATGAATTCGCCCGTGATTGGTTAGAATCTCATTATTCAGAG
CTAATTTCGGAAACACTTTATGATTTAACGGGGGCAAAATTAGCTATTCGCTTTATTATT
GCTAAAGCATATAATCCCCTCTTTATTTATGGGGGAGTTGGACTTGGAAAAACCCATTTA
>GBAA_RS00015_3
ATGCTTTATATCGCAAATCAAATCGATTCAAATATTCGTGAACTAGAAGGTGCACTCATC
CGCGTTGTAGCTTATTCATCTTTAATTAACAAGGATATTAATGCTGATTTAGCAGCTGAA
AAAGCTGTTGGAGATGTTTATCAAGTAAAATTAGAAGATTTCAAGGCGAAAAAGCGCACA
AAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC
CATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA
ATTAACGATATTTTAAAGTAG
...
```
After finish installation, you should first initialize the reference database using following command
```shell
cvmcgmlst create_db -file YOUR_REF.fasta -name DBNAME
```
### 2. Show available database
You could list all available databases using the show_db subcommand.
```shell
cvmcgmlst show_db
```
The shell will print available databases
|DB_name|No. of seqs|Update_date|
|---|---|---|
|demo|1|2025-02-25|
|DBNAME|Number of locus|Date|
### Run with your genome
```shell
# Single Genome Mode
cvmcgmlst -i /PATH_TO_ASSEBLED_GENOME/sample.fa -db DBNAME -o PATH_TO_OUTPUT
```
Raw data
{
"_id": null,
"home_page": "https://github.com/hbucqp/cvmcgmlst",
"name": "cvmcgmlst",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "wgs, cgmlst",
"author": "Qingpo Cui",
"author_email": "cqp@cau.edu.cn",
"download_url": "https://files.pythonhosted.org/packages/4b/2e/09f893a1a8b8d9890c17dbef52729e2c34cc0c30eb28b9c2cf8fa54c9016/cvmcgmlst-0.2.4.tar.gz",
"platform": "any",
"description": "# cvmcgmlst\n\n\n\ncvmcgmlst is a tool developed based on the [cvmmlst](https://github.com/hbucqp/cvmmlst) for core genome MLST analysis .\n\n```shell\nUsage: cvmcgmlst -i <genome assemble directory> -o <output_directory> -db database_name\n\nAuthor: Qingpo Cui(SZQ Lab, China Agricultural University)\n\noptions:\n -h, --help show this help message and exit\n -i I <input_file>: the PATH of assembled genome file\n -db DB <database_path>: name of cgMLST database\n -o O <output_directory>: output PATH\n -minid MINID <minimum threshold of identity>, default=95\n -mincov MINCOV <minimum threshold of coverage>, default=90\n -t T <number of threads>: default=8\n -v, --version Display version\n\ncvmcgmlst subcommand:\n {show_db,init,create_db}\n show_db <show the list of all available database>\n init <initialize the reference database>\n create_db <add custome database, use cvmcgmlst createdb -h for help>\n```\n\n\n## Installation\n### Using pip\n```shell\npip3 install cvmcgmlst\n```\n\n## Dependency\n- BLAST+ >2.7.0\n\n**you should add BLAST in your PATH**\n\n\n## Blast installation\n### Windows\n\n\nFollowing this tutorial:\n[Add blast into your windows PATH](http://82.157.185.121:22300/shares/BevQrP0j8EXn76p7CwfheA)\n\n### Linux/Mac\nThe easyest way to install blast is:\n\n```\nconda install -c bioconda blast\n```\n\n## Usage\n\n### 1. Create reference cgmlst database\n\nUsers could create their own core genome database. All you need is a FASTA file of nucleotide sequences. The sequence IDs should have the format >locus_allelenumber, where **LOCUS** is the loci name, **ALLELENUMBER** is the number of this allele. \nThe curated core genome fasta file should like this:\n\n```shell\n>GBAA_RS00015_1\nTTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT\nAAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA\nAAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC\nTCCTTACCTAAAATAGGTGAAGAATTTGGTGGACGTGATCATACAACCGTTATCCATGCC\nCATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA\nATTAACGATATTTTAAAGTAG\n>GBAA_RS00015_2\nTTGGAAAACATCTCTGATTTATGGAACAGCGCCTTAAAAGAACTCGAAAAAAAGGTCAGT\nAAACCAAGTTATGAAACATGGTTAAAATCAACAACCGCACATAATTTAAAGAAAGATGTA\nTTAACAATTACGGCTCCAAATGAATTCGCCCGTGATTGGTTAGAATCTCATTATTCAGAG\nCTAATTTCGGAAACACTTTATGATTTAACGGGGGCAAAATTAGCTATTCGCTTTATTATT\nGCTAAAGCATATAATCCCCTCTTTATTTATGGGGGAGTTGGACTTGGAAAAACCCATTTA\n>GBAA_RS00015_3\nATGCTTTATATCGCAAATCAAATCGATTCAAATATTCGTGAACTAGAAGGTGCACTCATC\nCGCGTTGTAGCTTATTCATCTTTAATTAACAAGGATATTAATGCTGATTTAGCAGCTGAA\nAAAGCTGTTGGAGATGTTTATCAAGTAAAATTAGAAGATTTCAAGGCGAAAAAGCGCACA\nAAGTCAGTTGCCTTTCCTCGCCAAATTGCAATGTATTTGTCACGCGAACTGACAGATTCC\nCATGAAAAAATTTCTAAGCTACTTAAGACGGATACGCAATTACAAAAACAAGTTGAAGAA\nATTAACGATATTTTAAAGTAG\n...\n```\n\nAfter finish installation, you should first initialize the reference database using following command\n```shell\ncvmcgmlst create_db -file YOUR_REF.fasta -name DBNAME\n```\n### 2. Show available database\n\nYou could list all available databases using the show_db subcommand.\n```shell\ncvmcgmlst show_db\n```\nThe shell will print available databases\n|DB_name|No. of seqs|Update_date|\n|---|---|---|\n|demo|1|2025-02-25|\n|DBNAME|Number of locus|Date|\n\n\n### Run with your genome\n```shell\n# Single Genome Mode\ncvmcgmlst -i /PATH_TO_ASSEBLED_GENOME/sample.fa -db DBNAME -o PATH_TO_OUTPUT\n```\n\n\n\n\n",
"bugtrack_url": null,
"license": "MIT Licence",
"summary": "cgMLST analysis tool",
"version": "0.2.4",
"project_urls": {
"Homepage": "https://github.com/hbucqp/cvmcgmlst"
},
"split_keywords": [
"wgs",
" cgmlst"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4c017454c85984e9f7204fa1e2aec54d8246f4db06b9be55ae1489b1059ce8fb",
"md5": "112e785bfc09152ac493c9de6cabe7e4",
"sha256": "d33adf295b5c41016e6a74d8625679112e3c96ea24aad2810340338f5d1d9376"
},
"downloads": -1,
"filename": "cvmcgmlst-0.2.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "112e785bfc09152ac493c9de6cabe7e4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 8110,
"upload_time": "2025-02-26T06:13:05",
"upload_time_iso_8601": "2025-02-26T06:13:05.990860Z",
"url": "https://files.pythonhosted.org/packages/4c/01/7454c85984e9f7204fa1e2aec54d8246f4db06b9be55ae1489b1059ce8fb/cvmcgmlst-0.2.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4b2e09f893a1a8b8d9890c17dbef52729e2c34cc0c30eb28b9c2cf8fa54c9016",
"md5": "feb9e2dd1344c5ca6d9c21edd47f2d18",
"sha256": "65b04717a1ad2ec738cde422c25e60433a5f29ce1715eccd7f3eb983b3526187"
},
"downloads": -1,
"filename": "cvmcgmlst-0.2.4.tar.gz",
"has_sig": false,
"md5_digest": "feb9e2dd1344c5ca6d9c21edd47f2d18",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6648,
"upload_time": "2025-02-26T06:13:07",
"upload_time_iso_8601": "2025-02-26T06:13:07.803029Z",
"url": "https://files.pythonhosted.org/packages/4b/2e/09f893a1a8b8d9890c17dbef52729e2c34cc0c30eb28b9c2cf8fa54c9016/cvmcgmlst-0.2.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-26 06:13:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hbucqp",
"github_project": "cvmcgmlst",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "Bio",
"specs": [
[
"==",
"1.7.1"
]
]
},
{
"name": "cvmblaster",
"specs": [
[
"==",
"0.4.4"
]
]
},
{
"name": "cvmcore",
"specs": [
[
"==",
"0.1.9"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"1.4.0"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"58.1.0"
]
]
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
}
],
"lcname": "cvmcgmlst"
}