# ERVdetective: an efficient pipeline for identification and annotation of endogenous retroviruses (ERVs)
![](https://img.shields.io/badge/System-Windows/Linux/MacOS-green.svg)
![](https://img.shields.io/pypi/pyversions/ervdetective)
![](https://img.shields.io/pypi/wheel/ervdetective)
![](https://img.shields.io/pypi/dm/ervdetective)
## 1. Download and install
ervdetective is a command-line-interface program developed based on ```Python 3```, and you can download and install the ervdetective in a variety of ways.
### 1.1. pip method
ervdetective has been distributed to the standard library of ```PyPI``` (https://pypi.org/project/ervdetective/), and can be easily installed by the tool ```pip```.
Firstly, download ```Python3``` (https://www.python.org/), and install ```Python3``` and ```pip``` tool, then,
```
pip install ervdetective
ervdetective -h
```
### 1.2. Or local installation
In addition, ervdetective can also be installed manually using the file using the file ```setup.py```.
Firstly, download this repository, then, run:
```
python setup.py install
ervdetective -h
```
### 1.3. Or run the source code directly
you can also directly run the source code of ervdetective without installation. First, download this repository, then, install the required python environment of ervdetective:
```
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```
finally, run ervdetective using the file ```main.py```. Please view the help documentation by ```python main.py -h```.
## 2. Software dependencies
The running of ```ervdetective``` relies on these softwares:
+ [blast](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/) (version >=2.9.0+), has to contain ```makeblastdb``` and ```tblastn```.
+ [genometools](http://genometools.org) (version >=1.6.1), has to contain ```ltrharvest```.
+ [hmmer](http://hmmer.org/) (version >=3.0), has to contain ```hmmpress``` and ```hmmscan```.
<b>Note</b>, these dependencies need to be installed and added to environment variables of system (or user) beforehand, because ervdetective call them from the environment variables directly.
## 3. Getting help
The help documentation can be get by entering ```ervdetective -h``` or ```ervdetective --help```.
| Parameter | Description |
| --- | --- |
|-h, --help | show this help message and exit|
|-i HOST | The file-path of host genome sequence, the suffix is generally *.fna, *.fas, *.fasta.|
|-eb EBLAST | Specify threshold of e-value for BLAST search, default: 1e-5.|
|-f FLANK | The length of extended flank sequence on either side of the blast hit-site, default: 15000.|
|-l1 MINLTR | Specify minimum length of LTR, default: 100.|
|-l2 MAXLTR | Specify maximum length of LTR, default: 1000.|
|-s LTRSIMILAR | Specify threshold(%) of the similarity of paired LTRs, default: 80.|
|-d1 MINDISTLTR | The minimum interval of paired-LTRs start-positions, default: 1000.|
|-d2 MAXDISTLTR | The maximum interval of paired-LTRs start-positions, default: 15000.|
|-t1 MINTSD | The minimum length for each TSD site, default: 4.|
|-t2 MAXTSD | The maximum length for each TSD site, default: 6.|
|-motif MOTIF | Specify start-motif (2 nucleotides) and end-motif (2 nucleotides), default string: TGCA.|
|-mis MISMOTIF | The maximum number of mismatches nucleotides in motif, default: 1.|
|-ed EHMMER | The threshold of e-value using for HMMER search, default: 1e-6.|
|-n THREAD | The the number of threads used, default: 1.|
|-p PREFIX | The the prefix of output file, default character: 'host'.|
|-o OUTPUT | The path of output folder to store all the results.|
|--gag GAG_LENGTH | The threshold of length of GAG protein in HMMER search, default: 250 aa.|
|--pro PRO_LENGTH | The threshold of length of PRO protein in HMMER search, default: 50 aa.|
|--rt RT_LENGTH | The threshold of length of RT protein in HMMER search, default: 150 aa.|
|--rh RNASEH_LENGTH | The threshold of length of RNaseH protein in HMMER search, default: 65 aa.|
|--int INT_LENGTH | The threshold of length of INT protein in HMMER search, default: 150 aa.|
|--env ENV_LENGTH | The threshold of length of ENV protein in HMMER search, default: 250 aa.|
## 4. Example of usage
The bat Myotis myotis (GCA_004026985.1) , and its genome data was downloaded from https://www.ncbi.nlm.nih.gov/datasets/taxonomy/51298/
Then, run:
```
ervdetective -i GCA_004026985.1_MyoMyo_v1_BIUU_genomic.fna -p myotis_myotis -o output
```
Raw data
{
"_id": null,
"home_page": "https://github.com/ZhijianZhou01/ervdetective",
"name": "ervdetective",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "endogenous retroviruses, virus, evolution",
"author": "Zhi-Jian Zhou",
"author_email": "zjzhou@hnu.edu.cn",
"download_url": "https://files.pythonhosted.org/packages/51/1c/f82a4ef583a48799e029cfe0d671866cce847c251572afcb8cef93782dde/ervdetective-1.0.8.tar.gz",
"platform": null,
"description": "# ERVdetective: an efficient pipeline for identification and annotation of endogenous retroviruses (ERVs)\r\n\r\n![](https://img.shields.io/badge/System-Windows/Linux/MacOS-green.svg)\r\n![](https://img.shields.io/pypi/pyversions/ervdetective)\r\n\r\n![](https://img.shields.io/pypi/wheel/ervdetective)\r\n![](https://img.shields.io/pypi/dm/ervdetective)\r\n\r\n\r\n\r\n## 1. Download and install\r\n\r\nervdetective is a command-line-interface program developed based on ```Python 3```, and you can download and install the ervdetective in a variety of ways.\r\n\r\n### 1.1. pip method\r\n\r\nervdetective has been distributed to the standard library of ```PyPI``` (https://pypi.org/project/ervdetective/), and can be easily installed by the tool ```pip```.\r\n\r\nFirstly, download ```Python3``` (https://www.python.org/), and install ```Python3``` and ```pip``` tool, then,\r\n\r\n```\r\npip install ervdetective\r\nervdetective -h\r\n```\r\n\r\n### 1.2. Or local installation\r\n\r\nIn addition, ervdetective can also be installed manually using the file using the file ```setup.py```. \r\n\r\nFirstly, download this repository, then, run:\r\n```\r\npython setup.py install\r\nervdetective -h\r\n```\r\n\r\n### 1.3. Or run the source code directly\r\n\r\nyou can also directly run the source code of ervdetective without installation. First, download this repository, then, install the required python environment of ervdetective:\r\n\r\n```\r\npip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple\r\n```\r\n\r\nfinally, run ervdetective using the file ```main.py```. Please view the help documentation by ```python main.py -h```.\r\n\r\n\r\n## 2. Software dependencies\r\n\r\nThe running of ```ervdetective``` relies on these softwares:\r\n\r\n+ [blast](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/) (version >=2.9.0+), has to contain ```makeblastdb``` and ```tblastn```.\r\n\r\n+ [genometools](http://genometools.org) (version >=1.6.1), has to contain ```ltrharvest```. \r\n\r\n+ [hmmer](http://hmmer.org/) (version >=3.0), has to contain ```hmmpress``` and ```hmmscan```. \r\n\r\n\r\n<b>Note</b>, these dependencies need to be installed and added to environment variables of system (or user) beforehand, because ervdetective call them from the environment variables directly.\r\n\r\n\r\n## 3. Getting help\r\n\r\nThe help documentation can be get by entering ```ervdetective -h``` or ```ervdetective --help```.\r\n\r\n| Parameter | Description |\r\n| --- | --- |\r\n|-h, --help | show this help message and exit|\r\n|-i HOST | The file-path of host genome sequence, the suffix is generally *.fna, *.fas, *.fasta.|\r\n|-eb EBLAST | Specify threshold of e-value for BLAST search, default: 1e-5.|\r\n|-f FLANK | The length of extended flank sequence on either side of the blast hit-site, default: 15000.|\r\n|-l1 MINLTR | Specify minimum length of LTR, default: 100.|\r\n|-l2 MAXLTR | Specify maximum length of LTR, default: 1000.|\r\n|-s LTRSIMILAR | Specify threshold(%) of the similarity of paired LTRs, default: 80.|\r\n|-d1 MINDISTLTR | The minimum interval of paired-LTRs start-positions, default: 1000.|\r\n|-d2 MAXDISTLTR | The maximum interval of paired-LTRs start-positions, default: 15000.|\r\n|-t1 MINTSD | The minimum length for each TSD site, default: 4.|\r\n|-t2 MAXTSD | The maximum length for each TSD site, default: 6.|\r\n|-motif MOTIF | Specify start-motif (2 nucleotides) and end-motif (2 nucleotides), default string: TGCA.|\r\n|-mis MISMOTIF | The maximum number of mismatches nucleotides in motif, default: 1.|\r\n|-ed EHMMER | The threshold of e-value using for HMMER search, default: 1e-6.|\r\n|-n THREAD | The the number of threads used, default: 1.|\r\n|-p PREFIX | The the prefix of output file, default character: 'host'.|\r\n|-o OUTPUT | The path of output folder to store all the results.|\r\n|--gag GAG_LENGTH | The threshold of length of GAG protein in HMMER search, default: 250 aa.|\r\n|--pro PRO_LENGTH | The threshold of length of PRO protein in HMMER search, default: 50 aa.|\r\n|--rt RT_LENGTH | The threshold of length of RT protein in HMMER search, default: 150 aa.|\r\n|--rh RNASEH_LENGTH | The threshold of length of RNaseH protein in HMMER search, default: 65 aa.|\r\n|--int INT_LENGTH | The threshold of length of INT protein in HMMER search, default: 150 aa.|\r\n|--env ENV_LENGTH | The threshold of length of ENV protein in HMMER search, default: 250 aa.|\r\n\r\n\r\n## 4. Example of usage\r\n\r\nThe bat Myotis myotis (GCA_004026985.1) , and its genome data was downloaded from https://www.ncbi.nlm.nih.gov/datasets/taxonomy/51298/\r\n\r\nThen, run:\r\n\r\n```\r\nervdetective -i GCA_004026985.1_MyoMyo_v1_BIUU_genomic.fna -p myotis_myotis -o output\r\n```\r\n",
"bugtrack_url": null,
"license": null,
"summary": "An efficient pipeline for identification and annotation of endogenous retroviruses (ERVs)",
"version": "1.0.8",
"project_urls": {
"Homepage": "https://github.com/ZhijianZhou01/ervdetective"
},
"split_keywords": [
"endogenous retroviruses",
" virus",
" evolution"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6de4d44645cd746b07130ecb86acff485f0851ab426673535c9e9d4b0e349843",
"md5": "2412df0d5c30b1101b14f7c7ae170067",
"sha256": "6c4b543b03e387705637e829ebf38fa480cd5fba3b26bd22fb7c2c84c4a21672"
},
"downloads": -1,
"filename": "ervdetective-1.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2412df0d5c30b1101b14f7c7ae170067",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 1000111,
"upload_time": "2024-06-02T10:00:02",
"upload_time_iso_8601": "2024-06-02T10:00:02.748147Z",
"url": "https://files.pythonhosted.org/packages/6d/e4/d44645cd746b07130ecb86acff485f0851ab426673535c9e9d4b0e349843/ervdetective-1.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "511cf82a4ef583a48799e029cfe0d671866cce847c251572afcb8cef93782dde",
"md5": "99bd67a847d5f48dfdc74ed52f8bf5c6",
"sha256": "3438eaf219dc78c51f7711dc420b6f7cc7d71ec4b6608573a11df1dac8f407d4"
},
"downloads": -1,
"filename": "ervdetective-1.0.8.tar.gz",
"has_sig": false,
"md5_digest": "99bd67a847d5f48dfdc74ed52f8bf5c6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 999138,
"upload_time": "2024-06-02T10:00:05",
"upload_time_iso_8601": "2024-06-02T10:00:05.361378Z",
"url": "https://files.pythonhosted.org/packages/51/1c/f82a4ef583a48799e029cfe0d671866cce847c251572afcb8cef93782dde/ervdetective-1.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-02 10:00:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ZhijianZhou01",
"github_project": "ervdetective",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "biopython",
"specs": [
[
">=",
"1.78"
]
]
},
{
"name": "psutil",
"specs": [
[
">=",
"5.9.1"
]
]
},
{
"name": "colorama",
"specs": [
[
">=",
"0.4.6"
]
]
}
],
"lcname": "ervdetective"
}