autophylo


Nameautophylo JSON
Version 1.0.4 PyPI version JSON
download
home_pageNone
SummaryAutophylo is used to generate phylogentic trees automatically.
upload_time2024-09-02 14:32:29
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseCopyright (c) 2024 Amogelang R. Raphenya (raphenar@mcmaster.ca) Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords phylogentic autophy automatic autophylo
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # The Automatic Phylogenetic Tree Builder

The **Auto**matic **Phylo**genetic (AutoPhylo) is used to generate phylogenetic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogenetic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.

The user provides one protein sequence or multiple sequences and blast databases are used to sample sequences
from phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., *Actinobacteriota*, *Bacteroidota*, *Desulfobacterota*, *Firmicutes*, *Proteobacteria*, *Synergistota*, *Verrucomicrobiota*, *Fusobacteria*.

![autophylo overview](https://github.com/raphenya/autophylo/blob/main/docs/images/Phylogenetic_Tree_Building.trimmed.png?raw=true)

# Installation

The tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.

```
pip install autophylo
```

Or

Create a conda environment using the `environment.yml` file which installs all the dependencies (listed below).

```
conda env create -f environment.yml
```

# Usage

```
conda activate autophylo
```

# Install autophylo using tarball

Install the `autophylo` application within the created `autophylo` conda environment using a tarball.

```
python3 -m pip install /path/to/autophylo-1.0.0.tar.gz
```

# Dependencies

The following are required dependencies (listed below):

- NCBI BLAST 2.15.0
- BLAST databases (version 5)
- FastTree 2.1.11
- MUSCLE 5.1.0
- RAxML 8.2.13
- GBLOCKS 0.91b
- USEARCH 12.0_beta
- seqkit 2.8.2
- Trimal 1.5.0
- biopython 1.84
- joblib 1.4.2



# Download pre-formatted blast databases

(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)

- The pre-formatted databases offer the following advantages:
    * Pre-formatting removes the need to run makeblastdb;
    * Species-level taxonomy ids are included for each database entry;
    * Databases are broken into smaller-sized volumes and are therefore easier
      to download;
    * Sequences in FASTA format can be generated from the pre-formatted databases
      by using the blastdbcmd utility;
    * A convenient script (update_blastdb.pl) is available in the blast+ package
      to download the pre-formatted databases.

# download nr

```
update_blastdb.pl --source ncbi --decompress --blastdb_version 5  --verbose 2 --num_threads 30 nr > log.nr 2>&1
```

# Update `config` file 

Obtain path to dependencies programs using the `$CONDA_PREFIX` variable.

After activating the `autophylo` run the following command to get the path and use it to update the `config` file

```
(autophylo) echo $CONDA_PREFIX
```

# Example `config` file

```
(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX
/Users/amos/miniconda3/envs/autophylo
```

# Updated `config` file

NOTE: The databases can be place anywhere in the filesystem and in this example they are in `/Users/amos/datalake`.

```
[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[ALIGNMENT]
MUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle
TRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal

[TREE]
FastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree
RAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX
RAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX

[CLUSTERING]
GBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks
USEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch

[DATABASES]
BLASTDB=/Users/amos/datalake/BLASTDB/NR/nr
TAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "autophylo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "\"Amogelang R. Raphenya\" <raphenar@mcmaster.ca>",
    "keywords": "phylogentic, autophy, automatic, autophylo",
    "author": null,
    "author_email": "\"Amogelang R. Raphenya\" <raphenar@mcmaster.ca>",
    "download_url": "https://files.pythonhosted.org/packages/ec/d3/25e081dfde8ad80a73ef63754cd129731323df406d6338a43ec894fd0174/autophylo-1.0.4.tar.gz",
    "platform": null,
    "description": "# The Automatic Phylogenetic Tree Builder\n\nThe **Auto**matic **Phylo**genetic (AutoPhylo) is used to generate phylogenetic trees automatically by sampling the selected database (e.g., NCBI nr) and performs all tasks associated with traditional phylogenetic tree building which includes trimming, dropping overly similar sequences, and generating an maximum likelihood (ML) tree using RAxML.\n\nThe user provides one protein sequence or multiple sequences and blast databases are used to sample sequences\nfrom phyla selected by the user e.g., The following can be used for phyla found in the human gut i.e., *Actinobacteriota*, *Bacteroidota*, *Desulfobacterota*, *Firmicutes*, *Proteobacteria*, *Synergistota*, *Verrucomicrobiota*, *Fusobacteria*.\n\n![autophylo overview](https://github.com/raphenya/autophylo/blob/main/docs/images/Phylogenetic_Tree_Building.trimmed.png?raw=true)\n\n# Installation\n\nThe tool requires Python >= 3.11 and conda >= 4.12.0. The latest release can be installed directly from pip or this repository.\n\n```\npip install autophylo\n```\n\nOr\n\nCreate a conda environment using the `environment.yml` file which installs all the dependencies (listed below).\n\n```\nconda env create -f environment.yml\n```\n\n# Usage\n\n```\nconda activate autophylo\n```\n\n# Install autophylo using tarball\n\nInstall the `autophylo` application within the created `autophylo` conda environment using a tarball.\n\n```\npython3 -m pip install /path/to/autophylo-1.0.0.tar.gz\n```\n\n# Dependencies\n\nThe following are required dependencies (listed below):\n\n- NCBI BLAST 2.15.0\n- BLAST databases (version 5)\n- FastTree 2.1.11\n- MUSCLE 5.1.0\n- RAxML 8.2.13\n- GBLOCKS 0.91b\n- USEARCH 12.0_beta\n- seqkit 2.8.2\n- Trimal 1.5.0\n- biopython 1.84\n- joblib 1.4.2\n\n\n\n# Download pre-formatted blast databases\n\n(https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README)\n\n- The pre-formatted databases offer the following advantages:\n    * Pre-formatting removes the need to run makeblastdb;\n    * Species-level taxonomy ids are included for each database entry;\n    * Databases are broken into smaller-sized volumes and are therefore easier\n      to download;\n    * Sequences in FASTA format can be generated from the pre-formatted databases\n      by using the blastdbcmd utility;\n    * A convenient script (update_blastdb.pl) is available in the blast+ package\n      to download the pre-formatted databases.\n\n# download nr\n\n```\nupdate_blastdb.pl --source ncbi --decompress --blastdb_version 5  --verbose 2 --num_threads 30 nr > log.nr 2>&1\n```\n\n# Update `config` file \n\nObtain path to dependencies programs using the `$CONDA_PREFIX` variable.\n\nAfter activating the `autophylo` run the following command to get the path and use it to update the `config` file\n\n```\n(autophylo) echo $CONDA_PREFIX\n```\n\n# Example `config` file\n\n```\n(autophylo) amos@Amogelangs-MacBook-Pro autophylo % echo $CONDA_PREFIX\n/Users/amos/miniconda3/envs/autophylo\n```\n\n# Updated `config` file\n\nNOTE: The databases can be place anywhere in the filesystem and in this example they are in `/Users/amos/datalake`.\n\n```\n[DEFAULT]\nServerAliveInterval = 45\nCompression = yes\nCompressionLevel = 9\nForwardX11 = yes\n\n[ALIGNMENT]\nMUSCLE=/Users/amos/miniconda3/envs/autophylo/bin/muscle\nTRIMAL=/Users/amos/miniconda3/envs/autophylo/bin/trimal\n\n[TREE]\nFastTree=/Users/amos/miniconda3/envs/autophylo/bin/FastTree\nRAxML_PTHREADS=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-PTHREADS-AVX\nRAxML_HYBRID=/Users/amos/miniconda3/envs/autophylo/bin/raxmlHPC-HYBRID-AVX\n\n[CLUSTERING]\nGBLOCKS=/Users/amos/miniconda3/envs/autophylo/bin/Gblocks\nUSEARCH=/Users/amos/miniconda3/envs/autophylo/bin/usearch\n\n[DATABASES]\nBLASTDB=/Users/amos/datalake/BLASTDB/NR/nr\nTAXIDS=/Users/amos/datalake/BLASTDB/NR/taxids\n```\n",
    "bugtrack_url": null,
    "license": "Copyright (c) 2024 Amogelang R. Raphenya (raphenar@mcmaster.ca)  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Autophylo is used to generate phylogentic trees automatically.",
    "version": "1.0.4",
    "project_urls": {
        "Homepage": "https://github.com/raphenya/autophylo",
        "Issues": "https://github.com/raphenya/autophylo/issues"
    },
    "split_keywords": [
        "phylogentic",
        " autophy",
        " automatic",
        " autophylo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a5de8c7d845fc5b4709758125547410175e21d33a17a803f92b5461b3b6b74ae",
                "md5": "7b3dd9cec125f1b03a60872b07b43797",
                "sha256": "9b9e9080b02a9d12813156743ce807a6769594232cfaa305ac355df2e53e4d13"
            },
            "downloads": -1,
            "filename": "autophylo-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7b3dd9cec125f1b03a60872b07b43797",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 17098,
            "upload_time": "2024-09-02T14:32:27",
            "upload_time_iso_8601": "2024-09-02T14:32:27.812028Z",
            "url": "https://files.pythonhosted.org/packages/a5/de/8c7d845fc5b4709758125547410175e21d33a17a803f92b5461b3b6b74ae/autophylo-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ecd325e081dfde8ad80a73ef63754cd129731323df406d6338a43ec894fd0174",
                "md5": "be1c744fb118a5c5e28d2cc977567eef",
                "sha256": "01cec39ba113dcf833e9c1567bba1b5b653f7f680e6eaf890d31c4627c654f1b"
            },
            "downloads": -1,
            "filename": "autophylo-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "be1c744fb118a5c5e28d2cc977567eef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 501180,
            "upload_time": "2024-09-02T14:32:29",
            "upload_time_iso_8601": "2024-09-02T14:32:29.195666Z",
            "url": "https://files.pythonhosted.org/packages/ec/d3/25e081dfde8ad80a73ef63754cd129731323df406d6338a43ec894fd0174/autophylo-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-02 14:32:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "raphenya",
    "github_project": "autophylo",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "autophylo"
}
        
Elapsed time: 2.59470s