pyabpoa

Name	pyabpoa JSON
Version	1.5.3 JSON
	download
home_page	https://github.com/yangao07/abPOA
Summary	pyabpoa: SIMD-based partial order alignment using adaptive band
upload_time	2024-09-18 19:35:59
maintainer	None
docs_url	None
author	Yan Gao
requires_python	None
license	MIT
keywords	multiple-sequence-alignment partial-order-graph-alignment
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI
coveralls test coverage	No coveralls.

            # pyabpoa: abPOA Python interface
## Introduction
pyabpoa provides an easy-to-use interface to [abPOA](https://github.com/yangao07/abPOA), it contains all the APIs that can be used to perform MSA for a set of sequences and consensus calling from the final alignment graph.

## Installation

### Install pyabpoa with pip

pyabpoa can be installed with pip:

```
pip install pyabpoa
```

### Install pyabpoa from source
Alternatively, you can install pyabpoa from source (cython is required):
```
git clone --recursive https://github.com/yangao07/abPOA.git
cd abPOA
make install_py
```

## Examples
The following code illustrates how to use pyabpoa.
```
import pyabpoa as pa
a = pa.msa_aligner()
seqs=[
'CCGAAGA',
'CCGAACTCGA',
'CCCGGAAGA',
'CCGAAGA'
]
res=a.msa(seqs, out_cons=True, out_msa=True) # perform multiple sequence alignment 

for seq in res.cons_seq:
    print(seq)  # print consensus sequence

res.print_msa() # print row-column multiple sequence alignment in PIR format
```
You can also try the example script provided in the source folder:
```
python ./python/example.py
```


## APIs

### Class pyabpoa.msa_aligner
```
pyabpoa.msa_aligner(aln_mode='g', ...)
```
This constructs a multiple sequence alignment handler of pyabpoa, it accepts the following arguments:

* **aln_mode**: alignment mode. 'g': global, 'l': local, 'e': extension; default: **'g'**
* **is_aa**: input is amino acid sequence; default: **False**
* **match**: match score; default: **2**
* **mismatch**: match penaty; default: **4**
* **score_matrix**: scoring matrix file, **match** and **mismatch** are not used when **score_matrix** is used; default: **''**
* **gap_open1**: first gap opening penalty; default: **4**
* **gap_ext1**: first gap extension penalty; default: **2**
* **gap_open2**: second gap opening penalty; default: **24**
* **gap_ext2**: second gap extension penalty; default: **1**
* **extra_b**: first adaptive banding paremeter; set as < 0 to disable adaptive banded DP; default: **10**
* **extra_f**: second adaptive banding paremete; the number of extra bases added on both sites of the band is *b+f\*L*, where *L* is the length of the aligned sequence; default : **0.01**
* **cons_algrm**: consensus calling algorithm. 'HB': heaviest bunlding, 'MF': most frequent bases; default: **'HB'**

The `msa_aligner` handler provides one method which performs multiple sequence alignment and takes four arguments:
```
pyabpoa.msa_aligner.msa(seqs, out_cons, out_msa, out_pog='', incr_fn='')
```

* **seqs**: a list variable containing a set of input sequences; **positional**
* **out_cons**: a bool variable to ask pyabpoa to generate consensus sequence; **positional**
* **out_msa**: a bool variable to ask pyabpoa to generate RC-MSA; **positional**
* **max_n_cons**: maximum number of consensus sequence to generate; default: **1**
* **min_freq**: minimum frequency of each consensus to output (effective when **max_n_cons** > 1); default: **0.3**
* **out_pog**: name of a file (`.png` or `.pdf`) to store the plot of the final alignment graph; default: **''**
* **incr_fn**: name of an existing graph (GFA) or MSA (FASTA) file, incrementally align sequence to this graph/MSA; default: **''**

### Class pyabpoa.msa_result
```
pyabpoa.msa_result(seq_n, cons_n, cons_len, ...)
```
This class describes the information of the generated consensus sequence and the RC-MSA. The returned result of `pyabpoa.msa_aligner.msa()` is an object of this class that has the following properties:

* **n_seq**: number of input aligned sequences
* **n_cons**: number of generated consensus sequences (generally 1, could be 2 or more if **max_n_cons** is set as > 1)
* **clu_n_seq**: an array of sequence cluster size
* **cons_len**: an array of consensus sequence length(s)
* **cons_seq**: an array of consensus sequence(s)
* **cons_cov**: an array of consensus sequence coverage for each base
* **msa_len**: size of each row in the RC-MSA
* **msa_seq**: an array containing `n_seq`+`n_cons` strings that demonstrates the RC-MSA, each consisting of one input sequence and several `-` indicating the alignment gaps. 

`pyabpoa.msa_result()` has a function of `print_msa` which prints the RC-MSA to screen.

```
pyabpoa.msa_result().print_msa()
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yangao07/abPOA",
    "name": "pyabpoa",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "multiple-sequence-alignment  partial-order-graph-alignment",
    "author": "Yan Gao",
    "author_email": "yangao@ds.dfci.harvard.edu",
    "download_url": "https://files.pythonhosted.org/packages/6e/b8/2ec2e44c82b8011e4179b1da91748d232789a3f8a8b7b8bec1d61c39cac3/pyabpoa-1.5.3.tar.gz",
    "platform": null,
    "description": "# pyabpoa: abPOA Python interface\n## Introduction\npyabpoa provides an easy-to-use interface to [abPOA](https://github.com/yangao07/abPOA), it contains all the APIs that can be used to perform MSA for a set of sequences and consensus calling from the final alignment graph.\n\n## Installation\n\n### Install pyabpoa with pip\n\npyabpoa can be installed with pip:\n\n```\npip install pyabpoa\n```\n\n### Install pyabpoa from source\nAlternatively, you can install pyabpoa from source (cython is required):\n```\ngit clone --recursive https://github.com/yangao07/abPOA.git\ncd abPOA\nmake install_py\n```\n\n## Examples\nThe following code illustrates how to use pyabpoa.\n```\nimport pyabpoa as pa\na = pa.msa_aligner()\nseqs=[\n'CCGAAGA',\n'CCGAACTCGA',\n'CCCGGAAGA',\n'CCGAAGA'\n]\nres=a.msa(seqs, out_cons=True, out_msa=True) # perform multiple sequence alignment \n\nfor seq in res.cons_seq:\n    print(seq)  # print consensus sequence\n\nres.print_msa() # print row-column multiple sequence alignment in PIR format\n```\nYou can also try the example script provided in the source folder:\n```\npython ./python/example.py\n```\n\n\n## APIs\n\n### Class pyabpoa.msa_aligner\n```\npyabpoa.msa_aligner(aln_mode='g', ...)\n```\nThis constructs a multiple sequence alignment handler of pyabpoa, it accepts the following arguments:\n\n* **aln_mode**: alignment mode. 'g': global, 'l': local, 'e': extension; default: **'g'**\n* **is_aa**: input is amino acid sequence; default: **False**\n* **match**: match score; default: **2**\n* **mismatch**: match penaty; default: **4**\n* **score_matrix**: scoring matrix file, **match** and **mismatch** are not used when **score_matrix** is used; default: **''**\n* **gap_open1**: first gap opening penalty; default: **4**\n* **gap_ext1**: first gap extension penalty; default: **2**\n* **gap_open2**: second gap opening penalty; default: **24**\n* **gap_ext2**: second gap extension penalty; default: **1**\n* **extra_b**: first adaptive banding paremeter; set as < 0 to disable adaptive banded DP; default: **10**\n* **extra_f**: second adaptive banding paremete; the number of extra bases added on both sites of the band is *b+f\\*L*, where *L* is the length of the aligned sequence; default : **0.01**\n* **cons_algrm**: consensus calling algorithm. 'HB': heaviest bunlding, 'MF': most frequent bases; default: **'HB'**\n\nThe `msa_aligner` handler provides one method which performs multiple sequence alignment and takes four arguments:\n```\npyabpoa.msa_aligner.msa(seqs, out_cons, out_msa, out_pog='', incr_fn='')\n```\n\n* **seqs**: a list variable containing a set of input sequences; **positional**\n* **out_cons**: a bool variable to ask pyabpoa to generate consensus sequence; **positional**\n* **out_msa**: a bool variable to ask pyabpoa to generate RC-MSA; **positional**\n* **max_n_cons**: maximum number of consensus sequence to generate; default: **1**\n* **min_freq**: minimum frequency of each consensus to output (effective when **max_n_cons** > 1); default: **0.3**\n* **out_pog**: name of a file (`.png` or `.pdf`) to store the plot of the final alignment graph; default: **''**\n* **incr_fn**: name of an existing graph (GFA) or MSA (FASTA) file, incrementally align sequence to this graph/MSA; default: **''**\n\n### Class pyabpoa.msa_result\n```\npyabpoa.msa_result(seq_n, cons_n, cons_len, ...)\n```\nThis class describes the information of the generated consensus sequence and the RC-MSA. The returned result of `pyabpoa.msa_aligner.msa()` is an object of this class that has the following properties:\n\n* **n_seq**: number of input aligned sequences\n* **n_cons**: number of generated consensus sequences (generally 1, could be 2 or more if **max_n_cons** is set as > 1)\n* **clu_n_seq**: an array of sequence cluster size\n* **cons_len**: an array of consensus sequence length(s)\n* **cons_seq**: an array of consensus sequence(s)\n* **cons_cov**: an array of consensus sequence coverage for each base\n* **msa_len**: size of each row in the RC-MSA\n* **msa_seq**: an array containing `n_seq`+`n_cons` strings that demonstrates the RC-MSA, each consisting of one input sequence and several `-` indicating the alignment gaps. \n\n`pyabpoa.msa_result()` has a function of `print_msa` which prints the RC-MSA to screen.\n\n```\npyabpoa.msa_result().print_msa()\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "pyabpoa: SIMD-based partial order alignment using adaptive band",
    "version": "1.5.3",
    "project_urls": {
        "Homepage": "https://github.com/yangao07/abPOA"
    },
    "split_keywords": [
        "multiple-sequence-alignment",
        "",
        "partial-order-graph-alignment"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6eb82ec2e44c82b8011e4179b1da91748d232789a3f8a8b7b8bec1d61c39cac3",
                "md5": "dc42f963d0029461b4225fbdc88b9ba9",
                "sha256": "94714bb5c6be9f5ca35b66a5c63490237ebff2498ff93b82a842a9512b0bbc08"
            },
            "downloads": -1,
            "filename": "pyabpoa-1.5.3.tar.gz",
            "has_sig": false,
            "md5_digest": "dc42f963d0029461b4225fbdc88b9ba9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 690103,
            "upload_time": "2024-09-18T19:35:59",
            "upload_time_iso_8601": "2024-09-18T19:35:59.174362Z",
            "url": "https://files.pythonhosted.org/packages/6e/b8/2ec2e44c82b8011e4179b1da91748d232789a3f8a8b7b8bec1d61c39cac3/pyabpoa-1.5.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-18 19:35:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yangao07",
    "github_project": "abPOA",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyabpoa"
}

Yan Gao