rsbio-seq


Namersbio-seq JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryRSBio-Seq is a fast and light-weight sequence reading library (built on top of rust bio crate).
upload_time2024-09-07 10:36:50
maintainerNone
docs_urlNone
authorAnuradha Wickramarachchi <anuradhawick@gmail.com>, Vijini Mallawaarachchi <viji.mallawaarachchi@gmail.com>
requires_python>=3.9
licenseNone
keywords bioinformatics genomics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # RSBio-Seq

[![Cargo tests](https://github.com/anuradhawick/rsbio-seq/actions/workflows/rust_test.yml/badge.svg)](https://github.com/anuradhawick/rsbio-seq/actions/workflows/rust_test.yml)
[![Downloads](https://static.pepy.tech/badge/rsbio-seq)](https://pepy.tech/project/rsbio-seq)
[![PyPI - Version](https://img.shields.io/pypi/v/rsbio-seq)](https://pypi.org/project/rsbio-seq/)
[![Upload to PyPI](https://github.com/anuradhawick/rsbio-seq/actions/workflows/pypi.yml/badge.svg)](https://github.com/anuradhawick/rsbio-seq/actions/workflows/pypi.yml)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

<div align="center">
<pre>
██████  ███████ ██████  ██  ██████        ███████ ███████  ██████  
██   ██ ██      ██   ██ ██ ██    ██       ██      ██      ██    ██ 
██████  ███████ ██████  ██ ██    ██ █████ ███████ █████   ██    ██ 
██   ██      ██ ██   ██ ██ ██    ██            ██ ██      ██ ▄▄ ██ 
██   ██ ███████ ██████  ██  ██████        ███████ ███████  ██████  
                                                              ▀▀   
</pre>
</div>

RSBio-Seq intends to provide reading/writing facility on common sequence formats (FASTA/FASTQ) in both raw (`fasta`, `fa`, `fna`, `fastq`, `fq`) and compressed formats (`.gz`).

## Installation

### 1. From PyPI (Recommended)

Use the following command to install from PyPI.

```bash
pip install rsbio-seq
```

### 2. Build and install from source

To build from source, make sure you have the following programs installed.

- Rust - https://www.rust-lang.org/tools/install
- Maturin - https://www.maturin.rs/installation
- Python environment with Python >=3.9 - https://www.python.org/downloads/

To build and install the development version of the wheel.

```bash
maturin develop # this installs the development version in the env
maturin develop --rust # this installs a release version in the env
```

To build a release mode wheel for installation, use this command.

```bash
maturin build --release
```

You will find the `whl` file inside the `target/wheels` directory. Your `whl` file will have a name depicting your python environment and CPU architecture. The built wheel can be installed using this command.

```bash
pip install target/wheels/*.whl
```

## Usage

Once installed you can import the library and use as follows.

### Reading

```python
from rsbio_seq import SeqReader, Sequence, ascii_to_phred

# each seq entry is of type Sequence
seq: Sequence

for seq in SeqReader("path/to/seq.fasta.gz"):
    print(seq.id)
    print(seq.seq)
    # for fastq quality line
    print(seq.qual) # prints IIII
    print(ascii_to_phred(seq.qual)) # prints [40, 40, 40, 40]
    # optional description attribute
    print(seq.desc)
```

### Writing

```python
from rsbio_seq import SeqWriter, Sequence, phred_to_ascii

# writing fasta
seq = Sequence("id", "desc", "ACGT") # id, description, sequence
writer = SeqWriter("out.fasta")
writer.write(seq)
writer.close()

# writing fastq
seq = Sequence("id", "desc", "ACGT", "IIII") # id, description, sequence, quality
writer = SeqWriter("out.fastq")
writer.write(seq)
writer.close()

# writing gzipped
seq = Sequence("id", "desc", "ACGT", "IIII") # id, description, sequence, quality
writer = SeqWriter("out.fq.gz")
writer.write(seq)
writer.close()

# writing gzipped with phred score translation
qual = phred_to_ascii([40, 40, 40, 40])
seq = Sequence("id", "desc", "ACGT", qual) # id, description, sequence, quality
writer = SeqWriter("out.fq.gz")
writer.write(seq)
writer.close()
```

Note: `close()` is only required if you want to read the file again in the same function/code scope. Closing opened files is a good practice either way.

We provide two utility functions for your convenience.

* `phred_to_ascii` - convert phred scores list of numbers to a string
* `ascii_to_phred` - convert the quality string to a list of numbers

RSBio-Seq reads and write quality string in ascii format only. Please use these helper functions to translate if you intend to read them.

## Authors

- Anuradha Wickramarachchi [https://anuradhawick.com](https://anuradhawick.com)
- Vijini Mallawaarachchi [https://vijinimallawaarachchi.com](https://vijinimallawaarachchi.com)

## Support and contributions

Please get in touch via author websites or GitHub issues. Thanks!


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rsbio-seq",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "bioinformatics, genomics",
    "author": "Anuradha Wickramarachchi <anuradhawick@gmail.com>, Vijini Mallawaarachchi <viji.mallawaarachchi@gmail.com>",
    "author_email": "Anuradha Wickramarachchi <anuradhawick@gmail.com>, Vijini Mallawaarachchi <viji.mallawaarachchi@gmail.com>",
    "download_url": null,
    "platform": null,
    "description": "# RSBio-Seq\n\n[![Cargo tests](https://github.com/anuradhawick/rsbio-seq/actions/workflows/rust_test.yml/badge.svg)](https://github.com/anuradhawick/rsbio-seq/actions/workflows/rust_test.yml)\n[![Downloads](https://static.pepy.tech/badge/rsbio-seq)](https://pepy.tech/project/rsbio-seq)\n[![PyPI - Version](https://img.shields.io/pypi/v/rsbio-seq)](https://pypi.org/project/rsbio-seq/)\n[![Upload to PyPI](https://github.com/anuradhawick/rsbio-seq/actions/workflows/pypi.yml/badge.svg)](https://github.com/anuradhawick/rsbio-seq/actions/workflows/pypi.yml)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n<div align=\"center\">\n<pre>\n\u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588        \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588  \n\u2588\u2588   \u2588\u2588 \u2588\u2588      \u2588\u2588   \u2588\u2588 \u2588\u2588 \u2588\u2588    \u2588\u2588       \u2588\u2588      \u2588\u2588      \u2588\u2588    \u2588\u2588 \n\u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588 \u2588\u2588    \u2588\u2588 \u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588   \u2588\u2588    \u2588\u2588 \n\u2588\u2588   \u2588\u2588      \u2588\u2588 \u2588\u2588   \u2588\u2588 \u2588\u2588 \u2588\u2588    \u2588\u2588            \u2588\u2588 \u2588\u2588      \u2588\u2588 \u2584\u2584 \u2588\u2588 \n\u2588\u2588   \u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588        \u2588\u2588\u2588\u2588\u2588\u2588\u2588 \u2588\u2588\u2588\u2588\u2588\u2588\u2588  \u2588\u2588\u2588\u2588\u2588\u2588  \n                                                              \u2580\u2580   \n</pre>\n</div>\n\nRSBio-Seq intends to provide reading/writing facility on common sequence formats (FASTA/FASTQ) in both raw (`fasta`, `fa`, `fna`, `fastq`, `fq`) and compressed formats (`.gz`).\n\n## Installation\n\n### 1. From PyPI (Recommended)\n\nUse the following command to install from PyPI.\n\n```bash\npip install rsbio-seq\n```\n\n### 2. Build and install from source\n\nTo build from source, make sure you have the following programs installed.\n\n- Rust - https://www.rust-lang.org/tools/install\n- Maturin - https://www.maturin.rs/installation\n- Python environment with Python >=3.9 - https://www.python.org/downloads/\n\nTo build and install the development version of the wheel.\n\n```bash\nmaturin develop # this installs the development version in the env\nmaturin develop --rust # this installs a release version in the env\n```\n\nTo build a release mode wheel for installation, use this command.\n\n```bash\nmaturin build --release\n```\n\nYou will find the `whl` file inside the `target/wheels` directory. Your `whl` file will have a name depicting your python environment and CPU architecture. The built wheel can be installed using this command.\n\n```bash\npip install target/wheels/*.whl\n```\n\n## Usage\n\nOnce installed you can import the library and use as follows.\n\n### Reading\n\n```python\nfrom rsbio_seq import SeqReader, Sequence, ascii_to_phred\n\n# each seq entry is of type Sequence\nseq: Sequence\n\nfor seq in SeqReader(\"path/to/seq.fasta.gz\"):\n    print(seq.id)\n    print(seq.seq)\n    # for fastq quality line\n    print(seq.qual) # prints IIII\n    print(ascii_to_phred(seq.qual)) # prints [40, 40, 40, 40]\n    # optional description attribute\n    print(seq.desc)\n```\n\n### Writing\n\n```python\nfrom rsbio_seq import SeqWriter, Sequence, phred_to_ascii\n\n# writing fasta\nseq = Sequence(\"id\", \"desc\", \"ACGT\") # id, description, sequence\nwriter = SeqWriter(\"out.fasta\")\nwriter.write(seq)\nwriter.close()\n\n# writing fastq\nseq = Sequence(\"id\", \"desc\", \"ACGT\", \"IIII\") # id, description, sequence, quality\nwriter = SeqWriter(\"out.fastq\")\nwriter.write(seq)\nwriter.close()\n\n# writing gzipped\nseq = Sequence(\"id\", \"desc\", \"ACGT\", \"IIII\") # id, description, sequence, quality\nwriter = SeqWriter(\"out.fq.gz\")\nwriter.write(seq)\nwriter.close()\n\n# writing gzipped with phred score translation\nqual = phred_to_ascii([40, 40, 40, 40])\nseq = Sequence(\"id\", \"desc\", \"ACGT\", qual) # id, description, sequence, quality\nwriter = SeqWriter(\"out.fq.gz\")\nwriter.write(seq)\nwriter.close()\n```\n\nNote: `close()` is only required if you want to read the file again in the same function/code scope. Closing opened files is a good practice either way.\n\nWe provide two utility functions for your convenience.\n\n* `phred_to_ascii` - convert phred scores list of numbers to a string\n* `ascii_to_phred` - convert the quality string to a list of numbers\n\nRSBio-Seq reads and write quality string in ascii format only. Please use these helper functions to translate if you intend to read them.\n\n## Authors\n\n- Anuradha Wickramarachchi [https://anuradhawick.com](https://anuradhawick.com)\n- Vijini Mallawaarachchi [https://vijinimallawaarachchi.com](https://vijinimallawaarachchi.com)\n\n## Support and contributions\n\nPlease get in touch via author websites or GitHub issues. Thanks!\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "RSBio-Seq is a fast and light-weight sequence reading library (built on top of rust bio crate).",
    "version": "0.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/anuradhawick/rsbio-seq/issues",
        "Documentation": "https://github.com/anuradhawick/rsbio-seq/",
        "Source Code": "https://github.com/anuradhawick/rsbio-seq/"
    },
    "split_keywords": [
        "bioinformatics",
        " genomics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6913768ace9b8b0d82c7773ea3006e7db01890d4a6e68c2b5a557b3171e7b5c0",
                "md5": "99e0030dde55dd09ec77bb986c05428f",
                "sha256": "e683fdbb0c10979115bbfc9cbec8fc3e6d167f1b0fbb02d1fe698cc10a5fd260"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-macosx_10_12_x86_64.whl",
            "has_sig": false,
            "md5_digest": "99e0030dde55dd09ec77bb986c05428f",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 280297,
            "upload_time": "2024-09-07T10:36:50",
            "upload_time_iso_8601": "2024-09-07T10:36:50.760715Z",
            "url": "https://files.pythonhosted.org/packages/69/13/768ace9b8b0d82c7773ea3006e7db01890d4a6e68c2b5a557b3171e7b5c0/rsbio_seq-0.1.3-cp39-abi3-macosx_10_12_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ef638317ce64b2f179d5674b334361acea044cbfead082a957cf1775e7696e9c",
                "md5": "7189a6a97b855311cfd292bfa93aada0",
                "sha256": "ac583b5b3423b5a2f52172876e3af132318dd31dae83873ef9f5a6097d15e9ee"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "7189a6a97b855311cfd292bfa93aada0",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 269470,
            "upload_time": "2024-09-07T10:36:49",
            "upload_time_iso_8601": "2024-09-07T10:36:49.671216Z",
            "url": "https://files.pythonhosted.org/packages/ef/63/8317ce64b2f179d5674b334361acea044cbfead082a957cf1775e7696e9c/rsbio_seq-0.1.3-cp39-abi3-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7e5ccf9327f2a39c65b5769d49fb988fb13fba7f87c4e0abecdd9ee2270b5b10",
                "md5": "bb67ca5ca57ba7927c69eb2d5949e127",
                "sha256": "d139f5b3571c8fceaba67b8212126c3d5ba2a58d9f1192e672cbdb2908844348"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
            "has_sig": false,
            "md5_digest": "bb67ca5ca57ba7927c69eb2d5949e127",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 307528,
            "upload_time": "2024-09-07T10:36:46",
            "upload_time_iso_8601": "2024-09-07T10:36:46.416034Z",
            "url": "https://files.pythonhosted.org/packages/7e/5c/cf9327f2a39c65b5769d49fb988fb13fba7f87c4e0abecdd9ee2270b5b10/rsbio_seq-0.1.3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "98b86881fc4e3c00f8447c0ce6d72805ec81ef99d2dfdf4a65d2998453922e44",
                "md5": "40893aa8c5c21a0ca5ba9c6fc0284b30",
                "sha256": "6ebe2ee2a40e50bae4c5c061e8776995a311a43ae48d56c96adba6cd98a9baf8"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "40893aa8c5c21a0ca5ba9c6fc0284b30",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 304788,
            "upload_time": "2024-09-07T10:36:48",
            "upload_time_iso_8601": "2024-09-07T10:36:48.062027Z",
            "url": "https://files.pythonhosted.org/packages/98/b8/6881fc4e3c00f8447c0ce6d72805ec81ef99d2dfdf4a65d2998453922e44/rsbio_seq-0.1.3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "809154961576f1df7a62f113c56f193c39a9aa074016dedede708128ff8ccd97",
                "md5": "999e5e49fb24b588b2f826217bd98cd9",
                "sha256": "e0d21f11cb7aa9a84fe02d46904a6c0857ccd8e9b27598de7b18ab722202130e"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-musllinux_1_2_aarch64.whl",
            "has_sig": false,
            "md5_digest": "999e5e49fb24b588b2f826217bd98cd9",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 481418,
            "upload_time": "2024-09-07T10:36:52",
            "upload_time_iso_8601": "2024-09-07T10:36:52.441718Z",
            "url": "https://files.pythonhosted.org/packages/80/91/54961576f1df7a62f113c56f193c39a9aa074016dedede708128ff8ccd97/rsbio_seq-0.1.3-cp39-abi3-musllinux_1_2_aarch64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5bab5a19ae581d2c6ef2cbd6c0ac0126e7002ead90905f9799f45c0d33f4c769",
                "md5": "68d811e04c351c085fd10f718a9cb88a",
                "sha256": "279bf9f6219d214880ddf1ef05a5345107dba343f45d070390180a248232d51a"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-musllinux_1_2_x86_64.whl",
            "has_sig": false,
            "md5_digest": "68d811e04c351c085fd10f718a9cb88a",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 470251,
            "upload_time": "2024-09-07T10:36:53",
            "upload_time_iso_8601": "2024-09-07T10:36:53.572176Z",
            "url": "https://files.pythonhosted.org/packages/5b/ab/5a19ae581d2c6ef2cbd6c0ac0126e7002ead90905f9799f45c0d33f4c769/rsbio_seq-0.1.3-cp39-abi3-musllinux_1_2_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3d312f3932845e9cd4ded1f0fbd8f1fddd50269dc063baa1bf76a84df45ae31d",
                "md5": "148826c7cc5c3e37ec0c4100d460daa0",
                "sha256": "1547c265fc8f1c46bb897ac8c6d8219f656821d79cdcc2c062e71c818c636f28"
            },
            "downloads": -1,
            "filename": "rsbio_seq-0.1.3-cp39-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "148826c7cc5c3e37ec0c4100d460daa0",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 187831,
            "upload_time": "2024-09-07T10:36:55",
            "upload_time_iso_8601": "2024-09-07T10:36:55.063836Z",
            "url": "https://files.pythonhosted.org/packages/3d/31/2f3932845e9cd4ded1f0fbd8f1fddd50269dc063baa1bf76a84df45ae31d/rsbio_seq-0.1.3-cp39-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-07 10:36:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "anuradhawick",
    "github_project": "rsbio-seq",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rsbio-seq"
}
        
Elapsed time: 3.69363s