# kmertools: DNA Vectorisation Tool
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Cargo tests](https://github.com/anuradhawick/kmertools/actions/workflows/rust_test.yml/badge.svg)](https://github.com/anuradhawick/kmertools/actions/workflows/rust_test.yml)
[![Clippy check](https://github.com/anuradhawick/kmertools/actions/workflows/clippy_check.yml/badge.svg)](https://github.com/anuradhawick/kmertools/actions/workflows/clippy_check.yml)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/kmertools/README.html)
[![Conda - Version](https://img.shields.io/conda/v/bioconda/kmertools)](https://anaconda.org/bioconda/kmertools)
[![Conda Downloads](https://img.shields.io/conda/dn/bioconda/kmertools)](https://anaconda.org/bioconda/kmertools)
[![PyPI Downloads](https://static.pepy.tech/badge/pykmertools)](https://pepy.tech/projects/pykmertools)
[![codecov](https://codecov.io/gh/anuradhawick/kmertools/graph/badge.svg?token=IDGRE54SSQ)](https://codecov.io/gh/anuradhawick/kmertools)
[![PyPI - Version](https://img.shields.io/pypi/v/pykmertools)](https://pypi.org/project/pykmertools/)
<div align="center">
<pre>
$$\ $$\ $$$$$$$$\ $$\
$$ | $$ | \__$$ __| $$ |
$$ |$$ / $$$$$$\$$$$\ $$$$$$\ $$$$$$\ $$ | $$$$$$\ $$$$$$\ $$ | $$$$$$$\
$$$$$ / $$ _$$ _$$\ $$ __$$\ $$ __$$\ $$ | $$ __$$\ $$ __$$\ $$ |$$ _____|
$$ $$< $$ / $$ / $$ |$$$$$$$$ |$$ | \__| $$ | $$ / $$ |$$ / $$ |$$ |\$$$$$$\
$$ |\$$\ $$ | $$ | $$ |$$ ____|$$ | $$ | $$ | $$ |$$ | $$ |$$ | \____$$\
$$ | \$$\ $$ | $$ | $$ |\$$$$$$$\ $$ | $$ | \$$$$$$ |\$$$$$$ |$$ |$$$$$$$ |
\__| \__|\__| \__| \__| \_______|\__| \__| \______/ \______/ \__|\_______/
</pre>
</div>
## Overview
`kmertools` is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.
## Features
- **Oligonucleotide Frequency Vectors:** Generate frequency vectors for oligonucleotides.
- **Minimiser Binning:** Efficiently bin sequences using minimisers to reduce data complexity.
- **Chaos Game Representation (CGR):** Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.
- **Coverage Histograms:** Create coverage histograms to analyze the depth of sequencing reads.
- **Python Binding:** You can import kmertools functionality using `import pykmertools as kt`
## Installation
### Option 1: from bioconda (recommended)
You can install `kmertools` from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have [conda](https://docs.conda.io/en/latest/) installed.
```bash
# create conda environment and install kmertools
conda create -n kmertools -c bioconda kmertools
# activate environment
conda activate kmertools
```
### Option 2: from PyPI
You can install `kmertools` from PyPI at https://pypi.org/project/pykmertools/.
```bash
pip install pykmertools
```
### Option 3: from sources
You can install `kmertools` directly from the source by cloning the repository and using Rust's package manager `cargo`.
```bash
git clone https://github.com/your-repository/kmertools.git
cd kmertools
cargo build --release
```
Now add the binary to path (you may modify `~/.bashrc` or `~/.zshrc`)
```sh
# to add to current terminal
export PATH=$PATH:$(pwd)/target/release/
# to save to ~/.bashrc
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.bashrc
source ~/.bashrc
# to save to ~/.zshrc for Mac
echo "export PATH=\$PATH:$(pwd)/target/release/" >> ~/.zshrc
source ~/.zshrc
```
To install the python bindings run the following commands. You can use either pip or conda directories for this.
```bash
# pip
cd pip
maturin build --release
# conda
cd conda
maturin build --release
```
Now move to parent directory using `cd ..` and run the following command.
```bash
pip install target/wheels/pykmertools-<VERSION>-cp39-abi3-manylinux_2_34_x86_64.whl
```
## Test the installation
After setting up, run the following command to print out the `kmertools` help message.
```bash
kmertools --help
```
## Help
Please read our comprehensive [Wiki](https://github.com/anuradhawick/kmertools/wiki).
## Authors
- Anuradha Wickramarachchi [https://anuradhawick.com](https://anuradhawick.com)
- Vijini Mallawaarachchi [https://vijinimallawaarachchi.com](https://vijinimallawaarachchi.com)
## Citation
If you use `kmertools` please cite as follows.
```bib
@software{Wickramarachchi_kmertools_DNA_Vectorisation,
author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},
title = {{kmertools: DNA Vectorisation Tool}},
url = {https://github.com/anuradhawick/kmertools},
version = {0.1.4}
}
```
Please refer to the [Wiki](https://github.com/anuradhawick/kmertools/wiki) for citations of relevant algorithms.
## Support and contributions
Please get in touch via author websites or GitHub issues. Thanks!
Raw data
{
"_id": null,
"home_page": null,
"name": "pykmertools",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "genomics, bioinformatics",
"author": "Anuradha Wickramarachchi <anuradhawick@gmail.com>, Vijini Mallawaarachchi <viji.mallawaarachchi@gmail.com>",
"author_email": "Anuradha Wickramarachchi <anuradhawick@gmail.com>, Vijini Mallawaarachchi <viji.mallawaarachchi@gmail.com>",
"download_url": null,
"platform": null,
"description": "# kmertools: DNA Vectorisation Tool\n\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![Cargo tests](https://github.com/anuradhawick/kmertools/actions/workflows/rust_test.yml/badge.svg)](https://github.com/anuradhawick/kmertools/actions/workflows/rust_test.yml)\n[![Clippy check](https://github.com/anuradhawick/kmertools/actions/workflows/clippy_check.yml/badge.svg)](https://github.com/anuradhawick/kmertools/actions/workflows/clippy_check.yml)\n[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/kmertools/README.html)\n[![Conda - Version](https://img.shields.io/conda/v/bioconda/kmertools)](https://anaconda.org/bioconda/kmertools)\n[![Conda Downloads](https://img.shields.io/conda/dn/bioconda/kmertools)](https://anaconda.org/bioconda/kmertools)\n[![PyPI Downloads](https://static.pepy.tech/badge/pykmertools)](https://pepy.tech/projects/pykmertools)\n[![codecov](https://codecov.io/gh/anuradhawick/kmertools/graph/badge.svg?token=IDGRE54SSQ)](https://codecov.io/gh/anuradhawick/kmertools)\n[![PyPI - Version](https://img.shields.io/pypi/v/pykmertools)](https://pypi.org/project/pykmertools/)\n\n<div align=\"center\">\n<pre>\n$$\\ $$\\ $$$$$$$$\\ $$\\ \n$$ | $$ | \\__$$ __| $$ | \n$$ |$$ / $$$$$$\\$$$$\\ $$$$$$\\ $$$$$$\\ $$ | $$$$$$\\ $$$$$$\\ $$ | $$$$$$$\\ \n$$$$$ / $$ _$$ _$$\\ $$ __$$\\ $$ __$$\\ $$ | $$ __$$\\ $$ __$$\\ $$ |$$ _____|\n$$ $$< $$ / $$ / $$ |$$$$$$$$ |$$ | \\__| $$ | $$ / $$ |$$ / $$ |$$ |\\$$$$$$\\ \n$$ |\\$$\\ $$ | $$ | $$ |$$ ____|$$ | $$ | $$ | $$ |$$ | $$ |$$ | \\____$$\\ \n$$ | \\$$\\ $$ | $$ | $$ |\\$$$$$$$\\ $$ | $$ | \\$$$$$$ |\\$$$$$$ |$$ |$$$$$$$ |\n\\__| \\__|\\__| \\__| \\__| \\_______|\\__| \\__| \\______/ \\______/ \\__|\\_______/ \n</pre>\n</div>\n \n## Overview\n\n`kmertools` is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics. This tool leverages k-mer analysis to vectorize DNA sequences, facilitating the use of these vectors in various AI/ML applications.\n\n## Features\n\n- **Oligonucleotide Frequency Vectors:** Generate frequency vectors for oligonucleotides.\n- **Minimiser Binning:** Efficiently bin sequences using minimisers to reduce data complexity.\n- **Chaos Game Representation (CGR):** Compute CGR vectors for DNA sequences based on k-mers or whole sequence transformation.\n- **Coverage Histograms:** Create coverage histograms to analyze the depth of sequencing reads.\n- **Python Binding:** You can import kmertools functionality using `import pykmertools as kt`\n\n## Installation\n\n### Option 1: from bioconda (recommended)\n\nYou can install `kmertools` from Bioconda at https://anaconda.org/bioconda/kmertools. Make sure you have [conda](https://docs.conda.io/en/latest/) installed.\n\n```bash\n# create conda environment and install kmertools\nconda create -n kmertools -c bioconda kmertools\n\n# activate environment\nconda activate kmertools\n```\n\n### Option 2: from PyPI\n\nYou can install `kmertools` from PyPI at https://pypi.org/project/pykmertools/.\n\n```bash\npip install pykmertools\n```\n\n### Option 3: from sources\n\nYou can install `kmertools` directly from the source by cloning the repository and using Rust's package manager `cargo`.\n\n```bash\ngit clone https://github.com/your-repository/kmertools.git\ncd kmertools\ncargo build --release\n```\n\nNow add the binary to path (you may modify `~/.bashrc` or `~/.zshrc`)\n\n```sh\n# to add to current terminal\nexport PATH=$PATH:$(pwd)/target/release/\n\n# to save to ~/.bashrc\necho \"export PATH=\\$PATH:$(pwd)/target/release/\" >> ~/.bashrc\nsource ~/.bashrc\n\n# to save to ~/.zshrc for Mac\necho \"export PATH=\\$PATH:$(pwd)/target/release/\" >> ~/.zshrc\nsource ~/.zshrc\n```\n\nTo install the python bindings run the following commands. You can use either pip or conda directories for this.\n\n```bash\n# pip\ncd pip\nmaturin build --release\n# conda\ncd conda\nmaturin build --release\n```\n\nNow move to parent directory using `cd ..` and run the following command.\n\n```bash\npip install target/wheels/pykmertools-<VERSION>-cp39-abi3-manylinux_2_34_x86_64.whl\n```\n\n## Test the installation\n\nAfter setting up, run the following command to print out the `kmertools` help message.\n\n```bash\nkmertools --help\n```\n\n## Help\n\nPlease read our comprehensive [Wiki](https://github.com/anuradhawick/kmertools/wiki).\n\n## Authors\n\n- Anuradha Wickramarachchi [https://anuradhawick.com](https://anuradhawick.com)\n- Vijini Mallawaarachchi [https://vijinimallawaarachchi.com](https://vijinimallawaarachchi.com)\n\n## Citation\n\nIf you use `kmertools` please cite as follows.\n\n```bib\n@software{Wickramarachchi_kmertools_DNA_Vectorisation,\n author = {Wickramarachchi, Anuradha and Mallawaarachchi, Vijini},\n title = {{kmertools: DNA Vectorisation Tool}},\n url = {https://github.com/anuradhawick/kmertools},\n version = {0.1.4}\n}\n```\n\nPlease refer to the [Wiki](https://github.com/anuradhawick/kmertools/wiki) for citations of relevant algorithms.\n\n## Support and contributions\n\nPlease get in touch via author websites or GitHub issues. Thanks!\n\n",
"bugtrack_url": null,
"license": null,
"summary": "kmertools is a k-mer based feature extraction tool designed to support metagenomics and other bioinformatics analytics.",
"version": "0.1.5",
"project_urls": {
"Bug Tracker": "https://github.com/anuradhawick/kmertools/issues",
"Documentation": "https://github.com/anuradhawick/kmertools/wiki",
"Source Code": "https://github.com/anuradhawick/kmertools/"
},
"split_keywords": [
"genomics",
" bioinformatics"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "341569765ca57dd34d39961b4c16b1cc0e7241f7d58987e75a7903494e1eb83f",
"md5": "8ac0ea30c7f15b4c45544981270894f5",
"sha256": "3810d815f83589735fcdc6b87b0f6b00ce1b0d7b5a0fa3cfa9ddf64d0a362c0d"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
"has_sig": false,
"md5_digest": "8ac0ea30c7f15b4c45544981270894f5",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 974639,
"upload_time": "2024-12-13T14:11:46",
"upload_time_iso_8601": "2024-12-13T14:11:46.738097Z",
"url": "https://files.pythonhosted.org/packages/34/15/69765ca57dd34d39961b4c16b1cc0e7241f7d58987e75a7903494e1eb83f/pykmertools-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "52eaa81d97f6054df232f70990126f939066181c7d855c1723f2167a447f52c5",
"md5": "70fb16935bd196dee84514811fa8ec89",
"sha256": "1a3b28df3379097ed775250ec70e37ee8271ef86d0de95451cd1b998fb355e74"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "70fb16935bd196dee84514811fa8ec89",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 936776,
"upload_time": "2024-12-13T14:11:43",
"upload_time_iso_8601": "2024-12-13T14:11:43.874159Z",
"url": "https://files.pythonhosted.org/packages/52/ea/a81d97f6054df232f70990126f939066181c7d855c1723f2167a447f52c5/pykmertools-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "42d157e7fa7f94d43c4e4d03b6a541bdb10ad664c3f782bc16a7a70a260d9e48",
"md5": "021da3b01d720a3f1db41ae34c8bfedc",
"sha256": "d37e5a4f466f945225d3f4c6d45d9d29e0aa20c77c8b31f65a8b50cc778187ab"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
"has_sig": false,
"md5_digest": "021da3b01d720a3f1db41ae34c8bfedc",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1016951,
"upload_time": "2024-12-13T14:11:37",
"upload_time_iso_8601": "2024-12-13T14:11:37.764697Z",
"url": "https://files.pythonhosted.org/packages/42/d1/57e7fa7f94d43c4e4d03b6a541bdb10ad664c3f782bc16a7a70a260d9e48/pykmertools-0.1.5-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "539128efc5cbe02fb74f21b6ff51befb155532f829d35dc94f62c04e2e6b414f",
"md5": "32153c24738c02593679d64c98c43fa0",
"sha256": "3698b0a3c83012e7dda640fbf659e5a9564841837ef26fc79de354b437a70094"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "32153c24738c02593679d64c98c43fa0",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1043096,
"upload_time": "2024-12-13T14:11:40",
"upload_time_iso_8601": "2024-12-13T14:11:40.709675Z",
"url": "https://files.pythonhosted.org/packages/53/91/28efc5cbe02fb74f21b6ff51befb155532f829d35dc94f62c04e2e6b414f/pykmertools-0.1.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2b0bdbe9fcc51d41333ddb241e9bb877d2a79dc34715e842d60322629ac31a46",
"md5": "f58bf7e509a28be183820316f898eee0",
"sha256": "ac9fa1c6ccf04bb5576689d853a6a358a705e8edf18a1516360877f0a6eb1fbd"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
"has_sig": false,
"md5_digest": "f58bf7e509a28be183820316f898eee0",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1190323,
"upload_time": "2024-12-13T14:11:48",
"upload_time_iso_8601": "2024-12-13T14:11:48.567587Z",
"url": "https://files.pythonhosted.org/packages/2b/0b/dbe9fcc51d41333ddb241e9bb877d2a79dc34715e842d60322629ac31a46/pykmertools-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "27ac2110faadd3c97ff805994138769e8ae34706dc70a49afc685d5430b0de90",
"md5": "6562e3060315f074c1f7bed86782b40c",
"sha256": "f3b816f90a19c84b46daf185dc7113da55e63023220d52b193eea9b512526311"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
"has_sig": false,
"md5_digest": "6562e3060315f074c1f7bed86782b40c",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1214626,
"upload_time": "2024-12-13T14:11:51",
"upload_time_iso_8601": "2024-12-13T14:11:51.303489Z",
"url": "https://files.pythonhosted.org/packages/27/ac/2110faadd3c97ff805994138769e8ae34706dc70a49afc685d5430b0de90/pykmertools-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "137b515f488dc861a6aceeac522d0855588f6923180e09676380851a4e38cf59",
"md5": "2ae4b5e855f91569b3ad5c9b0301722a",
"sha256": "359cde9fc3f3d7a9b551e9ed90fd324c00078dab965c105dc45844ff068d1c72"
},
"downloads": -1,
"filename": "pykmertools-0.1.5-cp39-abi3-win_amd64.whl",
"has_sig": false,
"md5_digest": "2ae4b5e855f91569b3ad5c9b0301722a",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 783583,
"upload_time": "2024-12-13T14:11:54",
"upload_time_iso_8601": "2024-12-13T14:11:54.506167Z",
"url": "https://files.pythonhosted.org/packages/13/7b/515f488dc861a6aceeac522d0855588f6923180e09676380851a4e38cf59/pykmertools-0.1.5-cp39-abi3-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-13 14:11:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "anuradhawick",
"github_project": "kmertools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pykmertools"
}