# minimizers
A Python package for extracting minimizers from sequence data.
## Requirements
The package requires Python 3 and there are no constraints on the type of operating system.
It also requires the [biopython](https://pypi.org/project/biopython/) package.
## Install
It can be installed with `pip` by typing the following command in your terminal:
```
pip install minimizers
```
## How to use it
Run `minimizers --help` for a list of available arguments:
```
usage: minimizers [-h] [-a] -i INPUT [-o OUTPUT] [-t {list,fasta}] -s SIZE -w
WINDOW [--report-counts] [--top-perc TOP_PERC]
[--top-num TOP_NUM] [-n NPROC] [--verbose] [-v]
Extract the set of minimizers from a sequence file
optional arguments:
-h, --help show this help message and exit
-a, --aggregate Aggregate record results (default: False)
-i INPUT, --input INPUT
Path to the input sequence file in fasta format. It
can be Gzip compressed (default: None)
-o OUTPUT, --output OUTPUT
Path to the output file with minimizers. Results are
printed on the stdout if no output is provided
(default: None)
-t {list,fasta}, --output-type {list,fasta}
The output can be formatted as a list of kmers or as a
fasta file (default: list)
-s SIZE, --size SIZE Length of the minimizers (default: None)
-w WINDOW, --window WINDOW
Size of the sliding window. It must be greater than
the minimizer size (default: None)
--report-counts Report the frequencies of the minimizers. This is
compatible with "--output-type list" only (default:
False)
--top-perc TOP_PERC Report the top percentage of minimizers based on their
frequency (default: None)
--top-num TOP_NUM Report the top number of minimizers based on their
frequency (default: None)
-n NPROC, --nproc NPROC
Make it parallel (default: 1)
--verbose Print messages on the stdout (default: False)
-v, --version Print the "minimizers" version and exit
```
Copyright © 2022 [Fabio Cumbo](https://github.com/cumbof). See [LICENSE](https://github.com/cumbof/minimizers/blob/main/LICENSE) for additional details.
Raw data
{
"_id": null,
"home_page": "http://github.com/cumbof/minimizers",
"name": "minimizers",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "bioinformatics,minimizers,sketches",
"author": "Fabio Cumbo",
"author_email": "fabio.cumbo@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f9/90/9b07ae3de6474c4a29f321c9b072b7c0799b92a02f06a4321139d0f7b428/minimizers-0.1.2.tar.gz",
"platform": null,
"description": "# minimizers\n\nA Python package for extracting minimizers from sequence data.\n\n## Requirements\n\nThe package requires Python 3 and there are no constraints on the type of operating system.\n\nIt also requires the [biopython](https://pypi.org/project/biopython/) package.\n\n## Install\n\nIt can be installed with `pip` by typing the following command in your terminal:\n\n```\npip install minimizers\n```\n\n## How to use it\n\nRun `minimizers --help` for a list of available arguments:\n\n```\nusage: minimizers [-h] [-a] -i INPUT [-o OUTPUT] [-t {list,fasta}] -s SIZE -w\n WINDOW [--report-counts] [--top-perc TOP_PERC]\n [--top-num TOP_NUM] [-n NPROC] [--verbose] [-v]\n\nExtract the set of minimizers from a sequence file\n\noptional arguments:\n -h, --help show this help message and exit\n -a, --aggregate Aggregate record results (default: False)\n -i INPUT, --input INPUT\n Path to the input sequence file in fasta format. It\n can be Gzip compressed (default: None)\n -o OUTPUT, --output OUTPUT\n Path to the output file with minimizers. Results are\n printed on the stdout if no output is provided\n (default: None)\n -t {list,fasta}, --output-type {list,fasta}\n The output can be formatted as a list of kmers or as a\n fasta file (default: list)\n -s SIZE, --size SIZE Length of the minimizers (default: None)\n -w WINDOW, --window WINDOW\n Size of the sliding window. It must be greater than\n the minimizer size (default: None)\n --report-counts Report the frequencies of the minimizers. This is\n compatible with \"--output-type list\" only (default:\n False)\n --top-perc TOP_PERC Report the top percentage of minimizers based on their\n frequency (default: None)\n --top-num TOP_NUM Report the top number of minimizers based on their\n frequency (default: None)\n -n NPROC, --nproc NPROC\n Make it parallel (default: 1)\n --verbose Print messages on the stdout (default: False)\n -v, --version Print the \"minimizers\" version and exit\n```\n\nCopyright \u00a9 2022 [Fabio Cumbo](https://github.com/cumbof). See [LICENSE](https://github.com/cumbof/minimizers/blob/main/LICENSE) for additional details.",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package for extracting minimizers from sequence data",
"version": "0.1.2",
"split_keywords": [
"bioinformatics",
"minimizers",
"sketches"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f9909b07ae3de6474c4a29f321c9b072b7c0799b92a02f06a4321139d0f7b428",
"md5": "01c2bfa3cd9c56d0e1a5fcbe74b76dd4",
"sha256": "41757d81ec6a2872f947ec568eb521bdd40170832bee21b748d48b1778f817bf"
},
"downloads": -1,
"filename": "minimizers-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "01c2bfa3cd9c56d0e1a5fcbe74b76dd4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5905,
"upload_time": "2023-04-26T03:50:22",
"upload_time_iso_8601": "2023-04-26T03:50:22.618093Z",
"url": "https://files.pythonhosted.org/packages/f9/90/9b07ae3de6474c4a29f321c9b072b7c0799b92a02f06a4321139d0f7b428/minimizers-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-26 03:50:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "cumbof",
"github_project": "minimizers",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "minimizers"
}