Name | kmeruniq JSON |
Version |
0.6
JSON |
| download |
home_page | None |
Summary | Extract uniq kmer into an ondisk efficient datastructure and allows querying |
upload_time | 2025-07-25 21:28:08 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
dna
kmer
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# A tool to store kmer unique to dataset
To install:
```
pip install kmeruniq
```
Two commands:
- `kmeruniq build`
- `kmeruniq query`
Check the help with `-h` flag.
## Usage example of the CLI
```
kmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20
kmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index
kmeruniq query --query_file path/to/some/data.fa --index-path path/to/index
```
The output is a the result of a Counter for each value.
## The fof format
A file of file. That is a line separated list of file.
```
/path/to/foo.fa
/path/to/bar.fa
/path/to/barrr.fa.gz
```
Each line can also specify a label for the file. Two file with the same label will be considered
as merge. For instance:
```
/path/to/foo.fa ;chr1
/path/to/bar.fa ;chr2
/path/to/barrr.fa.gz ;chr1
```
Here the `foo.fa` and `barr.fa.gz` will be merged together within the index.
Remark that file can be gzip compressed.
## Usage example of the Python API:
The Kmer and DNA datatype are the one used by vizibridge.
```python
from kmeruniq.index import Index
from vizibridge import Kmer, DNA
idx = Index("path/to/my/index")
# idx is a dict-like object keyed by kmer and valued by the annotation
idx["ACG..ACG"] # some kmer of the appropriate size.
for kmer in idx:
print(idx[kmer]) # print the value of each kmer
dna = DNA("ACG....ACGT") # long sequence of DNA from somewhere
for kmer in dna.enum_canonical_kmer(idx.k):
print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.
```
Raw data
{
"_id": null,
"home_page": null,
"name": "kmeruniq",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "DNA, kmer",
"author": null,
"author_email": "Charles Paperman <charles.paperman@univ-lille.fr>, Camille Marchet <camille.marchet@univ-lille.fr>",
"download_url": "https://files.pythonhosted.org/packages/47/ab/4587cec256831742a400a28a0455b3604ccdf57c30ab076f8d0209a39a24/kmeruniq-0.6.tar.gz",
"platform": null,
"description": "# A tool to store kmer unique to dataset\n\nTo install:\n\n```\npip install kmeruniq \n```\n\nTwo commands:\n\n- `kmeruniq build`\n- `kmeruniq query` \n\nCheck the help with `-h` flag.\n\n## Usage example of the CLI\n\n```\nkmeruniq build -k 21 --fof my_file.fof --output path/to/index -e --shard 20 \nkmeruniq query --query_str ACGAAACGTACATTCACACACACACATAGAGAAGGAGAGCAGCACACACA --index-path path/to/index\nkmeruniq query --query_file path/to/some/data.fa --index-path path/to/index\n```\n\nThe output is a the result of a Counter for each value.\n\n\n## The fof format\n\nA file of file. That is a line separated list of file.\n\n```\n/path/to/foo.fa\n/path/to/bar.fa\n/path/to/barrr.fa.gz\n```\n\nEach line can also specify a label for the file. Two file with the same label will be considered\nas merge. For instance:\n\n\n```\n/path/to/foo.fa ;chr1\n/path/to/bar.fa ;chr2\n/path/to/barrr.fa.gz ;chr1\n```\n\nHere the `foo.fa` and `barr.fa.gz` will be merged together within the index.\n\nRemark that file can be gzip compressed.\n\n## Usage example of the Python API:\n\nThe Kmer and DNA datatype are the one used by vizibridge.\n\n```python\nfrom kmeruniq.index import Index\nfrom vizibridge import Kmer, DNA\n\nidx = Index(\"path/to/my/index\")\n# idx is a dict-like object keyed by kmer and valued by the annotation\n\nidx[\"ACG..ACG\"] # some kmer of the appropriate size. \n\nfor kmer in idx:\n print(idx[kmer]) # print the value of each kmer\n\ndna = DNA(\"ACG....ACGT\") # long sequence of DNA from somewhere\n\nfor kmer in dna.enum_canonical_kmer(idx.k):\n print(idx.get(kmer)) # print the value associated to kmer or None if kmer not inside.\n\n```\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Extract uniq kmer into an ondisk efficient datastructure and allows querying",
"version": "0.6",
"project_urls": {
"Repository": "https://gitlab.inria.fr/vizisoft/kmeruniq"
},
"split_keywords": [
"dna",
" kmer"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "3e2af8d207e3133fc9e054bba1ca9374b8b1241bf11c7ae396690358aaffc248",
"md5": "298934660dde2e3bd33c8700458a4627",
"sha256": "7dbe88cfc1fa1cf06e6533a38d93b6d65cd830d193a6a6a243682da4d54e8617"
},
"downloads": -1,
"filename": "kmeruniq-0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "298934660dde2e3bd33c8700458a4627",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 10979,
"upload_time": "2025-07-25T21:28:06",
"upload_time_iso_8601": "2025-07-25T21:28:06.612000Z",
"url": "https://files.pythonhosted.org/packages/3e/2a/f8d207e3133fc9e054bba1ca9374b8b1241bf11c7ae396690358aaffc248/kmeruniq-0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "47ab4587cec256831742a400a28a0455b3604ccdf57c30ab076f8d0209a39a24",
"md5": "29243c74b0a42c580c1a691acd0e9136",
"sha256": "defe9b44ffbf828d611c4b08b228c53ed1ff9d74e26980dc87257dde97b74007"
},
"downloads": -1,
"filename": "kmeruniq-0.6.tar.gz",
"has_sig": false,
"md5_digest": "29243c74b0a42c580c1a691acd0e9136",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 6647,
"upload_time": "2025-07-25T21:28:08",
"upload_time_iso_8601": "2025-07-25T21:28:08.884941Z",
"url": "https://files.pythonhosted.org/packages/47/ab/4587cec256831742a400a28a0455b3604ccdf57c30ab076f8d0209a39a24/kmeruniq-0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-25 21:28:08",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "kmeruniq"
}