minimappers2


Nameminimappers2 JSON
Version 0.1.7 PyPI version JSON
download
home_pageNone
SummaryA Python wrapper for minimap2-rs
upload_time2025-01-08 00:02:24
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords minimap2 bioinformatics alignment mapping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Python bindings for the [Rust FFI](https://github.com/jguhlin/minimap2-rs/) [minimap2](https://github.com/lh3/minimap2/) library. In development! Feedback appreciated!

# Why?
[PyO3](https://github.com/PyO3/pyo3) makes it very easy to create Python libraries via Rust. Further, we can use [Polars](https://github.com/pola-rs/polars) to export results as a dataframe (which can be used as-is, or converted to Pandas). Python allows for faster experimentation with novel algorithms, integration into machine learning pipelines, and provides an opportunity for those not familiar with Rust nor C/C++ to use minimap2.

# Current State
Very early alpha. Please use, and open an issue for any features you need that are missing, and for any bugs you find.

# How to use
## Requirements
Polars and PyArrow, these should be installed when you install minimappers2

## Creating an Aligner Instance
```python
aligner = map_ont()
aligner.threads(4)
```

If you want an alignment performed, rather than just matches, enable .cigar() 
```python
aligner = map_hifi()
aligner.cigar()
```

Please note, at this time the following syntax is **NOT** supported:
```python
aligner = map_ont().threads(4).cigar()
```

## Creating an index
```python
aligner.index("ref.fa")
```

To save a built-index, for future processing use:
```python
aligner.index_and_save("ref.fa", "ref.mmi")
```

Then next time you use the index will be faster if you use the saved index instead.
```python
aligner.load_index("ref.mmi")
```

## Aligning a Single Sequence
```python
query = Sequence(seq_name, seq)
aligner.map1(query)

# Example
seq = "CCAGAACGTACAAGGAAATATCCTCAAATTATCCCAAGAATTGTCCGCAGGAAATGGGGATAATTTCAGAAATGAGAG"
result = aligner.map1(Sequence("MySeq", seq))
```

Where seq_name and seq are both strings. The output is a Polars DataFrame.

## Aligning Multiple Sequences
```python
seqs = [Sequence("name of seq 1", seq1), 
        Sequence("name of seq 2", seq1)]
result = aligner.map(seqs)
```

# Example Notebook
Please see the [example notebook](https://github.com/jguhlin/minimap2-rs/blob/main/minimappers2/example/Exampe.ipynb) for more examples.

## Mapping a file
Please [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) if you need to map files from this API.

# Results
All results are returned as [Polars](https://github.com/pola-rs/polars) dataframes. You can convert Polars dataframes to Pandas dataframes with [.to_pandas()](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.to_pandas.html#polars.DataFrame.to_pandas)

* Polars is the fastest dataframe library in the Python Ecosystem. 
* Polars provides a nice data bridge between Rust and Python.

For more information, please see the [Polars User Guide](https://pola-rs.github.io/polars-book/user-guide/index.html) or the [Polars Guide for Pandas users](https://pola-rs.github.io/polars-book/user-guide/coming_from_pandas.html).

## Example of Results
Here is an image of the resulting dataframe
![Resulting Dataframe Image](https://raw.githubusercontent.com/jguhlin/minimap2-rs/main/minimappers2/images/minimappers2_df.png)

**NOTE** Mapq, Cigar, and others will not show up unless .cigar() is enabled on the aligner itself.

# Errors
As this is a very-early stage library, error checking is not yet implemented. When things crash you will likely need to restart your python interpreter (jupyter kernel). Let me know what happened and [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) and I will get to it.

## Compatability

* Linux: Yes
* Mac: Unknown
* Windows: Unlikely

* x86_64: Yes
* aarch64: Unknown (open an issue)
* neon: No (Open an issue)

* Google Colab: Yes

# Performance
Effort has been made to make this as performant as possible, but if you need more performance, please use minimap2 directly and import the results.

# Citation
You should cite the minimap2 papers if you use this in your work.

> Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences.
> *Bioinformatics*, **34**:3094-3100. [doi:10.1093/bioinformatics/bty191][doi]

and/or:

> Li, H. (2021). New strategies to improve minimap2 alignment accuracy.
> *Bioinformatics*, **37**:4572-4574. [doi:10.1093/bioinformatics/btab705][doi2]

# Changelog
## 0.1.5 
* Updated minimap2-rs, polars, pyo3 deps
* Add new presets

## 0.1.4 
* Update pyo3, polars, minimap2-rs, and mimalloc deps

## 0.1.1
* Update pyo3 and polars deps
* Add with_seq for indexing TODO

## 0.1.0
* Initial Functions implemented
* Return results as Polars dfs

# Funding
![Genomics Aotearoa](https://github.com/jguhlin/minimap2-rs/blob/main/info/genomics-aotearoa.png)


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "minimappers2",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "minimap2, bioinformatics, alignment, mapping",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/b2/2b/86702e9300f3883f778941045fcc59ae861f44f3f2ae828a7ff83eba1331/minimappers2-0.1.7.tar.gz",
    "platform": null,
    "description": "Python bindings for the [Rust FFI](https://github.com/jguhlin/minimap2-rs/) [minimap2](https://github.com/lh3/minimap2/) library. In development! Feedback appreciated!\n\n# Why?\n[PyO3](https://github.com/PyO3/pyo3) makes it very easy to create Python libraries via Rust. Further, we can use [Polars](https://github.com/pola-rs/polars) to export results as a dataframe (which can be used as-is, or converted to Pandas). Python allows for faster experimentation with novel algorithms, integration into machine learning pipelines, and provides an opportunity for those not familiar with Rust nor C/C++ to use minimap2.\n\n# Current State\nVery early alpha. Please use, and open an issue for any features you need that are missing, and for any bugs you find.\n\n# How to use\n## Requirements\nPolars and PyArrow, these should be installed when you install minimappers2\n\n## Creating an Aligner Instance\n```python\naligner = map_ont()\naligner.threads(4)\n```\n\nIf you want an alignment performed, rather than just matches, enable .cigar() \n```python\naligner = map_hifi()\naligner.cigar()\n```\n\nPlease note, at this time the following syntax is **NOT** supported:\n```python\naligner = map_ont().threads(4).cigar()\n```\n\n## Creating an index\n```python\naligner.index(\"ref.fa\")\n```\n\nTo save a built-index, for future processing use:\n```python\naligner.index_and_save(\"ref.fa\", \"ref.mmi\")\n```\n\nThen next time you use the index will be faster if you use the saved index instead.\n```python\naligner.load_index(\"ref.mmi\")\n```\n\n## Aligning a Single Sequence\n```python\nquery = Sequence(seq_name, seq)\naligner.map1(query)\n\n# Example\nseq = \"CCAGAACGTACAAGGAAATATCCTCAAATTATCCCAAGAATTGTCCGCAGGAAATGGGGATAATTTCAGAAATGAGAG\"\nresult = aligner.map1(Sequence(\"MySeq\", seq))\n```\n\nWhere seq_name and seq are both strings. The output is a Polars DataFrame.\n\n## Aligning Multiple Sequences\n```python\nseqs = [Sequence(\"name of seq 1\", seq1), \n        Sequence(\"name of seq 2\", seq1)]\nresult = aligner.map(seqs)\n```\n\n# Example Notebook\nPlease see the [example notebook](https://github.com/jguhlin/minimap2-rs/blob/main/minimappers2/example/Exampe.ipynb) for more examples.\n\n## Mapping a file\nPlease [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) if you need to map files from this API.\n\n# Results\nAll results are returned as [Polars](https://github.com/pola-rs/polars) dataframes. You can convert Polars dataframes to Pandas dataframes with [.to_pandas()](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.to_pandas.html#polars.DataFrame.to_pandas)\n\n* Polars is the fastest dataframe library in the Python Ecosystem. \n* Polars provides a nice data bridge between Rust and Python.\n\nFor more information, please see the [Polars User Guide](https://pola-rs.github.io/polars-book/user-guide/index.html) or the [Polars Guide for Pandas users](https://pola-rs.github.io/polars-book/user-guide/coming_from_pandas.html).\n\n## Example of Results\nHere is an image of the resulting dataframe\n![Resulting Dataframe Image](https://raw.githubusercontent.com/jguhlin/minimap2-rs/main/minimappers2/images/minimappers2_df.png)\n\n**NOTE** Mapq, Cigar, and others will not show up unless .cigar() is enabled on the aligner itself.\n\n# Errors\nAs this is a very-early stage library, error checking is not yet implemented. When things crash you will likely need to restart your python interpreter (jupyter kernel). Let me know what happened and [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) and I will get to it.\n\n## Compatability\n\n* Linux: Yes\n* Mac: Unknown\n* Windows: Unlikely\n\n* x86_64: Yes\n* aarch64: Unknown (open an issue)\n* neon: No (Open an issue)\n\n* Google Colab: Yes\n\n# Performance\nEffort has been made to make this as performant as possible, but if you need more performance, please use minimap2 directly and import the results.\n\n# Citation\nYou should cite the minimap2 papers if you use this in your work.\n\n> Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences.\n> *Bioinformatics*, **34**:3094-3100. [doi:10.1093/bioinformatics/bty191][doi]\n\nand/or:\n\n> Li, H. (2021). New strategies to improve minimap2 alignment accuracy.\n> *Bioinformatics*, **37**:4572-4574. [doi:10.1093/bioinformatics/btab705][doi2]\n\n# Changelog\n## 0.1.5 \n* Updated minimap2-rs, polars, pyo3 deps\n* Add new presets\n\n## 0.1.4 \n* Update pyo3, polars, minimap2-rs, and mimalloc deps\n\n## 0.1.1\n* Update pyo3 and polars deps\n* Add with_seq for indexing TODO\n\n## 0.1.0\n* Initial Functions implemented\n* Return results as Polars dfs\n\n# Funding\n![Genomics Aotearoa](https://github.com/jguhlin/minimap2-rs/blob/main/info/genomics-aotearoa.png)\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python wrapper for minimap2-rs",
    "version": "0.1.7",
    "project_urls": {
        "homepage": "https://github.com/jguhlin/minimap2-rs",
        "repository": "https://github.com/jguhlin/minimap2-rs"
    },
    "split_keywords": [
        "minimap2",
        " bioinformatics",
        " alignment",
        " mapping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df7e0c98585122bb0c2c844cc2277b3a7fc55b54b70156dc750cafab7290ce08",
                "md5": "be9a46f98ad21fc3f48602b22f7b0b3d",
                "sha256": "22af4946fc0b7991a7891daceed8756738e98d3ae9b17ce17ee6b6a09842c708"
            },
            "downloads": -1,
            "filename": "minimappers2-0.1.7-cp37-abi3-manylinux_2_34_x86_64.whl",
            "has_sig": false,
            "md5_digest": "be9a46f98ad21fc3f48602b22f7b0b3d",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 4208123,
            "upload_time": "2025-01-08T00:02:20",
            "upload_time_iso_8601": "2025-01-08T00:02:20.256407Z",
            "url": "https://files.pythonhosted.org/packages/df/7e/0c98585122bb0c2c844cc2277b3a7fc55b54b70156dc750cafab7290ce08/minimappers2-0.1.7-cp37-abi3-manylinux_2_34_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b22b86702e9300f3883f778941045fcc59ae861f44f3f2ae828a7ff83eba1331",
                "md5": "f68d253f3ef5ca0c3a169eb8760a3437",
                "sha256": "c94b6a9b7fa807a3586719ce806a50e1d82800b86f704019f22e8220f12c4e3d"
            },
            "downloads": -1,
            "filename": "minimappers2-0.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "f68d253f3ef5ca0c3a169eb8760a3437",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 2053173,
            "upload_time": "2025-01-08T00:02:24",
            "upload_time_iso_8601": "2025-01-08T00:02:24.547138Z",
            "url": "https://files.pythonhosted.org/packages/b2/2b/86702e9300f3883f778941045fcc59ae861f44f3f2ae828a7ff83eba1331/minimappers2-0.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-08 00:02:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jguhlin",
    "github_project": "minimap2-rs",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "minimappers2"
}
        
Elapsed time: 0.82176s