minimappers2


Nameminimappers2 JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryA Python wrapper for minimap2-rs
upload_time2023-01-22 01:09:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords minimap2 bioinformatics alignment mapping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Python bindings for the [Rust FFI](https://github.com/jguhlin/minimap2-rs/) [minimap2](https://github.com/lh3/minimap2/) library. In development! Feedback appreciated!

# Why?
[PyO3](https://github.com/PyO3/pyo3) makes it very easy to create Python libraries via Rust. Further, we can use [Polars](https://github.com/pola-rs/polars) to export results as a dataframe (which can be used as-is, or converted to Pandas). Python allows for faster experimentation with novel algorithms, integration into machine learning pipelines, and provides an opportunity for those not familiar with Rust nor C/C++ to use minimap2.

# Current State
Very early alpha. Please use, and open an issue for any features you need that are missing, and for any bugs you find.

# How to use
## Requirements
Polars and PyArrow, these should be installed when you install minimappers2

## Creating an Aligner Instance
```python
aligner = map_ont()
aligner.threads(4)
```

If you want an alignment performed, rather than just matches, enable .cigar() 
```python
aligner = map_hifi()
aligner.cigar()
```

Please note, at this time the following syntax is **NOT** supported:
```python
aligner = map_ont().threads(4).cigar()
```

## Creating an index
```python
aligner.index("ref.fa")
```

To save a built-index, for future processing use:
```python
aligner.index_and_save("ref.fa", "ref.mmi")
```

Then next time you use the index will be faster if you use the saved index instead.
```python
aligner.load_index("ref.mmi")
```

## Aligning a Single Sequence
```python
query = Sequence(seq_name, seq)
aligner.map1(query)

# Example
seq = "CCAGAACGTACAAGGAAATATCCTCAAATTATCCCAAGAATTGTCCGCAGGAAATGGGGATAATTTCAGAAATGAGAG"
result = aligner.map1(Sequence("MySeq", seq))
```

Where seq_name and seq are both strings. The output is a Polars DataFrame.

## Aligning Multiple Sequences
```python
seqs = [Sequence("name of seq 1", seq1), 
        Sequence("name of seq 2", seq1)]
result = aligner.map(seqs)
```

# Example Notebook
Please see the [example notebook](https://github.com/jguhlin/minimap2-rs/blob/main/minimappers2/example/Exampe.ipynb) for more examples.

## Mapping a file
Please [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) if you need to map files from this API.

# Results
All results are returned as [Polars](https://github.com/pola-rs/polars) dataframes. You can convert Polars dataframes to Pandas dataframes with [.to_pandas()](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.to_pandas.html#polars.DataFrame.to_pandas)

* Polars is the fastest dataframe library in the Python Ecosystem. 
* Polars provides a nice data bridge between Rust and Python.

For more information, please see the [Polars User Guide](https://pola-rs.github.io/polars-book/user-guide/index.html) or the [Polars Guide for Pandas users](https://pola-rs.github.io/polars-book/user-guide/coming_from_pandas.html).

## Example of Results
Here is an image of the resulting dataframe
![Resulting Dataframe Image](https://raw.githubusercontent.com/jguhlin/minimap2-rs/main/minimappers2/images/minimappers2_df.png)

**NOTE** Mapq, Cigar, and others will not show up unless .cigar() is enabled on the aligner itself.

# Errors
As this is a very-early stage library, error checking is not yet implemented. When things crash you will likely need to restart your python interpreter (jupyter kernel). Let me know what happened and [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) and I will get to it.

## Compatability

* Windows: Unlikely
* Linux: Likely
* Mac: Unknown

* x86_64: Likely
* aarch64: Unknown
* neon: No (Open an issue)

* Google Colab: No, not sure why though.

# Performance
Effort has been made to make this as performant as possible, but if you need more performance, please use minimap2 directly and import the results.

# Citation
You should cite the minimap2 papers if you use this in your work.

> Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences.
> *Bioinformatics*, **34**:3094-3100. [doi:10.1093/bioinformatics/bty191][doi]

and/or:

> Li, H. (2021). New strategies to improve minimap2 alignment accuracy.
> *Bioinformatics*, **37**:4572-4574. [doi:10.1093/bioinformatics/btab705][doi2]

# Changelog
## 0.1.0
* Initial Functions implemented
* Return results as Polars dfs

# Funding
![Genomics Aotearoa](https://github.com/jguhlin/minimap2-rs/blob/main/info/genomics-aotearoa.png)


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "minimappers2",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "minimap2,bioinformatics,alignment,mapping",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/67/55/225ee844665e2f3e7bdb4a2517f05b2d0bb9c7007f1f1dd88ce5a7037637/minimappers2-0.1.3.tar.gz",
    "platform": null,
    "description": "Python bindings for the [Rust FFI](https://github.com/jguhlin/minimap2-rs/) [minimap2](https://github.com/lh3/minimap2/) library. In development! Feedback appreciated!\n\n# Why?\n[PyO3](https://github.com/PyO3/pyo3) makes it very easy to create Python libraries via Rust. Further, we can use [Polars](https://github.com/pola-rs/polars) to export results as a dataframe (which can be used as-is, or converted to Pandas). Python allows for faster experimentation with novel algorithms, integration into machine learning pipelines, and provides an opportunity for those not familiar with Rust nor C/C++ to use minimap2.\n\n# Current State\nVery early alpha. Please use, and open an issue for any features you need that are missing, and for any bugs you find.\n\n# How to use\n## Requirements\nPolars and PyArrow, these should be installed when you install minimappers2\n\n## Creating an Aligner Instance\n```python\naligner = map_ont()\naligner.threads(4)\n```\n\nIf you want an alignment performed, rather than just matches, enable .cigar() \n```python\naligner = map_hifi()\naligner.cigar()\n```\n\nPlease note, at this time the following syntax is **NOT** supported:\n```python\naligner = map_ont().threads(4).cigar()\n```\n\n## Creating an index\n```python\naligner.index(\"ref.fa\")\n```\n\nTo save a built-index, for future processing use:\n```python\naligner.index_and_save(\"ref.fa\", \"ref.mmi\")\n```\n\nThen next time you use the index will be faster if you use the saved index instead.\n```python\naligner.load_index(\"ref.mmi\")\n```\n\n## Aligning a Single Sequence\n```python\nquery = Sequence(seq_name, seq)\naligner.map1(query)\n\n# Example\nseq = \"CCAGAACGTACAAGGAAATATCCTCAAATTATCCCAAGAATTGTCCGCAGGAAATGGGGATAATTTCAGAAATGAGAG\"\nresult = aligner.map1(Sequence(\"MySeq\", seq))\n```\n\nWhere seq_name and seq are both strings. The output is a Polars DataFrame.\n\n## Aligning Multiple Sequences\n```python\nseqs = [Sequence(\"name of seq 1\", seq1), \n        Sequence(\"name of seq 2\", seq1)]\nresult = aligner.map(seqs)\n```\n\n# Example Notebook\nPlease see the [example notebook](https://github.com/jguhlin/minimap2-rs/blob/main/minimappers2/example/Exampe.ipynb) for more examples.\n\n## Mapping a file\nPlease [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) if you need to map files from this API.\n\n# Results\nAll results are returned as [Polars](https://github.com/pola-rs/polars) dataframes. You can convert Polars dataframes to Pandas dataframes with [.to_pandas()](https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.to_pandas.html#polars.DataFrame.to_pandas)\n\n* Polars is the fastest dataframe library in the Python Ecosystem. \n* Polars provides a nice data bridge between Rust and Python.\n\nFor more information, please see the [Polars User Guide](https://pola-rs.github.io/polars-book/user-guide/index.html) or the [Polars Guide for Pandas users](https://pola-rs.github.io/polars-book/user-guide/coming_from_pandas.html).\n\n## Example of Results\nHere is an image of the resulting dataframe\n![Resulting Dataframe Image](https://raw.githubusercontent.com/jguhlin/minimap2-rs/main/minimappers2/images/minimappers2_df.png)\n\n**NOTE** Mapq, Cigar, and others will not show up unless .cigar() is enabled on the aligner itself.\n\n# Errors\nAs this is a very-early stage library, error checking is not yet implemented. When things crash you will likely need to restart your python interpreter (jupyter kernel). Let me know what happened and [open an issue](https://github.com/jguhlin/minimap2-rs/issues/new) and I will get to it.\n\n## Compatability\n\n* Windows: Unlikely\n* Linux: Likely\n* Mac: Unknown\n\n* x86_64: Likely\n* aarch64: Unknown\n* neon: No (Open an issue)\n\n* Google Colab: No, not sure why though.\n\n# Performance\nEffort has been made to make this as performant as possible, but if you need more performance, please use minimap2 directly and import the results.\n\n# Citation\nYou should cite the minimap2 papers if you use this in your work.\n\n> Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences.\n> *Bioinformatics*, **34**:3094-3100. [doi:10.1093/bioinformatics/bty191][doi]\n\nand/or:\n\n> Li, H. (2021). New strategies to improve minimap2 alignment accuracy.\n> *Bioinformatics*, **37**:4572-4574. [doi:10.1093/bioinformatics/btab705][doi2]\n\n# Changelog\n## 0.1.0\n* Initial Functions implemented\n* Return results as Polars dfs\n\n# Funding\n![Genomics Aotearoa](https://github.com/jguhlin/minimap2-rs/blob/main/info/genomics-aotearoa.png)\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python wrapper for minimap2-rs",
    "version": "0.1.3",
    "split_keywords": [
        "minimap2",
        "bioinformatics",
        "alignment",
        "mapping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6cb2953f1de1412cb6e2e98f2cc97200d05455919874794a31281cf2b3fbe4d7",
                "md5": "9482d4626b68b47f83672d8c3fe10d4b",
                "sha256": "859c45a80356870d61728e5da4d908409534bb5446631ea82b07188e88f2fa5d"
            },
            "downloads": -1,
            "filename": "minimappers2-0.1.3-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "9482d4626b68b47f83672d8c3fe10d4b",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 2761389,
            "upload_time": "2023-01-22T01:09:18",
            "upload_time_iso_8601": "2023-01-22T01:09:18.878165Z",
            "url": "https://files.pythonhosted.org/packages/6c/b2/953f1de1412cb6e2e98f2cc97200d05455919874794a31281cf2b3fbe4d7/minimappers2-0.1.3-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6755225ee844665e2f3e7bdb4a2517f05b2d0bb9c7007f1f1dd88ce5a7037637",
                "md5": "74d6347479304617faf7e69590d39150",
                "sha256": "00567c75244ad1c9d2e280eb6c46b8fcedd07460ee7f71735f0d3d61171d43c4"
            },
            "downloads": -1,
            "filename": "minimappers2-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "74d6347479304617faf7e69590d39150",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 158584,
            "upload_time": "2023-01-22T01:09:21",
            "upload_time_iso_8601": "2023-01-22T01:09:21.242527Z",
            "url": "https://files.pythonhosted.org/packages/67/55/225ee844665e2f3e7bdb4a2517f05b2d0bb9c7007f1f1dd88ce5a7037637/minimappers2-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-22 01:09:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "minimappers2"
}
        
Elapsed time: 0.04288s