helixerlite


Namehelixerlite JSON
Version 25.4.19 PyPI version JSON
download
home_pageNone
Summaryhelixerlite: simplified genome annotation with Helixer
upload_time2025-04-20 04:43:41
maintainerNone
docs_urlNone
authorTony Bolger <bolger.tony@gmail.com>, Jonathan Palmer <nextgenusfs@gmail.com>
requires_python>=3.9.0
licenseNone
keywords bioinformatics genome annotation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Helixerlite: Simplified Gene Prediction using Helixer and HelixerPost

This is a lightweight "predict-only" version of [Helixer](https://github.com/weberlab-hhu/Helixer) and [HelixerPost](https://github.com/TonyBolger/HelixerPost). Helixer is written in Python and contains many utilities for training models that aren't needed for end users who just want to predict genes in a genome. For smaller eukaryotic genomes, a GPU is not necessary for prediction. On average Ascomycete fungal genomes (~30 Mb), `helixerlite` should take less than 20 minutes to run.

HelixerPost is written in Rust and is in a separate repository, which makes installing a single tool cumbersome. By using `maturin` and `pyO3`, we wrap the Rust code into Python and run it as a single command-line tool.

## Features

- Convert FASTA files to HDF5 format for Helixer
- Run gene prediction using a pre-trained Helixer model
- Convert predictions to GFF3 format
- Lightweight and easy to install
- No GPU required for smaller genomes

## Installation

Installation can be done with `pip` or other tools able to install from PyPI, such as `uv`:

```bash
python -m pip install helixerlite
```

## Usage

### Command-line Interface

HelixerLite provides a simple command-line interface:

```bash

# Run prediction
helixerlite --fasta genome.fasta --lineage fungi --out output.gff3 
```

### Python API

You can also use HelixerLite as a Python library:

```python
from helixerlite import fasta2hdf5, preds2gff3
from helixerlite.hybrid_model import HybridModel

# Convert FASTA to HDF5
fasta2hdf5("genome.fasta", "genome.h5")

# Run prediction
model = HybridModel(["--load-model-path", "path/to/model",
                     "--test-data", "genome.h5",
                     "--prediction-output-path", "predictions.h5"])
model.run()

# Convert predictions to GFF3
preds2gff3("genome.h5", "predictions.h5", "output.gff3")
```

## Requirements

- Python 3.8 or higher
- TensorFlow 2.10 or higher
- h5py
- pyfastx
- gfftk

## Development

### Setting up a development environment

```bash
# Clone the repository
git clone https://github.com/nextgenusfs/helixerlite.git
cd helixerlite

# Create a conda environment
conda create -n helixerlite python=3.10
conda activate helixerlite

# Install development dependencies
pip install -e ".[dev]"
```

### Running tests

```bash
python -m pytest
```

## Citation

Anybody using this repo should cite the original Helixer authors, manuscript, code, etc.

Felix Holst, Anthony Bolger, Christopher Günther, Janina Maß, Sebastian Triesch, Felicitas Kindel, Niklas Kiel, Nima Saadat, Oliver Ebenhöh, Björn Usadel, Rainer Schwacke, Marie Bolger, Andreas P.M. Weber, Alisandra K. Denton. Helixer—de novo Prediction of Primary Eukaryotic Gene Models Combining Deep Learning and a Hidden Markov Model. bioRxiv 2023.02.06.527280; doi: https://doi.org/10.1101/2023.02.06.527280

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "helixerlite",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9.0",
    "maintainer_email": null,
    "keywords": "bioinformatics, genome, annotation",
    "author": "Tony Bolger <bolger.tony@gmail.com>, Jonathan Palmer <nextgenusfs@gmail.com>",
    "author_email": "Jon Palmer <nextgenusfs@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/95/2a/32f4962e07bd97d126724fc6f9d7101da78dd9c996f94df5fcfd1d34b2f1/helixerlite-25.4.19.tar.gz",
    "platform": null,
    "description": "# Helixerlite: Simplified Gene Prediction using Helixer and HelixerPost\n\nThis is a lightweight \"predict-only\" version of [Helixer](https://github.com/weberlab-hhu/Helixer) and [HelixerPost](https://github.com/TonyBolger/HelixerPost). Helixer is written in Python and contains many utilities for training models that aren't needed for end users who just want to predict genes in a genome. For smaller eukaryotic genomes, a GPU is not necessary for prediction. On average Ascomycete fungal genomes (~30 Mb), `helixerlite` should take less than 20 minutes to run.\n\nHelixerPost is written in Rust and is in a separate repository, which makes installing a single tool cumbersome. By using `maturin` and `pyO3`, we wrap the Rust code into Python and run it as a single command-line tool.\n\n## Features\n\n- Convert FASTA files to HDF5 format for Helixer\n- Run gene prediction using a pre-trained Helixer model\n- Convert predictions to GFF3 format\n- Lightweight and easy to install\n- No GPU required for smaller genomes\n\n## Installation\n\nInstallation can be done with `pip` or other tools able to install from PyPI, such as `uv`:\n\n```bash\npython -m pip install helixerlite\n```\n\n## Usage\n\n### Command-line Interface\n\nHelixerLite provides a simple command-line interface:\n\n```bash\n\n# Run prediction\nhelixerlite --fasta genome.fasta --lineage fungi --out output.gff3 \n```\n\n### Python API\n\nYou can also use HelixerLite as a Python library:\n\n```python\nfrom helixerlite import fasta2hdf5, preds2gff3\nfrom helixerlite.hybrid_model import HybridModel\n\n# Convert FASTA to HDF5\nfasta2hdf5(\"genome.fasta\", \"genome.h5\")\n\n# Run prediction\nmodel = HybridModel([\"--load-model-path\", \"path/to/model\",\n                     \"--test-data\", \"genome.h5\",\n                     \"--prediction-output-path\", \"predictions.h5\"])\nmodel.run()\n\n# Convert predictions to GFF3\npreds2gff3(\"genome.h5\", \"predictions.h5\", \"output.gff3\")\n```\n\n## Requirements\n\n- Python 3.8 or higher\n- TensorFlow 2.10 or higher\n- h5py\n- pyfastx\n- gfftk\n\n## Development\n\n### Setting up a development environment\n\n```bash\n# Clone the repository\ngit clone https://github.com/nextgenusfs/helixerlite.git\ncd helixerlite\n\n# Create a conda environment\nconda create -n helixerlite python=3.10\nconda activate helixerlite\n\n# Install development dependencies\npip install -e \".[dev]\"\n```\n\n### Running tests\n\n```bash\npython -m pytest\n```\n\n## Citation\n\nAnybody using this repo should cite the original Helixer authors, manuscript, code, etc.\n\nFelix Holst, Anthony Bolger, Christopher G\u00fcnther, Janina Ma\u00df, Sebastian Triesch, Felicitas Kindel, Niklas Kiel, Nima Saadat, Oliver Ebenh\u00f6h, Bj\u00f6rn Usadel, Rainer Schwacke, Marie Bolger, Andreas P.M. Weber, Alisandra K. Denton. Helixer\u2014de novo Prediction of Primary Eukaryotic Gene Models Combining Deep Learning and a Hidden Markov Model. bioRxiv 2023.02.06.527280; doi: https://doi.org/10.1101/2023.02.06.527280\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "helixerlite: simplified genome annotation with Helixer",
    "version": "25.4.19",
    "project_urls": {
        "Homepage": "https://github.com/nextgenusfs/helixerlite",
        "Repository": "https://github.com/nextgenusfs/helixerlite.git"
    },
    "split_keywords": [
        "bioinformatics",
        " genome",
        " annotation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "767cb3289a65db3e99d5a6e0225e920d98c533ba022e5ee26bd40a22e77fc00a",
                "md5": "d1993f4fcb36fb474e42cba9f7e9a66d",
                "sha256": "56f7d998130eb5f5b65520140476593f8fee1f14ce462dcfa913eaf1a6844cfc"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "d1993f4fcb36fb474e42cba9f7e9a66d",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.9.0",
            "size": 1338307,
            "upload_time": "2025-04-20T04:43:30",
            "upload_time_iso_8601": "2025-04-20T04:43:30.503222Z",
            "url": "https://files.pythonhosted.org/packages/76/7c/b3289a65db3e99d5a6e0225e920d98c533ba022e5ee26bd40a22e77fc00a/helixerlite-25.4.19-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5dc29abe98b0e415819040ded90076bdec20687c17bc352f341c46dd24a414f2",
                "md5": "c3c85f961de755cc418237e37512ab04",
                "sha256": "42f2b2cb35480cbc1f87aac5c68970171f244395e350322a4eae0327b22a10a8"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c3c85f961de755cc418237e37512ab04",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.9.0",
            "size": 1811930,
            "upload_time": "2025-04-20T04:43:32",
            "upload_time_iso_8601": "2025-04-20T04:43:32.555726Z",
            "url": "https://files.pythonhosted.org/packages/5d/c2/9abe98b0e415819040ded90076bdec20687c17bc352f341c46dd24a414f2/helixerlite-25.4.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1abc684836b02eb6f468b81ed3257f32dd25eac7245e9e31051325586f589779",
                "md5": "3a96f709da92b766ceb49c61e666fab2",
                "sha256": "9f0646631e6fd43724e88e19be614003c9d98520db4cd04fdf8a1836c438392e"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp311-cp311-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "3a96f709da92b766ceb49c61e666fab2",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.9.0",
            "size": 1337391,
            "upload_time": "2025-04-20T04:43:34",
            "upload_time_iso_8601": "2025-04-20T04:43:34.009018Z",
            "url": "https://files.pythonhosted.org/packages/1a/bc/684836b02eb6f468b81ed3257f32dd25eac7245e9e31051325586f589779/helixerlite-25.4.19-cp311-cp311-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e4d664129a81a293e058e20dbc39ec312bdf9414f336a18d11d802350e0c15e3",
                "md5": "8e35c10b4bd012026c14bbb97a146ac9",
                "sha256": "56611aaeec8111db2a75ebd9d04249adfb8064fe8ef0ae9992e1e19045ac9f6e"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "8e35c10b4bd012026c14bbb97a146ac9",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.9.0",
            "size": 1811622,
            "upload_time": "2025-04-20T04:43:35",
            "upload_time_iso_8601": "2025-04-20T04:43:35.990075Z",
            "url": "https://files.pythonhosted.org/packages/e4/d6/64129a81a293e058e20dbc39ec312bdf9414f336a18d11d802350e0c15e3/helixerlite-25.4.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1d12ef0390c62cf91f69f20c052741d01d6c5db34c50978edc0ad4fef9e93f72",
                "md5": "a3f60e1e962c5bf657f15ccbab961844",
                "sha256": "77f7d1af3cb60927ac7863a99639c7aa3675e46a72e47b64629639778e8b8f69"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp39-cp39-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "a3f60e1e962c5bf657f15ccbab961844",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9.0",
            "size": 1337967,
            "upload_time": "2025-04-20T04:43:37",
            "upload_time_iso_8601": "2025-04-20T04:43:37.843504Z",
            "url": "https://files.pythonhosted.org/packages/1d/12/ef0390c62cf91f69f20c052741d01d6c5db34c50978edc0ad4fef9e93f72/helixerlite-25.4.19-cp39-cp39-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f9f427d70bb65d19cdc5c777bd56459a68ee9aa9a6e14d9294e1085d9323303",
                "md5": "0d0cf1183fa5a3c3c7130329063e04db",
                "sha256": "d587a050d7b85c1038fbc56dac142072449eb84b759ab1e3d85b8bc0f7eeabdb"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "0d0cf1183fa5a3c3c7130329063e04db",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9.0",
            "size": 1811708,
            "upload_time": "2025-04-20T04:43:39",
            "upload_time_iso_8601": "2025-04-20T04:43:39.721387Z",
            "url": "https://files.pythonhosted.org/packages/2f/9f/427d70bb65d19cdc5c777bd56459a68ee9aa9a6e14d9294e1085d9323303/helixerlite-25.4.19-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "952a32f4962e07bd97d126724fc6f9d7101da78dd9c996f94df5fcfd1d34b2f1",
                "md5": "f0661b8ad380782d07f2eac1e1b2cbf6",
                "sha256": "2d1df285b444548a9b6479522aa13cf66849250fc15fb194a357847143f40a9e"
            },
            "downloads": -1,
            "filename": "helixerlite-25.4.19.tar.gz",
            "has_sig": false,
            "md5_digest": "f0661b8ad380782d07f2eac1e1b2cbf6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9.0",
            "size": 109079,
            "upload_time": "2025-04-20T04:43:41",
            "upload_time_iso_8601": "2025-04-20T04:43:41.106106Z",
            "url": "https://files.pythonhosted.org/packages/95/2a/32f4962e07bd97d126724fc6f9d7101da78dd9c996f94df5fcfd1d34b2f1/helixerlite-25.4.19.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-04-20 04:43:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nextgenusfs",
    "github_project": "helixerlite",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "helixerlite"
}
        
Elapsed time: 0.42354s