sequana-pipetools


Namesequana-pipetools JSON
Version 1.1.1 PyPI version JSON
download
home_pagehttps://github.com/sequana/sequana_pipetools
SummaryA set of tools to help building or using Sequana pipelines
upload_time2024-11-21 22:38:49
maintainerNone
docs_urlNone
authorSequana Team
requires_python<4.0,>=3.8
licenseBSD-3
keywords snakemake sequana pipelines
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            

.. image:: https://badge.fury.io/py/sequana-pipetools.svg
    :target: https://pypi.python.org/pypi/sequana_pipetools

.. image:: https://github.com/sequana/sequana_pipetools/actions/workflows/main.yml/badge.svg?branch=main
    :target: https://github.com/sequana/sequana_pipetools/actions/workflows/main.yml

.. image:: https://coveralls.io/repos/github/sequana/sequana_pipetools/badge.svg?branch=main
    :target: https://coveralls.io/github/sequana/sequana_pipetools?branch=main

.. image:: https://readthedocs.org/projects/sequana-pipetools/badge/?version=latest
    :target: https://sequana-pipetools.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

.. image:: https://app.codacy.com/project/badge/Grade/9031e4e4213e4e57a876fd5b792b5003
   :target: https://app.codacy.com/gh/sequana/sequana_pipetools/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade

.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg
   :target: http://joss.theoj.org/papers/10.21105/joss.00352
   :alt: JOSS (journal of open source software) DOI

:Overview: A set of tools to help building or using Sequana pipelines
:Status: Production
:Issues: Please fill a report on `github <https://github.com/sequana/sequana_pipetools/issues>`__
:Python version: Python 3.8, 3.9, 3.10, 3.11
:Citation: Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352,  `JOSS DOI doi:10.21105/joss.00352 <http://www.doi2bib.org/bib/10.21105%2Fjoss.00352>`_

Installation
============

from pypi website::

    pip install sequana_pipetools

No dependencies for this package except Python itself. In practice, this package
has no interest if not used within a Sequana pipeline. It is installed automatically when you install
a Sequana pipelines. For example::

    pip install sequana_rnaseq
    pip install sequana_fastqc

See `Sequana <https://sequana.readthedocs.io>`_ for a list of pipelines ready for production.


Targetted audience
==================

This package is intended for `Sequana <https://sequana.readthedocs.io>`_ developers seeking to integrate Snakemake pipelines into the Sequana project. Please refer below for more information. Additionally, note that as a developer, you can generate the reference documentation using Sphinx::

    make html
    browse build/html/index.html


What is sequana_pipetools ?
============================

**sequana_pipetools** is a collection of tools designed to facilitate the management of `Sequana <https://sequana.readthedocs.io>`_ pipelines, which includes next-generation sequencing (NGS) pipelines like RNA-seq, variant calling, ChIP-seq, and others.

The aim of this package is to streamline the deployment of `Sequana pipelines <https://sequana.readthedocs.io>`_ by
creating a pure Python library that includes commonly used tools for various pipelines.

Previously, the Sequana framework incorporated all bioinformatics, Snakemake rules,
pipelines, and pipeline management tools into a single library (Sequana) as illustrated
in **Fig 1** below.

.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/veryold.png

    **Figure 1** Old Sequana framework will all pipelines and Sequana library in the same
    place including pipetools (this library).

Despite maintaining an 80% test coverage, whenever changes were introduced to the Sequana library, a comprehensive examination of the entire library was imperative. The complexity escalated further when incorporating new pipelines or dependencies. To address this challenge, we initially designed all pipelines to operate independently, as depicted in **Fig. 2**. This approach allowed modifications to pipelines without necessitating updates to Sequana and vice versa, resulting in a significant improvement.


.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/old.png

    **Figure 2** v0.8 of Sequana moved the Snakemake pipelines in independent
    repositories. A `cookie cutter <https://github.com/sequana/sequana_pipeline_template>`_
    ease the creation of such pipelines


Nevertheless, certain tools, including those utilized for user interface and input data sanity checks, were essential for all pipelines, as illustrated by the pipetools box in the figure. With the continuous addition of new pipelines each month, our goal was to enhance the modularity of both the pipelines and Sequana. As a result, we developed a pure Python library named sequana_pipetools, depicted in **Fig. 3**, to further empower the autonomy of the pipelines.



.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/new.png

    **Figure 3** New Sequana framework. The new Sequana framework comprises the core library
    and bioinformatics tools, which are now separate from the pipelines. Moreover, the
    sequana_pipetools library provides essential tools for the creation and management
    of all pipelines, including a shared parser for options

As a final step, we separated the rules originally available in Sequana to create an independent package featuring a collection of Snakemake wrappers. These wrappers can be accessed at https://github.com/sequana/sequana-wrappers and offer the added benefit of being rigorously tested through continuous integration.

.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/wrappers.png

    **Figure 3** New Sequana framework 2021. The library itself with the core, the
    bioinformatics tools is now fully independent of the pipelines.



Quick tour of the standalone
============================

The **sequana_pipetools** package provide a standalone called **sequana_pipetools**. Here is a snapshot of the user interface:

.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/UI.png

There are several applications. The first one is for Linux users under
bash to obtain completion of a sequana pipeline command line arguments::

    sequana_pipetools --completion fastqc

The second is used to introspect slurm files to get a summary of the SLURM log
files::

    sequana_pipetools --slurm-diag

It searches for files with pattern **slurm** in the current directory and slurm files in the ./logs directory.
This is used within th pipeline but can be used manually as well and is useful to get a quick summary of common errors found in slurm files.

The following command provides statistics about Sequana pipelines installed on your system (number of rules, wrappers
used)::

    sequana_pipetools --stats

And for developpers, a quick creation of schema file given a config file (experimental, developers would still need to edit the schema but it does 90% of the job)::

    sequana_pipetools --config-to-schema config.yaml > schema.yaml

You can also convert the dot file into a nice PNG file using::

    sequana_pipetools --dot2png dag.dot


For Sequana developers
======================

The library is intended to help Sequana developers to design their pipelines.
See the `Sequana organization repository for examples <https://github.com/sequana>`_.
In addition to the standalone shown above, **sequana_pipetools** main goal is to provide utilities to help Sequana developers.

First, let us create a pipeline

Initiate a project (Sequana pipeline) with cookiecutter
-------------------------------------------------------

You can start a Sequana pipeline skeleton as follows::

    sequana_pipetools --init-new-pipeline

and then follow the instructions. You will be asked some questions such as the name of your pipeline (eg. variant), a description, keywords and the *project_slug* (just press enter).

Update the main script
-----------------------

Go to sequana_pipelines/NAME and look at the main.py script.

We currently provide a set of Options classes that should be used to
design the API of your pipelines. For example, the
sequana_pipetools.options.SlurmOptions can be used as follows inside a standard
Python module (the last two lines is where the magic happens)::


    import rich_click as click
    from sequana_pipetools.options import *
    from sequana_pipetools import SequanaManager

    NAME = "fastqc"
    help = init_click(NAME, groups={
        "Pipeline Specific": [
            "--method", "--skip-multiqc"],
            }
    )

    @click.command(context_settings=help)
    @include_options_from(ClickSnakemakeOptions, working_directory=NAME)
    @include_options_from(ClickSlurmOptions)
    @include_options_from(ClickInputOptions, add_input_readtag=False)
    @include_options_from(ClickGeneralOptions)
    @click.option("--method", default="fastqc", type=click.Choice(["fastqc", "falco"]), help="your msg")
    def main(**options):

        # the real stuff is here
        manager = SequanaManager(options, NAME)
        manager.setup()

        # just two aliases
        options = manager.options
        cfg = manager.config.config


        # fills input_data, input_directory, input_readtag
        manager.fill_data_options()

        # fill specific options.
        # create a function for a given option (here --method)
        def fill_method():
            # any extra sanity checks
            cfg["method"] = options["method"]

        if options["from-project"]:
            # in --from-project, we fill the method is --method is provided only (since already pre-filled)
            if "--method" in sys.argv
                fill_method()
        else:
            # in normal, we always want to fill the user-provided option
            fill_method()

        # finalise the command and save it; copy the snakemake. update the config
        # file and save it.
        manager.teardown()

    if __name__ == "__main__":
        main()



Developers should look at e.g. module sequana_pipetools.options
for the API reference and one of the official sequana pipeline (e.g.,
https://github.com/sequana/sequana_variant_calling) to get help from examples.

The Options classes provided can be used and combined to design pipelines.


How to use sequana pipetools within your Pipeline
--------------------------------------------------

For FastQ files (paired ot not), The config file should look like::

    sequana_wrappers: "v0.15.1"

    input_directory: "."
    input_readtag: "_R[12]_"
    input_pattern: "*fastq.gz"


    apptainers:
        fastqc: "https://zenodo.org/record/7923780/files/fastqc_0.12.1.img"

    section1:
        key1: value1
        key2: value2

And your pipeline could make use of this as follows::

    configfile: "config.yaml"

    from sequana_pipetools import PipelineManager
    manager = PipelineManager("fastqc", config)

    # you can then figure out wheter it is paired or not:
    manager.paired

    # get the wrapper version to be used within a rule:
    manager.wrappers

    # the raw data (with a wild card) for the first rule
    manager.getrawdata()

    # add a Makefile to clean things at the end
    manager.teardown()


Setting up and Running Sequana pipelines
-----------------------------------------


When you execute a sequana pipeline, e.g.::

    sequana_fastqc --input-directory data

a working directory is created (with the name of the pipeline; here fastqc). Moreover, the working directory
contains a shell script that will hide the snakemake command. This snakemake command with make use
of the sequana wrappers and will use the official sequana github repository by default
(https://github.com/sequana/sequana-wrappers). This may be overwritten. For instance, you may use a local clone. To do
so, you will need to create an environment variable::

    export SEQUANA_WRAPPERS="git+file:///home/user/github/sequana-wrappers"

If you decide to use singularity/apptainer, one common error on a cluster is that non-standard paths are not found. You can bind them using the -B option but a more general set up is to create this environment variable::

    export SINGULARITY_BINDPATH="/path_to_bind"

for Apptainer setup ::

    export APPTAINER_BINDPATH="/path_to_bind"



What is Sequana ?
=================

**Sequana** is a versatile tool that provides

#. A Python library dedicated to NGS analysis (e.g., tools to visualise standard NGS formats).
#. A set of Pipelines dedicated to NGS in the form of Snakefiles
   (Makefile-like with Python syntax based on snakemake framework) with more
   common wrappers.
#. Standalone applications such as sequana_coverage and sequana_taxonomy.

See the `sequana home page <https://sequana.readthedocs.io>`_ for details.


To join the project, please let us know on `github <https://github.com/sequana/sequana/issues/306>`_.



Changelog
=========

========= ======================================================================
Version   Description
========= ======================================================================
1.1.1     * symlink creation on apptainers skipped if permission error (file
            is probably already present and created by another users e.g.
            the admin system)
          * add --init-new-pipeline argument in sequana_pipetools standalone
1.1.0     * add exclude_pattern in input data section
1.0.6     * add py3.12, slight updates wrt slurm
1.0.5     * introspect slurm files to extract stats
1.0.4     * add utility function to download and untar a tar.gz file
1.0.3     * add levenshtein function. some typo corrections.
1.0.2     * add the dot2png command. pin docutils <0.21 due to pip error
1.0.1     * hot fix in the profile creation (regression)
1.0.0     * Stable release
0.17.3    * remove useless code and fix a requirement
0.17.2    * simpler logging
0.17.1    * remove the --use-singulariry (replaced by --use-apptainer in
            previous release)
          * slight updates on logging and slight update on slurm module
0.17.0    * Remove deprecated options and deprecated functions. More tests.
0.16.9    * Fix slurm sys exit (replaced by print)
          * upadte doc
          * more tests
0.16.8    * stats command add the number of rules per pipeline
          * better slurm parsing using profile tree directory (slurm in logs/)
0.16.7    * add missing --trimming-quality option in list of TrimmingOption
          * set default to cutadatp if no fastp available
          * better UI for the completion script.
0.16.6    * Set default value for the option trimming to 20
          * Fix issue https://github.com/sequana/sequana_pipetools/issues/85
0.16.5    * merge completion standalone into main sequana_pipetools application
          * add application to create schema given a config file
          * add application to get basic stats about the pipelines
          * add precommit and applied black/isort on all files
          * remove some useless code
          * update completion to use click instead of argparse
          * Rename Module into Pipeline (remove rules so Module are only made
            of pipelines hence the renaming)
0.16.4    * fix Trimming options (click) for the quality option
0.16.3    * add class to handle multiplex entry for click.option (useful for
            multitax multiple databases)
0.16.2    * remove useless function get_pipeline_location, get_package_location
            guess_scheduler from sequana_manager (not used)
          * store sequana version correctly in info.txt Fixing #89
          * sort click options alphabetically
          * --from-project not funtcional (example in multitax pipeline)
          * Click checks that input-directoyr is a directory indeed
0.16.1    * Fix/rename error_report into onerror to be included in the Snakemake
            onerror section. added *slurm* in slurm output log file in the
            profile
0.16.0    * scripts now use click instead of argparse
          * All Options classes have now an equivalent using click.
            For example GeneralOptions has a class ClickGeneralOptions.
            The GeneralOptions is kept for now for back compatibility
          * --run-mode removed and replaced by --profile options. Profiles are
            used and stored withub .sequana/profiles
          * Remove --slurm-cores-per-job redundant with resources from snakemake
          * Way a main.py is coded fully refactored and simplified as described
            in the README
          * cluster_config are now deprecated in favor of profile
          * sequana_slurm_status removed. Use manager.error_report in pipelines
            instead
0.15.0    * remove useless code (readme, description) related to old rules
          * requirements.txt renamed in tools.txt to store the required tools to
            run a pipeline.
          * remove copy_requirements, not used in any pipelines (replaced by code
            in main.py of the pipelines)
          * a utility function called getmetadata that returns dictionary
            with name, version, wrappers version)
0.14.X    * Module now returns the list of requirements. SequanaManager
            creates a txt file with all standalones from the requirements.
0.13.0    * switch to pyproject and fixes #64
0.12.X    * automatically populater *wrappers* in PipelineManager based on the
            config entry *sequana_wrappers*.
          * Fix the singularity arguments by (i) adding -e and (ii) bind the
            /home. Indeed, snakemake sets --home to the current directory.
            Somehow the /home is lost. Removed deprecated function
          * factorise hash function to have url2hash easily accessible
          * remove harcoded bind path for apptainer. Uses env variable instead
0.11.X    * fix regression, add codacy badge, applied black, remove
            init_pipeline deprecated function.
0.10.X    * Fixes https://github.com/sequana/sequana_pipetools/issues/49
            that properly sets the apptainer prefix in defualt mode
0.9.X     * replaced singularity word by apptainer (--use-aptainer instead of
            --use-singularity)
          * add config2schema utility function for developers
          * Ability to download automatically singularity images (as URLs) if
            set in the  pipelines (container field). add the --use-singularity
            option in all pipelines (and --singualrity-prefix)
0.9.0     * **MAJOR update/Aug 2022**
          * new mechanism to handle  profile for Snakemake that will replace the
            cluster_config.yaml files
          * Major cleanup of PipelineManager (PipelineManagerGeneric was
            removed). The way input files are handled was also cleanup.
            Fixes https://github.com/sequana/sequana_pipetools/issues/37
            and also files starting with common prefixes
0.8.X     * Better schema validation. switch from distutils to packaging
0.7.X     * simplify the setup() method in pipeline manager
            can set a SEQUANA_WRAPPERS env variable to use local wrappers
            add schema pipeline manager directory & fix attrdict error with yaml
          * Set the --wrapper-prefix to point to the  sequana-wrappers github
0.6.X     * Fix SequanaConfig file to include wrapper
            and take new snakemake syntax into account. update schema handling
          * Move all modules related to pipelines from sequana into
            sequana_pipetools
0.5.X     * feature removed in sequana to deal with adapter removal and
            changes updated in the package (removed the *design* option
            from the cutadapt rules and needed); add TrimmingOptions.
0.4.X     * add FeatureCounts options and slurm status utility
0.4.0     * stable version
0.3.X     * first stable release
0.2.X     * completion can now handle multiple directories/files properly
            better doc and more tests; add --from-project option to import
            existing config file; remove --paired-data option; add content
            from sequana.pipeline_common
0.1.X     * software creation
========= ======================================================================


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sequana/sequana_pipetools",
    "name": "sequana-pipetools",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "snakemake, sequana, pipelines",
    "author": "Sequana Team",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/50/98/cf61066e8738434a12d8fc08855d3c9ffea40e5a5a09d413573f2030cf7f/sequana_pipetools-1.1.1.tar.gz",
    "platform": null,
    "description": "\n\n.. image:: https://badge.fury.io/py/sequana-pipetools.svg\n    :target: https://pypi.python.org/pypi/sequana_pipetools\n\n.. image:: https://github.com/sequana/sequana_pipetools/actions/workflows/main.yml/badge.svg?branch=main\n    :target: https://github.com/sequana/sequana_pipetools/actions/workflows/main.yml\n\n.. image:: https://coveralls.io/repos/github/sequana/sequana_pipetools/badge.svg?branch=main\n    :target: https://coveralls.io/github/sequana/sequana_pipetools?branch=main\n\n.. image:: https://readthedocs.org/projects/sequana-pipetools/badge/?version=latest\n    :target: https://sequana-pipetools.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n.. image:: https://app.codacy.com/project/badge/Grade/9031e4e4213e4e57a876fd5b792b5003\n   :target: https://app.codacy.com/gh/sequana/sequana_pipetools/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n   :target: http://joss.theoj.org/papers/10.21105/joss.00352\n   :alt: JOSS (journal of open source software) DOI\n\n:Overview: A set of tools to help building or using Sequana pipelines\n:Status: Production\n:Issues: Please fill a report on `github <https://github.com/sequana/sequana_pipetools/issues>`__\n:Python version: Python 3.8, 3.9, 3.10, 3.11\n:Citation: Cokelaer et al, (2017), \u2018Sequana\u2019: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352,  `JOSS DOI doi:10.21105/joss.00352 <http://www.doi2bib.org/bib/10.21105%2Fjoss.00352>`_\n\nInstallation\n============\n\nfrom pypi website::\n\n    pip install sequana_pipetools\n\nNo dependencies for this package except Python itself. In practice, this package\nhas no interest if not used within a Sequana pipeline. It is installed automatically when you install\na Sequana pipelines. For example::\n\n    pip install sequana_rnaseq\n    pip install sequana_fastqc\n\nSee `Sequana <https://sequana.readthedocs.io>`_ for a list of pipelines ready for production.\n\n\nTargetted audience\n==================\n\nThis package is intended for `Sequana <https://sequana.readthedocs.io>`_ developers seeking to integrate Snakemake pipelines into the Sequana project. Please refer below for more information. Additionally, note that as a developer, you can generate the reference documentation using Sphinx::\n\n    make html\n    browse build/html/index.html\n\n\nWhat is sequana_pipetools ?\n============================\n\n**sequana_pipetools** is a collection of tools designed to facilitate the management of `Sequana <https://sequana.readthedocs.io>`_ pipelines, which includes next-generation sequencing (NGS) pipelines like RNA-seq, variant calling, ChIP-seq, and others.\n\nThe aim of this package is to streamline the deployment of `Sequana pipelines <https://sequana.readthedocs.io>`_ by\ncreating a pure Python library that includes commonly used tools for various pipelines.\n\nPreviously, the Sequana framework incorporated all bioinformatics, Snakemake rules,\npipelines, and pipeline management tools into a single library (Sequana) as illustrated\nin **Fig 1** below.\n\n.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/veryold.png\n\n    **Figure 1** Old Sequana framework will all pipelines and Sequana library in the same\n    place including pipetools (this library).\n\nDespite maintaining an 80% test coverage, whenever changes were introduced to the Sequana library, a comprehensive examination of the entire library was imperative. The complexity escalated further when incorporating new pipelines or dependencies. To address this challenge, we initially designed all pipelines to operate independently, as depicted in **Fig. 2**. This approach allowed modifications to pipelines without necessitating updates to Sequana and vice versa, resulting in a significant improvement.\n\n\n.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/old.png\n\n    **Figure 2** v0.8 of Sequana moved the Snakemake pipelines in independent\n    repositories. A `cookie cutter <https://github.com/sequana/sequana_pipeline_template>`_\n    ease the creation of such pipelines\n\n\nNevertheless, certain tools, including those utilized for user interface and input data sanity checks, were essential for all pipelines, as illustrated by the pipetools box in the figure. With the continuous addition of new pipelines each month, our goal was to enhance the modularity of both the pipelines and Sequana. As a result, we developed a pure Python library named sequana_pipetools, depicted in **Fig. 3**, to further empower the autonomy of the pipelines.\n\n\n\n.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/new.png\n\n    **Figure 3** New Sequana framework. The new Sequana framework comprises the core library\n    and bioinformatics tools, which are now separate from the pipelines. Moreover, the\n    sequana_pipetools library provides essential tools for the creation and management\n    of all pipelines, including a shared parser for options\n\nAs a final step, we separated the rules originally available in Sequana to create an independent package featuring a collection of Snakemake wrappers. These wrappers can be accessed at https://github.com/sequana/sequana-wrappers and offer the added benefit of being rigorously tested through continuous integration.\n\n.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/wrappers.png\n\n    **Figure 3** New Sequana framework 2021. The library itself with the core, the\n    bioinformatics tools is now fully independent of the pipelines.\n\n\n\nQuick tour of the standalone\n============================\n\nThe **sequana_pipetools** package provide a standalone called **sequana_pipetools**. Here is a snapshot of the user interface:\n\n.. figure:: https://raw.githubusercontent.com/sequana/sequana_pipetools/main/doc/UI.png\n\nThere are several applications. The first one is for Linux users under\nbash to obtain completion of a sequana pipeline command line arguments::\n\n    sequana_pipetools --completion fastqc\n\nThe second is used to introspect slurm files to get a summary of the SLURM log\nfiles::\n\n    sequana_pipetools --slurm-diag\n\nIt searches for files with pattern **slurm** in the current directory and slurm files in the ./logs directory.\nThis is used within th pipeline but can be used manually as well and is useful to get a quick summary of common errors found in slurm files.\n\nThe following command provides statistics about Sequana pipelines installed on your system (number of rules, wrappers\nused)::\n\n    sequana_pipetools --stats\n\nAnd for developpers, a quick creation of schema file given a config file (experimental, developers would still need to edit the schema but it does 90% of the job)::\n\n    sequana_pipetools --config-to-schema config.yaml > schema.yaml\n\nYou can also convert the dot file into a nice PNG file using::\n\n    sequana_pipetools --dot2png dag.dot\n\n\nFor Sequana developers\n======================\n\nThe library is intended to help Sequana developers to design their pipelines.\nSee the `Sequana organization repository for examples <https://github.com/sequana>`_.\nIn addition to the standalone shown above, **sequana_pipetools** main goal is to provide utilities to help Sequana developers.\n\nFirst, let us create a pipeline\n\nInitiate a project (Sequana pipeline) with cookiecutter\n-------------------------------------------------------\n\nYou can start a Sequana pipeline skeleton as follows::\n\n    sequana_pipetools --init-new-pipeline\n\nand then follow the instructions. You will be asked some questions such as the name of your pipeline (eg. variant), a description, keywords and the *project_slug* (just press enter).\n\nUpdate the main script\n-----------------------\n\nGo to sequana_pipelines/NAME and look at the main.py script.\n\nWe currently provide a set of Options classes that should be used to\ndesign the API of your pipelines. For example, the\nsequana_pipetools.options.SlurmOptions can be used as follows inside a standard\nPython module (the last two lines is where the magic happens)::\n\n\n    import rich_click as click\n    from sequana_pipetools.options import *\n    from sequana_pipetools import SequanaManager\n\n    NAME = \"fastqc\"\n    help = init_click(NAME, groups={\n        \"Pipeline Specific\": [\n            \"--method\", \"--skip-multiqc\"],\n            }\n    )\n\n    @click.command(context_settings=help)\n    @include_options_from(ClickSnakemakeOptions, working_directory=NAME)\n    @include_options_from(ClickSlurmOptions)\n    @include_options_from(ClickInputOptions, add_input_readtag=False)\n    @include_options_from(ClickGeneralOptions)\n    @click.option(\"--method\", default=\"fastqc\", type=click.Choice([\"fastqc\", \"falco\"]), help=\"your msg\")\n    def main(**options):\n\n        # the real stuff is here\n        manager = SequanaManager(options, NAME)\n        manager.setup()\n\n        # just two aliases\n        options = manager.options\n        cfg = manager.config.config\n\n\n        # fills input_data, input_directory, input_readtag\n        manager.fill_data_options()\n\n        # fill specific options.\n        # create a function for a given option (here --method)\n        def fill_method():\n            # any extra sanity checks\n            cfg[\"method\"] = options[\"method\"]\n\n        if options[\"from-project\"]:\n            # in --from-project, we fill the method is --method is provided only (since already pre-filled)\n            if \"--method\" in sys.argv\n                fill_method()\n        else:\n            # in normal, we always want to fill the user-provided option\n            fill_method()\n\n        # finalise the command and save it; copy the snakemake. update the config\n        # file and save it.\n        manager.teardown()\n\n    if __name__ == \"__main__\":\n        main()\n\n\n\nDevelopers should look at e.g. module sequana_pipetools.options\nfor the API reference and one of the official sequana pipeline (e.g.,\nhttps://github.com/sequana/sequana_variant_calling) to get help from examples.\n\nThe Options classes provided can be used and combined to design pipelines.\n\n\nHow to use sequana pipetools within your Pipeline\n--------------------------------------------------\n\nFor FastQ files (paired ot not), The config file should look like::\n\n    sequana_wrappers: \"v0.15.1\"\n\n    input_directory: \".\"\n    input_readtag: \"_R[12]_\"\n    input_pattern: \"*fastq.gz\"\n\n\n    apptainers:\n        fastqc: \"https://zenodo.org/record/7923780/files/fastqc_0.12.1.img\"\n\n    section1:\n        key1: value1\n        key2: value2\n\nAnd your pipeline could make use of this as follows::\n\n    configfile: \"config.yaml\"\n\n    from sequana_pipetools import PipelineManager\n    manager = PipelineManager(\"fastqc\", config)\n\n    # you can then figure out wheter it is paired or not:\n    manager.paired\n\n    # get the wrapper version to be used within a rule:\n    manager.wrappers\n\n    # the raw data (with a wild card) for the first rule\n    manager.getrawdata()\n\n    # add a Makefile to clean things at the end\n    manager.teardown()\n\n\nSetting up and Running Sequana pipelines\n-----------------------------------------\n\n\nWhen you execute a sequana pipeline, e.g.::\n\n    sequana_fastqc --input-directory data\n\na working directory is created (with the name of the pipeline; here fastqc). Moreover, the working directory\ncontains a shell script that will hide the snakemake command. This snakemake command with make use\nof the sequana wrappers and will use the official sequana github repository by default\n(https://github.com/sequana/sequana-wrappers). This may be overwritten. For instance, you may use a local clone. To do\nso, you will need to create an environment variable::\n\n    export SEQUANA_WRAPPERS=\"git+file:///home/user/github/sequana-wrappers\"\n\nIf you decide to use singularity/apptainer, one common error on a cluster is that non-standard paths are not found. You can bind them using the -B option but a more general set up is to create this environment variable::\n\n    export SINGULARITY_BINDPATH=\"/path_to_bind\"\n\nfor Apptainer setup ::\n\n    export APPTAINER_BINDPATH=\"/path_to_bind\"\n\n\n\nWhat is Sequana ?\n=================\n\n**Sequana** is a versatile tool that provides\n\n#. A Python library dedicated to NGS analysis (e.g., tools to visualise standard NGS formats).\n#. A set of Pipelines dedicated to NGS in the form of Snakefiles\n   (Makefile-like with Python syntax based on snakemake framework) with more\n   common wrappers.\n#. Standalone applications such as sequana_coverage and sequana_taxonomy.\n\nSee the `sequana home page <https://sequana.readthedocs.io>`_ for details.\n\n\nTo join the project, please let us know on `github <https://github.com/sequana/sequana/issues/306>`_.\n\n\n\nChangelog\n=========\n\n========= ======================================================================\nVersion   Description\n========= ======================================================================\n1.1.1     * symlink creation on apptainers skipped if permission error (file\n            is probably already present and created by another users e.g.\n            the admin system)\n          * add --init-new-pipeline argument in sequana_pipetools standalone\n1.1.0     * add exclude_pattern in input data section\n1.0.6     * add py3.12, slight updates wrt slurm\n1.0.5     * introspect slurm files to extract stats\n1.0.4     * add utility function to download and untar a tar.gz file\n1.0.3     * add levenshtein function. some typo corrections.\n1.0.2     * add the dot2png command. pin docutils <0.21 due to pip error\n1.0.1     * hot fix in the profile creation (regression)\n1.0.0     * Stable release\n0.17.3    * remove useless code and fix a requirement\n0.17.2    * simpler logging\n0.17.1    * remove the --use-singulariry (replaced by --use-apptainer in\n            previous release)\n          * slight updates on logging and slight update on slurm module\n0.17.0    * Remove deprecated options and deprecated functions. More tests.\n0.16.9    * Fix slurm sys exit (replaced by print)\n          * upadte doc\n          * more tests\n0.16.8    * stats command add the number of rules per pipeline\n          * better slurm parsing using profile tree directory (slurm in logs/)\n0.16.7    * add missing --trimming-quality option in list of TrimmingOption\n          * set default to cutadatp if no fastp available\n          * better UI for the completion script.\n0.16.6    * Set default value for the option trimming to 20\n          * Fix issue https://github.com/sequana/sequana_pipetools/issues/85\n0.16.5    * merge completion standalone into main sequana_pipetools application\n          * add application to create schema given a config file\n          * add application to get basic stats about the pipelines\n          * add precommit and applied black/isort on all files\n          * remove some useless code\n          * update completion to use click instead of argparse\n          * Rename Module into Pipeline (remove rules so Module are only made\n            of pipelines hence the renaming)\n0.16.4    * fix Trimming options (click) for the quality option\n0.16.3    * add class to handle multiplex entry for click.option (useful for\n            multitax multiple databases)\n0.16.2    * remove useless function get_pipeline_location, get_package_location\n            guess_scheduler from sequana_manager (not used)\n          * store sequana version correctly in info.txt Fixing #89\n          * sort click options alphabetically\n          * --from-project not funtcional (example in multitax pipeline)\n          * Click checks that input-directoyr is a directory indeed\n0.16.1    * Fix/rename error_report into onerror to be included in the Snakemake\n            onerror section. added *slurm* in slurm output log file in the\n            profile\n0.16.0    * scripts now use click instead of argparse\n          * All Options classes have now an equivalent using click.\n            For example GeneralOptions has a class ClickGeneralOptions.\n            The GeneralOptions is kept for now for back compatibility\n          * --run-mode removed and replaced by --profile options. Profiles are\n            used and stored withub .sequana/profiles\n          * Remove --slurm-cores-per-job redundant with resources from snakemake\n          * Way a main.py is coded fully refactored and simplified as described\n            in the README\n          * cluster_config are now deprecated in favor of profile\n          * sequana_slurm_status removed. Use manager.error_report in pipelines\n            instead\n0.15.0    * remove useless code (readme, description) related to old rules\n          * requirements.txt renamed in tools.txt to store the required tools to\n            run a pipeline.\n          * remove copy_requirements, not used in any pipelines (replaced by code\n            in main.py of the pipelines)\n          * a utility function called getmetadata that returns dictionary\n            with name, version, wrappers version)\n0.14.X    * Module now returns the list of requirements. SequanaManager\n            creates a txt file with all standalones from the requirements.\n0.13.0    * switch to pyproject and fixes #64\n0.12.X    * automatically populater *wrappers* in PipelineManager based on the\n            config entry *sequana_wrappers*.\n          * Fix the singularity arguments by (i) adding -e and (ii) bind the\n            /home. Indeed, snakemake sets --home to the current directory.\n            Somehow the /home is lost. Removed deprecated function\n          * factorise hash function to have url2hash easily accessible\n          * remove harcoded bind path for apptainer. Uses env variable instead\n0.11.X    * fix regression, add codacy badge, applied black, remove\n            init_pipeline deprecated function.\n0.10.X    * Fixes https://github.com/sequana/sequana_pipetools/issues/49\n            that properly sets the apptainer prefix in defualt mode\n0.9.X     * replaced singularity word by apptainer (--use-aptainer instead of\n            --use-singularity)\n          * add config2schema utility function for developers\n          * Ability to download automatically singularity images (as URLs) if\n            set in the  pipelines (container field). add the --use-singularity\n            option in all pipelines (and --singualrity-prefix)\n0.9.0     * **MAJOR update/Aug 2022**\n          * new mechanism to handle  profile for Snakemake that will replace the\n            cluster_config.yaml files\n          * Major cleanup of PipelineManager (PipelineManagerGeneric was\n            removed). The way input files are handled was also cleanup.\n            Fixes https://github.com/sequana/sequana_pipetools/issues/37\n            and also files starting with common prefixes\n0.8.X     * Better schema validation. switch from distutils to packaging\n0.7.X     * simplify the setup() method in pipeline manager\n            can set a SEQUANA_WRAPPERS env variable to use local wrappers\n            add schema pipeline manager directory & fix attrdict error with yaml\n          * Set the --wrapper-prefix to point to the  sequana-wrappers github\n0.6.X     * Fix SequanaConfig file to include wrapper\n            and take new snakemake syntax into account. update schema handling\n          * Move all modules related to pipelines from sequana into\n            sequana_pipetools\n0.5.X     * feature removed in sequana to deal with adapter removal and\n            changes updated in the package (removed the *design* option\n            from the cutadapt rules and needed); add TrimmingOptions.\n0.4.X     * add FeatureCounts options and slurm status utility\n0.4.0     * stable version\n0.3.X     * first stable release\n0.2.X     * completion can now handle multiple directories/files properly\n            better doc and more tests; add --from-project option to import\n            existing config file; remove --paired-data option; add content\n            from sequana.pipeline_common\n0.1.X     * software creation\n========= ======================================================================\n\n",
    "bugtrack_url": null,
    "license": "BSD-3",
    "summary": "A set of tools to help building or using Sequana pipelines",
    "version": "1.1.1",
    "project_urls": {
        "Homepage": "https://github.com/sequana/sequana_pipetools",
        "Repository": "https://github.com/sequana/sequana_pipetools"
    },
    "split_keywords": [
        "snakemake",
        " sequana",
        " pipelines"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ef8af6ee92b69c09aa2bed9ab8f885317856c39a79753c6ab2fb014f2cc95f80",
                "md5": "e1693a603acd6b090b93ebc982b1e35c",
                "sha256": "310c472eab1f1ae5e6445c95ffcfe641990ab6afaf47c86a2440b39c0139f54c"
            },
            "downloads": -1,
            "filename": "sequana_pipetools-1.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e1693a603acd6b090b93ebc982b1e35c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 59506,
            "upload_time": "2024-11-21T22:38:46",
            "upload_time_iso_8601": "2024-11-21T22:38:46.837039Z",
            "url": "https://files.pythonhosted.org/packages/ef/8a/f6ee92b69c09aa2bed9ab8f885317856c39a79753c6ab2fb014f2cc95f80/sequana_pipetools-1.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5098cf61066e8738434a12d8fc08855d3c9ffea40e5a5a09d413573f2030cf7f",
                "md5": "0ac15600a02a7e3e80badc31790bf44a",
                "sha256": "78169f1632ddfddacab0a7bbee5eb9b39af110f7d4e9b8f31d572b37365608ca"
            },
            "downloads": -1,
            "filename": "sequana_pipetools-1.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0ac15600a02a7e3e80badc31790bf44a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 54620,
            "upload_time": "2024-11-21T22:38:49",
            "upload_time_iso_8601": "2024-11-21T22:38:49.123537Z",
            "url": "https://files.pythonhosted.org/packages/50/98/cf61066e8738434a12d8fc08855d3c9ffea40e5a5a09d413573f2030cf7f/sequana_pipetools-1.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-21 22:38:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sequana",
    "github_project": "sequana_pipetools",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sequana-pipetools"
}
        
Elapsed time: 2.71464s