sequana-demultiplex


Namesequana-demultiplex JSON
Version 1.5.2 PyPI version JSON
download
home_pagehttps://github.com/sequana/demultiplex
SummaryPipeline that runs bcl2fastq and ease demultiplexing of Sequencing data
upload_time2023-12-29 13:49:18
maintainer
docs_urlNone
authorSequana Team
requires_python>=3.8,<4.0
licenseBSD-3
keywords bcl2fastq illumina sequana base caller demultiplexing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            
.. image:: https://badge.fury.io/py/sequana-demultiplex.svg
     :target: https://pypi.python.org/pypi/sequana_demultiplex

.. image:: https://github.com/sequana/demultiplex/actions/workflows/main.yml/badge.svg
   :target: https://github.com/sequana/demultiplex/actions/workflows/main.yml

.. image:: https://coveralls.io/repos/github/sequana/demultiplex/badge.svg?branch=main
    :target: https://coveralls.io/github/sequana/demultiplex?branch=main

.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg
    :target: https://pypi.python.org/pypi/sequana
    :alt: Python 3.8 | 3.9 | 3.10

.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg
   :target: http://joss.theoj.org/papers/10.21105/joss.00352
   :alt: JOSS (journal of open source software) DOI

This is is the **demultiplex** pipeline from the `Sequana <https://sequana.readthedocs.org>`_ projet

:Overview: Runs bcl2fastq on raw BCL data and creates plots to ease the QC validation
:Input: A valid Illumina base calling directory and sample sheet file
:Output: An HTML report, a set of PNG files and the expected FastQ files
:Status: production
:Wiki: https://github.com/sequana/demultiplex/wiki
:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io
:Citation: Cokelaer et al, (2017), 'Sequana': a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352


Installation
~~~~~~~~~~~~

Intall the **sequana_demultiplex** package as follows::

    pip install sequana_demultiplex

Usage
~~~~~

::

    sequana_demultiplex --help
    sequana_demultiplex --working-directory DATAPATH --bcl-directory bcldata --sample-sheet SampleSheet.csv --merging-strategy merge

The --bcl-directory option indicates where to find your raw data, the sample-sheet
expects the SampleSheet to be compatible with IEM software. The --merging-strategy can
be set to *none* or *merge*. The *merge* option merges the lanes, which is
useful for e.g. NextSeq sequencers.

This creates a directory **fastq**. You just need to execute the pipeline::

    cd demultiplex
    sh demultiplex.sh  # for a local run

These commands launch a snakemake pipeline. If you are familiar with snakemake, you can retrieve the demultiplex.rules and config.yaml files and then execute the pipeline yourself with specific parameters::

    snakemake -s demultiplex.rules --cores 4 --stats stats.txt \
        --wrapper-prefix https://raw.githubusercontent.com/sequana/sequana-wrappers/"


You may also use `sequanix <https://sequana.readthedocs.io/en/master/sequanix.html>`_ for a graphical interface.

Would you need to merge the lane, please add the --merging-strategy argument
followed by *merge*::

    sequana_demultiplex --bcl-directory bcl_data --merging-strategy merge --sample-sheet SampleSheet.csv


Requirements
~~~~~~~~~~~~

This pipeline requires the following third-party tool(s):

- bcl2fastq 2.20.0

This software has an end-user license agreement (EULA). Given the EULA details
of this software, it cannot be distributed according to ` Illumina license <https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/bcl2fastq/bcl2fastq2-v2-20-eula.pdf>`_
Therefore, you should install it yourself. On cluster facility, you may ask to
your system administator. For instance::

    module load bcl2fastq/2.20.0

For the same reason you cannot find it on community such as bioconda or docker (aug 2020).

So, you will need to download the code yourself. The easiest is to download the
RPM from `Illumina
<https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software/downloads.html>`_
and accept the agreements. Install it using the RPM if you have a debian-like system::

    rpm install file.rpm

If you do not have a debian system, you can look at https://damona.readthedocs.io where we provide
a singularity recipes to build an image from your own  rpm. Recipes can be found
`here <https://github.com/cokelaer/damona/tree/master/damona/recipes/bcl2fastq>`_.


Details
~~~~~~~~~
.. image:: https://raw.githubusercontent.com/sequana/demultiplex/master/sequana_pipelines/demultiplex/dag.png

This pipeline runs bcl2fastq 2.20 and creates a set of diagnostics plots to help
deciphering common issues such as missing index and sample sheet errors.


Rules and configuration details
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here is the `latest documented configuration file <https://raw.githubusercontent.com/sequana/demultiplex/master/sequana_pipelines/demultiplex/config.yaml>`_
to be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file.



Changelog
~~~~~~~~~

========= =======================================================================
Version   Description
========= =======================================================================
1.5.2     * rename requirements.txt into tools.txt; update __init__
1.5.1     * switch working directory to fastq instead of demultiplex(regression)
1.5.0     * Uses click and new sequana_pipetools, add multiqc
1.4.0     * Implement demultiplexing of single cell ATAC seq data with
            cellranger.
1.3.1     * use sequana_wrappers version in the config file
1.3.0     * use latest sequana-wrappers to benefit and graphivz apptainer
1.2.1     * Update CI action and use new sequana_pipetools v0.9.0
1.2.0     * stable release with cleanup of the setup and README
1.1.3     * add the --mars-seq option that fills the config automatically
1.1.2     * fix the none_and_force merging strategy option
1.1.1     * fix a regression bug
1.1.0     * Uses new sequana-wrappers repository
1.0.5     * Fix regression bug to cope with new snakemake API
          * Compatibility with sequanix GUI
1.0.4     * Better HTML report with updated images.
          * validate the SampleSheet when using sequana_demultiplex and/or the
            pipeline
          * Add error handler from sequana_pipetools
          * save all undetermined barcodes (not just first 20)
          * No changes to the UI
          * technically, the input_directory option is now in a section so that
            it can be used in Sequanix
1.0.3     * remove check_samplesheet and fix_samplesheet modules now in sequana
          * check sample sheet but do not fail. Instead, informing users that
            there is an error and suggest to use 'sequana samplesheet
            --quick-fix'
1.0.2     Use 'sequana samplesheet --check ' command instead of deprecated
          sequana_check_sample_sheet command
1.0.1     change some default behaviour:

          * write_fastq_reverse_complement is now set to False by default
            like bcl2fastq
          * The --no-bgzf-compression option is changed into
            --bgzf-compression. We do not want this option by default.
          * The --ignore-missing-bcls option is changed into
            --no-ignore-missing-bcls so as to ignore missing bcls by default
            keep this option as a flag and keep same behaviour
          * Fix HTML syntax
1.0.0     * stable version pinned on sequana libraries
0.9.11    * fix label in plot_summary,
          * add new plot to show reads per sample + undetermined
          * add two tools one to check the samplesheet called
            sequana_sample_sheet and one called sequana_fix_samplesheet. The
            former is now inside the pipeline as well and when creating the
            pipeline
          * set --write_reverse_complement to False by default
          * remove the --ignore-missing-control which is deprecated anyway
0.9.10    * implement the new option --from-project, add missing MANIFEST
0.9.9     * simplification of the pipeline to use sequana 0.8.4 to speed up
            the --help calls.
          * include a summary HTML report
0.9.8     * fix typos
0.9.7     * Use new release of sequana_pipetools
          * set matplotlib backend to agg
          * include a simple HTML report
0.9.6     * Handle different RunParameter.xml name (NextSeq vs HiSeq)
0.9.5     * Fix a regression bug due to new sequana release. We do not check
            the input file (fastq) since this is not a sequence analysis
            pipeline
          * Check whether it is a NextSeq run. If so, merging-strategy must be
            set to 'merge'. Can be bypassed using --force
0.9.4     * Check the presence of the bcl input directory and samplesheet.
          * More help in the --help message.
          * add  --sample-sheet option to replace --samplesheet option
          * Fix the schema file
          * Check for presence of RunParameters.xml and provide information
            if merging-stratgy is set to None whereas it is a NextSeq run
0.9.3     Fix regression bug
0.9.2     remove warning due to relative paths.
0.9.1     Make the merging options compulsory. Users must tell whether they
          want to merge the lanes or not. This avoid to do the merging or not
          whereas the inverse was expected.
0.8.6     Uses 64G/biomics queue and 16 cores on a SLURM scheduler
========= =======================================================================



Contribute & Code of Conduct
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To contribute to this project, please take a look at the
`Contributing Guidelines <https://github.com/sequana/sequana/blob/master/CONTRIBUTING.rst>`_ first. Please note that this project is released with a
`Code of Conduct <https://github.com/sequana/sequana/blob/master/CONDUCT.md>`_. By contributing to this project, you agree to abide by its terms.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sequana/demultiplex",
    "name": "sequana-demultiplex",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "bcl2fastq,Illumina,sequana,base caller,demultiplexing",
    "author": "Sequana Team",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/f7/ed/199cabc4ac19509c2ded0820ebd3839ee13da0e216088617f32acc8d3cca/sequana_demultiplex-1.5.2.tar.gz",
    "platform": null,
    "description": "\n.. image:: https://badge.fury.io/py/sequana-demultiplex.svg\n     :target: https://pypi.python.org/pypi/sequana_demultiplex\n\n.. image:: https://github.com/sequana/demultiplex/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/demultiplex/actions/workflows/main.yml\n\n.. image:: https://coveralls.io/repos/github/sequana/demultiplex/badge.svg?branch=main\n    :target: https://coveralls.io/github/sequana/demultiplex?branch=main\n\n.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg\n    :target: https://pypi.python.org/pypi/sequana\n    :alt: Python 3.8 | 3.9 | 3.10\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n   :target: http://joss.theoj.org/papers/10.21105/joss.00352\n   :alt: JOSS (journal of open source software) DOI\n\nThis is is the **demultiplex** pipeline from the `Sequana <https://sequana.readthedocs.org>`_ projet\n\n:Overview: Runs bcl2fastq on raw BCL data and creates plots to ease the QC validation\n:Input: A valid Illumina base calling directory and sample sheet file\n:Output: An HTML report, a set of PNG files and the expected FastQ files\n:Status: production\n:Wiki: https://github.com/sequana/demultiplex/wiki\n:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io\n:Citation: Cokelaer et al, (2017), 'Sequana': a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI https://doi:10.21105/joss.00352\n\n\nInstallation\n~~~~~~~~~~~~\n\nIntall the **sequana_demultiplex** package as follows::\n\n    pip install sequana_demultiplex\n\nUsage\n~~~~~\n\n::\n\n    sequana_demultiplex --help\n    sequana_demultiplex --working-directory DATAPATH --bcl-directory bcldata --sample-sheet SampleSheet.csv --merging-strategy merge\n\nThe --bcl-directory option indicates where to find your raw data, the sample-sheet\nexpects the SampleSheet to be compatible with IEM software. The --merging-strategy can\nbe set to *none* or *merge*. The *merge* option merges the lanes, which is\nuseful for e.g. NextSeq sequencers.\n\nThis creates a directory **fastq**. You just need to execute the pipeline::\n\n    cd demultiplex\n    sh demultiplex.sh  # for a local run\n\nThese commands launch a snakemake pipeline. If you are familiar with snakemake, you can retrieve the demultiplex.rules and config.yaml files and then execute the pipeline yourself with specific parameters::\n\n    snakemake -s demultiplex.rules --cores 4 --stats stats.txt \\\n        --wrapper-prefix https://raw.githubusercontent.com/sequana/sequana-wrappers/\"\n\n\nYou may also use `sequanix <https://sequana.readthedocs.io/en/master/sequanix.html>`_ for a graphical interface.\n\nWould you need to merge the lane, please add the --merging-strategy argument\nfollowed by *merge*::\n\n    sequana_demultiplex --bcl-directory bcl_data --merging-strategy merge --sample-sheet SampleSheet.csv\n\n\nRequirements\n~~~~~~~~~~~~\n\nThis pipeline requires the following third-party tool(s):\n\n- bcl2fastq 2.20.0\n\nThis software has an end-user license agreement (EULA). Given the EULA details\nof this software, it cannot be distributed according to ` Illumina license <https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/bcl2fastq/bcl2fastq2-v2-20-eula.pdf>`_\nTherefore, you should install it yourself. On cluster facility, you may ask to\nyour system administator. For instance::\n\n    module load bcl2fastq/2.20.0\n\nFor the same reason you cannot find it on community such as bioconda or docker (aug 2020).\n\nSo, you will need to download the code yourself. The easiest is to download the\nRPM from `Illumina\n<https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software/downloads.html>`_\nand accept the agreements. Install it using the RPM if you have a debian-like system::\n\n    rpm install file.rpm\n\nIf you do not have a debian system, you can look at https://damona.readthedocs.io where we provide\na singularity recipes to build an image from your own  rpm. Recipes can be found\n`here <https://github.com/cokelaer/damona/tree/master/damona/recipes/bcl2fastq>`_.\n\n\nDetails\n~~~~~~~~~\n.. image:: https://raw.githubusercontent.com/sequana/demultiplex/master/sequana_pipelines/demultiplex/dag.png\n\nThis pipeline runs bcl2fastq 2.20 and creates a set of diagnostics plots to help\ndeciphering common issues such as missing index and sample sheet errors.\n\n\nRules and configuration details\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere is the `latest documented configuration file <https://raw.githubusercontent.com/sequana/demultiplex/master/sequana_pipelines/demultiplex/config.yaml>`_\nto be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file.\n\n\n\nChangelog\n~~~~~~~~~\n\n========= =======================================================================\nVersion   Description\n========= =======================================================================\n1.5.2     * rename requirements.txt into tools.txt; update __init__\n1.5.1     * switch working directory to fastq instead of demultiplex(regression)\n1.5.0     * Uses click and new sequana_pipetools, add multiqc\n1.4.0     * Implement demultiplexing of single cell ATAC seq data with\n            cellranger.\n1.3.1     * use sequana_wrappers version in the config file\n1.3.0     * use latest sequana-wrappers to benefit and graphivz apptainer\n1.2.1     * Update CI action and use new sequana_pipetools v0.9.0\n1.2.0     * stable release with cleanup of the setup and README\n1.1.3     * add the --mars-seq option that fills the config automatically\n1.1.2     * fix the none_and_force merging strategy option\n1.1.1     * fix a regression bug\n1.1.0     * Uses new sequana-wrappers repository\n1.0.5     * Fix regression bug to cope with new snakemake API\n          * Compatibility with sequanix GUI\n1.0.4     * Better HTML report with updated images.\n          * validate the SampleSheet when using sequana_demultiplex and/or the\n            pipeline\n          * Add error handler from sequana_pipetools\n          * save all undetermined barcodes (not just first 20)\n          * No changes to the UI\n          * technically, the input_directory option is now in a section so that\n            it can be used in Sequanix\n1.0.3     * remove check_samplesheet and fix_samplesheet modules now in sequana\n          * check sample sheet but do not fail. Instead, informing users that\n            there is an error and suggest to use 'sequana samplesheet\n            --quick-fix'\n1.0.2     Use 'sequana samplesheet --check ' command instead of deprecated\n          sequana_check_sample_sheet command\n1.0.1     change some default behaviour:\n\n          * write_fastq_reverse_complement is now set to False by default\n            like bcl2fastq\n          * The --no-bgzf-compression option is changed into\n            --bgzf-compression. We do not want this option by default.\n          * The --ignore-missing-bcls option is changed into\n            --no-ignore-missing-bcls so as to ignore missing bcls by default\n            keep this option as a flag and keep same behaviour\n          * Fix HTML syntax\n1.0.0     * stable version pinned on sequana libraries\n0.9.11    * fix label in plot_summary,\n          * add new plot to show reads per sample + undetermined\n          * add two tools one to check the samplesheet called\n            sequana_sample_sheet and one called sequana_fix_samplesheet. The\n            former is now inside the pipeline as well and when creating the\n            pipeline\n          * set --write_reverse_complement to False by default\n          * remove the --ignore-missing-control which is deprecated anyway\n0.9.10    * implement the new option --from-project, add missing MANIFEST\n0.9.9     * simplification of the pipeline to use sequana 0.8.4 to speed up\n            the --help calls.\n          * include a summary HTML report\n0.9.8     * fix typos\n0.9.7     * Use new release of sequana_pipetools\n          * set matplotlib backend to agg\n          * include a simple HTML report\n0.9.6     * Handle different RunParameter.xml name (NextSeq vs HiSeq)\n0.9.5     * Fix a regression bug due to new sequana release. We do not check\n            the input file (fastq) since this is not a sequence analysis\n            pipeline\n          * Check whether it is a NextSeq run. If so, merging-strategy must be\n            set to 'merge'. Can be bypassed using --force\n0.9.4     * Check the presence of the bcl input directory and samplesheet.\n          * More help in the --help message.\n          * add  --sample-sheet option to replace --samplesheet option\n          * Fix the schema file\n          * Check for presence of RunParameters.xml and provide information\n            if merging-stratgy is set to None whereas it is a NextSeq run\n0.9.3     Fix regression bug\n0.9.2     remove warning due to relative paths.\n0.9.1     Make the merging options compulsory. Users must tell whether they\n          want to merge the lanes or not. This avoid to do the merging or not\n          whereas the inverse was expected.\n0.8.6     Uses 64G/biomics queue and 16 cores on a SLURM scheduler\n========= =======================================================================\n\n\n\nContribute & Code of Conduct\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo contribute to this project, please take a look at the\n`Contributing Guidelines <https://github.com/sequana/sequana/blob/master/CONTRIBUTING.rst>`_ first. Please note that this project is released with a\n`Code of Conduct <https://github.com/sequana/sequana/blob/master/CONDUCT.md>`_. By contributing to this project, you agree to abide by its terms.\n\n",
    "bugtrack_url": null,
    "license": "BSD-3",
    "summary": "Pipeline that runs bcl2fastq and ease demultiplexing of Sequencing data",
    "version": "1.5.2",
    "project_urls": {
        "Homepage": "https://github.com/sequana/demultiplex",
        "Repository": "https://github.com/sequana/demultiplex"
    },
    "split_keywords": [
        "bcl2fastq",
        "illumina",
        "sequana",
        "base caller",
        "demultiplexing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7ce812727ae0d61081aa2ee0896a949d8ad5d76db4cc914f96c88fb87aed8ba8",
                "md5": "79053c423b814c29fe39234c95e0bcaa",
                "sha256": "3ca7f46cc6d011bed76529725c0fae2952452c85d574fb2461a5f2117af80ebd"
            },
            "downloads": -1,
            "filename": "sequana_demultiplex-1.5.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "79053c423b814c29fe39234c95e0bcaa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 41203,
            "upload_time": "2023-12-29T13:49:15",
            "upload_time_iso_8601": "2023-12-29T13:49:15.930679Z",
            "url": "https://files.pythonhosted.org/packages/7c/e8/12727ae0d61081aa2ee0896a949d8ad5d76db4cc914f96c88fb87aed8ba8/sequana_demultiplex-1.5.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f7ed199cabc4ac19509c2ded0820ebd3839ee13da0e216088617f32acc8d3cca",
                "md5": "d759ff19d040ac30566ec85c5a8b2dc5",
                "sha256": "1b64e3063cdbf34193b9d74c4f372017f59de64baaa5c47f1a859fb4854b7c6e"
            },
            "downloads": -1,
            "filename": "sequana_demultiplex-1.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d759ff19d040ac30566ec85c5a8b2dc5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 43124,
            "upload_time": "2023-12-29T13:49:18",
            "upload_time_iso_8601": "2023-12-29T13:49:18.608329Z",
            "url": "https://files.pythonhosted.org/packages/f7/ed/199cabc4ac19509c2ded0820ebd3839ee13da0e216088617f32acc8d3cca/sequana_demultiplex-1.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-29 13:49:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sequana",
    "github_project": "demultiplex",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "sequana-demultiplex"
}
        
Elapsed time: 0.16930s