sequana-pacbio-qc


Namesequana-pacbio-qc JSON
Version 1.0.1 PyPI version JSON
download
home_pagehttps://github.com/sequana/
SummaryQC on various type of pacbio data
upload_time2023-07-07 13:13:21
maintainerthomas cokelaer
docs_urlNone
authorthomas cokelaer
requires_python
licensenew BSD
keywords pacbio snakemake ngs sequana
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://badge.fury.io/py/sequana-pacbio-qc.svg
     :target: https://pypi.python.org/pypi/sequana_pacbio_qc

.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg
    :target: http://joss.theoj.org/papers/10.21105/joss.00352
    :alt: JOSS (journal of open source software) DOI

.. image:: https://github.com/sequana/pacbio_qc/actions/workflows/main.yml/badge.svg
   :target: https://github.com/sequana/pacbio_qc/actions/workflows    

.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg
    :target: https://pypi.python.org/pypi/sequana
    :alt: Python 3.8 | 3.9 | 3.10

.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg
   :target: http://joss.theoj.org/papers/10.21105/joss.00352
   :alt: JOSS (journal of open source software) DOI

|Codacy-Grade|


This is is the **pacbio_qc** pipeline from the `Sequana <https://sequana.readthedocs.org>`_ projet

:Overview: Quality control for pacbio datafiles (raw data or CCS files). 

:Input: BAM files provided by Pacbio Sequencers. 
:Output: HTML reports with various plots including taxonomic plot
:Status: production
:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io
:Citation: Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI doi:10.21105/joss.00352


Installation
~~~~~~~~~~~~

Just install this package::

    pip install sequana_pacbio_qc

You will need **samtools** and  **kraken2** (optional) for the taxonomic analysis.


Usage
~~~~~

::

    sequana_pacbio_qc --help
    sequana_pacbio_qc --input-directory DATAPATH


If you want to filter out some BAM files, you may use the pattern in tab 'input data'.

In the configuration tab, in the kraken section add as many databases
as you wish. You may simply unset the first database to skip the taxonomy, which
is experimental.


This creates a directory with the pipeline and configuration file. You will then need
to execute the pipeline::

    cd pacbio_qc
    sh pacbio_qc.sh  # for a local run

This launch a snakemake pipeline. If you are familiar with snakemake, you can 
retrieve the pipeline itself and its configuration files and then execute the pipeline yourself with specific parameters::

    snakemake -s pacbio_qc.rules -c config.yaml --cores 4 --stats stats.txt

Or use `sequanix <https://sequana.readthedocs.io/en/main/sequanix.html>`_ interface.

Requirements
~~~~~~~~~~~~

This pipelines requires the following executable(s):

- sequana
- samtools
- kraken2
- multiqc

.. image:: https://raw.githubusercontent.com/sequana/pacbio_qc/main/sequana_pipelines/pacbio_qc/dag.png


Details
~~~~~~~~~

This pipeline takes as inputs a set of BAM files from Pacbio sequencers. It
computes a set of basic statistics related to the read lengths. It also shows
some histograms related to the GC content, SNR of the diodes and the number of passes
Finally, a quick taxonomy can be performed using Kraken. HTML reports
are created for each sample as well as a multiqc summary page.

Kraken databases are not provided with the pipeline. This step is optional and
not used by default.


Changelog
~~~~~~~~~
========= ====================================================================
Version   Description
========= ====================================================================
1.0.1     fix missing import in the summary 
1.0.0     Uses latest wrappers and graphviz apptainers
0.11.0    Release to use latests sequana_pipetools framework
0.10.0    Update to use latest tools from sequana framework
0.9.0     First release of sequana_pacbio_qc using latest sequana rules and
          modules (0.9.5)
========= ====================================================================


Contribute & Code of Conduct
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To contribute to this project, please take a look at the 
`Contributing Guidelines <https://github.com/sequana/sequana/blob/main/CONTRIBUTING.rst>`_ first. Please note that this project is released with a 
`Code of Conduct <https://github.com/sequana/sequana/blob/main/CONDUCT.md>`_. By contributing to this project, you agree to abide by its terms.


Rules and configuration details
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here is the `latest documented configuration file <https://raw.githubusercontent.com/sequana/sequana_pacbio_qc/main/sequana_pipelines/pacbio_qc/config.yaml>`_
to be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file. 



.. |Codacy-Grade| image:: https://app.codacy.com/project/badge/Grade/9b8355ff642f4de9acd4b270f8d14d10
   :target: https://www.codacy.com/gh/sequana/pacbio_qc/dashboard
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sequana/",
    "name": "sequana-pacbio-qc",
    "maintainer": "thomas cokelaer",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "thomas.cokelaer@pasteur.fr",
    "keywords": "pacbio, snakemake, NGS, sequana",
    "author": "thomas cokelaer",
    "author_email": "thomas.cokelaer@pasteur.fr",
    "download_url": "https://files.pythonhosted.org/packages/fe/57/13bb994b360cd122ce64c44ee873690d19a15f575fbd760daddc9aecd560/sequana_pacbio_qc-1.0.1.tar.gz",
    "platform": "Linux",
    "description": ".. image:: https://badge.fury.io/py/sequana-pacbio-qc.svg\n     :target: https://pypi.python.org/pypi/sequana_pacbio_qc\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n    :target: http://joss.theoj.org/papers/10.21105/joss.00352\n    :alt: JOSS (journal of open source software) DOI\n\n.. image:: https://github.com/sequana/pacbio_qc/actions/workflows/main.yml/badge.svg\n   :target: https://github.com/sequana/pacbio_qc/actions/workflows    \n\n.. image:: https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C3.10-blue.svg\n    :target: https://pypi.python.org/pypi/sequana\n    :alt: Python 3.8 | 3.9 | 3.10\n\n.. image:: http://joss.theoj.org/papers/10.21105/joss.00352/status.svg\n   :target: http://joss.theoj.org/papers/10.21105/joss.00352\n   :alt: JOSS (journal of open source software) DOI\n\n|Codacy-Grade|\n\n\nThis is is the **pacbio_qc** pipeline from the `Sequana <https://sequana.readthedocs.org>`_ projet\n\n:Overview: Quality control for pacbio datafiles (raw data or CCS files). \n\n:Input: BAM files provided by Pacbio Sequencers. \n:Output: HTML reports with various plots including taxonomic plot\n:Status: production\n:Documentation: This README file, the Wiki from the github repository (link above) and https://sequana.readthedocs.io\n:Citation: Cokelaer et al, (2017), \u2018Sequana\u2019: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, JOSS DOI doi:10.21105/joss.00352\n\n\nInstallation\n~~~~~~~~~~~~\n\nJust install this package::\n\n    pip install sequana_pacbio_qc\n\nYou will need **samtools** and  **kraken2** (optional) for the taxonomic analysis.\n\n\nUsage\n~~~~~\n\n::\n\n    sequana_pacbio_qc --help\n    sequana_pacbio_qc --input-directory DATAPATH\n\n\nIf you want to filter out some BAM files, you may use the pattern in tab 'input data'.\n\nIn the configuration tab, in the kraken section add as many databases\nas you wish. You may simply unset the first database to skip the taxonomy, which\nis experimental.\n\n\nThis creates a directory with the pipeline and configuration file. You will then need\nto execute the pipeline::\n\n    cd pacbio_qc\n    sh pacbio_qc.sh  # for a local run\n\nThis launch a snakemake pipeline. If you are familiar with snakemake, you can \nretrieve the pipeline itself and its configuration files and then execute the pipeline yourself with specific parameters::\n\n    snakemake -s pacbio_qc.rules -c config.yaml --cores 4 --stats stats.txt\n\nOr use `sequanix <https://sequana.readthedocs.io/en/main/sequanix.html>`_ interface.\n\nRequirements\n~~~~~~~~~~~~\n\nThis pipelines requires the following executable(s):\n\n- sequana\n- samtools\n- kraken2\n- multiqc\n\n.. image:: https://raw.githubusercontent.com/sequana/pacbio_qc/main/sequana_pipelines/pacbio_qc/dag.png\n\n\nDetails\n~~~~~~~~~\n\nThis pipeline takes as inputs a set of BAM files from Pacbio sequencers. It\ncomputes a set of basic statistics related to the read lengths. It also shows\nsome histograms related to the GC content, SNR of the diodes and the number of passes\nFinally, a quick taxonomy can be performed using Kraken. HTML reports\nare created for each sample as well as a multiqc summary page.\n\nKraken databases are not provided with the pipeline. This step is optional and\nnot used by default.\n\n\nChangelog\n~~~~~~~~~\n========= ====================================================================\nVersion   Description\n========= ====================================================================\n1.0.1     fix missing import in the summary \n1.0.0     Uses latest wrappers and graphviz apptainers\n0.11.0    Release to use latests sequana_pipetools framework\n0.10.0    Update to use latest tools from sequana framework\n0.9.0     First release of sequana_pacbio_qc using latest sequana rules and\n          modules (0.9.5)\n========= ====================================================================\n\n\nContribute & Code of Conduct\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nTo contribute to this project, please take a look at the \n`Contributing Guidelines <https://github.com/sequana/sequana/blob/main/CONTRIBUTING.rst>`_ first. Please note that this project is released with a \n`Code of Conduct <https://github.com/sequana/sequana/blob/main/CONDUCT.md>`_. By contributing to this project, you agree to abide by its terms.\n\n\nRules and configuration details\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nHere is the `latest documented configuration file <https://raw.githubusercontent.com/sequana/sequana_pacbio_qc/main/sequana_pipelines/pacbio_qc/config.yaml>`_\nto be used with the pipeline. Each rule used in the pipeline may have a section in the configuration file. \n\n\n\n.. |Codacy-Grade| image:: https://app.codacy.com/project/badge/Grade/9b8355ff642f4de9acd4b270f8d14d10\n   :target: https://www.codacy.com/gh/sequana/pacbio_qc/dashboard",
    "bugtrack_url": null,
    "license": "new BSD",
    "summary": "QC on various type of pacbio data",
    "version": "1.0.1",
    "project_urls": {
        "Homepage": "https://github.com/sequana/"
    },
    "split_keywords": [
        "pacbio",
        " snakemake",
        " ngs",
        " sequana"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fe5713bb994b360cd122ce64c44ee873690d19a15f575fbd760daddc9aecd560",
                "md5": "5d6d1357cfbbfe53c114906130e3da66",
                "sha256": "855374e81f7579bfd6cc59032d67cc8f8f62e26e00f378434c7381bacccad822"
            },
            "downloads": -1,
            "filename": "sequana_pacbio_qc-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "5d6d1357cfbbfe53c114906130e3da66",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 30724,
            "upload_time": "2023-07-07T13:13:21",
            "upload_time_iso_8601": "2023-07-07T13:13:21.975346Z",
            "url": "https://files.pythonhosted.org/packages/fe/57/13bb994b360cd122ce64c44ee873690d19a15f575fbd760daddc9aecd560/sequana_pacbio_qc-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-07 13:13:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "sequana-pacbio-qc"
}
        
Elapsed time: 0.09265s