stpipeline

Name	stpipeline JSON
Version	2.0.0 JSON
	download
home_page	https://github.com/jfnavarro/st_pipeline
Summary	ST Pipeline: An automated pipeline for spatial mapping of unique transcripts
upload_time	2025-02-09 10:39:42
maintainer	None
docs_url	https://pythonhosted.org/stpipeline/
author	Jose Fernandez Navarro
requires_python	<3.13,>=3.10
license	MIT
keywords	visium analysis pipeline spatial transcriptomics toolkit
VCS
bugtrack_url
requirements	argparse cfgv contourpy cycler distance distlib dnaio filelock fonttools htseq identify isal joblib kiwisolver matplotlib nodeenv numpy packaging pandas-stubs pandas pillow platformdirs pre-commit pyparsing pysam python-dateutil pytz pyyaml regex scikit-learn scipy seaborn six taggd threadpoolctl types-pytz types-regex tzdata virtualenv xopen zlib-ng
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Spatial Transcriptomics (ST) Pipeline

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-310/)
[![Python 3.11](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-311/)
[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/release/python-312/)
[![PyPI version](https://badge.fury.io/py/stpipeline.svg)](https://badge.fury.io/py/stpipeline)
[![Build Status](https://github.com/jfnavarro/st_pipeline/actions/workflows/dev.yml/badge.svg)](https://github.com/jfnavarro/st_pipeline/actions/workflows/dev)

The ST Pipeline provides the tools, algorithms and scripts needed to process and analyze the raw
data generated with Spatial Transcriptomics or Visium in FASTQ format to generate datasets for down-stream analysis.

The ST Pipeline can also be used to process single cell/nuclei RNA-seq data as long as a
file with molecular `barcodes` identifying each cell is provided (same template as the files in the folder "ids").

The ST Pipeline can also be used to process bulk RNA-seq data, in this case the barcodes file is not required.

The ST Pipeline has been optimized for speed, robustness and it is very easy to use with many parameters to adjust all the settings.
The ST Pipeline is fully parallel and it has constant memory use.
The ST Pipeline allows to skip any of the main steps and provides multiple customization options.
The ST Pipeline allows to use either the genome or the transcriptome as reference.

Basically what the ST pipeline does (default mode) is:

- Quality trimming step (read 1 and read 2):
  - Remove low quality bases
  - Sanity check (reads same length, reads order, etc..)
  - Check quality UMI
  - Remove artifacts (PolyT, PolyA, PolyG, PolyN and PolyC) of user defined length
  - Check for AT and GC content
  - Discard reads with a minimum number of bases of that failed any of the checks above
- Contamimant filter step (e.x. rRNA genome) (Optional)
- Mapping with [STAR](https://github.com/alexdobin/STAR) step (only read 2) (Optional)
- Demultiplexing with [Taggd](https://github.com/jfnavarro/taggd) step (only read 1) (Optional)
- Keep reads (read 2) that contain a valid barcode and are correctly mapped
- Annotate the reads to the reference (Optional)
- Group annotated reads by barcode (spot position), gene and genomic location (with an offset) to get a read count
- In the grouping/counting only unique molecules (UMIs) are kept (Optional)

You can see a graphical more detailed description of the workflow in the documents `workflow.pdf` and `workflow_extended.pdf`

The output dataset is a matrix of counts (genes as columns, spots as rows) in TSV format.
The ST pipeline will also output a log file with useful stats and information.

## Installation

For users see [install](docs/installation.md)

For developers [contributing](CONTRIBUTING.md)

## Usage

See [usage](docs/usage.md)

## Authors

See [authors](AUTHORS.md)

## License

The ST pipeline is open source under the MIT license which means that you can use it,
change it and re-distribute but you must always refer to our license (see LICENSE).

## Credits

If you use the ST Pipeline, please refer its publication:
ST Pipeline: An automated pipeline for spatial mapping of unique transcripts
Oxford BioInformatics
10.1093/bioinformatics/btx211

## Example dataset

You can see a real dataset obtained from the public data from
the following publication (http://science.sciencemag.org/content/353/6294/78)
in the folder called "data".

## Contact

For questions, bugs, feedback, etc.. you can contact:

Jose Fernandez Navarro <jc.fernandez.navarro@gmail.com>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jfnavarro/st_pipeline",
    "name": "stpipeline",
    "maintainer": null,
    "docs_url": "https://pythonhosted.org/stpipeline/",
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": null,
    "keywords": "visium, analysis, pipeline, spatial, transcriptomics, toolkit",
    "author": "Jose Fernandez Navarro",
    "author_email": "jc.fernandez.navarro@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/92/70/c43274e021746d5fcbd25c9ecaa433745b59eb14b62f2ea3892f2482823a/stpipeline-2.0.0.tar.gz",
    "platform": null,
    "description": "# Spatial Transcriptomics (ST) Pipeline\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-310/)\n[![Python 3.11](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-311/)\n[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/release/python-312/)\n[![PyPI version](https://badge.fury.io/py/stpipeline.svg)](https://badge.fury.io/py/stpipeline)\n[![Build Status](https://github.com/jfnavarro/st_pipeline/actions/workflows/dev.yml/badge.svg)](https://github.com/jfnavarro/st_pipeline/actions/workflows/dev)\n\nThe ST Pipeline provides the tools, algorithms and scripts needed to process and analyze the raw\ndata generated with Spatial Transcriptomics or Visium in FASTQ format to generate datasets for down-stream analysis.\n\nThe ST Pipeline can also be used to process single cell/nuclei RNA-seq data as long as a\nfile with molecular `barcodes` identifying each cell is provided (same template as the files in the folder \"ids\").\n\nThe ST Pipeline can also be used to process bulk RNA-seq data, in this case the barcodes file is not required.\n\nThe ST Pipeline has been optimized for speed, robustness and it is very easy to use with many parameters to adjust all the settings.\nThe ST Pipeline is fully parallel and it has constant memory use.\nThe ST Pipeline allows to skip any of the main steps and provides multiple customization options.\nThe ST Pipeline allows to use either the genome or the transcriptome as reference.\n\nBasically what the ST pipeline does (default mode) is:\n\n- Quality trimming step (read 1 and read 2):\n  - Remove low quality bases\n  - Sanity check (reads same length, reads order, etc..)\n  - Check quality UMI\n  - Remove artifacts (PolyT, PolyA, PolyG, PolyN and PolyC) of user defined length\n  - Check for AT and GC content\n  - Discard reads with a minimum number of bases of that failed any of the checks above\n- Contamimant filter step (e.x. rRNA genome) (Optional)\n- Mapping with [STAR](https://github.com/alexdobin/STAR) step (only read 2) (Optional)\n- Demultiplexing with [Taggd](https://github.com/jfnavarro/taggd) step (only read 1) (Optional)\n- Keep reads (read 2) that contain a valid barcode and are correctly mapped\n- Annotate the reads to the reference (Optional)\n- Group annotated reads by barcode (spot position), gene and genomic location (with an offset) to get a read count\n- In the grouping/counting only unique molecules (UMIs) are kept (Optional)\n\nYou can see a graphical more detailed description of the workflow in the documents `workflow.pdf` and `workflow_extended.pdf`\n\nThe output dataset is a matrix of counts (genes as columns, spots as rows) in TSV format.\nThe ST pipeline will also output a log file with useful stats and information.\n\n## Installation\n\nFor users see [install](docs/installation.md)\n\nFor developers [contributing](CONTRIBUTING.md)\n\n## Usage\n\nSee [usage](docs/usage.md)\n\n## Authors\n\nSee [authors](AUTHORS.md)\n\n## License\n\nThe ST pipeline is open source under the MIT license which means that you can use it,\nchange it and re-distribute but you must always refer to our license (see LICENSE).\n\n## Credits\n\nIf you use the ST Pipeline, please refer its publication:\nST Pipeline: An automated pipeline for spatial mapping of unique transcripts\nOxford BioInformatics\n10.1093/bioinformatics/btx211\n\n## Example dataset\n\nYou can see a real dataset obtained from the public data from\nthe following publication (http://science.sciencemag.org/content/353/6294/78)\nin the folder called \"data\".\n\n## Contact\n\nFor questions, bugs, feedback, etc.. you can contact:\n\nJose Fernandez Navarro <jc.fernandez.navarro@gmail.com>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "ST Pipeline: An automated pipeline for spatial mapping of unique transcripts",
    "version": "2.0.0",
    "project_urls": {
        "Homepage": "https://github.com/jfnavarro/st_pipeline",
        "Repository": "https://github.com/jfnavarro/st_pipeline"
    },
    "split_keywords": [
        "visium",
        " analysis",
        " pipeline",
        " spatial",
        " transcriptomics",
        " toolkit"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a8e52cf855801a31a8e6d55077536f77f822842f76a7849df102931063acbe1a",
                "md5": "362b94765073583e8b505b14313be92b",
                "sha256": "11a0f6952a7e4a88c9a26fa9907c94a87102b5816331e05bb3050270f26fb97e"
            },
            "downloads": -1,
            "filename": "stpipeline-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "362b94765073583e8b505b14313be92b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 56896,
            "upload_time": "2025-02-09T10:39:39",
            "upload_time_iso_8601": "2025-02-09T10:39:39.953468Z",
            "url": "https://files.pythonhosted.org/packages/a8/e5/2cf855801a31a8e6d55077536f77f822842f76a7849df102931063acbe1a/stpipeline-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9270c43274e021746d5fcbd25c9ecaa433745b59eb14b62f2ea3892f2482823a",
                "md5": "ee804e113bea9037dd3a836bd3ca4997",
                "sha256": "44efbbc5f4fd97cad0b540e8494e6954103327fc14caaea3b5bff5cb75a8927d"
            },
            "downloads": -1,
            "filename": "stpipeline-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ee804e113bea9037dd3a836bd3ca4997",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 46153,
            "upload_time": "2025-02-09T10:39:42",
            "upload_time_iso_8601": "2025-02-09T10:39:42.038592Z",
            "url": "https://files.pythonhosted.org/packages/92/70/c43274e021746d5fcbd25c9ecaa433745b59eb14b62f2ea3892f2482823a/stpipeline-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-09 10:39:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jfnavarro",
    "github_project": "st_pipeline",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "argparse",
            "specs": [
                [
                    "==",
                    "1.4.0"
                ]
            ]
        },
        {
            "name": "cfgv",
            "specs": [
                [
                    "==",
                    "3.4.0"
                ]
            ]
        },
        {
            "name": "contourpy",
            "specs": [
                [
                    "==",
                    "1.3.1"
                ]
            ]
        },
        {
            "name": "cycler",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "distance",
            "specs": [
                [
                    "==",
                    "0.1.3"
                ]
            ]
        },
        {
            "name": "distlib",
            "specs": [
                [
                    "==",
                    "0.3.9"
                ]
            ]
        },
        {
            "name": "dnaio",
            "specs": [
                [
                    "==",
                    "1.2.3"
                ]
            ]
        },
        {
            "name": "filelock",
            "specs": [
                [
                    "==",
                    "3.16.1"
                ]
            ]
        },
        {
            "name": "fonttools",
            "specs": [
                [
                    "==",
                    "4.55.3"
                ]
            ]
        },
        {
            "name": "htseq",
            "specs": [
                [
                    "==",
                    "2.0.9"
                ]
            ]
        },
        {
            "name": "identify",
            "specs": [
                [
                    "==",
                    "2.6.5"
                ]
            ]
        },
        {
            "name": "isal",
            "specs": [
                [
                    "==",
                    "1.7.1"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    "==",
                    "1.4.2"
                ]
            ]
        },
        {
            "name": "kiwisolver",
            "specs": [
                [
                    "==",
                    "1.4.8"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.10.0"
                ]
            ]
        },
        {
            "name": "nodeenv",
            "specs": [
                [
                    "==",
                    "1.9.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "2.2.1"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "24.2"
                ]
            ]
        },
        {
            "name": "pandas-stubs",
            "specs": [
                [
                    "==",
                    "2.2.3.241126"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.3"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    "==",
                    "11.1.0"
                ]
            ]
        },
        {
            "name": "platformdirs",
            "specs": [
                [
                    "==",
                    "4.3.6"
                ]
            ]
        },
        {
            "name": "pre-commit",
            "specs": [
                [
                    "==",
                    "4.0.1"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    "==",
                    "3.2.1"
                ]
            ]
        },
        {
            "name": "pysam",
            "specs": [
                [
                    "==",
                    "0.22.1"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    "==",
                    "2024.2"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.2"
                ]
            ]
        },
        {
            "name": "regex",
            "specs": [
                [
                    "==",
                    "2024.11.6"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "==",
                    "1.6.1"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.15.1"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    "==",
                    "0.13.2"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.17.0"
                ]
            ]
        },
        {
            "name": "taggd",
            "specs": [
                [
                    "==",
                    "0.4.0"
                ]
            ]
        },
        {
            "name": "threadpoolctl",
            "specs": [
                [
                    "==",
                    "3.5.0"
                ]
            ]
        },
        {
            "name": "types-pytz",
            "specs": [
                [
                    "==",
                    "2024.2.0.20241221"
                ]
            ]
        },
        {
            "name": "types-regex",
            "specs": [
                [
                    "==",
                    "2024.11.6.20241221"
                ]
            ]
        },
        {
            "name": "tzdata",
            "specs": [
                [
                    "==",
                    "2024.2"
                ]
            ]
        },
        {
            "name": "virtualenv",
            "specs": [
                [
                    "==",
                    "20.28.1"
                ]
            ]
        },
        {
            "name": "xopen",
            "specs": [
                [
                    "==",
                    "2.0.2"
                ]
            ]
        },
        {
            "name": "zlib-ng",
            "specs": [
                [
                    "==",
                    "0.5.1"
                ]
            ]
        }
    ],
    "lcname": "stpipeline"
}

Jose Fernandez Navarro