taggd


Nametaggd JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/jfnavarro/taggd
SummaryBioinformatics genetic barcode demultiplexing (Spatial Transcriptomics)
upload_time2025-02-08 16:44:25
maintainerNone
docs_urlNone
authorJose Fernandez Navarro
requires_python>=3.10
licenseMIT
keywords bioinformatics demultiplexing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TagGD: Barcode Demultiplexing Utilities for Spatial Transcriptomics Data

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-310/)
[![Python 3.11](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-311/)
[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/release/python-312/)
[![PyPI version](https://badge.fury.io/py/taggd.svg)](https://badge.fury.io/py/taggd)
[![Build Status](https://github.com/jfnavarro/taggd/actions/workflows/dev.yml/badge.svg)](https://github.com/jfnavarro/taggd/actions/workflows/dev)

**TagGD** is a Python-based barcode demultiplexer for Spatial Transcriptomics data.
It provides a generalized, optimized, and up-to-date version of the original C++ demultiplexer "findIndexes," available [here](https://github.com/pelinakan/UBD).

For the original peer-reviewed reference to the program, see [PLOS ONE](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0057521).

## Overview

The primary goal of TagGD is to extract cDNA barcodes from input files (FASTQ, FASTA, SAM, or BAM)
and match them against a list of reference barcodes using a k-mer-based approach. Matched reads are
output with barcode and spatial information added to each record.

TagGD is versatile and can be used to demultiplex any type of index if a reference file is provided.
Users can even create fake spatial coordinates (X, Y) for general-purpose demultiplexing tasks.

## Key Features

- Supports FASTQ, FASTA, SAM, and BAM formats.
- Handles multiple indexes per read.
- K-mer-based matching for efficient and accurate demultiplexing.
- Outputs matched, unmatched, and ambiguous reads with annotated barcodes.
- Multiple options and distance metrice.
- Fast and memmory efficient.

---

## Requirements

- python 3.10 or higher
- cython
- pysam
- numpy
- dnaio
- pytest (testing)

---

## Installation

### From Source

If you are using a virtual environment like Anaconda:

```console
git clone https://github.com/your-repo/taggd.git
cd taggd
python setup.py build
python setup.py install
```

or using pip

```console
git clone https://github.com/your-repo/taggd.git
cd taggd
pip install .
```

### Using `pip`

Install directly from PyPI:

```console
pip install taggd
```

---

## Building the Project

If you are contributing, testing or making changes to the code, you may need to build or rebuild the Cython extensions:

```console
python setup.py build_ext --inplace
```

## Testing the Project

```console
pytest
```

---

## Usage

### Basic Command

To see all available options, run:

```console
taggd_demultiplex -h
```

### Input Reference File Format

The reference file should contain barcodes and optional spatial coordinates, formatted as follows:

```tsv
BARCODE X Y
```

Example:

```tsv
ACGTACGT 0 0
TGCATGCA 1 1
```

---

### Example Commands

#### Example

```console
taggd_demultiplex   --k 6   --max-edit-distance 3   --overhang 2   --subprocesses 4   --seed randomseed   <barcodes.tsv>   <input_file>   <output_prefix>
```

---

## Output

TagGD generates the following output files:

- `<output_prefix>_matched.*`: Reads that matched reference barcodes.
- `<output_prefix>_unmatched.*`: Reads that did not match any reference barcodes.
- `<output_prefix>_ambiguous.*`: Reads that matched multiple barcodes.
- `<output_prefix>_results.tsv`: Summary statistics of the run.

### Options

Run `taggd_demultiplex -h` to view all available options and their descriptions.

---

## Contact

For questions, bug reports, or contributions, please contact:

- **Jose Fernandez Navarro**: <jc.fernandez.navarro@gmail.com>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jfnavarro/taggd",
    "name": "taggd",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "bioinformatics, demultiplexing",
    "author": "Jose Fernandez Navarro",
    "author_email": "jc.fernandez.navarro@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/2b/fb/909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732/taggd-0.4.0.tar.gz",
    "platform": null,
    "description": "# TagGD: Barcode Demultiplexing Utilities for Spatial Transcriptomics Data\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-310/)\n[![Python 3.11](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-311/)\n[![Python 3.12](https://img.shields.io/badge/python-3.12-blue.svg)](https://www.python.org/downloads/release/python-312/)\n[![PyPI version](https://badge.fury.io/py/taggd.svg)](https://badge.fury.io/py/taggd)\n[![Build Status](https://github.com/jfnavarro/taggd/actions/workflows/dev.yml/badge.svg)](https://github.com/jfnavarro/taggd/actions/workflows/dev)\n\n**TagGD** is a Python-based barcode demultiplexer for Spatial Transcriptomics data.\nIt provides a generalized, optimized, and up-to-date version of the original C++ demultiplexer \"findIndexes,\" available [here](https://github.com/pelinakan/UBD).\n\nFor the original peer-reviewed reference to the program, see [PLOS ONE](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0057521).\n\n## Overview\n\nThe primary goal of TagGD is to extract cDNA barcodes from input files (FASTQ, FASTA, SAM, or BAM)\nand match them against a list of reference barcodes using a k-mer-based approach. Matched reads are\noutput with barcode and spatial information added to each record.\n\nTagGD is versatile and can be used to demultiplex any type of index if a reference file is provided.\nUsers can even create fake spatial coordinates (X, Y) for general-purpose demultiplexing tasks.\n\n## Key Features\n\n- Supports FASTQ, FASTA, SAM, and BAM formats.\n- Handles multiple indexes per read.\n- K-mer-based matching for efficient and accurate demultiplexing.\n- Outputs matched, unmatched, and ambiguous reads with annotated barcodes.\n- Multiple options and distance metrice.\n- Fast and memmory efficient.\n\n---\n\n## Requirements\n\n- python 3.10 or higher\n- cython\n- pysam\n- numpy\n- dnaio\n- pytest (testing)\n\n---\n\n## Installation\n\n### From Source\n\nIf you are using a virtual environment like Anaconda:\n\n```console\ngit clone https://github.com/your-repo/taggd.git\ncd taggd\npython setup.py build\npython setup.py install\n```\n\nor using pip\n\n```console\ngit clone https://github.com/your-repo/taggd.git\ncd taggd\npip install .\n```\n\n### Using `pip`\n\nInstall directly from PyPI:\n\n```console\npip install taggd\n```\n\n---\n\n## Building the Project\n\nIf you are contributing, testing or making changes to the code, you may need to build or rebuild the Cython extensions:\n\n```console\npython setup.py build_ext --inplace\n```\n\n## Testing the Project\n\n```console\npytest\n```\n\n---\n\n## Usage\n\n### Basic Command\n\nTo see all available options, run:\n\n```console\ntaggd_demultiplex -h\n```\n\n### Input Reference File Format\n\nThe reference file should contain barcodes and optional spatial coordinates, formatted as follows:\n\n```tsv\nBARCODE X Y\n```\n\nExample:\n\n```tsv\nACGTACGT 0 0\nTGCATGCA 1 1\n```\n\n---\n\n### Example Commands\n\n#### Example\n\n```console\ntaggd_demultiplex   --k 6   --max-edit-distance 3   --overhang 2   --subprocesses 4   --seed randomseed   <barcodes.tsv>   <input_file>   <output_prefix>\n```\n\n---\n\n## Output\n\nTagGD generates the following output files:\n\n- `<output_prefix>_matched.*`: Reads that matched reference barcodes.\n- `<output_prefix>_unmatched.*`: Reads that did not match any reference barcodes.\n- `<output_prefix>_ambiguous.*`: Reads that matched multiple barcodes.\n- `<output_prefix>_results.tsv`: Summary statistics of the run.\n\n### Options\n\nRun `taggd_demultiplex -h` to view all available options and their descriptions.\n\n---\n\n## Contact\n\nFor questions, bug reports, or contributions, please contact:\n\n- **Jose Fernandez Navarro**: <jc.fernandez.navarro@gmail.com>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Bioinformatics genetic barcode demultiplexing (Spatial Transcriptomics)",
    "version": "0.4.0",
    "project_urls": {
        "Download": "https://github.com/jfnavarro/taggd/0.4.0",
        "Homepage": "https://github.com/jfnavarro/taggd"
    },
    "split_keywords": [
        "bioinformatics",
        " demultiplexing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fb24d4d6ae804bf1621c37a7b02a4d5c00e06db00271f7d1800912e6acef1310",
                "md5": "94d8999a5375adc1737a32a58cf5f5d1",
                "sha256": "d2ca2cdcb4733f61b80095a9d92ffcae4681ac6bdc350dd8db829390f498162d"
            },
            "downloads": -1,
            "filename": "taggd-0.4.0-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "94d8999a5375adc1737a32a58cf5f5d1",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 199859,
            "upload_time": "2025-02-08T16:44:23",
            "upload_time_iso_8601": "2025-02-08T16:44:23.087437Z",
            "url": "https://files.pythonhosted.org/packages/fb/24/d4d6ae804bf1621c37a7b02a4d5c00e06db00271f7d1800912e6acef1310/taggd-0.4.0-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2bfb909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732",
                "md5": "10e1d1ce41559b5c9cf195dd07b5abee",
                "sha256": "452af6d66615c00de2a4a205a5cc2a22d94a259b4c99e5c16981baaf34fca18e"
            },
            "downloads": -1,
            "filename": "taggd-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "10e1d1ce41559b5c9cf195dd07b5abee",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 431238,
            "upload_time": "2025-02-08T16:44:25",
            "upload_time_iso_8601": "2025-02-08T16:44:25.562611Z",
            "url": "https://files.pythonhosted.org/packages/2b/fb/909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732/taggd-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-08 16:44:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jfnavarro",
    "github_project": "taggd",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "taggd"
}
        
Elapsed time: 1.16386s