# TagGD: Barcode Demultiplexing Utilities for Spatial Transcriptomics Data
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/release/python-310/)
[](https://www.python.org/downloads/release/python-311/)
[](https://www.python.org/downloads/release/python-312/)
[](https://badge.fury.io/py/taggd)
[](https://github.com/jfnavarro/taggd/actions/workflows/dev)
**TagGD** is a Python-based barcode demultiplexer for Spatial Transcriptomics data.
It provides a generalized, optimized, and up-to-date version of the original C++ demultiplexer "findIndexes," available [here](https://github.com/pelinakan/UBD).
For the original peer-reviewed reference to the program, see [PLOS ONE](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0057521).
## Overview
The primary goal of TagGD is to extract cDNA barcodes from input files (FASTQ, FASTA, SAM, or BAM)
and match them against a list of reference barcodes using a k-mer-based approach. Matched reads are
output with barcode and spatial information added to each record.
TagGD is versatile and can be used to demultiplex any type of index if a reference file is provided.
Users can even create fake spatial coordinates (X, Y) for general-purpose demultiplexing tasks.
## Key Features
- Supports FASTQ, FASTA, SAM, and BAM formats.
- Handles multiple indexes per read.
- K-mer-based matching for efficient and accurate demultiplexing.
- Outputs matched, unmatched, and ambiguous reads with annotated barcodes.
- Multiple options and distance metrice.
- Fast and memmory efficient.
---
## Requirements
- python 3.10 or higher
- cython
- pysam
- numpy
- dnaio
- pytest (testing)
---
## Installation
### From Source
If you are using a virtual environment like Anaconda:
```console
git clone https://github.com/your-repo/taggd.git
cd taggd
python setup.py build
python setup.py install
```
or using pip
```console
git clone https://github.com/your-repo/taggd.git
cd taggd
pip install .
```
### Using `pip`
Install directly from PyPI:
```console
pip install taggd
```
---
## Building the Project
If you are contributing, testing or making changes to the code, you may need to build or rebuild the Cython extensions:
```console
python setup.py build_ext --inplace
```
## Testing the Project
```console
pytest
```
---
## Usage
### Basic Command
To see all available options, run:
```console
taggd_demultiplex -h
```
### Input Reference File Format
The reference file should contain barcodes and optional spatial coordinates, formatted as follows:
```tsv
BARCODE X Y
```
Example:
```tsv
ACGTACGT 0 0
TGCATGCA 1 1
```
---
### Example Commands
#### Example
```console
taggd_demultiplex --k 6 --max-edit-distance 3 --overhang 2 --subprocesses 4 --seed randomseed <barcodes.tsv> <input_file> <output_prefix>
```
---
## Output
TagGD generates the following output files:
- `<output_prefix>_matched.*`: Reads that matched reference barcodes.
- `<output_prefix>_unmatched.*`: Reads that did not match any reference barcodes.
- `<output_prefix>_ambiguous.*`: Reads that matched multiple barcodes.
- `<output_prefix>_results.tsv`: Summary statistics of the run.
### Options
Run `taggd_demultiplex -h` to view all available options and their descriptions.
---
## Contact
For questions, bug reports, or contributions, please contact:
- **Jose Fernandez Navarro**: <jc.fernandez.navarro@gmail.com>
Raw data
{
"_id": null,
"home_page": "https://github.com/jfnavarro/taggd",
"name": "taggd",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "bioinformatics, demultiplexing",
"author": "Jose Fernandez Navarro",
"author_email": "jc.fernandez.navarro@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/2b/fb/909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732/taggd-0.4.0.tar.gz",
"platform": null,
"description": "# TagGD: Barcode Demultiplexing Utilities for Spatial Transcriptomics Data\n\n[](https://opensource.org/licenses/MIT)\n[](https://www.python.org/downloads/release/python-310/)\n[](https://www.python.org/downloads/release/python-311/)\n[](https://www.python.org/downloads/release/python-312/)\n[](https://badge.fury.io/py/taggd)\n[](https://github.com/jfnavarro/taggd/actions/workflows/dev)\n\n**TagGD** is a Python-based barcode demultiplexer for Spatial Transcriptomics data.\nIt provides a generalized, optimized, and up-to-date version of the original C++ demultiplexer \"findIndexes,\" available [here](https://github.com/pelinakan/UBD).\n\nFor the original peer-reviewed reference to the program, see [PLOS ONE](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0057521).\n\n## Overview\n\nThe primary goal of TagGD is to extract cDNA barcodes from input files (FASTQ, FASTA, SAM, or BAM)\nand match them against a list of reference barcodes using a k-mer-based approach. Matched reads are\noutput with barcode and spatial information added to each record.\n\nTagGD is versatile and can be used to demultiplex any type of index if a reference file is provided.\nUsers can even create fake spatial coordinates (X, Y) for general-purpose demultiplexing tasks.\n\n## Key Features\n\n- Supports FASTQ, FASTA, SAM, and BAM formats.\n- Handles multiple indexes per read.\n- K-mer-based matching for efficient and accurate demultiplexing.\n- Outputs matched, unmatched, and ambiguous reads with annotated barcodes.\n- Multiple options and distance metrice.\n- Fast and memmory efficient.\n\n---\n\n## Requirements\n\n- python 3.10 or higher\n- cython\n- pysam\n- numpy\n- dnaio\n- pytest (testing)\n\n---\n\n## Installation\n\n### From Source\n\nIf you are using a virtual environment like Anaconda:\n\n```console\ngit clone https://github.com/your-repo/taggd.git\ncd taggd\npython setup.py build\npython setup.py install\n```\n\nor using pip\n\n```console\ngit clone https://github.com/your-repo/taggd.git\ncd taggd\npip install .\n```\n\n### Using `pip`\n\nInstall directly from PyPI:\n\n```console\npip install taggd\n```\n\n---\n\n## Building the Project\n\nIf you are contributing, testing or making changes to the code, you may need to build or rebuild the Cython extensions:\n\n```console\npython setup.py build_ext --inplace\n```\n\n## Testing the Project\n\n```console\npytest\n```\n\n---\n\n## Usage\n\n### Basic Command\n\nTo see all available options, run:\n\n```console\ntaggd_demultiplex -h\n```\n\n### Input Reference File Format\n\nThe reference file should contain barcodes and optional spatial coordinates, formatted as follows:\n\n```tsv\nBARCODE X Y\n```\n\nExample:\n\n```tsv\nACGTACGT 0 0\nTGCATGCA 1 1\n```\n\n---\n\n### Example Commands\n\n#### Example\n\n```console\ntaggd_demultiplex --k 6 --max-edit-distance 3 --overhang 2 --subprocesses 4 --seed randomseed <barcodes.tsv> <input_file> <output_prefix>\n```\n\n---\n\n## Output\n\nTagGD generates the following output files:\n\n- `<output_prefix>_matched.*`: Reads that matched reference barcodes.\n- `<output_prefix>_unmatched.*`: Reads that did not match any reference barcodes.\n- `<output_prefix>_ambiguous.*`: Reads that matched multiple barcodes.\n- `<output_prefix>_results.tsv`: Summary statistics of the run.\n\n### Options\n\nRun `taggd_demultiplex -h` to view all available options and their descriptions.\n\n---\n\n## Contact\n\nFor questions, bug reports, or contributions, please contact:\n\n- **Jose Fernandez Navarro**: <jc.fernandez.navarro@gmail.com>\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Bioinformatics genetic barcode demultiplexing (Spatial Transcriptomics)",
"version": "0.4.0",
"project_urls": {
"Download": "https://github.com/jfnavarro/taggd/0.4.0",
"Homepage": "https://github.com/jfnavarro/taggd"
},
"split_keywords": [
"bioinformatics",
" demultiplexing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fb24d4d6ae804bf1621c37a7b02a4d5c00e06db00271f7d1800912e6acef1310",
"md5": "94d8999a5375adc1737a32a58cf5f5d1",
"sha256": "d2ca2cdcb4733f61b80095a9d92ffcae4681ac6bdc350dd8db829390f498162d"
},
"downloads": -1,
"filename": "taggd-0.4.0-cp310-cp310-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "94d8999a5375adc1737a32a58cf5f5d1",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 199859,
"upload_time": "2025-02-08T16:44:23",
"upload_time_iso_8601": "2025-02-08T16:44:23.087437Z",
"url": "https://files.pythonhosted.org/packages/fb/24/d4d6ae804bf1621c37a7b02a4d5c00e06db00271f7d1800912e6acef1310/taggd-0.4.0-cp310-cp310-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2bfb909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732",
"md5": "10e1d1ce41559b5c9cf195dd07b5abee",
"sha256": "452af6d66615c00de2a4a205a5cc2a22d94a259b4c99e5c16981baaf34fca18e"
},
"downloads": -1,
"filename": "taggd-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "10e1d1ce41559b5c9cf195dd07b5abee",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 431238,
"upload_time": "2025-02-08T16:44:25",
"upload_time_iso_8601": "2025-02-08T16:44:25.562611Z",
"url": "https://files.pythonhosted.org/packages/2b/fb/909594bd17cc81c283192643bbb8abb0b93bca53f713bd9fe951d5717732/taggd-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-08 16:44:25",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jfnavarro",
"github_project": "taggd",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "taggd"
}