bigtrack


Namebigtrack JSON
Version 0.2 PyPI version JSON
download
home_pageNone
SummaryA lightweight Python package for creating UCSC Track Hubs with ease.
upload_time2025-08-28 04:52:35
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT
keywords ucsc track hub genomics bioinformatics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bigtrack

[![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/zhang-shilong/bigtrack/python-publish.yml)](https://github.com/zhang-shilong/bigtrack/actions)
[![PyPI - Version](https://img.shields.io/pypi/v/bigtrack?label=PyPI&color=%230073b7)](https://pypi.org/project/bigtrack/)
[![GitHub License](https://img.shields.io/github/license/zhang-shilong/bigtrack)](./LICENSE)

A lightweight Python package for creating UCSC Track Hubs with ease.

_Note: This package was primarily developed to generate track hubs for my previous publications. It has not been tested for production use._

## Installation

Install by pip:

```bash
pip install bigtrack
```

Install the latest version from source:

```bash
git clone https://github.com/zhang-shilong/bigtrack
cd bigtrack/
pip install .
```

## Usage

### Quick start

```python
import bigtrack

# make a hub
hub = bigtrack.Hub(
    hub="ExampleHub",
    shortLabel="ExampleHub",
    longLabel="ExampleHub",
    email="example@email.com",
)

# make a genome
genome = bigtrack.Genome(
    genome="ExampleGenome",
    organism="Example Organism",
    scientificName="Example Organism",
    twoBitPath="/path/to/two/bit/file",
    chromSizes="/path/to/sizes/file",
    defaultPos="chr1:0-100000",
    orderKey=1,
    description="This is an example",
    htmlPath="/path/to/html/description",
)
hub.add_genome(genome)  # add the genome to hub

# make a group
group_map = bigtrack.Group(
    name="map",
    label="Mapping and Sequencing",
    priority=2,
)
genome.add_group(group_map)  # add the group to genome

# make a trackDb
trackDb_map = bigtrack.TrackDb(
    include="trackDb_map.txt",
)
genome.add_trackDb(trackDb_map)  # add the trackDb to genome

# make a track
track_ideogram = bigtrack.Track(
    track="cytoBandIdeo",
    shortLabel="Chromosome Band (Ideogram)",
    longLabel="Ideogram for Orientation",
    bigDataUrl="/path/to/track/file",
    type="bigBed 4 +",
    group="map",
)
trackDb_map.add_track(track_ideogram)  # add the track to trackDb

# finally, one function to generate the file structure
hub.generate()
```

Then, find your track hub under the `ExampleHub/` directory.

### Data structure

When `hub.generate()` runs, bigtrack writes a directory tree suitable for hosting as a UCSC Track Hub. The exact layout can be configured, but a typical generated structure looks like:

```
ExampleHub/
├─ hub.txt
├─ genomes.txt
├─ ExampleGenome/
│  ├─ groups.txt
│  ├─ trackDb.txt  # include all trackDbs
│  ├─ trackDb_map.txt
│  └─ trackDb_xxx.txt
└─ AnotherGenome/
   ├─ groups.txt
   ├─ trackDb.txt  # include all trackDbs
   ├─ trackDb_map.txt
   └─ trackDb_xxx.txt
```

You can host this directory on any web server (HTTP/HTTPS/FTP) and point UCSC Genome Browser at the `hub.txt` URL.

### Hub components

bigtrack models the standard UCSC hub components as Python classes. Each object has reserved keywords — those are required for correct hub generation. Some fields have sensible defaults. Please note, required keys may not consistent with UCSC guidance.

#### Hub

Top-level hub object. Represents `hub.txt`.

Required keys: `hub`, `shortLabel`, `longLabel`, `genomesFile` (default: `genomes.txt`), `email`.

#### Genome

Represents a genome entry (appears in `genomes.txt` and holds per-genome resources).

Required keys: `genome`, `trackDb` (default: `trackDb.txt`), `groups` (default: `groups.txt`), `organism`, `scientificName`.

#### Group

A logical grouping for tracks used for UI organization.

Required keys: `name`, `label`, `priority` (default: 1), `defaultIsClosed` (default: 0).

#### TrackDb

A container class that holds tracks and writes a trackDb file.

Required keys: `include`.

#### Track

Basic (atomic) track object.

Required keys: `track`, `parent` (default: `None`), `shortLabel`, `longLabel`, `type`.

To enhance usage, track collections are also available:

#### CompositeTrack

A composite track groups multiple subtracks that share the same type. See UCSC docs for [composite track settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#Composite_Track_Settings).

Required keys: `track`, `compositeTrack` (default: `on`), `parent` (default: `None`), `shortLabel`, `longLabel`, `type`.

#### SampledCompositeTrack

A convenience helper that produces a sampled subset of a CompositeTrack automatically. Useful when you have many samples and want to produce a smaller subset for quick browsing.

```python
bigtrack.SampledCompositeTrack(
    full_track: bigtrack.CompositeTrack,
    number: int,  # number of sampled child tracks from full_track
    random_seed: int = 0,
    suffix: str = "_subset",
    **kwargs,  # kwargs to override
)
```

#### SuperTrack

A superTrack provides a higher-level container that can contain multiple composite tracks or plain tracks. See UCSC docs for [super track settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#superTrack).

Required keys: `track`, `superTrack` (default: `on`), `parent` (default: `None`), `shortLabel`, `longLabel`.

#### MultiWig

A multiWig track enables the simultaneous display and comparison of multiple wiggle signal tracks. See UCSC docs for [multiWig settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#multiWig).

Required keys: `track`, `parent` (default: `None`), `container` (default: `multiWig`), `type` (default: `bigWig`), `shortLabel`, `longLabel`.

### Example

See codes for [T2T Macaque Hub](./trackhubs/generate_T2TMacaqueHub.py).

## Todo

- [ ] Add pre-flight checks while generating hubs
- [ ] Add automatic format conversion

## Acknowledgement

Thanks to the Python package [daler/trackhub](https://github.com/daler/trackhub).

## Citation

1. Zhang, S., Xu, N., Fu, L. *et al*. Integrated analysis of the complete sequence of a macaque genome. *Nature* (2025). [https://doi.org/10.1038/s41586-025-08596-w](https://doi.org/10.1038/s41586-025-08596-w)
2. Zhang, S. _et al_. A complete and near-perfect rhesus macaque reference genome: lessons from subtelomeric repeats and sequencing bias. _bioRxiv_ (2025). https://doi.org/10.1101/2025.08.04.668424

## License

This project is licensed under the MIT License — see the [LICENSE](./LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "bigtrack",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "UCSC, track hub, genomics, bioinformatics",
    "author": null,
    "author_email": "Shilong Zhang <shilong.zhang@sjtu.edu.cn>",
    "download_url": "https://files.pythonhosted.org/packages/f1/86/016cf446b0b15400a824a72c3288a71f9f5bfab35cd8fd6a4fe57ad4f4e1/bigtrack-0.2.tar.gz",
    "platform": null,
    "description": "# bigtrack\n\n[![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/zhang-shilong/bigtrack/python-publish.yml)](https://github.com/zhang-shilong/bigtrack/actions)\n[![PyPI - Version](https://img.shields.io/pypi/v/bigtrack?label=PyPI&color=%230073b7)](https://pypi.org/project/bigtrack/)\n[![GitHub License](https://img.shields.io/github/license/zhang-shilong/bigtrack)](./LICENSE)\n\nA lightweight Python package for creating UCSC Track Hubs with ease.\n\n_Note: This package was primarily developed to generate track hubs for my previous publications. It has not been tested for production use._\n\n## Installation\n\nInstall by pip:\n\n```bash\npip install bigtrack\n```\n\nInstall the latest version from source:\n\n```bash\ngit clone https://github.com/zhang-shilong/bigtrack\ncd bigtrack/\npip install .\n```\n\n## Usage\n\n### Quick start\n\n```python\nimport bigtrack\n\n# make a hub\nhub = bigtrack.Hub(\n    hub=\"ExampleHub\",\n    shortLabel=\"ExampleHub\",\n    longLabel=\"ExampleHub\",\n    email=\"example@email.com\",\n)\n\n# make a genome\ngenome = bigtrack.Genome(\n    genome=\"ExampleGenome\",\n    organism=\"Example Organism\",\n    scientificName=\"Example Organism\",\n    twoBitPath=\"/path/to/two/bit/file\",\n    chromSizes=\"/path/to/sizes/file\",\n    defaultPos=\"chr1:0-100000\",\n    orderKey=1,\n    description=\"This is an example\",\n    htmlPath=\"/path/to/html/description\",\n)\nhub.add_genome(genome)  # add the genome to hub\n\n# make a group\ngroup_map = bigtrack.Group(\n    name=\"map\",\n    label=\"Mapping and Sequencing\",\n    priority=2,\n)\ngenome.add_group(group_map)  # add the group to genome\n\n# make a trackDb\ntrackDb_map = bigtrack.TrackDb(\n    include=\"trackDb_map.txt\",\n)\ngenome.add_trackDb(trackDb_map)  # add the trackDb to genome\n\n# make a track\ntrack_ideogram = bigtrack.Track(\n    track=\"cytoBandIdeo\",\n    shortLabel=\"Chromosome Band (Ideogram)\",\n    longLabel=\"Ideogram for Orientation\",\n    bigDataUrl=\"/path/to/track/file\",\n    type=\"bigBed 4 +\",\n    group=\"map\",\n)\ntrackDb_map.add_track(track_ideogram)  # add the track to trackDb\n\n# finally, one function to generate the file structure\nhub.generate()\n```\n\nThen, find your track hub under the `ExampleHub/` directory.\n\n### Data structure\n\nWhen `hub.generate()` runs, bigtrack writes a directory tree suitable for hosting as a UCSC Track Hub. The exact layout can be configured, but a typical generated structure looks like:\n\n```\nExampleHub/\n\u251c\u2500 hub.txt\n\u251c\u2500 genomes.txt\n\u251c\u2500 ExampleGenome/\n\u2502  \u251c\u2500 groups.txt\n\u2502  \u251c\u2500 trackDb.txt  # include all trackDbs\n\u2502  \u251c\u2500 trackDb_map.txt\n\u2502  \u2514\u2500 trackDb_xxx.txt\n\u2514\u2500 AnotherGenome/\n   \u251c\u2500 groups.txt\n   \u251c\u2500 trackDb.txt  # include all trackDbs\n   \u251c\u2500 trackDb_map.txt\n   \u2514\u2500 trackDb_xxx.txt\n```\n\nYou can host this directory on any web server (HTTP/HTTPS/FTP) and point UCSC Genome Browser at the `hub.txt` URL.\n\n### Hub components\n\nbigtrack models the standard UCSC hub components as Python classes. Each object has reserved keywords \u2014 those are required for correct hub generation. Some fields have sensible defaults. Please note, required keys may not consistent with UCSC guidance.\n\n#### Hub\n\nTop-level hub object. Represents `hub.txt`.\n\nRequired keys: `hub`, `shortLabel`, `longLabel`, `genomesFile` (default: `genomes.txt`), `email`.\n\n#### Genome\n\nRepresents a genome entry (appears in `genomes.txt` and holds per-genome resources).\n\nRequired keys: `genome`, `trackDb` (default: `trackDb.txt`), `groups` (default: `groups.txt`), `organism`, `scientificName`.\n\n#### Group\n\nA logical grouping for tracks used for UI organization.\n\nRequired keys: `name`, `label`, `priority` (default: 1), `defaultIsClosed` (default: 0).\n\n#### TrackDb\n\nA container class that holds tracks and writes a trackDb file.\n\nRequired keys: `include`.\n\n#### Track\n\nBasic (atomic) track object.\n\nRequired keys: `track`, `parent` (default: `None`), `shortLabel`, `longLabel`, `type`.\n\nTo enhance usage, track collections are also available:\n\n#### CompositeTrack\n\nA composite track groups multiple subtracks that share the same type. See UCSC docs for [composite track settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#Composite_Track_Settings).\n\nRequired keys: `track`, `compositeTrack` (default: `on`), `parent` (default: `None`), `shortLabel`, `longLabel`, `type`.\n\n#### SampledCompositeTrack\n\nA convenience helper that produces a sampled subset of a CompositeTrack automatically. Useful when you have many samples and want to produce a smaller subset for quick browsing.\n\n```python\nbigtrack.SampledCompositeTrack(\n    full_track: bigtrack.CompositeTrack,\n    number: int,  # number of sampled child tracks from full_track\n    random_seed: int = 0,\n    suffix: str = \"_subset\",\n    **kwargs,  # kwargs to override\n)\n```\n\n#### SuperTrack\n\nA superTrack provides a higher-level container that can contain multiple composite tracks or plain tracks. See UCSC docs for [super track settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#superTrack).\n\nRequired keys: `track`, `superTrack` (default: `on`), `parent` (default: `None`), `shortLabel`, `longLabel`.\n\n#### MultiWig\n\nA multiWig track enables the simultaneous display and comparison of multiple wiggle signal tracks. See UCSC docs for [multiWig settings](https://genome.ucsc.edu/goldenpath/help/trackDb/trackDbHub.html#multiWig).\n\nRequired keys: `track`, `parent` (default: `None`), `container` (default: `multiWig`), `type` (default: `bigWig`), `shortLabel`, `longLabel`.\n\n### Example\n\nSee codes for [T2T Macaque Hub](./trackhubs/generate_T2TMacaqueHub.py).\n\n## Todo\n\n- [ ] Add pre-flight checks while generating hubs\n- [ ] Add automatic format conversion\n\n## Acknowledgement\n\nThanks to the Python package [daler/trackhub](https://github.com/daler/trackhub).\n\n## Citation\n\n1. Zhang, S., Xu, N., Fu, L. *et al*. Integrated analysis of the complete sequence of a macaque genome. *Nature* (2025). [https://doi.org/10.1038/s41586-025-08596-w](https://doi.org/10.1038/s41586-025-08596-w)\n2. Zhang, S. _et al_. A complete and near-perfect rhesus macaque reference genome: lessons from subtelomeric repeats and sequencing bias. _bioRxiv_ (2025). https://doi.org/10.1101/2025.08.04.668424\n\n## License\n\nThis project is licensed under the MIT License \u2014 see the [LICENSE](./LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A lightweight Python package for creating UCSC Track Hubs with ease.",
    "version": "0.2",
    "project_urls": null,
    "split_keywords": [
        "ucsc",
        " track hub",
        " genomics",
        " bioinformatics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3edea3906c8c96ffc4f5f338e19f57d0b465680f87216c2478239aaf9cb6846f",
                "md5": "0eb800ddc5ac7f38de220ee2f442266d",
                "sha256": "bfdfb805ebfcb79f766d20eef4575a3cd20d31150aeba26dbe65e415a79197bb"
            },
            "downloads": -1,
            "filename": "bigtrack-0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0eb800ddc5ac7f38de220ee2f442266d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 10753,
            "upload_time": "2025-08-28T04:52:34",
            "upload_time_iso_8601": "2025-08-28T04:52:34.343413Z",
            "url": "https://files.pythonhosted.org/packages/3e/de/a3906c8c96ffc4f5f338e19f57d0b465680f87216c2478239aaf9cb6846f/bigtrack-0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f186016cf446b0b15400a824a72c3288a71f9f5bfab35cd8fd6a4fe57ad4f4e1",
                "md5": "53dc3c54bd93d26462b2bc3c267a1eca",
                "sha256": "e84d96b038dc9c7a7ff2984d6c1dd477ce4ef55ef974ccbe52bb1f5b5be5d130"
            },
            "downloads": -1,
            "filename": "bigtrack-0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "53dc3c54bd93d26462b2bc3c267a1eca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 10069,
            "upload_time": "2025-08-28T04:52:35",
            "upload_time_iso_8601": "2025-08-28T04:52:35.593555Z",
            "url": "https://files.pythonhosted.org/packages/f1/86/016cf446b0b15400a824a72c3288a71f9f5bfab35cd8fd6a4fe57ad4f4e1/bigtrack-0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 04:52:35",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "bigtrack"
}
        
Elapsed time: 1.70635s