ONTraC


NameONTraC JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryA niche-centered, machine learning method for constructing spatially continuous trajectories
upload_time2024-04-25 15:52:35
maintainerNone
docs_urlNone
authorNone
requires_python==3.11.*
licenseMIT
keywords deep-learning pytorch pytorch geometric trajectory inference spatial omics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # **ONTraC**

ONTraC (Ordered Niche Trajectory Construction) is a niche-centered, machine learning
method for constructing spatially continuous trajectories. ONTraC differs from existing tools in
that it treats a niche, rather than an individual cell, as the basic unit for spatial trajectory
analysis. In this context, we define niche as a multicellular, spatially localized region where
different cell types may coexist and interact with each other.  ONTraC seamlessly integrates
cell-type composition and spatial information by using the graph neural network modeling
framework. Its output, which is called the niche trajectory, can be viewed as a one dimensional
representation of the tissue microenvironment continuum. By disentangling cell-level and niche-
level properties, niche trajectory analysis provides a coherent framework to study coordinated
responses from all the cells in association with continuous tissue microenvironment variations.

![ONTraC Structure](docs/source/_static/images/ONTraC_structure.png)

## Required packages

pyyaml=6.0.1
pandas=2.1.1
pytorch=2.2.1
torch_geometric=2.5.1

## Installation

Please see the [installation tutorial](tutorials/installation.md)

## Tutorial

### Input File

A example input file is provided in `examples/stereo_seq_brain/original_data.csv`.
This file contains all input formation with five columns: Cell_ID, Sample, Cell_Type, x, and y.

| Cell_ID         | Sample   | Cell_Type | x       | y     |
| --------------- | -------- | --------- | ------- | ----- |
| E12_E1S3_100034 | E12_E1S3 | Fibro     | 15940   | 18584 |
| E12_E1S3_100035 | E12_E1S3 | Fibro     | 15942   | 18623 |
| ...             | ...      | ...       | ...     | ...   |
| E16_E2S7_326412 | E16_E2S7 | Fibro     | 32990.5 | 14475 |

For detailed information about input and output file, please see [IO files explanation](tutorials/IO_files.md#input-files).

### Running ONTraC

The required options for running ONTraC are the paths to the input file and the three output directories:

- **preprocessing-dir:** This directory stores preprocessed data and other intermediary datasets for analysis.
- **GNN-dir:** This directory stores output from running the GP (Graph Pooling) algorithm.
- **NTScore-dir:** This directory stores NTScore output.

```{sh}
cd examples/stereo_seq_brain
ONTraC -d original_data.csv --preprocessing-dir stereo_seq_preprocessing_dir --GNN-dir stereo_seq_GNN --NTScore-dir stereo_seq_NTScore
```

We recommand running `ONTraC` on GPU, it may take much more time on your own laptop with CPU only.

All available parameter options are listed below.

```{text}
Usage: ONTraC <-d DATASET> <--preprocessing-dir PREPROCESSING_DIR> <--GNN-dir GNN_DIR> <--NTScore-dir NTSCORE_DIR>
    [--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--device DEVICE] [--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA] 
    [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE] [-s SEED] [--seed SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [-k K_CLUSTERS]
    [--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT] 
    [--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA]

All steps of ONTraC including dataset creation, Graph Pooling, and NT score
calculation.

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit

  IO:
    -d DATASET, --dataset=DATASET
                        Original input dataset.
    --preprocessing-dir=PREPROCESSING_DIR
                        Directory for preprocessing outputs.
    --GNN-dir=GNN_DIR   Directory for the GNN output.
    --NTScore-dir=NTSCORE_DIR
                        Directory for the NTScore output

  Niche Network Construction:
    --n-cpu=N_CPU       Number of CPUs used for parallel computing. Default is
                        4.
    --n-neighbors=N_NEIGHBORS
                        Number of neighbors used for kNN graph construction.
                        Default is 50.

  Options for training:
    --device=DEVICE     Device for training. We support cpu and cuda now. Auto
                        select if not specified.
    --epochs=EPOCHS     Number of maximum epochs for training. Default is
                        1000.
    --patience=PATIENCE
                        Number of epochs wait for better result. Default is
                        100.
    --min-delta=MIN_DELTA
                        Minimum delta for better result. Default is 0.001
    --min-epochs=MIN_EPOCHS
                        Minimum number of epochs for training. Default is 50.
                        Set to 0 to disable.
    --batch-size=BATCH_SIZE
                        Batch size for training. Default is 0 for whole
                        dataset.
    -s SEED, --seed=SEED
                        Random seed for training. Default is random.
    --lr=LR             Learning rate for training. Default is 0.03.
    --hidden-feats=HIDDEN_FEATS
                        Number of hidden features. Default is 4.
    -k K, --k-clusters=K
                        Number of niche clusters. Default is 6.
    --modularity-loss-weight=MODULARITY_LOSS_WEIGHT
                        Weight for modularity loss. Default is 0.3.
    --purity-loss-weight=PURITY_LOSS_WEIGHT
                        Weight for purity loss. Default is 300.
    --regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT
                        Weight for regularization loss. Default is 0.1.
    --beta=BETA         Beta value control niche cluster assignment matrix.
                        Default is 0.3.

```

### Output

The intermediate and final results are located in `preprocessing-dir`, `GNN-dir`, and `NTScore-dir` directories. Please see [IO files explanation](tutorials/IO_files.md#output-files) for detailed infromation.

### Visualization

Please see [post analysis tutorial](tutorials/post_analysis.md).

## Citation

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ONTraC",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "==3.11.*",
    "maintainer_email": null,
    "keywords": "deep-learning, pytorch, pytorch geometric, trajectory inference, spatial omics",
    "author": null,
    "author_email": "Wen Wang <wwang.bio@gmail.com>, Shiwei Zheng <swzheng29@gmail.com>, Crystal Shin <sjcshin5040@gmail.com>, Guo-Cheng Yuan <gcyuan@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/d6/7d/f2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7/ontrac-0.0.5.tar.gz",
    "platform": null,
    "description": "# **ONTraC**\n\nONTraC (Ordered Niche Trajectory Construction) is a niche-centered, machine learning\nmethod for constructing spatially continuous trajectories. ONTraC differs from existing tools in\nthat it treats a niche, rather than an individual cell, as the basic unit for spatial trajectory\nanalysis. In this context, we define niche as a multicellular, spatially localized region where\ndifferent cell types may coexist and interact with each other.  ONTraC seamlessly integrates\ncell-type composition and spatial information by using the graph neural network modeling\nframework. Its output, which is called the niche trajectory, can be viewed as a one dimensional\nrepresentation of the tissue microenvironment continuum. By disentangling cell-level and niche-\nlevel properties, niche trajectory analysis provides a coherent framework to study coordinated\nresponses from all the cells in association with continuous tissue microenvironment variations.\n\n![ONTraC Structure](docs/source/_static/images/ONTraC_structure.png)\n\n## Required packages\n\npyyaml=6.0.1\npandas=2.1.1\npytorch=2.2.1\ntorch_geometric=2.5.1\n\n## Installation\n\nPlease see the [installation tutorial](tutorials/installation.md)\n\n## Tutorial\n\n### Input File\n\nA example input file is provided in `examples/stereo_seq_brain/original_data.csv`.\nThis file contains all input formation with five columns: Cell_ID, Sample, Cell_Type, x, and y.\n\n| Cell_ID         | Sample   | Cell_Type | x       | y     |\n| --------------- | -------- | --------- | ------- | ----- |\n| E12_E1S3_100034 | E12_E1S3 | Fibro     | 15940   | 18584 |\n| E12_E1S3_100035 | E12_E1S3 | Fibro     | 15942   | 18623 |\n| ...             | ...      | ...       | ...     | ...   |\n| E16_E2S7_326412 | E16_E2S7 | Fibro     | 32990.5 | 14475 |\n\nFor detailed information about input and output file, please see [IO files explanation](tutorials/IO_files.md#input-files).\n\n### Running ONTraC\n\nThe required options for running ONTraC are the paths to the input file and the three output directories:\n\n- **preprocessing-dir:** This directory stores preprocessed data and other intermediary datasets for analysis.\n- **GNN-dir:** This directory stores output from running the GP (Graph Pooling) algorithm.\n- **NTScore-dir:** This directory stores NTScore output.\n\n```{sh}\ncd examples/stereo_seq_brain\nONTraC -d original_data.csv --preprocessing-dir stereo_seq_preprocessing_dir --GNN-dir stereo_seq_GNN --NTScore-dir stereo_seq_NTScore\n```\n\nWe recommand running `ONTraC` on GPU, it may take much more time on your own laptop with CPU only.\n\nAll available parameter options are listed below.\n\n```{text}\nUsage: ONTraC <-d DATASET> <--preprocessing-dir PREPROCESSING_DIR> <--GNN-dir GNN_DIR> <--NTScore-dir NTSCORE_DIR>\n    [--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--device DEVICE] [--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA] \n    [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE] [-s SEED] [--seed SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [-k K_CLUSTERS]\n    [--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT] \n    [--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA]\n\nAll steps of ONTraC including dataset creation, Graph Pooling, and NT score\ncalculation.\n\nOptions:\n  --version             show program's version number and exit\n  -h, --help            show this help message and exit\n\n  IO:\n    -d DATASET, --dataset=DATASET\n                        Original input dataset.\n    --preprocessing-dir=PREPROCESSING_DIR\n                        Directory for preprocessing outputs.\n    --GNN-dir=GNN_DIR   Directory for the GNN output.\n    --NTScore-dir=NTSCORE_DIR\n                        Directory for the NTScore output\n\n  Niche Network Construction:\n    --n-cpu=N_CPU       Number of CPUs used for parallel computing. Default is\n                        4.\n    --n-neighbors=N_NEIGHBORS\n                        Number of neighbors used for kNN graph construction.\n                        Default is 50.\n\n  Options for training:\n    --device=DEVICE     Device for training. We support cpu and cuda now. Auto\n                        select if not specified.\n    --epochs=EPOCHS     Number of maximum epochs for training. Default is\n                        1000.\n    --patience=PATIENCE\n                        Number of epochs wait for better result. Default is\n                        100.\n    --min-delta=MIN_DELTA\n                        Minimum delta for better result. Default is 0.001\n    --min-epochs=MIN_EPOCHS\n                        Minimum number of epochs for training. Default is 50.\n                        Set to 0 to disable.\n    --batch-size=BATCH_SIZE\n                        Batch size for training. Default is 0 for whole\n                        dataset.\n    -s SEED, --seed=SEED\n                        Random seed for training. Default is random.\n    --lr=LR             Learning rate for training. Default is 0.03.\n    --hidden-feats=HIDDEN_FEATS\n                        Number of hidden features. Default is 4.\n    -k K, --k-clusters=K\n                        Number of niche clusters. Default is 6.\n    --modularity-loss-weight=MODULARITY_LOSS_WEIGHT\n                        Weight for modularity loss. Default is 0.3.\n    --purity-loss-weight=PURITY_LOSS_WEIGHT\n                        Weight for purity loss. Default is 300.\n    --regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT\n                        Weight for regularization loss. Default is 0.1.\n    --beta=BETA         Beta value control niche cluster assignment matrix.\n                        Default is 0.3.\n\n```\n\n### Output\n\nThe intermediate and final results are located in `preprocessing-dir`, `GNN-dir`, and `NTScore-dir` directories. Please see [IO files explanation](tutorials/IO_files.md#output-files) for detailed infromation.\n\n### Visualization\n\nPlease see [post analysis tutorial](tutorials/post_analysis.md).\n\n## Citation\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A niche-centered, machine learning method for constructing spatially continuous trajectories",
    "version": "0.0.5",
    "project_urls": {
        "Homepage": "https://github.com/gyuanlab/ONTraC",
        "Issue Tracker": "https://github.com/gyuanlab/ONTraC/issues",
        "Repository": "https://github.com/gyuanlab/ONTraC"
    },
    "split_keywords": [
        "deep-learning",
        " pytorch",
        " pytorch geometric",
        " trajectory inference",
        " spatial omics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f2c7823b5db7cd4d66a3a75d748a59b3fb0e0262db4316b217e523fa93ee22c7",
                "md5": "83f9bb1fbd4bf238901360d3f724eec4",
                "sha256": "6bf8704f22df9b06d97bcac2025c10ef478599c810bc13cf4fb0c4568cd85cc0"
            },
            "downloads": -1,
            "filename": "ONTraC-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "83f9bb1fbd4bf238901360d3f724eec4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "==3.11.*",
            "size": 44117,
            "upload_time": "2024-04-25T15:51:54",
            "upload_time_iso_8601": "2024-04-25T15:51:54.834423Z",
            "url": "https://files.pythonhosted.org/packages/f2/c7/823b5db7cd4d66a3a75d748a59b3fb0e0262db4316b217e523fa93ee22c7/ONTraC-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d67df2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7",
                "md5": "feeb91ea7672e537c572e18a929efff9",
                "sha256": "22b2ab600c75bb4fe2d0f513e55ee9da9a62e045126792eaedbd07ddaa85024a"
            },
            "downloads": -1,
            "filename": "ontrac-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "feeb91ea7672e537c572e18a929efff9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "==3.11.*",
            "size": 52426115,
            "upload_time": "2024-04-25T15:52:35",
            "upload_time_iso_8601": "2024-04-25T15:52:35.720842Z",
            "url": "https://files.pythonhosted.org/packages/d6/7d/f2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7/ontrac-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-25 15:52:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "gyuanlab",
    "github_project": "ONTraC",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ontrac"
}
        
Elapsed time: 0.24675s