# **ONTraC**
ONTraC (Ordered Niche Trajectory Construction) is a niche-centered, machine learning
method for constructing spatially continuous trajectories. ONTraC differs from existing tools in
that it treats a niche, rather than an individual cell, as the basic unit for spatial trajectory
analysis. In this context, we define niche as a multicellular, spatially localized region where
different cell types may coexist and interact with each other. ONTraC seamlessly integrates
cell-type composition and spatial information by using the graph neural network modeling
framework. Its output, which is called the niche trajectory, can be viewed as a one dimensional
representation of the tissue microenvironment continuum. By disentangling cell-level and niche-
level properties, niche trajectory analysis provides a coherent framework to study coordinated
responses from all the cells in association with continuous tissue microenvironment variations.
![ONTraC Structure](docs/source/_static/images/ONTraC_structure.png)
## Required packages
pyyaml=6.0.1
pandas=2.1.1
pytorch=2.2.1
torch_geometric=2.5.1
## Installation
Please see the [installation tutorial](tutorials/installation.md)
## Tutorial
### Input File
A example input file is provided in `examples/stereo_seq_brain/original_data.csv`.
This file contains all input formation with five columns: Cell_ID, Sample, Cell_Type, x, and y.
| Cell_ID | Sample | Cell_Type | x | y |
| --------------- | -------- | --------- | ------- | ----- |
| E12_E1S3_100034 | E12_E1S3 | Fibro | 15940 | 18584 |
| E12_E1S3_100035 | E12_E1S3 | Fibro | 15942 | 18623 |
| ... | ... | ... | ... | ... |
| E16_E2S7_326412 | E16_E2S7 | Fibro | 32990.5 | 14475 |
For detailed information about input and output file, please see [IO files explanation](tutorials/IO_files.md#input-files).
### Running ONTraC
The required options for running ONTraC are the paths to the input file and the three output directories:
- **preprocessing-dir:** This directory stores preprocessed data and other intermediary datasets for analysis.
- **GNN-dir:** This directory stores output from running the GP (Graph Pooling) algorithm.
- **NTScore-dir:** This directory stores NTScore output.
```{sh}
cd examples/stereo_seq_brain
ONTraC -d original_data.csv --preprocessing-dir stereo_seq_preprocessing_dir --GNN-dir stereo_seq_GNN --NTScore-dir stereo_seq_NTScore
```
We recommand running `ONTraC` on GPU, it may take much more time on your own laptop with CPU only.
All available parameter options are listed below.
```{text}
Usage: ONTraC <-d DATASET> <--preprocessing-dir PREPROCESSING_DIR> <--GNN-dir GNN_DIR> <--NTScore-dir NTSCORE_DIR>
[--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--device DEVICE] [--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA]
[--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE] [-s SEED] [--seed SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [-k K_CLUSTERS]
[--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT]
[--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA]
All steps of ONTraC including dataset creation, Graph Pooling, and NT score
calculation.
Options:
--version show program's version number and exit
-h, --help show this help message and exit
IO:
-d DATASET, --dataset=DATASET
Original input dataset.
--preprocessing-dir=PREPROCESSING_DIR
Directory for preprocessing outputs.
--GNN-dir=GNN_DIR Directory for the GNN output.
--NTScore-dir=NTSCORE_DIR
Directory for the NTScore output
Niche Network Construction:
--n-cpu=N_CPU Number of CPUs used for parallel computing. Default is
4.
--n-neighbors=N_NEIGHBORS
Number of neighbors used for kNN graph construction.
Default is 50.
Options for training:
--device=DEVICE Device for training. We support cpu and cuda now. Auto
select if not specified.
--epochs=EPOCHS Number of maximum epochs for training. Default is
1000.
--patience=PATIENCE
Number of epochs wait for better result. Default is
100.
--min-delta=MIN_DELTA
Minimum delta for better result. Default is 0.001
--min-epochs=MIN_EPOCHS
Minimum number of epochs for training. Default is 50.
Set to 0 to disable.
--batch-size=BATCH_SIZE
Batch size for training. Default is 0 for whole
dataset.
-s SEED, --seed=SEED
Random seed for training. Default is random.
--lr=LR Learning rate for training. Default is 0.03.
--hidden-feats=HIDDEN_FEATS
Number of hidden features. Default is 4.
-k K, --k-clusters=K
Number of niche clusters. Default is 6.
--modularity-loss-weight=MODULARITY_LOSS_WEIGHT
Weight for modularity loss. Default is 0.3.
--purity-loss-weight=PURITY_LOSS_WEIGHT
Weight for purity loss. Default is 300.
--regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT
Weight for regularization loss. Default is 0.1.
--beta=BETA Beta value control niche cluster assignment matrix.
Default is 0.3.
```
### Output
The intermediate and final results are located in `preprocessing-dir`, `GNN-dir`, and `NTScore-dir` directories. Please see [IO files explanation](tutorials/IO_files.md#output-files) for detailed infromation.
### Visualization
Please see [post analysis tutorial](tutorials/post_analysis.md).
## Citation
Raw data
{
"_id": null,
"home_page": null,
"name": "ONTraC",
"maintainer": null,
"docs_url": null,
"requires_python": "==3.11.*",
"maintainer_email": null,
"keywords": "deep-learning, pytorch, pytorch geometric, trajectory inference, spatial omics",
"author": null,
"author_email": "Wen Wang <wwang.bio@gmail.com>, Shiwei Zheng <swzheng29@gmail.com>, Crystal Shin <sjcshin5040@gmail.com>, Guo-Cheng Yuan <gcyuan@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/d6/7d/f2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7/ontrac-0.0.5.tar.gz",
"platform": null,
"description": "# **ONTraC**\n\nONTraC (Ordered Niche Trajectory Construction) is a niche-centered, machine learning\nmethod for constructing spatially continuous trajectories. ONTraC differs from existing tools in\nthat it treats a niche, rather than an individual cell, as the basic unit for spatial trajectory\nanalysis. In this context, we define niche as a multicellular, spatially localized region where\ndifferent cell types may coexist and interact with each other. ONTraC seamlessly integrates\ncell-type composition and spatial information by using the graph neural network modeling\nframework. Its output, which is called the niche trajectory, can be viewed as a one dimensional\nrepresentation of the tissue microenvironment continuum. By disentangling cell-level and niche-\nlevel properties, niche trajectory analysis provides a coherent framework to study coordinated\nresponses from all the cells in association with continuous tissue microenvironment variations.\n\n![ONTraC Structure](docs/source/_static/images/ONTraC_structure.png)\n\n## Required packages\n\npyyaml=6.0.1\npandas=2.1.1\npytorch=2.2.1\ntorch_geometric=2.5.1\n\n## Installation\n\nPlease see the [installation tutorial](tutorials/installation.md)\n\n## Tutorial\n\n### Input File\n\nA example input file is provided in `examples/stereo_seq_brain/original_data.csv`.\nThis file contains all input formation with five columns: Cell_ID, Sample, Cell_Type, x, and y.\n\n| Cell_ID | Sample | Cell_Type | x | y |\n| --------------- | -------- | --------- | ------- | ----- |\n| E12_E1S3_100034 | E12_E1S3 | Fibro | 15940 | 18584 |\n| E12_E1S3_100035 | E12_E1S3 | Fibro | 15942 | 18623 |\n| ... | ... | ... | ... | ... |\n| E16_E2S7_326412 | E16_E2S7 | Fibro | 32990.5 | 14475 |\n\nFor detailed information about input and output file, please see [IO files explanation](tutorials/IO_files.md#input-files).\n\n### Running ONTraC\n\nThe required options for running ONTraC are the paths to the input file and the three output directories:\n\n- **preprocessing-dir:** This directory stores preprocessed data and other intermediary datasets for analysis.\n- **GNN-dir:** This directory stores output from running the GP (Graph Pooling) algorithm.\n- **NTScore-dir:** This directory stores NTScore output.\n\n```{sh}\ncd examples/stereo_seq_brain\nONTraC -d original_data.csv --preprocessing-dir stereo_seq_preprocessing_dir --GNN-dir stereo_seq_GNN --NTScore-dir stereo_seq_NTScore\n```\n\nWe recommand running `ONTraC` on GPU, it may take much more time on your own laptop with CPU only.\n\nAll available parameter options are listed below.\n\n```{text}\nUsage: ONTraC <-d DATASET> <--preprocessing-dir PREPROCESSING_DIR> <--GNN-dir GNN_DIR> <--NTScore-dir NTSCORE_DIR>\n [--n-cpu N_CPU] [--n-neighbors N_NEIGHBORS] [--device DEVICE] [--epochs EPOCHS] [--patience PATIENCE] [--min-delta MIN_DELTA] \n [--min-epochs MIN_EPOCHS] [--batch-size BATCH_SIZE] [-s SEED] [--seed SEED] [--lr LR] [--hidden-feats HIDDEN_FEATS] [-k K_CLUSTERS]\n [--modularity-loss-weight MODULARITY_LOSS_WEIGHT] [--purity-loss-weight PURITY_LOSS_WEIGHT] \n [--regularization-loss-weight REGULARIZATION_LOSS_WEIGHT] [--beta BETA]\n\nAll steps of ONTraC including dataset creation, Graph Pooling, and NT score\ncalculation.\n\nOptions:\n --version show program's version number and exit\n -h, --help show this help message and exit\n\n IO:\n -d DATASET, --dataset=DATASET\n Original input dataset.\n --preprocessing-dir=PREPROCESSING_DIR\n Directory for preprocessing outputs.\n --GNN-dir=GNN_DIR Directory for the GNN output.\n --NTScore-dir=NTSCORE_DIR\n Directory for the NTScore output\n\n Niche Network Construction:\n --n-cpu=N_CPU Number of CPUs used for parallel computing. Default is\n 4.\n --n-neighbors=N_NEIGHBORS\n Number of neighbors used for kNN graph construction.\n Default is 50.\n\n Options for training:\n --device=DEVICE Device for training. We support cpu and cuda now. Auto\n select if not specified.\n --epochs=EPOCHS Number of maximum epochs for training. Default is\n 1000.\n --patience=PATIENCE\n Number of epochs wait for better result. Default is\n 100.\n --min-delta=MIN_DELTA\n Minimum delta for better result. Default is 0.001\n --min-epochs=MIN_EPOCHS\n Minimum number of epochs for training. Default is 50.\n Set to 0 to disable.\n --batch-size=BATCH_SIZE\n Batch size for training. Default is 0 for whole\n dataset.\n -s SEED, --seed=SEED\n Random seed for training. Default is random.\n --lr=LR Learning rate for training. Default is 0.03.\n --hidden-feats=HIDDEN_FEATS\n Number of hidden features. Default is 4.\n -k K, --k-clusters=K\n Number of niche clusters. Default is 6.\n --modularity-loss-weight=MODULARITY_LOSS_WEIGHT\n Weight for modularity loss. Default is 0.3.\n --purity-loss-weight=PURITY_LOSS_WEIGHT\n Weight for purity loss. Default is 300.\n --regularization-loss-weight=REGULARIZATION_LOSS_WEIGHT\n Weight for regularization loss. Default is 0.1.\n --beta=BETA Beta value control niche cluster assignment matrix.\n Default is 0.3.\n\n```\n\n### Output\n\nThe intermediate and final results are located in `preprocessing-dir`, `GNN-dir`, and `NTScore-dir` directories. Please see [IO files explanation](tutorials/IO_files.md#output-files) for detailed infromation.\n\n### Visualization\n\nPlease see [post analysis tutorial](tutorials/post_analysis.md).\n\n## Citation\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A niche-centered, machine learning method for constructing spatially continuous trajectories",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/gyuanlab/ONTraC",
"Issue Tracker": "https://github.com/gyuanlab/ONTraC/issues",
"Repository": "https://github.com/gyuanlab/ONTraC"
},
"split_keywords": [
"deep-learning",
" pytorch",
" pytorch geometric",
" trajectory inference",
" spatial omics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f2c7823b5db7cd4d66a3a75d748a59b3fb0e0262db4316b217e523fa93ee22c7",
"md5": "83f9bb1fbd4bf238901360d3f724eec4",
"sha256": "6bf8704f22df9b06d97bcac2025c10ef478599c810bc13cf4fb0c4568cd85cc0"
},
"downloads": -1,
"filename": "ONTraC-0.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "83f9bb1fbd4bf238901360d3f724eec4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "==3.11.*",
"size": 44117,
"upload_time": "2024-04-25T15:51:54",
"upload_time_iso_8601": "2024-04-25T15:51:54.834423Z",
"url": "https://files.pythonhosted.org/packages/f2/c7/823b5db7cd4d66a3a75d748a59b3fb0e0262db4316b217e523fa93ee22c7/ONTraC-0.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d67df2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7",
"md5": "feeb91ea7672e537c572e18a929efff9",
"sha256": "22b2ab600c75bb4fe2d0f513e55ee9da9a62e045126792eaedbd07ddaa85024a"
},
"downloads": -1,
"filename": "ontrac-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "feeb91ea7672e537c572e18a929efff9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "==3.11.*",
"size": 52426115,
"upload_time": "2024-04-25T15:52:35",
"upload_time_iso_8601": "2024-04-25T15:52:35.720842Z",
"url": "https://files.pythonhosted.org/packages/d6/7d/f2c57c86e02161526e67536141a67e8f44bf423150fdf3076e02a1e62ca7/ontrac-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-25 15:52:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gyuanlab",
"github_project": "ONTraC",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "ontrac"
}