demuxnet


Namedemuxnet JSON
Version 1.1.6 PyPI version JSON
download
home_pagehttps://github.com/paddi1990/DemuxNet
SummaryMachine learning augmented sample demultiplexing of pooled single-cell RNA-seq data
upload_time2024-11-15 07:40:47
maintainerNone
docs_urlNone
authorYou Wu
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![Logo](images/logo.png)

## Machine learning augmented sample demultiplexing of pooled single-cell RNA-seq data
---


![GitHub release (latest SemVer)](https://img.shields.io/badge/Version-v1.1.0-yellowgreen) ![GitHub release (latest SemVer)](https://img.shields.io/badge/Language-python-yellowgreen)

DemuxNet is a computational tool for single-cell RNA sequencing (scRNA-seq) sample demultiplexing. It automates the process of assigning individual cells to their corresponding samples when multiple samples are pooled in a single sequencing run. DemuxNet predicts missing CMO (Cell Multiplexing Oligo) labels for barcodes with incomplete or missing information by leveraging machine learning algorithms. The tool works with sparse single-cell RNA expression matrices in RDS format and trains on known barcodes to predict sample identities for cells with missing or ambiguous CMO lables.
For more information, please refer to our manuscript: [******].

---

### Installation

#### Requirements

 - **Python**: Version 3.7 or higher is required to run DemuxNet.
 - **GCC**: Version 7 or higher is required to install dependencies, ensuring compatibility with the underlying R environment.


#### Install demuxnet from source code
```bash
git clone https://github.com/paddi1990/DemuxNet.git
cd DemuxNet
python setup.py install
```
#### Or, you can install it via pip
```bash
pip install demuxnet
```

---

### Data preparation
DemuxNet takes a sparse single-cell expression matrix in RDS format as input. For the preprocessing of your single-cell RNA expression data into the required format, you can follow the steps below::

```bash
************
************
************
************
************
```

---

### Usage

Once installed, DemuxNet can be run from the command line. The tool automatically detects the "CMO" keyword within barcode strings and uses this information for training. For barcodes that do not contain the "CMO" information, DemuxNet will predict and fill in the missing CMO class.

#### Command Line Usage

To run Demuxnet, use the following command:

```bash
demuxnet -i gene_expressioin_matrix.rds -model DNN -feature 6000 -out prediction.csv
```
This command reads the input expression matrix, selects the top 6000 non-sparse features, uses the DNN model for prediction, and outputs the predicted CMO labels into `prediction.csv`.

#### Parameters:
- **`-i`** `gene_expression_matrix.rds`  
  Path to the input file in RDS format, which contains the sparse single-cell RNA expression matrix. This matrix should have rows representing cells and columns representing gene expression levels.
  
- **`-model`** `DNN`  
  The machine learning model to use for predicting the CMO labels. Currently, the available model is `DNN` (Deep Neural Network), but other models may be supported in future versions.
  
- **`-feature`** `6000`  
  The number of top features (genes) to use for training the model, selected based on non-zero counts in the expression matrix. This parameter determines the subset of genes used to train the model and should be tuned according to the dataset size and complexity.
  
- **`-out`** `prediction.csv`  
  The output file where the predicted CMO labels will be saved. The output will be a CSV file with two columns: "Barcode" and "Predicted CMO Label."

---


### Model Training

DemuxNet employs a Deep Neural Network (DNN) architecture for training the model based on the provided input data. The training process involves:

1. **Data Splitting**: The input dataset is split into training, validation, and test sets.
2. **Feature Selection**: The top `n` features (genes) with the highest non-zero counts are selected for training the model.
3. **Model Training**: A DNN model is trained using the selected features and corresponding CMO labels. Cross-entropy loss is used for multi-class classification tasks.
4. **Model Inference**: The trained model is used to predict the missing CMO labels for the test set (i.e., barcodes with missing or ambiguous CMO labels).

### Performance Metrics

- **Validation Accuracy**: DemuxNet provides an accuracy score on the validation set to assess the model's performance.
- **Prediction**: After training, the model performs inference on the test set and generates predictions for the missing CMO labels.

---


### Visualization
To do

---
### Contact
DemuxNet is maintained by Hu lab. For any questions or issues, please feel free to open an issue on the GitHub repository.
If you use DemuxNet in your research, please cite *************************.

---

### License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

### Contributing

We welcome contributions! If you have suggestions or improvements, feel free to submit a pull request. For major changes, please open an issue to discuss what you would like to change.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/paddi1990/DemuxNet",
    "name": "demuxnet",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "You Wu",
    "author_email": "wuyou1990@sjtu.edu.cn",
    "download_url": "https://files.pythonhosted.org/packages/48/8d/a86a64148a3fea52e276bb990ed892305125443005f6db213bb7e2e47fc5/demuxnet-1.1.6.tar.gz",
    "platform": null,
    "description": "![Logo](images/logo.png)\n\n## Machine learning augmented sample demultiplexing of pooled single-cell RNA-seq data\n---\n\n\n![GitHub release (latest SemVer)](https://img.shields.io/badge/Version-v1.1.0-yellowgreen) ![GitHub release (latest SemVer)](https://img.shields.io/badge/Language-python-yellowgreen)\n\nDemuxNet is a computational tool for single-cell RNA sequencing (scRNA-seq) sample demultiplexing. It automates the process of assigning individual cells to their corresponding samples when multiple samples are pooled in a single sequencing run. DemuxNet predicts missing CMO (Cell Multiplexing Oligo) labels for barcodes with incomplete or missing information by leveraging machine learning algorithms. The tool works with sparse single-cell RNA expression matrices in RDS format and trains on known barcodes to predict sample identities for cells with missing or ambiguous CMO lables.\nFor more information, please refer to our manuscript: [******].\n\n---\n\n### Installation\n\n#### Requirements\n\n - **Python**: Version 3.7 or higher is required to run DemuxNet.\n - **GCC**: Version 7 or higher is required to install dependencies, ensuring compatibility with the underlying R environment.\n\n\n#### Install demuxnet from source code\n```bash\ngit clone https://github.com/paddi1990/DemuxNet.git\ncd DemuxNet\npython setup.py install\n```\n#### Or, you can install it via pip\n```bash\npip install demuxnet\n```\n\n---\n\n### Data preparation\nDemuxNet takes a sparse single-cell expression matrix in RDS format as input. For the preprocessing of your single-cell RNA expression data into the required format, you can follow the steps below::\n\n```bash\n************\n************\n************\n************\n************\n```\n\n---\n\n### Usage\n\nOnce installed, DemuxNet can be run from the command line. The tool automatically detects the \"CMO\" keyword within barcode strings and uses this information for training. For barcodes that do not contain the \"CMO\" information, DemuxNet will predict and fill in the missing CMO class.\n\n#### Command Line Usage\n\nTo run Demuxnet, use the following command:\n\n```bash\ndemuxnet -i gene_expressioin_matrix.rds -model DNN -feature 6000 -out prediction.csv\n```\nThis command reads the input expression matrix, selects the top 6000 non-sparse features, uses the DNN model for prediction, and outputs the predicted CMO labels into `prediction.csv`.\n\n#### Parameters:\n- **`-i`** `gene_expression_matrix.rds`  \n  Path to the input file in RDS format, which contains the sparse single-cell RNA expression matrix. This matrix should have rows representing cells and columns representing gene expression levels.\n  \n- **`-model`** `DNN`  \n  The machine learning model to use for predicting the CMO labels. Currently, the available model is `DNN` (Deep Neural Network), but other models may be supported in future versions.\n  \n- **`-feature`** `6000`  \n  The number of top features (genes) to use for training the model, selected based on non-zero counts in the expression matrix. This parameter determines the subset of genes used to train the model and should be tuned according to the dataset size and complexity.\n  \n- **`-out`** `prediction.csv`  \n  The output file where the predicted CMO labels will be saved. The output will be a CSV file with two columns: \"Barcode\" and \"Predicted CMO Label.\"\n\n---\n\n\n### Model Training\n\nDemuxNet employs a Deep Neural Network (DNN) architecture for training the model based on the provided input data. The training process involves:\n\n1. **Data Splitting**: The input dataset is split into training, validation, and test sets.\n2. **Feature Selection**: The top `n` features (genes) with the highest non-zero counts are selected for training the model.\n3. **Model Training**: A DNN model is trained using the selected features and corresponding CMO labels. Cross-entropy loss is used for multi-class classification tasks.\n4. **Model Inference**: The trained model is used to predict the missing CMO labels for the test set (i.e., barcodes with missing or ambiguous CMO labels).\n\n### Performance Metrics\n\n- **Validation Accuracy**: DemuxNet provides an accuracy score on the validation set to assess the model's performance.\n- **Prediction**: After training, the model performs inference on the test set and generates predictions for the missing CMO labels.\n\n---\n\n\n### Visualization\nTo do\n\n---\n### Contact\nDemuxNet is maintained by Hu lab. For any questions or issues, please feel free to open an issue on the GitHub repository.\nIf you use DemuxNet in your research, please cite *************************.\n\n---\n\n### License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\n### Contributing\n\nWe welcome contributions! If you have suggestions or improvements, feel free to submit a pull request. For major changes, please open an issue to discuss what you would like to change.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Machine learning augmented sample demultiplexing of pooled single-cell RNA-seq data",
    "version": "1.1.6",
    "project_urls": {
        "Homepage": "https://github.com/paddi1990/DemuxNet"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "44a46436f56c513688a0fdf6b072523f87157d5f460b304c9e816746eaf8830e",
                "md5": "f9f81597ab0189e1d69a89aee482b640",
                "sha256": "7b3a2313432e45cbc130f3b00fbe790ffae9b3234a96b52cedb68e8bc10bc390"
            },
            "downloads": -1,
            "filename": "demuxnet-1.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f9f81597ab0189e1d69a89aee482b640",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 175992,
            "upload_time": "2024-11-15T07:40:46",
            "upload_time_iso_8601": "2024-11-15T07:40:46.261000Z",
            "url": "https://files.pythonhosted.org/packages/44/a4/6436f56c513688a0fdf6b072523f87157d5f460b304c9e816746eaf8830e/demuxnet-1.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "488da86a64148a3fea52e276bb990ed892305125443005f6db213bb7e2e47fc5",
                "md5": "e56220b42cd8da5e4995d99d29134b53",
                "sha256": "25ef2be06f4022882c34e6ac8672e17eb1017ce2b8764bf68351b2bd539dcfd0"
            },
            "downloads": -1,
            "filename": "demuxnet-1.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "e56220b42cd8da5e4995d99d29134b53",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 411017,
            "upload_time": "2024-11-15T07:40:47",
            "upload_time_iso_8601": "2024-11-15T07:40:47.896589Z",
            "url": "https://files.pythonhosted.org/packages/48/8d/a86a64148a3fea52e276bb990ed892305125443005f6db213bb7e2e47fc5/demuxnet-1.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-15 07:40:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "paddi1990",
    "github_project": "DemuxNet",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "demuxnet"
}
        
Elapsed time: 1.59316s