# 3DMolMS
[![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa] (free for academic use)
**3D** **Mol**ecular Network for **M**ass **S**pectra Prediction (3DMolMS) is a deep neural network model to predict the MS/MS spectra of compounds from their 3D conformations. This model's molecular representation, learned through MS/MS prediction tasks, can be further applied to enhance performance in other molecular-related tasks, such as predicting retention times and collision cross sections.
[Read our paper in Bioinformatics](https://academic.oup.com/bioinformatics/article/39/6/btad354/7186501) | [Try our online service at GNPS](https://spectrumprediction.gnps2.org) | [Install from PyPI](https://pypi.org/project/molnetpack/)
## Installation
3DMolMS is available on PyPI. You can install the latest version using `pip`:
```bash
pip install molnetpack
# PyTorch must be installed separately.
# For CUDA 11.6, install PyTorch with the following command:
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116
# For CUDA 11.7, use:
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
# For CPU-only usage, use:
pip install torch==1.13.0+cpu torchvision==0.14.0+cpu torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cpu
```
3DMolMS can also be installed through source codes:
```bash
git clone https://github.com/JosieHong/3DMolMS.git
cd 3DMolMS
pip install .
```
## Usage
To get started quickly, you can load a CSV or MGF file to predict MS/MS and then plot the predicted results.
```python
import torch
from molnetpack import MolNet
# Set the device to CPU for CPU-only usage:
device = torch.device("cpu")
# For GPU usage, set the device as follows (replace '0' with your desired GPU index):
# gpu_index = 0
# device = torch.device(f"cuda:{gpu_index}")
# Instantiate a MolNet object
molnet_engine = MolNet(device, seed=42) # The random seed can be any integer.
# Load input data (here we use a CSV file as an example)
molnet_engine.load_data(path_to_test_data='./test/input_msms.csv') # Increasing the batch size if you wanna speed up.
# molnet_engine.load_data(path_to_test_data='./test/input_msms.mgf') # MGF file is also supported
# molnet_engine.load_data(path_to_test_data='./test/input_msms.pkl') # PKL file is faster.
# Predict MS/MS
spectra1 = molnet_engine.pred_msms(path_to_results='./test/output_qtof_msms.mgf', instrument='qtof')
# You could also download the checkpoint from release and set the 'path_to_checkpoint':
# spectra = molnet_engine.pred_msms(path_to_results='./test/output_msms.mgf', path_to_checkpoint='<path to the checkpoint>')
# Instrument can be 'qtof' or 'orbitrap'.
# Plot the predicted MS/MS with 3D molecular conformation
molnet_engine.plot_msms(dir_to_img='./img/', instrument='qtof')
```
For CCS prediction, please use the following codes after instantiating a MolNet object.
```python
# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_ccs.csv')
# Pred CCS
ccs_df = molnet_engine.pred_ccs(path_to_results='./test/output_ccs.csv')
```
For RT prediction, please use the following code after instantiating a MolNet object. Please note that since this model is trained on the METLIN-SMRT dataset, the predicted retention time is under the same experimental conditions as the METLIN-SMRT set.
```python
# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_rt.csv')
# Pred RT
rt_df = molnet_engine.pred_rt(path_to_results='./test/output_rt.csv')
```
For saving the molecular embeddings, please use the following codes after instantiating a MolNet object.
```python
# Load input data
molnet_engine.load_data(path_to_test_data='./test/input_savefeat.csv')
# Inference to get the features
features = molnet_engine.save_features()
print('Titles:', ids)
print('Features shape:', features.shape)
```
The sample input files, a CSV and an MGF, are located at `./test/demo_input.csv` and `./test/demo_input.mgf`, respectively. If the input data is only expected to be used in CCS prediction, you may assign an arbitrary numerical value to the `Collision_Energy` field in the CSV file or to `COLLISION_ENERGY` in the MGF file. It's important to note that during the data loading phase, any input formats that are not supported will be automatically excluded. Below is a table outlining the types of input data that are supported:
| Item | Supported input |
|------------------|---------------------------------------------------------------|
| Atom number | <=300 |
| Atom types | 'C', 'O', 'N', 'H', 'P', 'S', 'F', 'Cl', 'B', 'Br', 'I', 'Na' |
| Precursor types | '[M+H]+', '[M-H]-', '[M+H-H2O]+', '[M+Na]+', '[M+2H]2+' |
| Collision energy | any number |
The documents for running MS/MS prediction from source codes are at [MSMS_PRED.md](docs/MSMS_PRED.md).
## Citation
If you use 3DMolMS in your research, please cite our paper:
```
@article{hong20233dmolms,
title={3DMolMS: prediction of tandem mass spectra from 3D molecular conformations},
author={Hong, Yuhui and Li, Sujun and Welch, Christopher J and Tichy, Shane and Ye, Yuzhen and Tang, Haixu},
journal={Bioinformatics},
volume={39},
number={6},
pages={btad354},
year={2023},
publisher={Oxford University Press}
}
@article{hong2024enhanced,
title={Enhanced structure-based prediction of chiral stationary phases for chromatographic enantioseparation from 3D molecular conformations},
author={Hong, Yuhui and Welch, Christopher J and Piras, Patrick and Tang, Haixu},
journal={Analytical Chemistry},
volume={96},
number={6},
pages={2351--2359},
year={2024},
publisher={ACS Publications}
}
```
Thank you for considering 3DMolMS for your research needs!
## License
This work is licensed under a
[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].
[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png
[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg
Raw data
{
"_id": null,
"home_page": null,
"name": "molnetpack",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Yuhui Hong <yuhhong@iu.edu>",
"keywords": null,
"author": "Yuhui Hong, Sujun Li, Christopher J Welch, Shane Tichy, Yuzhen Ye, Haixu Tang",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/6c/7b/0a9a446c8e1ee01939964703d38ee874d1090bef63c0f926d8a586c4e8d7/molnetpack-1.1.9.tar.gz",
"platform": null,
"description": "# 3DMolMS\n\n[![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa] (free for academic use) \n\n**3D** **Mol**ecular Network for **M**ass **S**pectra Prediction (3DMolMS) is a deep neural network model to predict the MS/MS spectra of compounds from their 3D conformations. This model's molecular representation, learned through MS/MS prediction tasks, can be further applied to enhance performance in other molecular-related tasks, such as predicting retention times and collision cross sections. \n\n[Read our paper in Bioinformatics](https://academic.oup.com/bioinformatics/article/39/6/btad354/7186501) | [Try our online service at GNPS](https://spectrumprediction.gnps2.org) | [Install from PyPI](https://pypi.org/project/molnetpack/)\n\n## Installation\n\n3DMolMS is available on PyPI. You can install the latest version using `pip`:\n\n```bash\npip install molnetpack\n\n# PyTorch must be installed separately. \n# For CUDA 11.6, install PyTorch with the following command:\npip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116\n\n# For CUDA 11.7, use:\npip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117\n\n# For CPU-only usage, use:\npip install torch==1.13.0+cpu torchvision==0.14.0+cpu torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cpu\n```\n\n3DMolMS can also be installed through source codes:\n\n```bash\ngit clone https://github.com/JosieHong/3DMolMS.git\ncd 3DMolMS\n\npip install .\n```\n\n## Usage\n\nTo get started quickly, you can load a CSV or MGF file to predict MS/MS and then plot the predicted results. \n\n```python\nimport torch\nfrom molnetpack import MolNet\n\n# Set the device to CPU for CPU-only usage:\ndevice = torch.device(\"cpu\")\n\n# For GPU usage, set the device as follows (replace '0' with your desired GPU index):\n# gpu_index = 0\n# device = torch.device(f\"cuda:{gpu_index}\")\n\n# Instantiate a MolNet object\nmolnet_engine = MolNet(device, seed=42) # The random seed can be any integer. \n\n# Load input data (here we use a CSV file as an example)\nmolnet_engine.load_data(path_to_test_data='./test/input_msms.csv') # Increasing the batch size if you wanna speed up.\n# molnet_engine.load_data(path_to_test_data='./test/input_msms.mgf') # MGF file is also supported\n# molnet_engine.load_data(path_to_test_data='./test/input_msms.pkl') # PKL file is faster. \n\n# Predict MS/MS\nspectra1 = molnet_engine.pred_msms(path_to_results='./test/output_qtof_msms.mgf', instrument='qtof')\n# You could also download the checkpoint from release and set the 'path_to_checkpoint':\n# spectra = molnet_engine.pred_msms(path_to_results='./test/output_msms.mgf', path_to_checkpoint='<path to the checkpoint>')\n# Instrument can be 'qtof' or 'orbitrap'. \n\n# Plot the predicted MS/MS with 3D molecular conformation\nmolnet_engine.plot_msms(dir_to_img='./img/', instrument='qtof')\n```\n\nFor CCS prediction, please use the following codes after instantiating a MolNet object. \n\n```python\n# Load input data\nmolnet_engine.load_data(path_to_test_data='./test/input_ccs.csv')\n\n# Pred CCS\nccs_df = molnet_engine.pred_ccs(path_to_results='./test/output_ccs.csv')\n```\n\nFor RT prediction, please use the following code after instantiating a MolNet object. Please note that since this model is trained on the METLIN-SMRT dataset, the predicted retention time is under the same experimental conditions as the METLIN-SMRT set.\n\n```python\n# Load input data\nmolnet_engine.load_data(path_to_test_data='./test/input_rt.csv')\n\n# Pred RT\nrt_df = molnet_engine.pred_rt(path_to_results='./test/output_rt.csv')\n```\n\nFor saving the molecular embeddings, please use the following codes after instantiating a MolNet object. \n\n```python\n# Load input data\nmolnet_engine.load_data(path_to_test_data='./test/input_savefeat.csv')\n\n# Inference to get the features\nfeatures = molnet_engine.save_features()\n\nprint('Titles:', ids)\nprint('Features shape:', features.shape)\n```\n\nThe sample input files, a CSV and an MGF, are located at `./test/demo_input.csv` and `./test/demo_input.mgf`, respectively. If the input data is only expected to be used in CCS prediction, you may assign an arbitrary numerical value to the `Collision_Energy` field in the CSV file or to `COLLISION_ENERGY` in the MGF file. It's important to note that during the data loading phase, any input formats that are not supported will be automatically excluded. Below is a table outlining the types of input data that are supported: \n\n| Item | Supported input |\n|------------------|---------------------------------------------------------------|\n| Atom number | <=300 |\n| Atom types | 'C', 'O', 'N', 'H', 'P', 'S', 'F', 'Cl', 'B', 'Br', 'I', 'Na' |\n| Precursor types | '[M+H]+', '[M-H]-', '[M+H-H2O]+', '[M+Na]+', '[M+2H]2+' |\n| Collision energy | any number |\n\nThe documents for running MS/MS prediction from source codes are at [MSMS_PRED.md](docs/MSMS_PRED.md). \n\n\n\n## Citation\n\nIf you use 3DMolMS in your research, please cite our paper:\n\n```\n@article{hong20233dmolms,\n title={3DMolMS: prediction of tandem mass spectra from 3D molecular conformations},\n author={Hong, Yuhui and Li, Sujun and Welch, Christopher J and Tichy, Shane and Ye, Yuzhen and Tang, Haixu},\n journal={Bioinformatics},\n volume={39},\n number={6},\n pages={btad354},\n year={2023},\n publisher={Oxford University Press}\n}\n@article{hong2024enhanced,\n title={Enhanced structure-based prediction of chiral stationary phases for chromatographic enantioseparation from 3D molecular conformations},\n author={Hong, Yuhui and Welch, Christopher J and Piras, Patrick and Tang, Haixu},\n journal={Analytical Chemistry},\n volume={96},\n number={6},\n pages={2351--2359},\n year={2024},\n publisher={ACS Publications}\n}\n```\n\nThank you for considering 3DMolMS for your research needs!\n\n## License\n\nThis work is licensed under a\n[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].\n\n[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]\n\n[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/\n[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png\n[cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg\n",
"bugtrack_url": null,
"license": null,
"summary": "3DMolMS: prediction of tandem mass spectra from 3D molecular conformations",
"version": "1.1.9",
"project_urls": {
"Changelog": "https://github.com/JosieHong/3DMolMS/blob/main/CHANGE_LOG.md",
"Homepage": "https://github.com/JosieHong/3DMolMS/",
"Issues": "https://github.com/JosieHong/3DMolMS/issues"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "78d7a8c186abbfe44e4d31fa70f7579dc7bfe21e0bdc1c00608c2d14664a217e",
"md5": "4e14339e3a720998521369a2f1c6ef72",
"sha256": "f31a5e17dbcefadb8c1983861607a0a9a9677ab2b054a5705dd6301abae11ba9"
},
"downloads": -1,
"filename": "molnetpack-1.1.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4e14339e3a720998521369a2f1c6ef72",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 30166,
"upload_time": "2024-08-29T15:51:34",
"upload_time_iso_8601": "2024-08-29T15:51:34.080341Z",
"url": "https://files.pythonhosted.org/packages/78/d7/a8c186abbfe44e4d31fa70f7579dc7bfe21e0bdc1c00608c2d14664a217e/molnetpack-1.1.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6c7b0a9a446c8e1ee01939964703d38ee874d1090bef63c0f926d8a586c4e8d7",
"md5": "cfefcc25e64f87d03f06fc56751a94fe",
"sha256": "ceca4dc1bbd51a0985ca83d8440f640785436dea31094618b6df65e4e0194121"
},
"downloads": -1,
"filename": "molnetpack-1.1.9.tar.gz",
"has_sig": false,
"md5_digest": "cfefcc25e64f87d03f06fc56751a94fe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 27799,
"upload_time": "2024-08-29T15:51:35",
"upload_time_iso_8601": "2024-08-29T15:51:35.147847Z",
"url": "https://files.pythonhosted.org/packages/6c/7b/0a9a446c8e1ee01939964703d38ee874d1090bef63c0f926d8a586c4e8d7/molnetpack-1.1.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-29 15:51:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "JosieHong",
"github_project": "3DMolMS",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "molnetpack"
}