# MMLMGN
## Introduction
`MMLMGN` is a Python library designed to implement feature engineering based on **multi-similarity modality hypergraph contrastive learning**. It serves as the core implementation for the paper:
**A Multi-Channel Graph Neural Network based on Multi-Similarity Modality Hypergraph Contrastive Learning for Predicting Unknown Types of Cancer Biomarkers**.
The library offers a high-level API that simplifies tasks related to **graph representation learning**.
## Installation
Install `MMLMGN` via `pip`:
```bash
pip install MMLMGN
```
## Usage Example
Below is a sample usage demonstrating how to set input and output paths and run the model:
```python
import os
from mmlmgn.mmlmgn import InputPaths, OutputPaths, run
data_dir = 'data'
output_dir = 'output'
# Ensure output directory exists
os.makedirs(output_dir, exist_ok=True)
# Define input paths or use default InputPaths and OutputPaths
input_paths = InputPaths(
Bin_data_path=f"{data_dir}/NodeGraph.csv",
SimWalk_RNA_path=f"{data_dir}/NodeAWalker.csv",
SimWalk_Dis_path=f"{data_dir}/NodeBWalker.csv",
SimST_RNA_path=f"{data_dir}/NodeASt.csv",
SimST_Dis_path=f"{data_dir}/NodeBSt.csv"
)
output_paths = OutputPaths(
Save_RNA_BinFeature_path=f"{output_dir}/NodeAGraphEmb.csv",
Save_DIS_BinFeature_path=f"{output_dir}/NodeBGraphEmb.csv",
Save_RNAWalkerFeature_path=f"{output_dir}/NodeAWalkerEmb.csv",
Save_DISWalkerFeature_path=f"{output_dir}/NodeBWalkerEmb.csv",
Save_RNA_STFeature_path=f"{output_dir}/NodeAStEmb.csv",
Save_DIS_STFeature_path=f"{output_dir}/NodeBStEmb.csv"
)
# Execute the graph representation learning process
run(input_paths, output_paths,
mi_num=467, dis_num=72,
hidden_list=[256, 256],
proj_hidden=64,
validation=1,
epochs=2,
lr=0.00001)
```
This example demonstrates how to configure input and output paths and run the learning process using the provided `run` function.
## Parameter Description
### Input
- File `NodeGraph.csv` is the graph input, used to construct the kernel similarity modal hyperedge of the node
- Files `NodeAWalker.csv` and `NodeBWalker.csv` are the feature inputs of two nodes, with a size of N*M, n is the number of nodes, and M is the feature dimension, used to construct the nearest neighbor modal hyperedge
- Files `NodeASt.csv` and `NodeBSt.csv` are the feature inputs of two nodes, with a size of N*M, n is the number of nodes, and M is the feature dimension, used to construct the structural topology modal hyperedge
Users can refer to https://github.com/1axin/MML-MGNN/tree/main/data to customize input to use MMLMGN
### Output
- NodeAGraphEmb.csv, NodeBGraphEmb.csv are the kernel similarity modal embeddings of nodes A and B respectively.
- NodeAWalkerEmb.csv, NodeBWalkerEmb.csv are the nearest neighbor similarity modal embeddings of nodes A and B respectively.
- NodeAStEmb.csv, NodeBStEmb.csv are the structural topology similarity modal embeddings of nodes A and B respectively.
### Parameters of run
- `input_paths`: An `InputPaths` object containing paths to the input data files. These files typically include network data or similarity data and serve as the primary input for the model.
- `output_paths`: An `OutputPaths` object that specifies where the feature extraction results and model outputs will be saved.
- `mi_num`: Indicates the number of Node A in the input data.
- `dis_num`: Represents the number of Node B in the input data.
- `hidden_list`: A list that defines the number of neurons in each hidden layer of the model.
- `proj_hidden`: Refers to the hidden layer size used during the projection process. This is typically used to adjust the intermediate representation of the data for optimal downstream feature usage.
- `epochs`: Number of training iterations. This defines how many times the model will go through the entire training dataset to converge towards a set of optimal parameters.
- `lr`: Learning rate, which controls the step size during the parameter updates in the training process. A smaller learning rate often leads to more stable convergence but might require longer training time.
These parameters are primarily used to configure the structure of the model and the training process, and they can be adjusted according to the specific task and characteristics of the data to optimize model performance.
## Features
- **Multi-similarity modality hypergraph contrastive learning**: Supports feature extraction across multiple similarity-based modalities.
- **Easy-to-use API**: Simple and flexible API for configuring input and output paths.
- **Customizable**: Users can easily adjust parameters such as hidden layers, projection dimensions, learning rate, and epochs.
## Documentation
For more detailed examples of input and output data, please visit https://github.com/1axin/MML-MGNN
## Contributing
Contributions are welcome! To contribute to this project, follow these steps:
1. Fork the repository.
2. Create a new branch (`git checkout -b feature/YourFeature`).
3. Commit your changes (`git commit -m 'Add new feature'`).
4. Push the branch (`git push origin feature/YourFeature`).
5. Open a Pull Request.
For more details, refer to the [contributing guide](CONTRIBUTING.md).
## License
This project is licensed under the [MIT License](LICENSE). Feel free to use, modify, and distribute it.
---
Thank you for using `MMLMGN`! If you encounter any issues or have suggestions, please open an issue on the repository.
Raw data
{
"_id": null,
"home_page": "https://github.com/1axin/MML-MGNN",
"name": "mmlmgn",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "Multiple Similarity Modes, HGCN, Hypergraph Contrastive Learning",
"author": "axin",
"author_email": "xinfei106@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/09/80/16953fd44ceac60b13636877ccd88c9d3bd731076644d20961e676a09752/mmlmgn-0.1.2.tar.gz",
"platform": null,
"description": "\r\n\r\n# MMLMGN\r\n\r\n## Introduction\r\n\r\n`MMLMGN` is a Python library designed to implement feature engineering based on **multi-similarity modality hypergraph contrastive learning**. It serves as the core implementation for the paper: \r\n\r\n**A Multi-Channel Graph Neural Network based on Multi-Similarity Modality Hypergraph Contrastive Learning for Predicting Unknown Types of Cancer Biomarkers**. \r\n\r\nThe library offers a high-level API that simplifies tasks related to **graph representation learning**.\r\n\r\n## Installation\r\n\r\nInstall `MMLMGN` via `pip`:\r\n\r\n```bash\r\npip install MMLMGN\r\n```\r\n\r\n## Usage Example\r\n\r\nBelow is a sample usage demonstrating how to set input and output paths and run the model:\r\n\r\n```python\r\nimport os\r\nfrom mmlmgn.mmlmgn import InputPaths, OutputPaths, run\r\n\r\ndata_dir = 'data'\r\noutput_dir = 'output'\r\n\r\n# Ensure output directory exists\r\nos.makedirs(output_dir, exist_ok=True)\r\n\r\n# Define input paths or use default InputPaths and OutputPaths\r\ninput_paths = InputPaths(\r\n Bin_data_path=f\"{data_dir}/NodeGraph.csv\",\r\n SimWalk_RNA_path=f\"{data_dir}/NodeAWalker.csv\",\r\n SimWalk_Dis_path=f\"{data_dir}/NodeBWalker.csv\",\r\n SimST_RNA_path=f\"{data_dir}/NodeASt.csv\",\r\n SimST_Dis_path=f\"{data_dir}/NodeBSt.csv\"\r\n)\r\n\r\noutput_paths = OutputPaths(\r\n Save_RNA_BinFeature_path=f\"{output_dir}/NodeAGraphEmb.csv\",\r\n Save_DIS_BinFeature_path=f\"{output_dir}/NodeBGraphEmb.csv\",\r\n Save_RNAWalkerFeature_path=f\"{output_dir}/NodeAWalkerEmb.csv\",\r\n Save_DISWalkerFeature_path=f\"{output_dir}/NodeBWalkerEmb.csv\",\r\n Save_RNA_STFeature_path=f\"{output_dir}/NodeAStEmb.csv\",\r\n Save_DIS_STFeature_path=f\"{output_dir}/NodeBStEmb.csv\"\r\n)\r\n\r\n# Execute the graph representation learning process\r\nrun(input_paths, output_paths, \r\n mi_num=467, dis_num=72,\r\n hidden_list=[256, 256], \r\n proj_hidden=64,\r\n validation=1,\r\n epochs=2,\r\n lr=0.00001)\r\n```\r\nThis example demonstrates how to configure input and output paths and run the learning process using the provided `run` function.\r\n\r\n\r\n## Parameter Description\r\n\r\n### Input\r\n\r\n- File `NodeGraph.csv` is the graph input, used to construct the kernel similarity modal hyperedge of the node\r\n\r\n\r\n- Files `NodeAWalker.csv` and `NodeBWalker.csv` are the feature inputs of two nodes, with a size of N*M, n is the number of nodes, and M is the feature dimension, used to construct the nearest neighbor modal hyperedge\r\n\r\n\r\n- Files `NodeASt.csv` and `NodeBSt.csv` are the feature inputs of two nodes, with a size of N*M, n is the number of nodes, and M is the feature dimension, used to construct the structural topology modal hyperedge\r\n\r\nUsers can refer to https://github.com/1axin/MML-MGNN/tree/main/data to customize input to use MMLMGN\r\n\r\n\r\n### Output\r\n\r\n\r\n- NodeAGraphEmb.csv, NodeBGraphEmb.csv are the kernel similarity modal embeddings of nodes A and B respectively.\r\n\r\n\r\n- NodeAWalkerEmb.csv, NodeBWalkerEmb.csv are the nearest neighbor similarity modal embeddings of nodes A and B respectively.\r\n\r\n\r\n- NodeAStEmb.csv, NodeBStEmb.csv are the structural topology similarity modal embeddings of nodes A and B respectively.\r\n\r\n### Parameters of run\r\n\r\n- `input_paths`: An `InputPaths` object containing paths to the input data files. These files typically include network data or similarity data and serve as the primary input for the model.\r\n\r\n- `output_paths`: An `OutputPaths` object that specifies where the feature extraction results and model outputs will be saved.\r\n\r\n- `mi_num`: Indicates the number of Node A in the input data. \r\n\r\n- `dis_num`: Represents the number of Node B in the input data.\r\n\r\n- `hidden_list`: A list that defines the number of neurons in each hidden layer of the model. \r\n\r\n- `proj_hidden`: Refers to the hidden layer size used during the projection process. This is typically used to adjust the intermediate representation of the data for optimal downstream feature usage.\r\n\r\n- `epochs`: Number of training iterations. This defines how many times the model will go through the entire training dataset to converge towards a set of optimal parameters.\r\n\r\n- `lr`: Learning rate, which controls the step size during the parameter updates in the training process. A smaller learning rate often leads to more stable convergence but might require longer training time.\r\n\r\nThese parameters are primarily used to configure the structure of the model and the training process, and they can be adjusted according to the specific task and characteristics of the data to optimize model performance.\r\n\r\n\r\n## Features\r\n\r\n- **Multi-similarity modality hypergraph contrastive learning**: Supports feature extraction across multiple similarity-based modalities.\r\n- **Easy-to-use API**: Simple and flexible API for configuring input and output paths.\r\n- **Customizable**: Users can easily adjust parameters such as hidden layers, projection dimensions, learning rate, and epochs.\r\n\r\n## Documentation\r\n\r\nFor more detailed examples of input and output data, please visit https://github.com/1axin/MML-MGNN\r\n\r\n## Contributing\r\n\r\nContributions are welcome! To contribute to this project, follow these steps:\r\n\r\n1. Fork the repository.\r\n2. Create a new branch (`git checkout -b feature/YourFeature`).\r\n3. Commit your changes (`git commit -m 'Add new feature'`).\r\n4. Push the branch (`git push origin feature/YourFeature`).\r\n5. Open a Pull Request.\r\n\r\nFor more details, refer to the [contributing guide](CONTRIBUTING.md).\r\n\r\n## License\r\n\r\nThis project is licensed under the [MIT License](LICENSE). Feel free to use, modify, and distribute it.\r\n\r\n---\r\n\r\nThank you for using `MMLMGN`! If you encounter any issues or have suggestions, please open an issue on the repository.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Hypergraph contrastive learning framework for multiple similarity modalities",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/1axin/MML-MGNN"
},
"split_keywords": [
"multiple similarity modes",
" hgcn",
" hypergraph contrastive learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c52a8e9d6ec2b37a0f18f7784708d2b2de6bf37d42330cc03fe4ca425133fd17",
"md5": "248a8121e338f54f07bbdba977adafeb",
"sha256": "d1b11f4aae9a14e5938273cce3fd90febe21111fcfcc3818b3a13b648fcbcde7"
},
"downloads": -1,
"filename": "mmlmgn-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "248a8121e338f54f07bbdba977adafeb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 15462,
"upload_time": "2024-10-19T07:17:24",
"upload_time_iso_8601": "2024-10-19T07:17:24.116280Z",
"url": "https://files.pythonhosted.org/packages/c5/2a/8e9d6ec2b37a0f18f7784708d2b2de6bf37d42330cc03fe4ca425133fd17/mmlmgn-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "098016953fd44ceac60b13636877ccd88c9d3bd731076644d20961e676a09752",
"md5": "0b0770876f4771500db776b6d9d2760f",
"sha256": "8c5c59e705776e4ac060bcc265f052f36288898bff551e69bee540aad952f3db"
},
"downloads": -1,
"filename": "mmlmgn-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "0b0770876f4771500db776b6d9d2760f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 15622,
"upload_time": "2024-10-19T07:17:25",
"upload_time_iso_8601": "2024-10-19T07:17:25.928904Z",
"url": "https://files.pythonhosted.org/packages/09/80/16953fd44ceac60b13636877ccd88c9d3bd731076644d20961e676a09752/mmlmgn-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-19 07:17:25",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "1axin",
"github_project": "MML-MGNN",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "mmlmgn"
}