godm


Namegodm JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/kayzliu/godm
SummaryGODM
upload_time2024-01-03 03:29:48
maintainer
docs_urlNone
authorkayzliu
requires_python
license
keywords outlier detection data augmentation diffusion models graph neural networks graph generative model
VCS
bugtrack_url
requirements tqdm numpy pygod torch_geometric
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GODM

[![PyPI version](https://badge.fury.io/py/godm.svg)](https://badge.fury.io/py/godm)

GODM is a data augmentation package for supervised graph outlier detection. It generates synthetic graph outliers with latent diffusion models. This is the official implementation of [Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models](https://arxiv.org/abs/2312.17679).

<p align="center">
<img src="modelfig.png"  alt="model architecture"/>
</p>

## Installation

It is recommended to use **pip** for installation:

```pip install godm```

Alternatively, you can build from source by cloning this repository:

```
git clone https://github.com/kayzliu/godm.git
cd pygod
pip install .
```

## Usage

```python
from pygod.utils import load_data
data = load_data('weibo') # load data

from godm import GODM     # import GODM
godm = GODM(lr=0.004)     # init. GODM
aug_data = godm(data)     # augment data

detector(aug_data)        # train on data
```

The input data should be [`torch_geometric.Data`](https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.data.Data.html#torch_geometric.data.Data) object with the following keys:

- `x`: node features,
- `edge_index`: edge index, 
- `edge_time`: edge times (optional, name can be changed by `time_attr`),
- `edge_type`: edge types (optional, name can be changed by `type_attr`), 
- `y`: node labels, 
- `train_mask`: training node mask, 
- `val_mask`: validation node mask, 
- `test_mask`: testing node mask.

So far, no additional keys is allowed. We may support more keys by padding in the future.

## Parameters

- ```hid_dim``` (type: `int`, default: `None`): hidden dimension for VAE, i.e., latent embedding dimension. `None` means the largest power of 2 that is less than or equal to the feature dimension divided by two.
- ```diff_dim``` (type: `int`, default: `None`): hidden dimension for denoiser. `None` means as twice as `hid_dim`.
- ```vae_epochs``` (type: `int`, default: `100`): number of epochs for training VAE.
- ```diff_epochs``` (type: `int`, default: `100`): number of epochs for training diffusion model.
- ```patience``` (type: `int`, default: `50`): patience for early stopping.
- ```lr``` (type: `float`, default: `0.001`): learning rate.
- ```wd``` (type: `float`, default: `0.`): weight decay.
- ```batch_size``` (type: `int`, default: `2048`): batch size.
- ```threshold``` (type: `float`, default: `0.75`): threshold for edge generation.
- ```wx``` (type: `float`, default: `1.`): weight for node feature reconstruction loss.
- ```we``` (type: `float`, default: `0.5`): weight for edge reconstruction loss.
- ```beta``` (type: `float`, default: `0.001`): weight for KL divergence loss.
- ```wt``` (type: `float`, default: `1.`): weight for time prediction loss.
- ```time_attr``` (type: `str`, default: `edge_time`): attribute name for edge time.
- ```type_attr``` (type: `str`, default: `edge_type`): attribute name for edge type.
- ```wp``` (type: `float`, default: `0.3`): weight for node prediction loss.
- ```gen_nodes``` (type: `int`, default: `None`): number of nodes to generate. `None` means the same as the number of outliers in the original graph.
- ```sample_steps``` (type: `int`, default: `50`): number of steps for diffusion model sampling.
- ```device``` (type: `int`, default: `0`): GPU index, set to -1 for CPU.
- ```verbose``` (type: `bool`, default: `False`): verbose mode, enable for logging.

## Cite Us:

Our [paper](https://arxiv.org/abs/2312.17679) is publicly available. If you use GODM in a scientific publication, we would appreciate your citations:

    @article{liu2023data,
      title={Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models},
      author={Liu, Kay and Zhang, Hengrui and Hu, Ziqing and Wang, Fangxin and Yu, Philip S.},
      journal={arXiv preprint arXiv:2312.17679},
      year={2023}
    }

or:

    Liu, K., Zhang, H., Hu, Z., Wang, F., and Yu, P.S. 2023. Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models. arXiv preprint arXiv:2312.17679.
    
## 



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kayzliu/godm",
    "name": "godm",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "outlier detection,data augmentation,diffusion models,graph neural networks,graph generative model",
    "author": "kayzliu",
    "author_email": "zliu234@uic.edu",
    "download_url": "https://files.pythonhosted.org/packages/55/86/3677691bb0dd9b2fdb3eb203e7ce1b8b2b87d364cf60276f3dfa58d1f404/godm-0.1.0.tar.gz",
    "platform": null,
    "description": "# GODM\n\n[![PyPI version](https://badge.fury.io/py/godm.svg)](https://badge.fury.io/py/godm)\n\nGODM is a data augmentation package for supervised graph outlier detection. It generates synthetic graph outliers with latent diffusion models. This is the official implementation of [Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models](https://arxiv.org/abs/2312.17679).\n\n<p align=\"center\">\n<img src=\"modelfig.png\"  alt=\"model architecture\"/>\n</p>\n\n## Installation\n\nIt is recommended to use **pip** for installation:\n\n```pip install godm```\n\nAlternatively, you can build from source by cloning this repository:\n\n```\ngit clone https://github.com/kayzliu/godm.git\ncd pygod\npip install .\n```\n\n## Usage\n\n```python\nfrom pygod.utils import load_data\ndata = load_data('weibo') # load data\n\nfrom godm import GODM     # import GODM\ngodm = GODM(lr=0.004)     # init. GODM\naug_data = godm(data)     # augment data\n\ndetector(aug_data)        # train on data\n```\n\nThe input data should be [`torch_geometric.Data`](https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.data.Data.html#torch_geometric.data.Data) object with the following keys:\n\n- `x`: node features,\n- `edge_index`: edge index, \n- `edge_time`: edge times (optional, name can be changed by `time_attr`),\n- `edge_type`: edge types (optional, name can be changed by `type_attr`), \n- `y`: node labels, \n- `train_mask`: training node mask, \n- `val_mask`: validation node mask, \n- `test_mask`: testing node mask.\n\nSo far, no additional keys is allowed. We may support more keys by padding in the future.\n\n## Parameters\n\n- ```hid_dim``` (type: `int`, default: `None`): hidden dimension for VAE, i.e., latent embedding dimension. `None` means the largest power of 2 that is less than or equal to the feature dimension divided by two.\n- ```diff_dim``` (type: `int`, default: `None`): hidden dimension for denoiser. `None` means as twice as `hid_dim`.\n- ```vae_epochs``` (type: `int`, default: `100`): number of epochs for training VAE.\n- ```diff_epochs``` (type: `int`, default: `100`): number of epochs for training diffusion model.\n- ```patience``` (type: `int`, default: `50`): patience for early stopping.\n- ```lr``` (type: `float`, default: `0.001`): learning rate.\n- ```wd``` (type: `float`, default: `0.`): weight decay.\n- ```batch_size``` (type: `int`, default: `2048`): batch size.\n- ```threshold``` (type: `float`, default: `0.75`): threshold for edge generation.\n- ```wx``` (type: `float`, default: `1.`): weight for node feature reconstruction loss.\n- ```we``` (type: `float`, default: `0.5`): weight for edge reconstruction loss.\n- ```beta``` (type: `float`, default: `0.001`): weight for KL divergence loss.\n- ```wt``` (type: `float`, default: `1.`): weight for time prediction loss.\n- ```time_attr``` (type: `str`, default: `edge_time`): attribute name for edge time.\n- ```type_attr``` (type: `str`, default: `edge_type`): attribute name for edge type.\n- ```wp``` (type: `float`, default: `0.3`): weight for node prediction loss.\n- ```gen_nodes``` (type: `int`, default: `None`): number of nodes to generate. `None` means the same as the number of outliers in the original graph.\n- ```sample_steps``` (type: `int`, default: `50`): number of steps for diffusion model sampling.\n- ```device``` (type: `int`, default: `0`): GPU index, set to -1 for CPU.\n- ```verbose``` (type: `bool`, default: `False`): verbose mode, enable for logging.\n\n## Cite Us:\n\nOur [paper](https://arxiv.org/abs/2312.17679) is publicly available. If you use GODM in a scientific publication, we would appreciate your citations:\n\n    @article{liu2023data,\n      title={Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models},\n      author={Liu, Kay and Zhang, Hengrui and Hu, Ziqing and Wang, Fangxin and Yu, Philip S.},\n      journal={arXiv preprint arXiv:2312.17679},\n      year={2023}\n    }\n\nor:\n\n    Liu, K., Zhang, H., Hu, Z., Wang, F., and Yu, P.S. 2023. Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models. arXiv preprint arXiv:2312.17679.\n    \n## \n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "GODM",
    "version": "0.1.0",
    "project_urls": {
        "Download": "https://github.com/kayzliu/godm/archive/master.zip",
        "Homepage": "https://github.com/kayzliu/godm"
    },
    "split_keywords": [
        "outlier detection",
        "data augmentation",
        "diffusion models",
        "graph neural networks",
        "graph generative model"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "55863677691bb0dd9b2fdb3eb203e7ce1b8b2b87d364cf60276f3dfa58d1f404",
                "md5": "3ca24904215fd42d09a6c3b9466daf96",
                "sha256": "35d090492c803bc531c46df32d9d097273fa6ec4627f8679847d00f381b6d2ff"
            },
            "downloads": -1,
            "filename": "godm-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "3ca24904215fd42d09a6c3b9466daf96",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16542,
            "upload_time": "2024-01-03T03:29:48",
            "upload_time_iso_8601": "2024-01-03T03:29:48.941729Z",
            "url": "https://files.pythonhosted.org/packages/55/86/3677691bb0dd9b2fdb3eb203e7ce1b8b2b87d364cf60276f3dfa58d1f404/godm-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-03 03:29:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kayzliu",
    "github_project": "godm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pygod",
            "specs": []
        },
        {
            "name": "torch_geometric",
            "specs": []
        }
    ],
    "lcname": "godm"
}
        
Elapsed time: 3.83879s