synthetic-graph-benchmarks


Namesynthetic-graph-benchmarks JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryStandardized benchmarks for evaluating synthetic graph generation methods
upload_time2025-07-23 10:31:20
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords benchmarks evaluation-metrics graph-generation graph-neural-networks machine-learning networkx synthetic-graphs
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Synthetic Graph Benchmarks

[![PyPI version](https://badge.fury.io/py/synthetic-graph-benchmarks.svg)](https://badge.fury.io/py/synthetic-graph-benchmarks)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A Python package implementing standardized benchmarks for evaluating synthetic graph generation methods, based on the evaluation frameworks introduced in:

- [**SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators**](https://arxiv.org/pdf/2204.01613) (ICML 2022)
- [**Efficient and Scalable Graph Generation through Iterative Local Expansion**](https://arxiv.org/html/2312.11529v4) (2023)

This package provides a unified interface for benchmarking graph generation algorithms against established datasets and metrics used in the graph generation literature.

## Features

- **Standardized Datasets**: Access to benchmark datasets including Stochastic Block Model (SBM), Planar graphs, and Tree graphs
- **Comprehensive Metrics**: Implementation of key evaluation metrics including:
  - Degree distribution comparison (MMD)
  - Clustering coefficient analysis  
  - Orbit count statistics (using ORCA)
  - Spectral properties analysis
  - Wavelet coefficient comparison
- **Validation Metrics**: Graph-type specific validation (planarity, tree properties, SBM likelihood)
- **Reproducible Evaluation**: Consistent benchmarking across different graph generation methods
- **Easy Integration**: Simple API for evaluating your own graph generation algorithms

## Installation

### From PyPI (recommended)

```bash
pip install synthetic-graph-benchmarks
```

### From Source

```bash
git clone https://github.com/peteole/synthetic_graph_benchmarks.git
cd synthetic_graph_benchmarks
pip install -e .
```

## Quick Start

```python
import networkx as nx
from synthetic_graph_benchmarks import (
    benchmark_planar_results,
    benchmark_sbm_results, 
    benchmark_tree_results
)

# Generate some example graphs (replace with your graph generation method)
generated_graphs = [nx.erdos_renyi_graph(64, 0.1) for _ in range(20)]

# Benchmark against planar graph dataset
results = benchmark_planar_results(generated_graphs)
print(f"Planar accuracy: {results['planar_acc']:.3f}")
print(f"Average metric ratio: {results['average_ratio']:.3f}")

# Benchmark against SBM dataset  
sbm_results = benchmark_sbm_results(generated_graphs)
print(f"SBM accuracy: {sbm_results['sbm_acc']:.3f}")

# Benchmark against tree dataset
tree_results = benchmark_tree_results(generated_graphs)
print(f"Tree accuracy: {tree_results['planar_acc']:.3f}")
```

## Datasets

The package provides access to three standard benchmark datasets:

### Stochastic Block Model (SBM)
- **Size**: 200 graphs
- **Properties**: 2-5 communities, 20-40 nodes per community
- **Edge probabilities**: 0.3 intra-community, 0.05 inter-community

### Planar Graphs  
- **Size**: 200 graphs with 64 nodes each
- **Generation**: Delaunay triangulation on random points in unit square
- **Properties**: Guaranteed planarity

### Tree Graphs
- **Size**: 200 graphs with 64 nodes each  
- **Properties**: Connected acyclic graphs (trees)

## Evaluation Metrics

### Graph Statistics
- **Degree Distribution**: Maximum Mean Discrepancy (MMD) between degree histograms
- **Clustering Coefficient**: Local clustering coefficient comparison
- **Orbit Counts**: 4-node orbit statistics using ORCA package
- **Spectral Properties**: Laplacian eigenvalue distribution analysis
- **Wavelet Coefficients**: Graph wavelet signature comparison

### Validity Metrics
- **Planar Accuracy**: Fraction of generated graphs that are planar
- **Tree Accuracy**: Fraction of generated graphs that are trees (acyclic)
- **SBM Accuracy**: Likelihood of graphs under fitted SBM parameters

### Quality Scores
- **Uniqueness**: Fraction of non-isomorphic graphs in generated set
- **Novelty**: Fraction of generated graphs not isomorphic to training graphs
- **Validity-Uniqueness-Novelty (VUN)**: Combined score for overall quality

## Advanced Usage

### Custom Evaluation

```python
from synthetic_graph_benchmarks.dataset import Dataset
from synthetic_graph_benchmarks.spectre_utils import PlanarSamplingMetrics

# Load dataset manually
dataset = Dataset.load_planar()
print(f"Training graphs: {len(dataset.train_graphs)}")
print(f"Validation graphs: {len(dataset.val_graphs)}")

# Use metrics directly
metrics = PlanarSamplingMetrics(dataset)
test_metrics = metrics.forward(dataset.train_graphs, test=True)
results = metrics.forward(generated_graphs, ref_metrics={"test": test_metrics}, test=True)
```

### Accessing Individual Metrics

```python
# Get detailed breakdown of all metrics
results = benchmark_planar_results(generated_graphs)

# Individual metric values
print(f"Degree MMD: {results['degree']:.6f}")
print(f"Clustering MMD: {results['clustering']:.6f}")  
print(f"Orbit MMD: {results['orbit']:.6f}")
print(f"Spectral MMD: {results['spectre']:.6f}")
print(f"Wavelet MMD: {results['wavelet']:.6f}")

# Ratios compared to training set
print(f"Degree ratio: {results['degree_ratio']:.3f}")
print(f"Average ratio: {results['average_ratio']:.3f}")
```

## Citing

If you use this package in your research, please cite the original papers:

```bibtex
@inproceedings{martinkus2022spectre,
  title={SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators},
  author={Martinkus, Karolis and Loukas, Andreas and Perraudin, Nathanaël and Wattenhofer, Roger},
  booktitle={International Conference on Machine Learning},
  pages={15159--15202},
  year={2022},
  organization={PMLR}
}

@article{bergmeister2023efficient,
  title={Efficient and Scalable Graph Generation through Iterative Local Expansion},
  author={Bergmeister, Andreas and Martinkus, Karolis and Perraudin, Nathanaël and Wattenhofer, Roger},
  journal={arXiv preprint arXiv:2312.11529},
  year={2023}
}
```

## Dependencies

- Python ≥ 3.10
- NetworkX ≥ 3.4.2
- NumPy ≥ 2.2.6  
- SciPy ≥ 1.15.3
- PyGSP ≥ 0.5.1
- scikit-learn ≥ 1.7.1
- ORCA-graphlets ≥ 0.1.4
- PyTorch ≥ 2.3.0

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

This package is based on evaluation frameworks developed by:
- Karolis Martinkus (SPECTRE paper)
- Andreas Bergmeister (Iterative Local Expansion paper)
- The original GRAN evaluation codebase
- NetworkX and PyGSP communities
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "synthetic-graph-benchmarks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Ole Petersen <peteole2707@gmail.com>",
    "keywords": "benchmarks, evaluation-metrics, graph-generation, graph-neural-networks, machine-learning, networkx, synthetic-graphs",
    "author": null,
    "author_email": "Ole Petersen <peteole2707@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/46/7c/e0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e/synthetic_graph_benchmarks-0.1.1.tar.gz",
    "platform": null,
    "description": "# Synthetic Graph Benchmarks\n\n[![PyPI version](https://badge.fury.io/py/synthetic-graph-benchmarks.svg)](https://badge.fury.io/py/synthetic-graph-benchmarks)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA Python package implementing standardized benchmarks for evaluating synthetic graph generation methods, based on the evaluation frameworks introduced in:\n\n- [**SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators**](https://arxiv.org/pdf/2204.01613) (ICML 2022)\n- [**Efficient and Scalable Graph Generation through Iterative Local Expansion**](https://arxiv.org/html/2312.11529v4) (2023)\n\nThis package provides a unified interface for benchmarking graph generation algorithms against established datasets and metrics used in the graph generation literature.\n\n## Features\n\n- **Standardized Datasets**: Access to benchmark datasets including Stochastic Block Model (SBM), Planar graphs, and Tree graphs\n- **Comprehensive Metrics**: Implementation of key evaluation metrics including:\n  - Degree distribution comparison (MMD)\n  - Clustering coefficient analysis  \n  - Orbit count statistics (using ORCA)\n  - Spectral properties analysis\n  - Wavelet coefficient comparison\n- **Validation Metrics**: Graph-type specific validation (planarity, tree properties, SBM likelihood)\n- **Reproducible Evaluation**: Consistent benchmarking across different graph generation methods\n- **Easy Integration**: Simple API for evaluating your own graph generation algorithms\n\n## Installation\n\n### From PyPI (recommended)\n\n```bash\npip install synthetic-graph-benchmarks\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/peteole/synthetic_graph_benchmarks.git\ncd synthetic_graph_benchmarks\npip install -e .\n```\n\n## Quick Start\n\n```python\nimport networkx as nx\nfrom synthetic_graph_benchmarks import (\n    benchmark_planar_results,\n    benchmark_sbm_results, \n    benchmark_tree_results\n)\n\n# Generate some example graphs (replace with your graph generation method)\ngenerated_graphs = [nx.erdos_renyi_graph(64, 0.1) for _ in range(20)]\n\n# Benchmark against planar graph dataset\nresults = benchmark_planar_results(generated_graphs)\nprint(f\"Planar accuracy: {results['planar_acc']:.3f}\")\nprint(f\"Average metric ratio: {results['average_ratio']:.3f}\")\n\n# Benchmark against SBM dataset  \nsbm_results = benchmark_sbm_results(generated_graphs)\nprint(f\"SBM accuracy: {sbm_results['sbm_acc']:.3f}\")\n\n# Benchmark against tree dataset\ntree_results = benchmark_tree_results(generated_graphs)\nprint(f\"Tree accuracy: {tree_results['planar_acc']:.3f}\")\n```\n\n## Datasets\n\nThe package provides access to three standard benchmark datasets:\n\n### Stochastic Block Model (SBM)\n- **Size**: 200 graphs\n- **Properties**: 2-5 communities, 20-40 nodes per community\n- **Edge probabilities**: 0.3 intra-community, 0.05 inter-community\n\n### Planar Graphs  \n- **Size**: 200 graphs with 64 nodes each\n- **Generation**: Delaunay triangulation on random points in unit square\n- **Properties**: Guaranteed planarity\n\n### Tree Graphs\n- **Size**: 200 graphs with 64 nodes each  \n- **Properties**: Connected acyclic graphs (trees)\n\n## Evaluation Metrics\n\n### Graph Statistics\n- **Degree Distribution**: Maximum Mean Discrepancy (MMD) between degree histograms\n- **Clustering Coefficient**: Local clustering coefficient comparison\n- **Orbit Counts**: 4-node orbit statistics using ORCA package\n- **Spectral Properties**: Laplacian eigenvalue distribution analysis\n- **Wavelet Coefficients**: Graph wavelet signature comparison\n\n### Validity Metrics\n- **Planar Accuracy**: Fraction of generated graphs that are planar\n- **Tree Accuracy**: Fraction of generated graphs that are trees (acyclic)\n- **SBM Accuracy**: Likelihood of graphs under fitted SBM parameters\n\n### Quality Scores\n- **Uniqueness**: Fraction of non-isomorphic graphs in generated set\n- **Novelty**: Fraction of generated graphs not isomorphic to training graphs\n- **Validity-Uniqueness-Novelty (VUN)**: Combined score for overall quality\n\n## Advanced Usage\n\n### Custom Evaluation\n\n```python\nfrom synthetic_graph_benchmarks.dataset import Dataset\nfrom synthetic_graph_benchmarks.spectre_utils import PlanarSamplingMetrics\n\n# Load dataset manually\ndataset = Dataset.load_planar()\nprint(f\"Training graphs: {len(dataset.train_graphs)}\")\nprint(f\"Validation graphs: {len(dataset.val_graphs)}\")\n\n# Use metrics directly\nmetrics = PlanarSamplingMetrics(dataset)\ntest_metrics = metrics.forward(dataset.train_graphs, test=True)\nresults = metrics.forward(generated_graphs, ref_metrics={\"test\": test_metrics}, test=True)\n```\n\n### Accessing Individual Metrics\n\n```python\n# Get detailed breakdown of all metrics\nresults = benchmark_planar_results(generated_graphs)\n\n# Individual metric values\nprint(f\"Degree MMD: {results['degree']:.6f}\")\nprint(f\"Clustering MMD: {results['clustering']:.6f}\")  \nprint(f\"Orbit MMD: {results['orbit']:.6f}\")\nprint(f\"Spectral MMD: {results['spectre']:.6f}\")\nprint(f\"Wavelet MMD: {results['wavelet']:.6f}\")\n\n# Ratios compared to training set\nprint(f\"Degree ratio: {results['degree_ratio']:.3f}\")\nprint(f\"Average ratio: {results['average_ratio']:.3f}\")\n```\n\n## Citing\n\nIf you use this package in your research, please cite the original papers:\n\n```bibtex\n@inproceedings{martinkus2022spectre,\n  title={SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators},\n  author={Martinkus, Karolis and Loukas, Andreas and Perraudin, Nathana\u00ebl and Wattenhofer, Roger},\n  booktitle={International Conference on Machine Learning},\n  pages={15159--15202},\n  year={2022},\n  organization={PMLR}\n}\n\n@article{bergmeister2023efficient,\n  title={Efficient and Scalable Graph Generation through Iterative Local Expansion},\n  author={Bergmeister, Andreas and Martinkus, Karolis and Perraudin, Nathana\u00ebl and Wattenhofer, Roger},\n  journal={arXiv preprint arXiv:2312.11529},\n  year={2023}\n}\n```\n\n## Dependencies\n\n- Python \u2265 3.10\n- NetworkX \u2265 3.4.2\n- NumPy \u2265 2.2.6  \n- SciPy \u2265 1.15.3\n- PyGSP \u2265 0.5.1\n- scikit-learn \u2265 1.7.1\n- ORCA-graphlets \u2265 0.1.4\n- PyTorch \u2265 2.3.0\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\nThis package is based on evaluation frameworks developed by:\n- Karolis Martinkus (SPECTRE paper)\n- Andreas Bergmeister (Iterative Local Expansion paper)\n- The original GRAN evaluation codebase\n- NetworkX and PyGSP communities",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Standardized benchmarks for evaluating synthetic graph generation methods",
    "version": "0.1.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/peteole/synthetic_graph_benchmarks/issues",
        "Documentation": "https://github.com/peteole/synthetic_graph_benchmarks#readme",
        "Homepage": "https://github.com/peteole/synthetic_graph_benchmarks",
        "Repository": "https://github.com/peteole/synthetic_graph_benchmarks"
    },
    "split_keywords": [
        "benchmarks",
        " evaluation-metrics",
        " graph-generation",
        " graph-neural-networks",
        " machine-learning",
        " networkx",
        " synthetic-graphs"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "42f8f0b69a26f4dfa9756b6621e95b69dab5ef96aae168347cde6549ce2dc665",
                "md5": "92fd860e5e3d8d3359a3dc167e20c7bb",
                "sha256": "3c635520e2d64c5e0b19a99b1a7f5b3c0c24b66022f25abf1aee467ee9d0a84e"
            },
            "downloads": -1,
            "filename": "synthetic_graph_benchmarks-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "92fd860e5e3d8d3359a3dc167e20c7bb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 17878,
            "upload_time": "2025-07-23T10:31:18",
            "upload_time_iso_8601": "2025-07-23T10:31:18.568382Z",
            "url": "https://files.pythonhosted.org/packages/42/f8/f0b69a26f4dfa9756b6621e95b69dab5ef96aae168347cde6549ce2dc665/synthetic_graph_benchmarks-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "467ce0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e",
                "md5": "0804f630cb13f1cb206319e8d424e39e",
                "sha256": "9e2f57b1e0cfb19aadd48bf529e88d443330f6249bb8e96e5eb1b0fc541176fe"
            },
            "downloads": -1,
            "filename": "synthetic_graph_benchmarks-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0804f630cb13f1cb206319e8d424e39e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 87962,
            "upload_time": "2025-07-23T10:31:20",
            "upload_time_iso_8601": "2025-07-23T10:31:20.012258Z",
            "url": "https://files.pythonhosted.org/packages/46/7c/e0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e/synthetic_graph_benchmarks-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-23 10:31:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "peteole",
    "github_project": "synthetic_graph_benchmarks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "synthetic-graph-benchmarks"
}
        
Elapsed time: 1.19834s