# Synthetic Graph Benchmarks
[](https://badge.fury.io/py/synthetic-graph-benchmarks)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
A Python package implementing standardized benchmarks for evaluating synthetic graph generation methods, based on the evaluation frameworks introduced in:
- [**SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators**](https://arxiv.org/pdf/2204.01613) (ICML 2022)
- [**Efficient and Scalable Graph Generation through Iterative Local Expansion**](https://arxiv.org/html/2312.11529v4) (2023)
This package provides a unified interface for benchmarking graph generation algorithms against established datasets and metrics used in the graph generation literature.
## Features
- **Standardized Datasets**: Access to benchmark datasets including Stochastic Block Model (SBM), Planar graphs, and Tree graphs
- **Comprehensive Metrics**: Implementation of key evaluation metrics including:
- Degree distribution comparison (MMD)
- Clustering coefficient analysis
- Orbit count statistics (using ORCA)
- Spectral properties analysis
- Wavelet coefficient comparison
- **Validation Metrics**: Graph-type specific validation (planarity, tree properties, SBM likelihood)
- **Reproducible Evaluation**: Consistent benchmarking across different graph generation methods
- **Easy Integration**: Simple API for evaluating your own graph generation algorithms
## Installation
### From PyPI (recommended)
```bash
pip install synthetic-graph-benchmarks
```
### From Source
```bash
git clone https://github.com/peteole/synthetic_graph_benchmarks.git
cd synthetic_graph_benchmarks
pip install -e .
```
## Quick Start
```python
import networkx as nx
from synthetic_graph_benchmarks import (
benchmark_planar_results,
benchmark_sbm_results,
benchmark_tree_results
)
# Generate some example graphs (replace with your graph generation method)
generated_graphs = [nx.erdos_renyi_graph(64, 0.1) for _ in range(20)]
# Benchmark against planar graph dataset
results = benchmark_planar_results(generated_graphs)
print(f"Planar accuracy: {results['planar_acc']:.3f}")
print(f"Average metric ratio: {results['average_ratio']:.3f}")
# Benchmark against SBM dataset
sbm_results = benchmark_sbm_results(generated_graphs)
print(f"SBM accuracy: {sbm_results['sbm_acc']:.3f}")
# Benchmark against tree dataset
tree_results = benchmark_tree_results(generated_graphs)
print(f"Tree accuracy: {tree_results['planar_acc']:.3f}")
```
## Datasets
The package provides access to three standard benchmark datasets:
### Stochastic Block Model (SBM)
- **Size**: 200 graphs
- **Properties**: 2-5 communities, 20-40 nodes per community
- **Edge probabilities**: 0.3 intra-community, 0.05 inter-community
### Planar Graphs
- **Size**: 200 graphs with 64 nodes each
- **Generation**: Delaunay triangulation on random points in unit square
- **Properties**: Guaranteed planarity
### Tree Graphs
- **Size**: 200 graphs with 64 nodes each
- **Properties**: Connected acyclic graphs (trees)
## Evaluation Metrics
### Graph Statistics
- **Degree Distribution**: Maximum Mean Discrepancy (MMD) between degree histograms
- **Clustering Coefficient**: Local clustering coefficient comparison
- **Orbit Counts**: 4-node orbit statistics using ORCA package
- **Spectral Properties**: Laplacian eigenvalue distribution analysis
- **Wavelet Coefficients**: Graph wavelet signature comparison
### Validity Metrics
- **Planar Accuracy**: Fraction of generated graphs that are planar
- **Tree Accuracy**: Fraction of generated graphs that are trees (acyclic)
- **SBM Accuracy**: Likelihood of graphs under fitted SBM parameters
### Quality Scores
- **Uniqueness**: Fraction of non-isomorphic graphs in generated set
- **Novelty**: Fraction of generated graphs not isomorphic to training graphs
- **Validity-Uniqueness-Novelty (VUN)**: Combined score for overall quality
## Advanced Usage
### Custom Evaluation
```python
from synthetic_graph_benchmarks.dataset import Dataset
from synthetic_graph_benchmarks.spectre_utils import PlanarSamplingMetrics
# Load dataset manually
dataset = Dataset.load_planar()
print(f"Training graphs: {len(dataset.train_graphs)}")
print(f"Validation graphs: {len(dataset.val_graphs)}")
# Use metrics directly
metrics = PlanarSamplingMetrics(dataset)
test_metrics = metrics.forward(dataset.train_graphs, test=True)
results = metrics.forward(generated_graphs, ref_metrics={"test": test_metrics}, test=True)
```
### Accessing Individual Metrics
```python
# Get detailed breakdown of all metrics
results = benchmark_planar_results(generated_graphs)
# Individual metric values
print(f"Degree MMD: {results['degree']:.6f}")
print(f"Clustering MMD: {results['clustering']:.6f}")
print(f"Orbit MMD: {results['orbit']:.6f}")
print(f"Spectral MMD: {results['spectre']:.6f}")
print(f"Wavelet MMD: {results['wavelet']:.6f}")
# Ratios compared to training set
print(f"Degree ratio: {results['degree_ratio']:.3f}")
print(f"Average ratio: {results['average_ratio']:.3f}")
```
## Citing
If you use this package in your research, please cite the original papers:
```bibtex
@inproceedings{martinkus2022spectre,
title={SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators},
author={Martinkus, Karolis and Loukas, Andreas and Perraudin, Nathanaël and Wattenhofer, Roger},
booktitle={International Conference on Machine Learning},
pages={15159--15202},
year={2022},
organization={PMLR}
}
@article{bergmeister2023efficient,
title={Efficient and Scalable Graph Generation through Iterative Local Expansion},
author={Bergmeister, Andreas and Martinkus, Karolis and Perraudin, Nathanaël and Wattenhofer, Roger},
journal={arXiv preprint arXiv:2312.11529},
year={2023}
}
```
## Dependencies
- Python ≥ 3.10
- NetworkX ≥ 3.4.2
- NumPy ≥ 2.2.6
- SciPy ≥ 1.15.3
- PyGSP ≥ 0.5.1
- scikit-learn ≥ 1.7.1
- ORCA-graphlets ≥ 0.1.4
- PyTorch ≥ 2.3.0
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
This package is based on evaluation frameworks developed by:
- Karolis Martinkus (SPECTRE paper)
- Andreas Bergmeister (Iterative Local Expansion paper)
- The original GRAN evaluation codebase
- NetworkX and PyGSP communities
Raw data
{
"_id": null,
"home_page": null,
"name": "synthetic-graph-benchmarks",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Ole Petersen <peteole2707@gmail.com>",
"keywords": "benchmarks, evaluation-metrics, graph-generation, graph-neural-networks, machine-learning, networkx, synthetic-graphs",
"author": null,
"author_email": "Ole Petersen <peteole2707@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/46/7c/e0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e/synthetic_graph_benchmarks-0.1.1.tar.gz",
"platform": null,
"description": "# Synthetic Graph Benchmarks\n\n[](https://badge.fury.io/py/synthetic-graph-benchmarks)\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n\nA Python package implementing standardized benchmarks for evaluating synthetic graph generation methods, based on the evaluation frameworks introduced in:\n\n- [**SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators**](https://arxiv.org/pdf/2204.01613) (ICML 2022)\n- [**Efficient and Scalable Graph Generation through Iterative Local Expansion**](https://arxiv.org/html/2312.11529v4) (2023)\n\nThis package provides a unified interface for benchmarking graph generation algorithms against established datasets and metrics used in the graph generation literature.\n\n## Features\n\n- **Standardized Datasets**: Access to benchmark datasets including Stochastic Block Model (SBM), Planar graphs, and Tree graphs\n- **Comprehensive Metrics**: Implementation of key evaluation metrics including:\n - Degree distribution comparison (MMD)\n - Clustering coefficient analysis \n - Orbit count statistics (using ORCA)\n - Spectral properties analysis\n - Wavelet coefficient comparison\n- **Validation Metrics**: Graph-type specific validation (planarity, tree properties, SBM likelihood)\n- **Reproducible Evaluation**: Consistent benchmarking across different graph generation methods\n- **Easy Integration**: Simple API for evaluating your own graph generation algorithms\n\n## Installation\n\n### From PyPI (recommended)\n\n```bash\npip install synthetic-graph-benchmarks\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/peteole/synthetic_graph_benchmarks.git\ncd synthetic_graph_benchmarks\npip install -e .\n```\n\n## Quick Start\n\n```python\nimport networkx as nx\nfrom synthetic_graph_benchmarks import (\n benchmark_planar_results,\n benchmark_sbm_results, \n benchmark_tree_results\n)\n\n# Generate some example graphs (replace with your graph generation method)\ngenerated_graphs = [nx.erdos_renyi_graph(64, 0.1) for _ in range(20)]\n\n# Benchmark against planar graph dataset\nresults = benchmark_planar_results(generated_graphs)\nprint(f\"Planar accuracy: {results['planar_acc']:.3f}\")\nprint(f\"Average metric ratio: {results['average_ratio']:.3f}\")\n\n# Benchmark against SBM dataset \nsbm_results = benchmark_sbm_results(generated_graphs)\nprint(f\"SBM accuracy: {sbm_results['sbm_acc']:.3f}\")\n\n# Benchmark against tree dataset\ntree_results = benchmark_tree_results(generated_graphs)\nprint(f\"Tree accuracy: {tree_results['planar_acc']:.3f}\")\n```\n\n## Datasets\n\nThe package provides access to three standard benchmark datasets:\n\n### Stochastic Block Model (SBM)\n- **Size**: 200 graphs\n- **Properties**: 2-5 communities, 20-40 nodes per community\n- **Edge probabilities**: 0.3 intra-community, 0.05 inter-community\n\n### Planar Graphs \n- **Size**: 200 graphs with 64 nodes each\n- **Generation**: Delaunay triangulation on random points in unit square\n- **Properties**: Guaranteed planarity\n\n### Tree Graphs\n- **Size**: 200 graphs with 64 nodes each \n- **Properties**: Connected acyclic graphs (trees)\n\n## Evaluation Metrics\n\n### Graph Statistics\n- **Degree Distribution**: Maximum Mean Discrepancy (MMD) between degree histograms\n- **Clustering Coefficient**: Local clustering coefficient comparison\n- **Orbit Counts**: 4-node orbit statistics using ORCA package\n- **Spectral Properties**: Laplacian eigenvalue distribution analysis\n- **Wavelet Coefficients**: Graph wavelet signature comparison\n\n### Validity Metrics\n- **Planar Accuracy**: Fraction of generated graphs that are planar\n- **Tree Accuracy**: Fraction of generated graphs that are trees (acyclic)\n- **SBM Accuracy**: Likelihood of graphs under fitted SBM parameters\n\n### Quality Scores\n- **Uniqueness**: Fraction of non-isomorphic graphs in generated set\n- **Novelty**: Fraction of generated graphs not isomorphic to training graphs\n- **Validity-Uniqueness-Novelty (VUN)**: Combined score for overall quality\n\n## Advanced Usage\n\n### Custom Evaluation\n\n```python\nfrom synthetic_graph_benchmarks.dataset import Dataset\nfrom synthetic_graph_benchmarks.spectre_utils import PlanarSamplingMetrics\n\n# Load dataset manually\ndataset = Dataset.load_planar()\nprint(f\"Training graphs: {len(dataset.train_graphs)}\")\nprint(f\"Validation graphs: {len(dataset.val_graphs)}\")\n\n# Use metrics directly\nmetrics = PlanarSamplingMetrics(dataset)\ntest_metrics = metrics.forward(dataset.train_graphs, test=True)\nresults = metrics.forward(generated_graphs, ref_metrics={\"test\": test_metrics}, test=True)\n```\n\n### Accessing Individual Metrics\n\n```python\n# Get detailed breakdown of all metrics\nresults = benchmark_planar_results(generated_graphs)\n\n# Individual metric values\nprint(f\"Degree MMD: {results['degree']:.6f}\")\nprint(f\"Clustering MMD: {results['clustering']:.6f}\") \nprint(f\"Orbit MMD: {results['orbit']:.6f}\")\nprint(f\"Spectral MMD: {results['spectre']:.6f}\")\nprint(f\"Wavelet MMD: {results['wavelet']:.6f}\")\n\n# Ratios compared to training set\nprint(f\"Degree ratio: {results['degree_ratio']:.3f}\")\nprint(f\"Average ratio: {results['average_ratio']:.3f}\")\n```\n\n## Citing\n\nIf you use this package in your research, please cite the original papers:\n\n```bibtex\n@inproceedings{martinkus2022spectre,\n title={SPECTRE: Spectral Conditioning Helps to Overcome the Expressivity Limits of One-shot Graph Generators},\n author={Martinkus, Karolis and Loukas, Andreas and Perraudin, Nathana\u00ebl and Wattenhofer, Roger},\n booktitle={International Conference on Machine Learning},\n pages={15159--15202},\n year={2022},\n organization={PMLR}\n}\n\n@article{bergmeister2023efficient,\n title={Efficient and Scalable Graph Generation through Iterative Local Expansion},\n author={Bergmeister, Andreas and Martinkus, Karolis and Perraudin, Nathana\u00ebl and Wattenhofer, Roger},\n journal={arXiv preprint arXiv:2312.11529},\n year={2023}\n}\n```\n\n## Dependencies\n\n- Python \u2265 3.10\n- NetworkX \u2265 3.4.2\n- NumPy \u2265 2.2.6 \n- SciPy \u2265 1.15.3\n- PyGSP \u2265 0.5.1\n- scikit-learn \u2265 1.7.1\n- ORCA-graphlets \u2265 0.1.4\n- PyTorch \u2265 2.3.0\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\nThis package is based on evaluation frameworks developed by:\n- Karolis Martinkus (SPECTRE paper)\n- Andreas Bergmeister (Iterative Local Expansion paper)\n- The original GRAN evaluation codebase\n- NetworkX and PyGSP communities",
"bugtrack_url": null,
"license": "MIT",
"summary": "Standardized benchmarks for evaluating synthetic graph generation methods",
"version": "0.1.1",
"project_urls": {
"Bug Tracker": "https://github.com/peteole/synthetic_graph_benchmarks/issues",
"Documentation": "https://github.com/peteole/synthetic_graph_benchmarks#readme",
"Homepage": "https://github.com/peteole/synthetic_graph_benchmarks",
"Repository": "https://github.com/peteole/synthetic_graph_benchmarks"
},
"split_keywords": [
"benchmarks",
" evaluation-metrics",
" graph-generation",
" graph-neural-networks",
" machine-learning",
" networkx",
" synthetic-graphs"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "42f8f0b69a26f4dfa9756b6621e95b69dab5ef96aae168347cde6549ce2dc665",
"md5": "92fd860e5e3d8d3359a3dc167e20c7bb",
"sha256": "3c635520e2d64c5e0b19a99b1a7f5b3c0c24b66022f25abf1aee467ee9d0a84e"
},
"downloads": -1,
"filename": "synthetic_graph_benchmarks-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "92fd860e5e3d8d3359a3dc167e20c7bb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 17878,
"upload_time": "2025-07-23T10:31:18",
"upload_time_iso_8601": "2025-07-23T10:31:18.568382Z",
"url": "https://files.pythonhosted.org/packages/42/f8/f0b69a26f4dfa9756b6621e95b69dab5ef96aae168347cde6549ce2dc665/synthetic_graph_benchmarks-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "467ce0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e",
"md5": "0804f630cb13f1cb206319e8d424e39e",
"sha256": "9e2f57b1e0cfb19aadd48bf529e88d443330f6249bb8e96e5eb1b0fc541176fe"
},
"downloads": -1,
"filename": "synthetic_graph_benchmarks-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "0804f630cb13f1cb206319e8d424e39e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 87962,
"upload_time": "2025-07-23T10:31:20",
"upload_time_iso_8601": "2025-07-23T10:31:20.012258Z",
"url": "https://files.pythonhosted.org/packages/46/7c/e0806b3ae699e876cfeb00f4d77c965aab0f0e3ae08be37e3ac986cbcc8e/synthetic_graph_benchmarks-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-23 10:31:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "peteole",
"github_project": "synthetic_graph_benchmarks",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "synthetic-graph-benchmarks"
}