molecule-benchmarks


Namemolecule-benchmarks JSON
Version 0.1.12 PyPI version JSON
download
home_pageNone
SummaryA comprehensive benchmark suite for evaluating generative models for molecules
upload_time2025-07-08 15:49:38
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseMIT
keywords benchmarking cheminformatics drug-discovery generative-models molecules
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # M[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)lecule Benchmarks

[![PyPI version](https://badge.fury.io/py/molecule-benchmarks.svg)](https://badge.fury.io/py/molecule-benchmarks)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A comprehensive benchmark suite for evaluating generative models for molecules. This package provides standardized metrics and evaluation protocols for assessing the quality of molecular generation models in drug discovery and cheminformatics.

## Features

- **Comprehensive Metrics**: Validity, uniqueness, novelty, diversity, and similarity metrics
- **Standard Benchmarks**: Implements metrics from Moses, GuacaMol, and FCD papers
- **Easy Integration**: Simple interface for integrating with any generative model
- **Direct SMILES Evaluation**: Benchmark pre-generated SMILES lists without implementing a model interface
- **Multiple Datasets**: Built-in support for QM9, Moses, and GuacaMol datasets
- **Efficient Computation**: Optimized for large-scale evaluation with multiprocessing support

## Installation

```bash
pip install molecule-benchmarks
```

## Quick Start

You can use the benchmark suite in two ways:

### Option 1: Direct SMILES Evaluation (Simplified)

If you already have generated SMILES strings, you can benchmark them directly. Just ensure you have at least the number of samples specified in `num_samples_to_generate`.

```python
from molecule_benchmarks import Benchmarker, SmilesDataset

# Load a dataset
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)

# Initialize benchmarker
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=10000,# You need to generate at least this many samples
    device="cpu"  # or "cuda" for GPU
)

# Your generated SMILES (replace with your actual generated molecules)
generated_smiles = [
    "CCO",           # Ethanol
    "CC(=O)O",       # Acetic acid
    "c1ccccc1",      # Benzene
    "CC(C)O",        # Isopropanol
    "CCN",           # Ethylamine
    None,            # Invalid molecule (use None for failures)
    # ... more molecules up to num_samples_to_generate
]

# Run benchmarks directly on the SMILES list
results = benchmarker.benchmark(generated_smiles)
print(results)
```

### Option 2: Model-Based Evaluation

To use the benchmark suite with a generative model, implement the `MoleculeGenerationModel` protocol. This will generate the required number of samples and run the benchmarks.

```python
from molecule_benchmarks.model import MoleculeGenerationModel

class MyGenerativeModel(MoleculeGenerationModel):
    def __init__(self, model_path):
        # Initialize your model here
        self.model = load_model(model_path)
    
    def generate_molecule_batch(self) -> list[str | None]:
        """Generate a batch of molecules as SMILES strings.
        
        Returns:
            List of SMILES strings. Return None for invalid molecules.
        """
        # Your generation logic here
        batch = self.model.generate(batch_size=100)
        return [self.convert_to_smiles(mol) for mol in batch]

# Initialize your model
model = MyGenerativeModel("path/to/model")

# Run benchmarks using the model
results = benchmarker.benchmark_model(model)
print(results)
```

### 3. Analyze Results

The benchmark returns comprehensive metrics:

```python
# Validity metrics
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Valid & unique & novel: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")

# Diversity and similarity metrics
print(f"Internal diversity: {results['moses']['IntDiv']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")

# Chemical property distribution similarity
print(f"KL divergence score: {results['kl_score']:.3f}")

# Fréchet ChemNet Distance
print(f"FCD score: {results['fcd']['fcd']:.3f}")
```

## Complete Examples

### Example 1: Direct SMILES Benchmarking (Recommended for Simplicity)

```python
from molecule_benchmarks import Benchmarker, SmilesDataset

# Load dataset
print("Loading dataset...")
dataset = SmilesDataset.load_qm9_dataset(max_train_samples=1000)

# Create benchmarker
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=100,
    device="cpu"
)

# Your generated SMILES (replace with your actual generated molecules)
generated_smiles = [
    "CCO",           # Ethanol
    "CC(=O)O",       # Acetic acid
    "c1ccccc1",      # Benzene
    "CC(C)O",        # Isopropanol
    "CCN",           # Ethylamine
    None,            # Invalid molecule
    # ... add more molecules up to 100 total
] + [None] * (100 - 6)  # Pad with None to reach desired count

# Run benchmarks directly
print("Running benchmarks...")
results = benchmarker.benchmark(generated_smiles)

# Print results (same as below)
print("\n=== Validity Metrics ===")
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Unique molecules: {results['validity']['unique_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")

print("\n=== Moses Metrics ===")
print(f"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")
print(f"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}")
print(f"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}")

print("\n=== Distribution Metrics ===")
print(f"KL divergence score: {results['kl_score']:.3f}")
print(f"FCD score: {results['fcd']['fcd']:.3f}")
print(f"FCD (valid only): {results['fcd']['fcd_valid']:.3f}")
```

### Example 2: Model-Based Benchmarking

Here's a complete example using the built-in dummy model:

```python
from molecule_benchmarks import Benchmarker, SmilesDataset
from molecule_benchmarks.model import DummyMoleculeGenerationModel

# Load dataset
print("Loading dataset...")
dataset = SmilesDataset.load_qm9_dataset(max_train_samples=1000)

# Create benchmarker
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=100,
    device="cpu"
)

# Create a dummy model (replace with your model)
model = DummyMoleculeGenerationModel([
    "CCO",           # Ethanol
    "CC(=O)O",       # Acetic acid
    "c1ccccc1",      # Benzene
    "CC(C)O",        # Isopropanol
    "CCN",           # Ethylamine
    None,            # Invalid molecule
])

# Run benchmarks using the model
print("Running benchmarks...")
results = benchmarker.benchmark_model(model)

# Print results
print("\n=== Validity Metrics ===")
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Unique molecules: {results['validity']['unique_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")

print("\n=== Moses Metrics ===")
print(f"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")
print(f"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}")
print(f"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}")

print("\n=== Distribution Metrics ===")
print(f"KL divergence score: {results['kl_score']:.3f}")
print(f"FCD score: {results['fcd']['fcd']:.3f}")
print(f"FCD (valid only): {results['fcd']['fcd_valid']:.3f}")
```

## Supported Datasets

The package includes several built-in datasets:

```python
from molecule_benchmarks import SmilesDataset

# QM9 dataset (small molecules)
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)

# Moses dataset (larger, drug-like molecules)
dataset = SmilesDataset.load_moses_dataset(fraction=0.1)

# GuacaMol dataset
dataset = SmilesDataset.load_guacamol_dataset(fraction=0.1)

# Custom dataset from files
dataset = SmilesDataset(
    train_smiles="path/to/train.txt",
    validation_smiles="path/to/valid.txt"
)
```

## Metrics Explained

### Validity Metrics

- **Valid fraction**: Percentage of generated molecules that are chemically valid
- **Unique fraction**: Percentage of generated molecules that are unique
- **Novel fraction**: Percentage of generated molecules not seen in training data

### Moses Metrics

Based on the [Moses paper](https://arxiv.org/abs/1811.12823):

- **SNN score**: Similarity to nearest neighbor in training set
- **Internal diversity**: Average pairwise Tanimoto distance within generated set
- **Scaffold similarity**: Similarity of molecular scaffolds to training set
- **Fragment similarity**: Similarity of molecular fragments to training set

### Distribution Metrics

- **KL divergence score**: Measures similarity of molecular property distributions
- **FCD score**: Fréchet ChemNet Distance, measures distribution similarity in learned feature space

## Advanced Usage

### Direct SMILES Evaluation

For most use cases, directly evaluating a list of generated SMILES is the simplest approach:

```python
# Custom number of samples and device
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=50000,
    device="cuda"  # Use GPU for faster computation
)

# Your generated SMILES list (with None for invalid generations)
my_generated_smiles = [
    "CCO", "c1ccccc1", "CC(=O)O", None, "invalid_smiles", 
    # ... up to 50000 molecules
]

# Run benchmarks directly
results = benchmarker.benchmark(my_generated_smiles)

# Access specific metric computations
validity_scores = benchmarker._compute_validity_scores(my_generated_smiles)
fcd_scores = benchmarker._compute_fcd_scores(my_generated_smiles)
```

### Model-Based Evaluation

For integration with generative models:

```python
class BatchedModel(MoleculeGenerationModel):
    def generate_molecule_batch(self) -> list[str | None]:
        # Generate larger batches for efficiency
        return self.model.sample(batch_size=1000)

# Use the model with benchmarker
results = benchmarker.benchmark_model(BatchedModel())
```

### Important Notes

- **SMILES format**: Use None for molecules that failed to generate or are invalid
- **Batch size**: The `num_samples_to_generate` parameter determines how many molecules will be evaluated
- **Validation**: Invalid SMILES are automatically detected and handled in the metrics
- **Memory**: For large evaluations (>10k molecules), consider using GPU acceleration with `device="cuda"`

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

This benchmark suite implements and builds upon metrics from several important papers:

- [Moses: A Benchmarking Platform for Molecular Generation Models](https://arxiv.org/abs/1811.12823)
- [GuacaMol: Benchmarking Models for De Novo Molecular Design](https://arxiv.org/abs/1811.09621)
- [Fréchet ChemNet Distance: A Metric for Generative Models for Molecules](https://arxiv.org/abs/1803.09518)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "molecule-benchmarks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "benchmarking, cheminformatics, drug-discovery, generative-models, molecules",
    "author": null,
    "author_email": "Ole Petersen <peteole2707@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/7d/7a/674a616aa33ab2718ac17b0845d1324c3168d1aad2aeb20ab7575131656c/molecule_benchmarks-0.1.12.tar.gz",
    "platform": null,
    "description": "# M[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)lecule Benchmarks\n\n[![PyPI version](https://badge.fury.io/py/molecule-benchmarks.svg)](https://badge.fury.io/py/molecule-benchmarks)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA comprehensive benchmark suite for evaluating generative models for molecules. This package provides standardized metrics and evaluation protocols for assessing the quality of molecular generation models in drug discovery and cheminformatics.\n\n## Features\n\n- **Comprehensive Metrics**: Validity, uniqueness, novelty, diversity, and similarity metrics\n- **Standard Benchmarks**: Implements metrics from Moses, GuacaMol, and FCD papers\n- **Easy Integration**: Simple interface for integrating with any generative model\n- **Direct SMILES Evaluation**: Benchmark pre-generated SMILES lists without implementing a model interface\n- **Multiple Datasets**: Built-in support for QM9, Moses, and GuacaMol datasets\n- **Efficient Computation**: Optimized for large-scale evaluation with multiprocessing support\n\n## Installation\n\n```bash\npip install molecule-benchmarks\n```\n\n## Quick Start\n\nYou can use the benchmark suite in two ways:\n\n### Option 1: Direct SMILES Evaluation (Simplified)\n\nIf you already have generated SMILES strings, you can benchmark them directly. Just ensure you have at least the number of samples specified in `num_samples_to_generate`.\n\n```python\nfrom molecule_benchmarks import Benchmarker, SmilesDataset\n\n# Load a dataset\ndataset = SmilesDataset.load_qm9_dataset(subset_size=10000)\n\n# Initialize benchmarker\nbenchmarker = Benchmarker(\n    dataset=dataset,\n    num_samples_to_generate=10000,# You need to generate at least this many samples\n    device=\"cpu\"  # or \"cuda\" for GPU\n)\n\n# Your generated SMILES (replace with your actual generated molecules)\ngenerated_smiles = [\n    \"CCO\",           # Ethanol\n    \"CC(=O)O\",       # Acetic acid\n    \"c1ccccc1\",      # Benzene\n    \"CC(C)O\",        # Isopropanol\n    \"CCN\",           # Ethylamine\n    None,            # Invalid molecule (use None for failures)\n    # ... more molecules up to num_samples_to_generate\n]\n\n# Run benchmarks directly on the SMILES list\nresults = benchmarker.benchmark(generated_smiles)\nprint(results)\n```\n\n### Option 2: Model-Based Evaluation\n\nTo use the benchmark suite with a generative model, implement the `MoleculeGenerationModel` protocol. This will generate the required number of samples and run the benchmarks.\n\n```python\nfrom molecule_benchmarks.model import MoleculeGenerationModel\n\nclass MyGenerativeModel(MoleculeGenerationModel):\n    def __init__(self, model_path):\n        # Initialize your model here\n        self.model = load_model(model_path)\n    \n    def generate_molecule_batch(self) -> list[str | None]:\n        \"\"\"Generate a batch of molecules as SMILES strings.\n        \n        Returns:\n            List of SMILES strings. Return None for invalid molecules.\n        \"\"\"\n        # Your generation logic here\n        batch = self.model.generate(batch_size=100)\n        return [self.convert_to_smiles(mol) for mol in batch]\n\n# Initialize your model\nmodel = MyGenerativeModel(\"path/to/model\")\n\n# Run benchmarks using the model\nresults = benchmarker.benchmark_model(model)\nprint(results)\n```\n\n### 3. Analyze Results\n\nThe benchmark returns comprehensive metrics:\n\n```python\n# Validity metrics\nprint(f\"Valid molecules: {results['validity']['valid_fraction']:.3f}\")\nprint(f\"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}\")\nprint(f\"Valid & unique & novel: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}\")\n\n# Diversity and similarity metrics\nprint(f\"Internal diversity: {results['moses']['IntDiv']:.3f}\")\nprint(f\"SNN score: {results['moses']['snn_score']:.3f}\")\n\n# Chemical property distribution similarity\nprint(f\"KL divergence score: {results['kl_score']:.3f}\")\n\n# Fr\u00e9chet ChemNet Distance\nprint(f\"FCD score: {results['fcd']['fcd']:.3f}\")\n```\n\n## Complete Examples\n\n### Example 1: Direct SMILES Benchmarking (Recommended for Simplicity)\n\n```python\nfrom molecule_benchmarks import Benchmarker, SmilesDataset\n\n# Load dataset\nprint(\"Loading dataset...\")\ndataset = SmilesDataset.load_qm9_dataset(max_train_samples=1000)\n\n# Create benchmarker\nbenchmarker = Benchmarker(\n    dataset=dataset,\n    num_samples_to_generate=100,\n    device=\"cpu\"\n)\n\n# Your generated SMILES (replace with your actual generated molecules)\ngenerated_smiles = [\n    \"CCO\",           # Ethanol\n    \"CC(=O)O\",       # Acetic acid\n    \"c1ccccc1\",      # Benzene\n    \"CC(C)O\",        # Isopropanol\n    \"CCN\",           # Ethylamine\n    None,            # Invalid molecule\n    # ... add more molecules up to 100 total\n] + [None] * (100 - 6)  # Pad with None to reach desired count\n\n# Run benchmarks directly\nprint(\"Running benchmarks...\")\nresults = benchmarker.benchmark(generated_smiles)\n\n# Print results (same as below)\nprint(\"\\n=== Validity Metrics ===\")\nprint(f\"Valid molecules: {results['validity']['valid_fraction']:.3f}\")\nprint(f\"Unique molecules: {results['validity']['unique_fraction']:.3f}\")\nprint(f\"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}\")\nprint(f\"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}\")\n\nprint(\"\\n=== Moses Metrics ===\")\nprint(f\"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}\")\nprint(f\"SNN score: {results['moses']['snn_score']:.3f}\")\nprint(f\"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}\")\nprint(f\"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}\")\n\nprint(\"\\n=== Distribution Metrics ===\")\nprint(f\"KL divergence score: {results['kl_score']:.3f}\")\nprint(f\"FCD score: {results['fcd']['fcd']:.3f}\")\nprint(f\"FCD (valid only): {results['fcd']['fcd_valid']:.3f}\")\n```\n\n### Example 2: Model-Based Benchmarking\n\nHere's a complete example using the built-in dummy model:\n\n```python\nfrom molecule_benchmarks import Benchmarker, SmilesDataset\nfrom molecule_benchmarks.model import DummyMoleculeGenerationModel\n\n# Load dataset\nprint(\"Loading dataset...\")\ndataset = SmilesDataset.load_qm9_dataset(max_train_samples=1000)\n\n# Create benchmarker\nbenchmarker = Benchmarker(\n    dataset=dataset,\n    num_samples_to_generate=100,\n    device=\"cpu\"\n)\n\n# Create a dummy model (replace with your model)\nmodel = DummyMoleculeGenerationModel([\n    \"CCO\",           # Ethanol\n    \"CC(=O)O\",       # Acetic acid\n    \"c1ccccc1\",      # Benzene\n    \"CC(C)O\",        # Isopropanol\n    \"CCN\",           # Ethylamine\n    None,            # Invalid molecule\n])\n\n# Run benchmarks using the model\nprint(\"Running benchmarks...\")\nresults = benchmarker.benchmark_model(model)\n\n# Print results\nprint(\"\\n=== Validity Metrics ===\")\nprint(f\"Valid molecules: {results['validity']['valid_fraction']:.3f}\")\nprint(f\"Unique molecules: {results['validity']['unique_fraction']:.3f}\")\nprint(f\"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}\")\nprint(f\"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}\")\n\nprint(\"\\n=== Moses Metrics ===\")\nprint(f\"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}\")\nprint(f\"SNN score: {results['moses']['snn_score']:.3f}\")\nprint(f\"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}\")\nprint(f\"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}\")\n\nprint(\"\\n=== Distribution Metrics ===\")\nprint(f\"KL divergence score: {results['kl_score']:.3f}\")\nprint(f\"FCD score: {results['fcd']['fcd']:.3f}\")\nprint(f\"FCD (valid only): {results['fcd']['fcd_valid']:.3f}\")\n```\n\n## Supported Datasets\n\nThe package includes several built-in datasets:\n\n```python\nfrom molecule_benchmarks import SmilesDataset\n\n# QM9 dataset (small molecules)\ndataset = SmilesDataset.load_qm9_dataset(subset_size=10000)\n\n# Moses dataset (larger, drug-like molecules)\ndataset = SmilesDataset.load_moses_dataset(fraction=0.1)\n\n# GuacaMol dataset\ndataset = SmilesDataset.load_guacamol_dataset(fraction=0.1)\n\n# Custom dataset from files\ndataset = SmilesDataset(\n    train_smiles=\"path/to/train.txt\",\n    validation_smiles=\"path/to/valid.txt\"\n)\n```\n\n## Metrics Explained\n\n### Validity Metrics\n\n- **Valid fraction**: Percentage of generated molecules that are chemically valid\n- **Unique fraction**: Percentage of generated molecules that are unique\n- **Novel fraction**: Percentage of generated molecules not seen in training data\n\n### Moses Metrics\n\nBased on the [Moses paper](https://arxiv.org/abs/1811.12823):\n\n- **SNN score**: Similarity to nearest neighbor in training set\n- **Internal diversity**: Average pairwise Tanimoto distance within generated set\n- **Scaffold similarity**: Similarity of molecular scaffolds to training set\n- **Fragment similarity**: Similarity of molecular fragments to training set\n\n### Distribution Metrics\n\n- **KL divergence score**: Measures similarity of molecular property distributions\n- **FCD score**: Fr\u00e9chet ChemNet Distance, measures distribution similarity in learned feature space\n\n## Advanced Usage\n\n### Direct SMILES Evaluation\n\nFor most use cases, directly evaluating a list of generated SMILES is the simplest approach:\n\n```python\n# Custom number of samples and device\nbenchmarker = Benchmarker(\n    dataset=dataset,\n    num_samples_to_generate=50000,\n    device=\"cuda\"  # Use GPU for faster computation\n)\n\n# Your generated SMILES list (with None for invalid generations)\nmy_generated_smiles = [\n    \"CCO\", \"c1ccccc1\", \"CC(=O)O\", None, \"invalid_smiles\", \n    # ... up to 50000 molecules\n]\n\n# Run benchmarks directly\nresults = benchmarker.benchmark(my_generated_smiles)\n\n# Access specific metric computations\nvalidity_scores = benchmarker._compute_validity_scores(my_generated_smiles)\nfcd_scores = benchmarker._compute_fcd_scores(my_generated_smiles)\n```\n\n### Model-Based Evaluation\n\nFor integration with generative models:\n\n```python\nclass BatchedModel(MoleculeGenerationModel):\n    def generate_molecule_batch(self) -> list[str | None]:\n        # Generate larger batches for efficiency\n        return self.model.sample(batch_size=1000)\n\n# Use the model with benchmarker\nresults = benchmarker.benchmark_model(BatchedModel())\n```\n\n### Important Notes\n\n- **SMILES format**: Use None for molecules that failed to generate or are invalid\n- **Batch size**: The `num_samples_to_generate` parameter determines how many molecules will be evaluated\n- **Validation**: Invalid SMILES are automatically detected and handled in the metrics\n- **Memory**: For large evaluations (>10k molecules), consider using GPU acceleration with `device=\"cuda\"`\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\nThis benchmark suite implements and builds upon metrics from several important papers:\n\n- [Moses: A Benchmarking Platform for Molecular Generation Models](https://arxiv.org/abs/1811.12823)\n- [GuacaMol: Benchmarking Models for De Novo Molecular Design](https://arxiv.org/abs/1811.09621)\n- [Fr\u00e9chet ChemNet Distance: A Metric for Generative Models for Molecules](https://arxiv.org/abs/1803.09518)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive benchmark suite for evaluating generative models for molecules",
    "version": "0.1.12",
    "project_urls": {
        "Documentation": "https://molecule-benchmarks.readthedocs.io/",
        "Issues": "https://github.com/peteole/molecule_benchmarks/issues",
        "Repository": "https://github.com/peteole/molecule_benchmarks"
    },
    "split_keywords": [
        "benchmarking",
        " cheminformatics",
        " drug-discovery",
        " generative-models",
        " molecules"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "12861bd71d71dd3fe99e579be98f4c2494db287fbda1e17b625fd487a1889057",
                "md5": "67312a881d7509f69aa6e8f60978ba8f",
                "sha256": "d9a20e565f604b2441cf42f644c5a10744e065fe4960060ef6dbb259881cbd11"
            },
            "downloads": -1,
            "filename": "molecule_benchmarks-0.1.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "67312a881d7509f69aa6e8f60978ba8f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 43603,
            "upload_time": "2025-07-08T15:49:36",
            "upload_time_iso_8601": "2025-07-08T15:49:36.859672Z",
            "url": "https://files.pythonhosted.org/packages/12/86/1bd71d71dd3fe99e579be98f4c2494db287fbda1e17b625fd487a1889057/molecule_benchmarks-0.1.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7d7a674a616aa33ab2718ac17b0845d1324c3168d1aad2aeb20ab7575131656c",
                "md5": "f5391590bbf4aaac79be046efb516db2",
                "sha256": "009bee0a77b6f3741fa2f840c07208117ddfca478eecd0be77a6b50c66d55dbf"
            },
            "downloads": -1,
            "filename": "molecule_benchmarks-0.1.12.tar.gz",
            "has_sig": false,
            "md5_digest": "f5391590bbf4aaac79be046efb516db2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 129389,
            "upload_time": "2025-07-08T15:49:38",
            "upload_time_iso_8601": "2025-07-08T15:49:38.449401Z",
            "url": "https://files.pythonhosted.org/packages/7d/7a/674a616aa33ab2718ac17b0845d1324c3168d1aad2aeb20ab7575131656c/molecule_benchmarks-0.1.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 15:49:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "peteole",
    "github_project": "molecule_benchmarks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "molecule-benchmarks"
}
        
Elapsed time: 1.08489s