dcmbench


Namedcmbench JSON
Version 0.1.2 PyPI version JSON
download
home_pagehttps://github.com/carlosguirado/dcmbench
SummaryA comprehensive benchmarking framework for discrete choice models
upload_time2025-08-29 17:35:21
maintainerNone
docs_urlNone
authorCarlos Guirado
requires_python>=3.8
licenseMIT
keywords discrete choice benchmarking transportation econometrics machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Discrete Choice Model Benchmarking (DCMBench)

A comprehensive Python package for benchmarking, analyzing, and validating discrete choice models for transportation mode choice analysis. DCMBench provides a unified framework for comparing different model specifications, conducting sensitivity analysis, and visualizing results.

## Installation

You can install DCMBench using pip:

```bash
pip install dcmbench
```

## Key Features

### Model Estimation and Benchmarking

- **Multiple Model Types**: Support for Multinomial Logit (MNL), Nested Logit (NL), and Mixed Logit (ML) models
- **Standardized Metrics**: Compare models using log-likelihood, rho-squared, prediction accuracy, and market share
- **Cross-validation**: Evaluate model performance on training and testing datasets
- **Visualization**: Generate comparative plots showing model performance across different metrics

```python
from dcmbench.model_benchmarker import Benchmarker
from dcmbench.datasets import fetch_data

# Load dataset (automatically downloads if not in local cache)
data = fetch_data("swissmetro_dataset")

# Define models and run benchmark
benchmarker = Benchmarker()
benchmarker.register_model(models, "Model Name")
results = benchmarker.run_benchmark(data, choice_column="CHOICE")
benchmarker.print_comparison()
```

### Advanced Analysis Capabilities

- **Sensitivity Analysis**: Evaluate how model predictions change with variations in key variables
  - Simulate changes in travel times, costs, and other attributes
  - Generate plots showing the evolution of market shares under different scenarios

- **Individual-Level Parameters**: For Mixed Logit models, calculate and visualize:
  - Individual-specific parameter distributions using Bayesian approaches
  - Value of Time (VOT) distributions across the population
  - Heterogeneity in preference structures

- **Model Calibration**: Automatically calibrate Alternative Specific Constants (ASCs) to match observed market shares

### Visualization and Reporting

- **Market Share Analysis**: Compare predicted vs. observed mode shares
- **Performance Plots**: Visualize how different models perform across multiple datasets
- **Parameter Distributions**: Plot distributions of random parameters and derived metrics like VOT
- **Sensitivity Curves**: Show how predicted mode shares change with variations in key variables

## Supported Datasets

- **Swissmetro** (`swissmetro_dataset`): Swiss inter-city travel mode choice
- **London Transport** (`ltds_dataset`): London Travel Demand Survey with urban mode choices
- **ModeCanada** (`modecanada_dataset`): Canadian inter-city travel dataset

Datasets are automatically downloaded from the [dcmbench-datasets](https://github.com/carlosguirado/dcmbench-datasets) repository on first use and cached locally:

```python
from dcmbench.datasets import fetch_data

# Use default cache location (~/.dcmbench/datasets)
data = fetch_data("swissmetro_dataset")

# Specify custom cache location
data = fetch_data("swissmetro_dataset", local_cache_dir="/path/to/cache")

# Get features and target separately
X, y = fetch_data("swissmetro_dataset", return_X_y=True)
```

## Example Applications

### Benchmarking Multiple Models

The package includes tools to benchmark multiple model types across different datasets:

```python
# Run benchmark_all_models.py to compare models across datasets
python benchmark_all_models.py
```

This generates comparative visualizations showing how different model types perform across datasets, plotting metrics like choice accuracy and market share accuracy against model fit.

### Sensitivity Analysis

Analyze how changes in key variables affect predicted mode shares:

```python
# Run sensitivity analysis on ModeCanada models
python sensitivity_analysis.py
```

This creates plots showing the evolution of mode shares as you modify variables like:
- Travel costs for different modes
- Travel times
- Service frequencies

### Individual Parameter Analysis

For Mixed Logit models, analyze individual-level parameters and VOT:

```python
# Generate individual parameter distributions for ModeCanada
python plot_individual_parameters_canada.py
```

This calculates individual-specific parameters using Bayesian conditioning and produces:
- Distributions of time and cost parameters
- Value of Time (VOT) distributions
- Summary statistics for preference heterogeneity

## Requirements

- Python >=3.8
- NumPy >=2.0.0
- Pandas >=2.0.0
- Biogeme >=3.2.14
- Matplotlib >=3.0.0
- Requests >=2.25.0
- SciPy (for statistical functions)
- Seaborn (for advanced visualizations)

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Contributing

We welcome contributions! Please see our contributing guidelines for details.

### Adding New Datasets

To add a new dataset to the DCMBench ecosystem:

1. Fork the [dcmbench-datasets](https://github.com/carlosguirado/dcmbench-datasets) repository
2. Add your dataset following the structure guidelines in the repository's CONTRIBUTING.md file
3. Submit a pull request to the dcmbench-datasets repository
4. Update the metadata.json file in the main DCMBench package to include your dataset information

This design allows the package to remain lightweight while providing access to a growing collection of transportation mode choice datasets.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/carlosguirado/dcmbench",
    "name": "dcmbench",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Carlos Guirado <your.email@example.com>",
    "keywords": "discrete choice, benchmarking, transportation, econometrics, machine learning",
    "author": "Carlos Guirado",
    "author_email": "Carlos Guirado <your.email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/60/85/c08e972c2dd2f6463abcd1e4ef939a27dc6970339a05070fb1527555d999/dcmbench-0.1.2.tar.gz",
    "platform": null,
    "description": "# Discrete Choice Model Benchmarking (DCMBench)\n\nA comprehensive Python package for benchmarking, analyzing, and validating discrete choice models for transportation mode choice analysis. DCMBench provides a unified framework for comparing different model specifications, conducting sensitivity analysis, and visualizing results.\n\n## Installation\n\nYou can install DCMBench using pip:\n\n```bash\npip install dcmbench\n```\n\n## Key Features\n\n### Model Estimation and Benchmarking\n\n- **Multiple Model Types**: Support for Multinomial Logit (MNL), Nested Logit (NL), and Mixed Logit (ML) models\n- **Standardized Metrics**: Compare models using log-likelihood, rho-squared, prediction accuracy, and market share\n- **Cross-validation**: Evaluate model performance on training and testing datasets\n- **Visualization**: Generate comparative plots showing model performance across different metrics\n\n```python\nfrom dcmbench.model_benchmarker import Benchmarker\nfrom dcmbench.datasets import fetch_data\n\n# Load dataset (automatically downloads if not in local cache)\ndata = fetch_data(\"swissmetro_dataset\")\n\n# Define models and run benchmark\nbenchmarker = Benchmarker()\nbenchmarker.register_model(models, \"Model Name\")\nresults = benchmarker.run_benchmark(data, choice_column=\"CHOICE\")\nbenchmarker.print_comparison()\n```\n\n### Advanced Analysis Capabilities\n\n- **Sensitivity Analysis**: Evaluate how model predictions change with variations in key variables\n  - Simulate changes in travel times, costs, and other attributes\n  - Generate plots showing the evolution of market shares under different scenarios\n\n- **Individual-Level Parameters**: For Mixed Logit models, calculate and visualize:\n  - Individual-specific parameter distributions using Bayesian approaches\n  - Value of Time (VOT) distributions across the population\n  - Heterogeneity in preference structures\n\n- **Model Calibration**: Automatically calibrate Alternative Specific Constants (ASCs) to match observed market shares\n\n### Visualization and Reporting\n\n- **Market Share Analysis**: Compare predicted vs. observed mode shares\n- **Performance Plots**: Visualize how different models perform across multiple datasets\n- **Parameter Distributions**: Plot distributions of random parameters and derived metrics like VOT\n- **Sensitivity Curves**: Show how predicted mode shares change with variations in key variables\n\n## Supported Datasets\n\n- **Swissmetro** (`swissmetro_dataset`): Swiss inter-city travel mode choice\n- **London Transport** (`ltds_dataset`): London Travel Demand Survey with urban mode choices\n- **ModeCanada** (`modecanada_dataset`): Canadian inter-city travel dataset\n\nDatasets are automatically downloaded from the [dcmbench-datasets](https://github.com/carlosguirado/dcmbench-datasets) repository on first use and cached locally:\n\n```python\nfrom dcmbench.datasets import fetch_data\n\n# Use default cache location (~/.dcmbench/datasets)\ndata = fetch_data(\"swissmetro_dataset\")\n\n# Specify custom cache location\ndata = fetch_data(\"swissmetro_dataset\", local_cache_dir=\"/path/to/cache\")\n\n# Get features and target separately\nX, y = fetch_data(\"swissmetro_dataset\", return_X_y=True)\n```\n\n## Example Applications\n\n### Benchmarking Multiple Models\n\nThe package includes tools to benchmark multiple model types across different datasets:\n\n```python\n# Run benchmark_all_models.py to compare models across datasets\npython benchmark_all_models.py\n```\n\nThis generates comparative visualizations showing how different model types perform across datasets, plotting metrics like choice accuracy and market share accuracy against model fit.\n\n### Sensitivity Analysis\n\nAnalyze how changes in key variables affect predicted mode shares:\n\n```python\n# Run sensitivity analysis on ModeCanada models\npython sensitivity_analysis.py\n```\n\nThis creates plots showing the evolution of mode shares as you modify variables like:\n- Travel costs for different modes\n- Travel times\n- Service frequencies\n\n### Individual Parameter Analysis\n\nFor Mixed Logit models, analyze individual-level parameters and VOT:\n\n```python\n# Generate individual parameter distributions for ModeCanada\npython plot_individual_parameters_canada.py\n```\n\nThis calculates individual-specific parameters using Bayesian conditioning and produces:\n- Distributions of time and cost parameters\n- Value of Time (VOT) distributions\n- Summary statistics for preference heterogeneity\n\n## Requirements\n\n- Python >=3.8\n- NumPy >=2.0.0\n- Pandas >=2.0.0\n- Biogeme >=3.2.14\n- Matplotlib >=3.0.0\n- Requests >=2.25.0\n- SciPy (for statistical functions)\n- Seaborn (for advanced visualizations)\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Contributing\n\nWe welcome contributions! Please see our contributing guidelines for details.\n\n### Adding New Datasets\n\nTo add a new dataset to the DCMBench ecosystem:\n\n1. Fork the [dcmbench-datasets](https://github.com/carlosguirado/dcmbench-datasets) repository\n2. Add your dataset following the structure guidelines in the repository's CONTRIBUTING.md file\n3. Submit a pull request to the dcmbench-datasets repository\n4. Update the metadata.json file in the main DCMBench package to include your dataset information\n\nThis design allows the package to remain lightweight while providing access to a growing collection of transportation mode choice datasets.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive benchmarking framework for discrete choice models",
    "version": "0.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/carlosguirado/dcmbench/issues",
        "Documentation": "https://dcmbench.readthedocs.io",
        "Homepage": "https://github.com/carlosguirado/dcmbench",
        "Repository": "https://github.com/carlosguirado/dcmbench.git"
    },
    "split_keywords": [
        "discrete choice",
        " benchmarking",
        " transportation",
        " econometrics",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "16b8bae189cbe342e5fb657162fba14e458d9337b5bfe108a84cd2b2bc757d4c",
                "md5": "5a413fd24338699b83adf939a83a29f1",
                "sha256": "175e91b325a6df5b6af8f4dc8d281764a2791312f229f87a807509dab1dcc102"
            },
            "downloads": -1,
            "filename": "dcmbench-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5a413fd24338699b83adf939a83a29f1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 127458,
            "upload_time": "2025-08-29T17:35:20",
            "upload_time_iso_8601": "2025-08-29T17:35:20.179133Z",
            "url": "https://files.pythonhosted.org/packages/16/b8/bae189cbe342e5fb657162fba14e458d9337b5bfe108a84cd2b2bc757d4c/dcmbench-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6085c08e972c2dd2f6463abcd1e4ef939a27dc6970339a05070fb1527555d999",
                "md5": "70c7bb6f93a7d3641e69ea7744231619",
                "sha256": "64a117495eb6a265b9c53898cf8100c07b8423988eb2ef648c6a94ba3f5be6df"
            },
            "downloads": -1,
            "filename": "dcmbench-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "70c7bb6f93a7d3641e69ea7744231619",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 112351,
            "upload_time": "2025-08-29T17:35:21",
            "upload_time_iso_8601": "2025-08-29T17:35:21.531954Z",
            "url": "https://files.pythonhosted.org/packages/60/85/c08e972c2dd2f6463abcd1e4ef939a27dc6970339a05070fb1527555d999/dcmbench-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-29 17:35:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "carlosguirado",
    "github_project": "dcmbench",
    "github_not_found": true,
    "lcname": "dcmbench"
}
        
Elapsed time: 0.69588s