dbgym


Namedbgym JSON
Version 0.1.3 PyPI version JSON
download
home_page
SummaryDBGym: platform for ML research and application on databases
upload_time2023-10-26 17:23:42
maintainer
docs_urlNone
author
requires_python>=3.8
license
keywords
VCS
bugtrack_url
requirements numpy pandas requests scikit_learn tensorboard torch torch_geometric tqdm xgboost yacs
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [pypi-image]: https://badge.fury.io/py/dbgym.svg
<!-- [pypi-image]: https://img.shields.io/pypi/v/dbgym -->
[pypi-url]: https://pypi.python.org/pypi/dbgym
[download-image]: https://static.pepy.tech/personalized-badge/dbgym?period=total&left_color=grey&left_text=PyPI%20Download
[download-url]: https://pepy.tech/project/dbgym
[arxiv-image]: https://img.shields.io/badge/arXiv-2310.16837-b31b1b.svg
[arxiv-url]: https://arxiv.org/abs/2310.16837
[testing-image]: https://github.com/JiaxuanYou/DBGym/actions/workflows/testing.yml/badge.svg
[testing-url]: https://github.com/JiaxuanYou/DBGym/actions/workflows/testing.yml
[linting-image]: https://github.com/JiaxuanYou/DBGym/actions/workflows/linting.yml/badge.svg
[linting-url]: https://github.com/JiaxuanYou/DBGym/actions/workflows/linting.yml
[docs-image]: https://readthedocs.org/projects/dbgym/badge/?version=latest
[docs-url]: https://dbgym.readthedocs.io/en/latest
[license-image]: https://img.shields.io/github/license/JiaxuanYou/DBGym
[license-url]: https://github.com/JiaxuanYou/DBGym/blob/main/LICENSE
<div align="center">

<a href="http://dbgym.readthedocs.io"><img align="center" src="docs/Logo.png" width="1000px"/>

[![PyPI Version][pypi-image]][pypi-url]
[![Downloads][download-image]][download-url]
[![arXiv][arxiv-image]][arxiv-url]
[![Testing Status][testing-image]][testing-url]
[![Linting Status][linting-image]][linting-url]
[![Docs Status][docs-image]][docs-url]
[![GitHub license][license-image]][license-url]

</div>

https://github.com/JiaxuanYou/DBGym/assets/49870769/cc4437eb-bdb6-4285-914b-892062d7d84e

# Overview

DBGym is a platform designed to facilitate ML research and application on databases.
With less than **5 lines of code**, you can provide the path of your database, write the predictive query you want, and DBGym will output the predictions along with your database.

DBGym is an ongoing research project serving the vibrant open-source community. We sincerely welcome new ideas, use cases, and pull requests to extend DBGym's capability.

# Installation

Prebuilt DBGym can be easily installed with `pip` (tested with Python 3.8 and above): 

```bash
pip install dbgym
```

Alternatively, you can install the latest DBGym version from our code repository:

```bash
git clone https://github.com/JiaxuanYou/DBGym.git
cd DBGym
pip install -e . 
```


# DBGym makes ML on Relational DBs simple

In the current release, DBGym focuses on relational databases (databases with multiple tables), which can be as simple as a directory of CSV or Parquet files. Here we show a simple example of what DBGym can help you achieve.

## Example DBGym use case

You may have the following CSV files in your directory `database`, each representing a table
```bash
 database
 ├── user.csv
 ├── item.csv
 └── trans.csv
```

For example, `user.csv` may look like this:

| _user | x | y | time |
| -------- | -------- | -------- | -------- |
| 0 | 0.1 | A | 2021/01/01
| 1 | 0.8 | A | 2022/02/01
| 2 | 0.5 | B | 2023/05/01
| 3 | 0.5 | **N/A** | 2024/08/01

Your goal is to **leverage the full database** (not just the `user.csv` table) to predict the unknown **N/A** values in column `y` in `user.csv`.

Let's pause here - the simple capability of predicting unknown column values is in fact extremely expressive. For example, 
- It can be used for classification or regression tasks
- It can be used for time series prediction tasks suppose the N/A values are associated with future time column values
- It can be used for recommendation tasks by predicting with respect to pairs of user and item IDs
- It can unify node, edge and graph-level prediction tasks on graphs by representing (heterogeneous) graphs as multiple tables
- ... 

This is a non-exhaustive list. By being creative, we can do so many things with the capability of predicting unknown column values.

## What DBGym can do for you

DBGym defines a simple declarative **predictive query** to describe the prediction task above, by specifying `table_name.column_name` in DBGym config. In the example above, the predictive query should be `user.y`, indicating that you want to set up an ML prediction task for column `y` in table `user`, based on the full database's information. DBGym will then automatically set up the ML prediction pipeline, including splitting the train/val/test sets, training an ML model, getting evaluation metrics for train/val/test sets, and returning `user_pred.csv` where any missing N/A values will be replaced with DBGym predictions.

To solve a given predictive query, DBGym provides a variety of ML models from tabular ML to graph ML communities, including: 

| DBGym data API | Model |
| -------- | -------- |
| Tabular | XGBoost, MLP | 
| Graph | GCN, GraphSAGE, GIN, GAT | 
| Heterogeneous graph | HGCN, HGT | 

You can also easily create your own models and register them to DBGym. We hope to integrate more ML models into DBGym, and we sincerely appreciate your pull requests!




# DBGym Quick Start

## 0 Your first DBGym experiment

With less than 5 lines of code, you can set up and carry out experiments using DBGym.

```Python
from dbgym.run import run
from dbgym.config import get_config

# Get default DBGym config
config = get_config()
stats = run(config)
```

The experiment statistics and predictions returned from the `run()` function and are further saved in the `output` directory by default.

## 1 Run DBGym experiments on RDBench datasets

For users who want to customize DBGym experiments, we provide two ways to customize configuration: using Python and/or using `YAML` config file. Please refer to [`dbgym/config.py`](dbgym/config.py) for all the available configurations.

**Customize configs with Python**

You can easily set customized config values in the Python code, e.g., picking an RDBench dataset and writing your favorite predictive query. For example:

```Python
from dbgym.run import run
from dbgym.config import get_config

config = get_config()
# point to an RDBench dataset. Will auto-download if not cached
config.dataset.name = 'rdb1-ather'
# predictive query. Format: table_name.column_name
config.dataset.query = 'entry_examination.Cholesterol'
stats = run(config)
```

**Customize configs with YAML file**

Alternatively, you can write a customized config in a YAML file, e.g., `config_gcn.yaml`, and load with DBGym:

```Python
from dbgym.run import run
from dbgym.config import get_config

config = get_config()
config.merge_from_file('config_gcn.yaml')
stats = run(config)
```

where the yaml `config_gcn.yaml` can be defined as follows:

```yaml
# Example Experiment Configuration for GCN

train:
  epoch: 200

model:
  name: 'GCN'
  hidden_dim: 128
  layer: 3

optim:
  lr: 0.001
```

## 2 Apply DBGym to your own database datasets

You can also easily apply DBGym to your own database datasets. To start, you can first organize your customized data as follows:

```
dataset_path
├── dataset_1
│   ├── user.csv
│   ├── item.csv
│   └── ...
├── dataset_2
└── ...
```

Within each CSV file, DBGym assumes that most of the table columns can be arbitrarily named, except for the key columns, where you should *follow the naming convention*, where the primary key column for table `table` is named as `_table`. For example, the column names for each CSV file may look like:
```bash
user.csv:	_user, x1, x2, ...
item.csv:	_item, Feature1, Feature2, ...
```
where the special columns begin with `_` indicate key columns, and the remaining columns (can be arbitrarily named) are regarded as feature columns. For example, `_user` is the primary key column for `user.csv`, which should save the unique ID for each user; other tables could also refer to user ID information, e.g., by adding `_user` column in `trans.csv`, in which case `_user` is the foreign key column in `trans.csv` that could have non-unique values.


Then, you can point DBGym to the dataset directory, specify the dataset you want, and write a simple predictive query by pointing to any feature column:

```Python
from dbgym.run import run
from dbgym.config import get_config

config = get_config()

# provide path to your dataset
config.dataset.dir = 'dataset_path'
config.dataset.name = 'dataset_1'
config.dataset.query = 'target.x1'
# (optional) set additional customized configs
config.merge_from_file('config_gcn.yaml')
stats = run(config)
```

Finally, DBGym will generate experiment logs and predictions in the `output` directory in `dataset_path` folder by default. The predictions for column `x1` of `target.csv` will be saved as `target_pred.csv`.

Alternatively, you can customize your output directory name by adding the following line

```Python
config.log_dir = 'output_path'
```



## 3 Include customized models to DBGym

You can further easily register your customized models to DBGym and run experiments with them. This is especially helpful if you want to benchmark your proposed model with DBGym. 
You may follow the instructions below to customize your own model and potentially contribute back to the DBGym repository via pull requests.

To start, instead of using the simple `pip install dbgym` method, you should:
```bash
git clone https://github.com/JiaxuanYou/DBGym.git
pip uninstall dbgym  # uninstall existing DBGym
pip install -e .  # install with developer mode
```

Then, you can customize your own model in `dbgym/contribute/your_model.py`:

```Python
from dbgym.register import register

class YourModel(nn.Module):
    ...

your_model_type = 'tabular_model' or 'graph_model'
register(your_model_type, 'your_model_name', YourModel)
```

Next, you can use DBGym with your customized model

```Python
from dbgym.run import run
from dbgym.config import get_config

config = get_config()
config.model.name = 'your_model_name'
stats = run(config)
```

# DBGym Feature Highlights

<div align="center">
<img align="center" src="docs/Overview.png" width="1000px" />
<b><br>Figure 1: An overview of DBGym design principle.</b>
</div>


**1. Unified Task Definition for Various Data Formats.**
- To meet the requirements of diverse users, DBGym provides 3 kinds of APIs: tabular data, homogeneous graphs and heterogeneous graphs.
- For all these data formats, we propose a unified task definition, enabling results comparison between models for different formats of data.

**2. Hierarchical Datasets with Comprehensive Experiments** 
- Via RDBench, DBGym provides 11 datasets with a range of scales, domains, and relationships. 
- These datasets are categorized into three groups based on their relationship complexity.
- Extensive experiments with 10 baselines are carried out on these datasets.

**3. Easy-to-use Interfaces with Robust Results** 
- Highly modularized pipeline.
- Reproducible experiment configuration.
- Scalable experiment management.
- Flexible user customization.
- Results reported are averaged over the same dataset and same task type (classification or regression).

DBGym has enabled RDBench, a user-friendly toolkit for benchmarking ML methods on relational databases. Please refer to our paper for more details: *[RDBench: ML Benchmark for Relational Databases](tbf)*, arXiv, Zizhao Zhang*, Yi Yang*, Lutong Zou*, He Wen*, Tao Feng, Jiaxuan You.


# Logging

## Logs and predictions

After running DBGym, the logs and predictions are saved to an output directory named as `output` by default. Moreover, the command line output will also indicate the path where logs and predictions are saved. For example

```
Logs and predictions are saved to Datasets/output/rdb2-bank/loan.Status_MLP_42_20231024_123456
```

## Visualization with tensorboard

To see the visualization of the training process, you can go into the output directory and run

```bash
tensorboard --logdir .
```

Then visit `localhost:6006`, or any other port you specified.

## Citation
If you find DBGym useful or relevant to your research, please kindly cite our paper:

```bibtex
@article{zhang2023rdbench,
  title={RDBench: ML Benchmark for Relational Databases}, 
  author={Zizhao, Zhang and Yi, Yang and Lutong, Zou and He, Wen and Tao, Feng and Jiaxuan, You},
  journal={arXiv preprint arXiv:2310.16837},
  year={2023}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "dbgym",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "",
    "author": "",
    "author_email": "DBGym Team <dbgym@googlegroups.com>",
    "download_url": "https://files.pythonhosted.org/packages/5c/b1/7cdc5fbadf111cf9c9a1e471db3125df6131136a4bbffc9ed70bc5a33552/dbgym-0.1.3.tar.gz",
    "platform": null,
    "description": "[pypi-image]: https://badge.fury.io/py/dbgym.svg\n<!-- [pypi-image]: https://img.shields.io/pypi/v/dbgym -->\n[pypi-url]: https://pypi.python.org/pypi/dbgym\n[download-image]: https://static.pepy.tech/personalized-badge/dbgym?period=total&left_color=grey&left_text=PyPI%20Download\n[download-url]: https://pepy.tech/project/dbgym\n[arxiv-image]: https://img.shields.io/badge/arXiv-2310.16837-b31b1b.svg\n[arxiv-url]: https://arxiv.org/abs/2310.16837\n[testing-image]: https://github.com/JiaxuanYou/DBGym/actions/workflows/testing.yml/badge.svg\n[testing-url]: https://github.com/JiaxuanYou/DBGym/actions/workflows/testing.yml\n[linting-image]: https://github.com/JiaxuanYou/DBGym/actions/workflows/linting.yml/badge.svg\n[linting-url]: https://github.com/JiaxuanYou/DBGym/actions/workflows/linting.yml\n[docs-image]: https://readthedocs.org/projects/dbgym/badge/?version=latest\n[docs-url]: https://dbgym.readthedocs.io/en/latest\n[license-image]: https://img.shields.io/github/license/JiaxuanYou/DBGym\n[license-url]: https://github.com/JiaxuanYou/DBGym/blob/main/LICENSE\n<div align=\"center\">\n\n<a href=\"http://dbgym.readthedocs.io\"><img align=\"center\" src=\"docs/Logo.png\" width=\"1000px\"/>\n\n[![PyPI Version][pypi-image]][pypi-url]\n[![Downloads][download-image]][download-url]\n[![arXiv][arxiv-image]][arxiv-url]\n[![Testing Status][testing-image]][testing-url]\n[![Linting Status][linting-image]][linting-url]\n[![Docs Status][docs-image]][docs-url]\n[![GitHub license][license-image]][license-url]\n\n</div>\n\nhttps://github.com/JiaxuanYou/DBGym/assets/49870769/cc4437eb-bdb6-4285-914b-892062d7d84e\n\n# Overview\n\nDBGym is a platform designed to facilitate ML research and application on databases.\nWith less than **5 lines of code**, you can provide the path of your database, write the predictive query you want, and DBGym will output the predictions along with your database.\n\nDBGym is an ongoing research project serving the vibrant open-source community. We sincerely welcome new ideas, use cases, and pull requests to extend DBGym's capability.\n\n# Installation\n\nPrebuilt DBGym can be easily installed with `pip` (tested with Python 3.8 and above): \n\n```bash\npip install dbgym\n```\n\nAlternatively, you can install the latest DBGym version from our code repository:\n\n```bash\ngit clone https://github.com/JiaxuanYou/DBGym.git\ncd DBGym\npip install -e . \n```\n\n\n# DBGym makes ML on Relational DBs simple\n\nIn the current release, DBGym focuses on relational databases (databases with multiple tables), which can be as simple as a directory of CSV or Parquet files. Here we show a simple example of what DBGym can help you achieve.\n\n## Example DBGym use case\n\nYou may have the following CSV files in your directory `database`, each representing a table\n```bash\n database\n \u251c\u2500\u2500 user.csv\n \u251c\u2500\u2500 item.csv\n \u2514\u2500\u2500 trans.csv\n```\n\nFor example, `user.csv` may look like this:\n\n| _user | x | y | time |\n| -------- | -------- | -------- | -------- |\n| 0 | 0.1 | A | 2021/01/01\n| 1 | 0.8 | A | 2022/02/01\n| 2 | 0.5 | B | 2023/05/01\n| 3 | 0.5 | **N/A** | 2024/08/01\n\nYour goal is to **leverage the full database** (not just the `user.csv` table) to predict the unknown **N/A** values in column `y` in `user.csv`.\n\nLet's pause here - the simple capability of predicting unknown column values is in fact extremely expressive. For example, \n- It can be used for classification or regression tasks\n- It can be used for time series prediction tasks suppose the N/A values are associated with future time column values\n- It can be used for recommendation tasks by predicting with respect to pairs of user and item IDs\n- It can unify node, edge and graph-level prediction tasks on graphs by representing (heterogeneous) graphs as multiple tables\n- ... \n\nThis is a non-exhaustive list. By being creative, we can do so many things with the capability of predicting unknown column values.\n\n## What DBGym can do for you\n\nDBGym defines a simple declarative **predictive query** to describe the prediction task above, by specifying `table_name.column_name` in DBGym config. In the example above, the predictive query should be `user.y`, indicating that you want to set up an ML prediction task for column `y` in table `user`, based on the full database's information. DBGym will then automatically set up the ML prediction pipeline, including splitting the train/val/test sets, training an ML model, getting evaluation metrics for train/val/test sets, and returning `user_pred.csv` where any missing N/A values will be replaced with DBGym predictions.\n\nTo solve a given predictive query, DBGym provides a variety of ML models from tabular ML to graph ML communities, including: \n\n| DBGym data API | Model |\n| -------- | -------- |\n| Tabular | XGBoost, MLP | \n| Graph | GCN, GraphSAGE, GIN, GAT | \n| Heterogeneous graph | HGCN, HGT | \n\nYou can also easily create your own models and register them to DBGym. We hope to integrate more ML models into DBGym, and we sincerely appreciate your pull requests!\n\n\n\n\n# DBGym Quick Start\n\n## 0 Your first DBGym experiment\n\nWith less than 5 lines of code, you can set up and carry out experiments using DBGym.\n\n```Python\nfrom dbgym.run import run\nfrom dbgym.config import get_config\n\n# Get default DBGym config\nconfig = get_config()\nstats = run(config)\n```\n\nThe experiment statistics and predictions returned from the `run()` function and are further saved in the `output` directory by default.\n\n## 1 Run DBGym experiments on RDBench datasets\n\nFor users who want to customize DBGym experiments, we provide two ways to customize configuration: using Python and/or using `YAML` config file. Please refer to [`dbgym/config.py`](dbgym/config.py) for all the available configurations.\n\n**Customize configs with Python**\n\nYou can easily set customized config values in the Python code, e.g., picking an RDBench dataset and writing your favorite predictive query. For example:\n\n```Python\nfrom dbgym.run import run\nfrom dbgym.config import get_config\n\nconfig = get_config()\n# point to an RDBench dataset. Will auto-download if not cached\nconfig.dataset.name = 'rdb1-ather'\n# predictive query. Format: table_name.column_name\nconfig.dataset.query = 'entry_examination.Cholesterol'\nstats = run(config)\n```\n\n**Customize configs with YAML file**\n\nAlternatively, you can write a customized config in a YAML file, e.g., `config_gcn.yaml`, and load with DBGym:\n\n```Python\nfrom dbgym.run import run\nfrom dbgym.config import get_config\n\nconfig = get_config()\nconfig.merge_from_file('config_gcn.yaml')\nstats = run(config)\n```\n\nwhere the yaml `config_gcn.yaml` can be defined as follows:\n\n```yaml\n# Example Experiment Configuration for GCN\n\ntrain:\n  epoch: 200\n\nmodel:\n  name: 'GCN'\n  hidden_dim: 128\n  layer: 3\n\noptim:\n  lr: 0.001\n```\n\n## 2 Apply DBGym to your own database datasets\n\nYou can also easily apply DBGym to your own database datasets. To start, you can first organize your customized data as follows:\n\n```\ndataset_path\n\u251c\u2500\u2500 dataset_1\n\u2502   \u251c\u2500\u2500 user.csv\n\u2502   \u251c\u2500\u2500 item.csv\n\u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 dataset_2\n\u2514\u2500\u2500 ...\n```\n\nWithin each CSV file, DBGym assumes that most of the table columns can be arbitrarily named, except for the key columns, where you should *follow the naming convention*, where the primary key column for table `table` is named as `_table`. For example, the column names for each CSV file may look like:\n```bash\nuser.csv:\t_user, x1, x2, ...\nitem.csv:\t_item, Feature1, Feature2, ...\n```\nwhere the special columns begin with `_` indicate key columns, and the remaining columns (can be arbitrarily named) are regarded as feature columns. For example, `_user` is the primary key column for `user.csv`, which should save the unique ID for each user; other tables could also refer to user ID information, e.g., by adding `_user` column in `trans.csv`, in which case `_user` is the foreign key column in `trans.csv` that could have non-unique values.\n\n\nThen, you can point DBGym to the dataset directory, specify the dataset you want, and write a simple predictive query by pointing to any feature column:\n\n```Python\nfrom dbgym.run import run\nfrom dbgym.config import get_config\n\nconfig = get_config()\n\n# provide path to your dataset\nconfig.dataset.dir = 'dataset_path'\nconfig.dataset.name = 'dataset_1'\nconfig.dataset.query = 'target.x1'\n# (optional) set additional customized configs\nconfig.merge_from_file('config_gcn.yaml')\nstats = run(config)\n```\n\nFinally, DBGym will generate experiment logs and predictions in the `output` directory in `dataset_path` folder by default. The predictions for column `x1` of `target.csv` will be saved as `target_pred.csv`.\n\nAlternatively, you can customize your output directory name by adding the following line\n\n```Python\nconfig.log_dir = 'output_path'\n```\n\n\n\n## 3 Include customized models to DBGym\n\nYou can further easily register your customized models to DBGym and run experiments with them. This is especially helpful if you want to benchmark your proposed model with DBGym. \nYou may follow the instructions below to customize your own model and potentially contribute back to the DBGym repository via pull requests.\n\nTo start, instead of using the simple `pip install dbgym` method, you should:\n```bash\ngit clone https://github.com/JiaxuanYou/DBGym.git\npip uninstall dbgym  # uninstall existing DBGym\npip install -e .  # install with developer mode\n```\n\nThen, you can customize your own model in `dbgym/contribute/your_model.py`:\n\n```Python\nfrom dbgym.register import register\n\nclass YourModel(nn.Module):\n    ...\n\nyour_model_type = 'tabular_model' or 'graph_model'\nregister(your_model_type, 'your_model_name', YourModel)\n```\n\nNext, you can use DBGym with your customized model\n\n```Python\nfrom dbgym.run import run\nfrom dbgym.config import get_config\n\nconfig = get_config()\nconfig.model.name = 'your_model_name'\nstats = run(config)\n```\n\n# DBGym Feature Highlights\n\n<div align=\"center\">\n<img align=\"center\" src=\"docs/Overview.png\" width=\"1000px\" />\n<b><br>Figure 1: An overview of DBGym design principle.</b>\n</div>\n\n\n**1. Unified Task Definition for Various Data Formats.**\n- To meet the requirements of diverse users, DBGym provides 3 kinds of APIs: tabular data, homogeneous graphs and heterogeneous graphs.\n- For all these data formats, we propose a unified task definition, enabling results comparison between models for different formats of data.\n\n**2. Hierarchical Datasets with Comprehensive Experiments** \n- Via RDBench, DBGym provides 11 datasets with a range of scales, domains, and relationships. \n- These datasets are categorized into three groups based on their relationship complexity.\n- Extensive experiments with 10 baselines are carried out on these datasets.\n\n**3. Easy-to-use Interfaces with Robust Results** \n- Highly modularized pipeline.\n- Reproducible experiment configuration.\n- Scalable experiment management.\n- Flexible user customization.\n- Results reported are averaged over the same dataset and same task type (classification or regression).\n\nDBGym has enabled RDBench, a user-friendly toolkit for benchmarking ML methods on relational databases. Please refer to our paper for more details: *[RDBench: ML Benchmark for Relational Databases](tbf)*, arXiv, Zizhao Zhang*, Yi Yang*, Lutong Zou*, He Wen*, Tao Feng, Jiaxuan You.\n\n\n# Logging\n\n## Logs and predictions\n\nAfter running DBGym, the logs and predictions are saved to an output directory named as `output` by default. Moreover, the command line output will also indicate the path where logs and predictions are saved. For example\n\n```\nLogs and predictions are saved to Datasets/output/rdb2-bank/loan.Status_MLP_42_20231024_123456\n```\n\n## Visualization with tensorboard\n\nTo see the visualization of the training process, you can go into the output directory and run\n\n```bash\ntensorboard --logdir .\n```\n\nThen visit `localhost:6006`, or any other port you specified.\n\n## Citation\nIf you find DBGym useful or relevant to your research, please kindly cite our paper:\n\n```bibtex\n@article{zhang2023rdbench,\n  title={RDBench: ML Benchmark for Relational Databases}, \n  author={Zizhao, Zhang and Yi, Yang and Lutong, Zou and He, Wen and Tao, Feng and Jiaxuan, You},\n  journal={arXiv preprint arXiv:2310.16837},\n  year={2023}\n}\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "DBGym: platform for ML research and application on databases",
    "version": "0.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/JiaxuanYou/DBGym/issues",
        "Homepage": "https://github.com/JiaxuanYou/DBGym"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ed71d7638a589997a47921e8bc10b163614b712ef98a35f0a113726c836a4781",
                "md5": "577c3f332f9d3c4f903536a3f5503882",
                "sha256": "92db3a7f7d3675debc6d3012c7f149c081a8d0ce76b636f3e77fd43e8b43d5a3"
            },
            "downloads": -1,
            "filename": "dbgym-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "577c3f332f9d3c4f903536a3f5503882",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 29681,
            "upload_time": "2023-10-26T17:23:38",
            "upload_time_iso_8601": "2023-10-26T17:23:38.694768Z",
            "url": "https://files.pythonhosted.org/packages/ed/71/d7638a589997a47921e8bc10b163614b712ef98a35f0a113726c836a4781/dbgym-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5cb17cdc5fbadf111cf9c9a1e471db3125df6131136a4bbffc9ed70bc5a33552",
                "md5": "c353b23d751f8225e9894b8250bf3a65",
                "sha256": "00ac765c4334ab21926889c38d49f63e053e1eb88b42d17ddfd490841767027c"
            },
            "downloads": -1,
            "filename": "dbgym-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "c353b23d751f8225e9894b8250bf3a65",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 3669798,
            "upload_time": "2023-10-26T17:23:42",
            "upload_time_iso_8601": "2023-10-26T17:23:42.973260Z",
            "url": "https://files.pythonhosted.org/packages/5c/b1/7cdc5fbadf111cf9c9a1e471db3125df6131136a4bbffc9ed70bc5a33552/dbgym-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-26 17:23:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "JiaxuanYou",
    "github_project": "DBGym",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "requests",
            "specs": []
        },
        {
            "name": "scikit_learn",
            "specs": []
        },
        {
            "name": "tensorboard",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "torch_geometric",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "xgboost",
            "specs": []
        },
        {
            "name": "yacs",
            "specs": []
        }
    ],
    "lcname": "dbgym"
}
        
Elapsed time: 0.40378s