betfairdatabase


Namebetfairdatabase JSON
Version 1.1.0 PyPI version JSON
download
home_page
SummaryTurns a collection of historical Betfair data into a queryable SQL database.
upload_time2024-03-11 15:22:08
maintainer
docs_urlNone
author
requires_python>=3.10
licenseMIT License
keywords betfair trading betting database
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # betfair-database
[![test](https://github.com/mzaja/betfair-database/actions/workflows/test.yml/badge.svg?branch=main)](https://github.com/mzaja/betfair-database/actions/workflows/test.yml) [![Coverage Status](https://coveralls.io/repos/github/mzaja/betfair-database/badge.svg?branch=main)](https://coveralls.io/github/mzaja/betfair-database?branch=main) ![PyPI version](https://img.shields.io/pypi/v/betfairdatabase) ![Python version](https://img.shields.io/pypi/pyversions/betfairdatabase) ![License](https://img.shields.io/github/license/mzaja/betfair-database)

Turns a collection of historical Betfair data into a queryable SQL database.

## Installation
Install the package from **PyPI**:
```bash
pip install betfairdatabase
```

## Usage
### Getting started
1. Index the folder holding historical Betfair data to turn it into a database.
2. Use SQL queries to select data.

```py
import betfairdatabase as bfdb

path_to_data = "./my_betfair_data"
bfdb.index(path_to_data)  # Create an index to convert the folder into a database

# Select all greyhound races in Sheffield
dataset = bfdb.select(
    path_to_data, where="eventTypeId='4339' AND eventVenue='Sheffield'"
)
for market in dataset:
    print(
        market["marketDataFilePath"],  # Path to stream data file
        market["marketCatalogueFilePath"],  # Path to market catalogue file
    )
```

The historical data can be grouped and divided using any subfolder hierarchy, but it must follow this convention:

1. Market catalogue is stored in a JSON file named `<market id>.json`.
2. Market data file (containing stream data) is stored in the same folder as the market catalogue file. It shares the same basename `<market id>` and ends with `.zip`, `.gz` or `.bz2`, or it has no extension (uncompressed data).

A sample database structure is shown below:
```
my_betfair_data/
├── arbitrary_folder/
    ├── 1.22334455.json  # Market catalogue file
    ├── 1.22334455  # Uncompressed market data file
    ├── 1.55667788.json  # Market catalogue file
    └── 1.55667788.zip  # Compressed market data file
```

### Retrieving data
`select()` method accepts the following arguments:
- `database_dir`: Main directory of the database initialised with `index()`.
- `columns`: A list of columns (field names) to retrieve. If omitted, all columns are returned. View a list of available columns by calling `betfairdatabase.columns()`.
- `where`: SQL query to execute.
- `limit`: Maximum number of results to return. If omitted, all results are returned.
- `return_dict`: If `True` (default), results are returned as a dictionary where keys are column names and values are data. If `False`, results are returned as tuples containing only data. The second option is faster but makes data harder to work with.

Below are several examples of selecting and filtering data:

```py
import betfairdatabase as bfdb

path_to_data = "./my_betfair_data"

# Return all market ids and paths to data files in the database
bfdb.select(path_to_data, columns=["marketId", "marketDataFilePath"])

# Return full market metadata for horse racing win markets
bfdb.select(path_to_data, where="eventTypeId='7' AND marketType='WIN'")

# Return full market metadata for a maximum of 100 BSP markets
bfdb.select(path_to_data, where="bspMarket=true", limit=100)

# Return a maximum of 250 data file paths for horse and greyhound racing
bfdb.select(
    path_to_data,
    columns=["marketDataFilePath"],
    where="eventTypeId IN ('7', '4339') AND marketType='WIN'",
    limit=250,
)
```

### Inserting data
Database can be updated with new files using `insert` method. This is much faster and more efficient than reindexing the whole database on each update. Files are moved by default, but they can also be copied if `copy=True` argument is provided.

```py
import betfairdatabase as bfdb

bfdb.insert("./my_betfair_data", "./my_capture_dir")
```

### Exporting data
Database index can be exported to a CSV file with the `export()` method. This is useful for debugging, visualising data and post-processing it with external tools.

```py
import betfairdatabase as bfdb

csv_file = bfdb.export("./my_betfair_data", "./my_data_dump")
print(csv_file)  # Prints: ./my_data_dump/my_betfair_data.csv
```

### Removing missing data
Throughout the course of database's lifetime, indexed files may get removed. `clean()` method checks for the presence of indexed market data files and removes the missing entries from the index, avoiding the need to reindex the whole database on every single file removal. However, reindexing the database may be the faster option when a large number of files has been removed.

```py
import betfairdatabase as bfdb

bfdb.clean("./my_betfair_data")
```

### Object-oriented interface
All of the above methods can also be accessed through OOP interface via `BetfairDatabase` class. This is useful when performing multiple operations on the same database as the database directory needs to be provided only once.
```py
from betfairdatabase import BetfairDatabase

db = BetfairDatabase("./my_betfair_data")
db.index()
db.select()
db.insert("./my_capture_dir")
db.export()
db.clean()
```

### Command line interface
The package also installs a `bfdb` command line app, which provides access to the following methods:
```bash
bfdb index "./my_database_dir"  # Index a database
bfdb export "./my_database_dir" "./my_db_dump.csv" # Export a database
bfdb insert "./my_database_dir" "./my_captured_data"  # Update the database
bfdb clean "./my_database_dir"  # Clean the database
```

For more information about the command line interface, run:
```bash
bfdb --help
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "betfairdatabase",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "betfair,trading,betting,database",
    "author": "",
    "author_email": "Mario Zaja <mzaja0@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/88/c7/84cd1cb7d292439d6e739018c95507e1742292adbb5ebb8c077d6c85f87d/betfairdatabase-1.1.0.tar.gz",
    "platform": null,
    "description": "# betfair-database\r\n[![test](https://github.com/mzaja/betfair-database/actions/workflows/test.yml/badge.svg?branch=main)](https://github.com/mzaja/betfair-database/actions/workflows/test.yml) [![Coverage Status](https://coveralls.io/repos/github/mzaja/betfair-database/badge.svg?branch=main)](https://coveralls.io/github/mzaja/betfair-database?branch=main) ![PyPI version](https://img.shields.io/pypi/v/betfairdatabase) ![Python version](https://img.shields.io/pypi/pyversions/betfairdatabase) ![License](https://img.shields.io/github/license/mzaja/betfair-database)\r\n\r\nTurns a collection of historical Betfair data into a queryable SQL database.\r\n\r\n## Installation\r\nInstall the package from **PyPI**:\r\n```bash\r\npip install betfairdatabase\r\n```\r\n\r\n## Usage\r\n### Getting started\r\n1. Index the folder holding historical Betfair data to turn it into a database.\r\n2. Use SQL queries to select data.\r\n\r\n```py\r\nimport betfairdatabase as bfdb\r\n\r\npath_to_data = \"./my_betfair_data\"\r\nbfdb.index(path_to_data)  # Create an index to convert the folder into a database\r\n\r\n# Select all greyhound races in Sheffield\r\ndataset = bfdb.select(\r\n    path_to_data, where=\"eventTypeId='4339' AND eventVenue='Sheffield'\"\r\n)\r\nfor market in dataset:\r\n    print(\r\n        market[\"marketDataFilePath\"],  # Path to stream data file\r\n        market[\"marketCatalogueFilePath\"],  # Path to market catalogue file\r\n    )\r\n```\r\n\r\nThe historical data can be grouped and divided using any subfolder hierarchy, but it must follow this convention:\r\n\r\n1. Market catalogue is stored in a JSON file named `<market id>.json`.\r\n2. Market data file (containing stream data) is stored in the same folder as the market catalogue file. It shares the same basename `<market id>` and ends with `.zip`, `.gz` or `.bz2`, or it has no extension (uncompressed data).\r\n\r\nA sample database structure is shown below:\r\n```\r\nmy_betfair_data/\r\n\u251c\u2500\u2500 arbitrary_folder/\r\n    \u251c\u2500\u2500 1.22334455.json  # Market catalogue file\r\n    \u251c\u2500\u2500 1.22334455  # Uncompressed market data file\r\n    \u251c\u2500\u2500 1.55667788.json  # Market catalogue file\r\n    \u2514\u2500\u2500 1.55667788.zip  # Compressed market data file\r\n```\r\n\r\n### Retrieving data\r\n`select()` method accepts the following arguments:\r\n- `database_dir`: Main directory of the database initialised with `index()`.\r\n- `columns`: A list of columns (field names) to retrieve. If omitted, all columns are returned. View a list of available columns by calling `betfairdatabase.columns()`.\r\n- `where`: SQL query to execute.\r\n- `limit`: Maximum number of results to return. If omitted, all results are returned.\r\n- `return_dict`: If `True` (default), results are returned as a dictionary where keys are column names and values are data. If `False`, results are returned as tuples containing only data. The second option is faster but makes data harder to work with.\r\n\r\nBelow are several examples of selecting and filtering data:\r\n\r\n```py\r\nimport betfairdatabase as bfdb\r\n\r\npath_to_data = \"./my_betfair_data\"\r\n\r\n# Return all market ids and paths to data files in the database\r\nbfdb.select(path_to_data, columns=[\"marketId\", \"marketDataFilePath\"])\r\n\r\n# Return full market metadata for horse racing win markets\r\nbfdb.select(path_to_data, where=\"eventTypeId='7' AND marketType='WIN'\")\r\n\r\n# Return full market metadata for a maximum of 100 BSP markets\r\nbfdb.select(path_to_data, where=\"bspMarket=true\", limit=100)\r\n\r\n# Return a maximum of 250 data file paths for horse and greyhound racing\r\nbfdb.select(\r\n    path_to_data,\r\n    columns=[\"marketDataFilePath\"],\r\n    where=\"eventTypeId IN ('7', '4339') AND marketType='WIN'\",\r\n    limit=250,\r\n)\r\n```\r\n\r\n### Inserting data\r\nDatabase can be updated with new files using `insert` method. This is much faster and more efficient than reindexing the whole database on each update. Files are moved by default, but they can also be copied if `copy=True` argument is provided.\r\n\r\n```py\r\nimport betfairdatabase as bfdb\r\n\r\nbfdb.insert(\"./my_betfair_data\", \"./my_capture_dir\")\r\n```\r\n\r\n### Exporting data\r\nDatabase index can be exported to a CSV file with the `export()` method. This is useful for debugging, visualising data and post-processing it with external tools.\r\n\r\n```py\r\nimport betfairdatabase as bfdb\r\n\r\ncsv_file = bfdb.export(\"./my_betfair_data\", \"./my_data_dump\")\r\nprint(csv_file)  # Prints: ./my_data_dump/my_betfair_data.csv\r\n```\r\n\r\n### Removing missing data\r\nThroughout the course of database's lifetime, indexed files may get removed. `clean()` method checks for the presence of indexed market data files and removes the missing entries from the index, avoiding the need to reindex the whole database on every single file removal. However, reindexing the database may be the faster option when a large number of files has been removed.\r\n\r\n```py\r\nimport betfairdatabase as bfdb\r\n\r\nbfdb.clean(\"./my_betfair_data\")\r\n```\r\n\r\n### Object-oriented interface\r\nAll of the above methods can also be accessed through OOP interface via `BetfairDatabase` class. This is useful when performing multiple operations on the same database as the database directory needs to be provided only once.\r\n```py\r\nfrom betfairdatabase import BetfairDatabase\r\n\r\ndb = BetfairDatabase(\"./my_betfair_data\")\r\ndb.index()\r\ndb.select()\r\ndb.insert(\"./my_capture_dir\")\r\ndb.export()\r\ndb.clean()\r\n```\r\n\r\n### Command line interface\r\nThe package also installs a `bfdb` command line app, which provides access to the following methods:\r\n```bash\r\nbfdb index \"./my_database_dir\"  # Index a database\r\nbfdb export \"./my_database_dir\" \"./my_db_dump.csv\" # Export a database\r\nbfdb insert \"./my_database_dir\" \"./my_captured_data\"  # Update the database\r\nbfdb clean \"./my_database_dir\"  # Clean the database\r\n```\r\n\r\nFor more information about the command line interface, run:\r\n```bash\r\nbfdb --help\r\n```\r\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Turns a collection of historical Betfair data into a queryable SQL database.",
    "version": "1.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/mzaja/betfair-database/issues",
        "Homepage": "https://github.com/mzaja/betfair-database"
    },
    "split_keywords": [
        "betfair",
        "trading",
        "betting",
        "database"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4b675de371ad80020a1876314b1624021fb7f8229ce80c13dca8552203aaf90a",
                "md5": "48716dcfe5c662b0f12994bd56403be6",
                "sha256": "bde60fc7a6478a7f89f7638d60bef7d715efbd55af5ed9eaaad1aa614f3783f2"
            },
            "downloads": -1,
            "filename": "betfairdatabase-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "48716dcfe5c662b0f12994bd56403be6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 19044,
            "upload_time": "2024-03-11T15:22:06",
            "upload_time_iso_8601": "2024-03-11T15:22:06.749844Z",
            "url": "https://files.pythonhosted.org/packages/4b/67/5de371ad80020a1876314b1624021fb7f8229ce80c13dca8552203aaf90a/betfairdatabase-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "88c784cd1cb7d292439d6e739018c95507e1742292adbb5ebb8c077d6c85f87d",
                "md5": "86919fde2010857336782feb81b7e2d4",
                "sha256": "e4acd5de2c00b97158eb34c172750edc6912d049221ad06368f639f67b352578"
            },
            "downloads": -1,
            "filename": "betfairdatabase-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "86919fde2010857336782feb81b7e2d4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 27837,
            "upload_time": "2024-03-11T15:22:08",
            "upload_time_iso_8601": "2024-03-11T15:22:08.729660Z",
            "url": "https://files.pythonhosted.org/packages/88/c7/84cd1cb7d292439d6e739018c95507e1742292adbb5ebb8c077d6c85f87d/betfairdatabase-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-11 15:22:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mzaja",
    "github_project": "betfair-database",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "betfairdatabase"
}
        
Elapsed time: 0.21117s