sportsball


Namesportsball JSON
Version 0.34.147 PyPI version JSON
download
home_pagehttps://github.com/8W9aG/sportsball
SummaryA library for pulling in and normalising sports stats.
upload_time2025-11-08 15:59:27
maintainerNone
docs_urlNone
authorWill Sackfield
requires_pythonNone
licenseMIT
keywords sports data betting
VCS
bugtrack_url
requirements pandas requests requests-cache python-dateutil tqdm beautifulsoup4 openpyxl joblib pyarrow ipython python-dateutil pytz python-dotenv geocoder retry-requests openmeteo-requests nba_api timezonefinder pydantic flatten_json extruct wikipedia-api tweepy pytest-is-running PySocks func-timeout tenacity random_user_agent wayback cryptography feedparser dateparser playwright cchardet lxml gender-guesser scrapesession pyhigh datefinder
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # sportsball

<a href="https://pypi.org/project/sportsball/">
    <img alt="PyPi" src="https://img.shields.io/pypi/v/sportsball">
</a>

A library for pulling in and normalising sports stats.

<p align="center">
    <img src="sportsball.png" alt="sportsball" width="200"/>
</p>

## Dependencies :globe_with_meridians:

Python 3.11.6:

- [pandas](https://pandas.pydata.org/)
- [requests](https://requests.readthedocs.io/en/latest/)
- [requests-cache](https://requests-cache.readthedocs.io/en/stable/)
- [python-dateutil](https://github.com/dateutil/dateutil)
- [tqdm](https://github.com/tqdm/tqdm)
- [beautifulsoup](https://www.crummy.com/software/BeautifulSoup/)
- [openpyxl](https://openpyxl.readthedocs.io/en/stable/)
- [joblib](https://joblib.readthedocs.io/en/stable/)
- [pyarrow](https://arrow.apache.org/docs/python/index.html)
- [ipython](https://ipython.org/)
- [pytz](https://pythonhosted.org/pytz/)
- [python-dotenv](https://github.com/theskumar/python-dotenv)
- [geocoder](https://geocoder.readthedocs.io/)
- [retry-requests](https://github.com/bustawin/retry-requests)
- [timezonefinder](https://timezonefinder.michelfe.it/gui)
- [nba_api](https://github.com/swar/nba_api)
- [pydantic](https://docs.pydantic.dev/latest/)
- [flatten_json](https://github.com/amirziai/flatten)
- [pygooglenews](https://github.com/kotartemiy/pygooglenews)
- [extruct](https://github.com/scrapinghub/extruct)
- [wikipedia-api](https://github.com/martin-majlis/Wikipedia-API)
- [tweepy](https://www.tweepy.org/)
- [pytest-is-running](https://github.com/adamchainz/pytest-is-running)
- [PySocks](https://github.com/Anorov/PySocks)
- [func-timeout](https://github.com/kata198/func_timeout)
- [tenacity](https://github.com/jd/tenacity)
- [random_user_agent](https://github.com/Luqman-Ud-Din/random_user_agent)
- [wayback](https://github.com/edgi-govdata-archiving/wayback)
- [cryptography](https://cryptography.io/en/latest/)
- [feedparser](https://github.com/kurtmckee/feedparser)
- [dateparser](https://dateparser.readthedocs.io/en/latest/)
- [playwright](https://playwright.dev/)
- [cchardet](https://github.com/PyYoshi/cChardet)
- [lxml](https://lxml.de/)
- [gender-guesser](https://github.com/lead-ratings/gender-guesser)
- [scrapesession](https://github.com/8W9aG/scrapesession)
- [pyhigh](https://github.com/sgherbst/pyhigh)
- [datefinder](https://github.com/akoumjian/datefinder)

## Raison D'Γͺtre :thought_balloon:

`sportsball` aims to be a library for pulling in historical information about previous sporting games in a standardised fashion for easy data processing.
The models it uses are designed to be used for many different types of sports.

The supported leagues are:

* πŸ‰ [AFL](https://www.afl.com.au/)
* πŸ‰ [AFLW](https://www.afl.com.au/aflw)
* 🎾 [ATP](https://www.atptour.com/en)
* ⚽ [BUNDESLIGA](https://www.bundesliga.com/en/bundesliga)
* ⚽ [EPL](https://www.premierleague.com/ens)
* ⚽ [FIFA](https://www.fifa.com/en)
* 🐎 [HKJC](https://www.hkjc.com/home/english/index.aspx)
* 🏏 [IPL](https://www.iplt20.com/)
* ⚽ [LALIGA](https://www.laliga.com/en-GB)
* ⚾ [MLB](https://www.mlb.com/)
* πŸ€ [NBA](https://www.nba.com/)
* πŸ€ [NCAAB](https://www.ncaa.com/sports/basketball-men/d1)
* πŸ€ [NCAABW](https://www.ncaa.com/sports/basketball-women/d1)
* 🏈 [NCAAF](https://www.ncaa.com/sports/football/fbs)
* 🏈 [NFL](https://www.nfl.com/)
* πŸ’ [NHL](https://www.nhl.com/)
* πŸ€ [WNBA](https://www.wnba.com/)
* 🎾 [WTA](https://www.wtatennis.com/)

## Architecture :triangular_ruler:

`sportsball` is an object orientated library. The entities are organised like so:

* **Game**: A game within a season.
    * **Team**: The team within the game. Note that in games with individual players a team exists as a wrapper.
        * **Player**: A player within the team.
            * **Address**: The address information of a players birth.
            * **Owner**: The owner of the player.
            * **Venue**: The college of the player.
        * **Odds**: The odds for the team to win the game.
            * **Bookie**: The bookie publishing the odds.
        * **News**: News about the team the day before the game.
        * **Social**: Social posts from the team the day before the game.
        * **Coach**: A coach for the team.
    * **Venue**: The venue the game was played in.
        * **Address**: The address information of a venue.
            * **Weather**: The weather at the address.
    * **Dividend**: The dividends the game pays out.
    * **Umpire**: The umpires adjudicating the game.

## Caching

This library uses very aggressive caching due to the large data requirements. If the requests are about a recent game (generally in the last 7 days) the caching is bypassed. The caching is as follows:

1. A joblib disk cache that caches calls to pydantic model creation functions. This changes on every version update to keep the models in sync. This is the fastest cache.
2. A requests cache backed by sqlite that caches requests forever.
3. An attempt to find the response is made to the wayback machine, and used if found.

It's very recommended that the user uses proxies defined in the `PROXIES` environment variable. The more proxies the easier it is to collect data.

## Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

`pip install sportsball`

or install using this local repository:

`python setup.py install --old-and-unmanageable`

## Usage example :eyes:

There are many different ways of using sportsball, but we generally recommend the CLI.

### CLI

To fetch a dataframe containing information about a league, you can use the following CLI:

```
sportsball --league=nfl -
```

The final argument denotes the file to write to, in this case `-` is stdout.

### Python

To pull a dataframe containing all the information for a particular league, the following example can be used:

```python
from sportsball import sportsball as spb

ball = spb.SportsBall()
league = ball.league(spb.League.AFL)
df = league.to_frame()
```

This results in a dataframe where each game is represented by all its features.

### Environment

If you wish to use the providers that require API keys, you can create a `.env` file with the following variables inside it:

```
GOOGLE_API_KEY=APIKEY
GRIBSTREAM_API_KEY=APIKEY
X_API_KEY=APIKEY
X_API_SECRET_KEY=APISECRETKEY
X_ACCESS_TOKEN=ACCESSTOKEN
X_ACCESS_TOKEN_SECRET=ACCESSTOKENSECRET
PROXIES=CSVPROXIESLIST
```

## License :memo:

The project is available under the [MIT License](LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/8W9aG/sportsball",
    "name": "sportsball",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "sports data betting",
    "author": "Will Sackfield",
    "author_email": "will.sackfield@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/21/74/23a174bb55b9a89750c76aa2894d385729ecc9ad2cf21081aad16b49cc13/sportsball-0.34.147.tar.gz",
    "platform": null,
    "description": "# sportsball\n\n<a href=\"https://pypi.org/project/sportsball/\">\n    <img alt=\"PyPi\" src=\"https://img.shields.io/pypi/v/sportsball\">\n</a>\n\nA library for pulling in and normalising sports stats.\n\n<p align=\"center\">\n    <img src=\"sportsball.png\" alt=\"sportsball\" width=\"200\"/>\n</p>\n\n## Dependencies :globe_with_meridians:\n\nPython 3.11.6:\n\n- [pandas](https://pandas.pydata.org/)\n- [requests](https://requests.readthedocs.io/en/latest/)\n- [requests-cache](https://requests-cache.readthedocs.io/en/stable/)\n- [python-dateutil](https://github.com/dateutil/dateutil)\n- [tqdm](https://github.com/tqdm/tqdm)\n- [beautifulsoup](https://www.crummy.com/software/BeautifulSoup/)\n- [openpyxl](https://openpyxl.readthedocs.io/en/stable/)\n- [joblib](https://joblib.readthedocs.io/en/stable/)\n- [pyarrow](https://arrow.apache.org/docs/python/index.html)\n- [ipython](https://ipython.org/)\n- [pytz](https://pythonhosted.org/pytz/)\n- [python-dotenv](https://github.com/theskumar/python-dotenv)\n- [geocoder](https://geocoder.readthedocs.io/)\n- [retry-requests](https://github.com/bustawin/retry-requests)\n- [timezonefinder](https://timezonefinder.michelfe.it/gui)\n- [nba_api](https://github.com/swar/nba_api)\n- [pydantic](https://docs.pydantic.dev/latest/)\n- [flatten_json](https://github.com/amirziai/flatten)\n- [pygooglenews](https://github.com/kotartemiy/pygooglenews)\n- [extruct](https://github.com/scrapinghub/extruct)\n- [wikipedia-api](https://github.com/martin-majlis/Wikipedia-API)\n- [tweepy](https://www.tweepy.org/)\n- [pytest-is-running](https://github.com/adamchainz/pytest-is-running)\n- [PySocks](https://github.com/Anorov/PySocks)\n- [func-timeout](https://github.com/kata198/func_timeout)\n- [tenacity](https://github.com/jd/tenacity)\n- [random_user_agent](https://github.com/Luqman-Ud-Din/random_user_agent)\n- [wayback](https://github.com/edgi-govdata-archiving/wayback)\n- [cryptography](https://cryptography.io/en/latest/)\n- [feedparser](https://github.com/kurtmckee/feedparser)\n- [dateparser](https://dateparser.readthedocs.io/en/latest/)\n- [playwright](https://playwright.dev/)\n- [cchardet](https://github.com/PyYoshi/cChardet)\n- [lxml](https://lxml.de/)\n- [gender-guesser](https://github.com/lead-ratings/gender-guesser)\n- [scrapesession](https://github.com/8W9aG/scrapesession)\n- [pyhigh](https://github.com/sgherbst/pyhigh)\n- [datefinder](https://github.com/akoumjian/datefinder)\n\n## Raison D'\u00eatre :thought_balloon:\n\n`sportsball` aims to be a library for pulling in historical information about previous sporting games in a standardised fashion for easy data processing.\nThe models it uses are designed to be used for many different types of sports.\n\nThe supported leagues are:\n\n* \ud83c\udfc9 [AFL](https://www.afl.com.au/)\n* \ud83c\udfc9 [AFLW](https://www.afl.com.au/aflw)\n* \ud83c\udfbe [ATP](https://www.atptour.com/en)\n* \u26bd [BUNDESLIGA](https://www.bundesliga.com/en/bundesliga)\n* \u26bd [EPL](https://www.premierleague.com/ens)\n* \u26bd [FIFA](https://www.fifa.com/en)\n* \ud83d\udc0e [HKJC](https://www.hkjc.com/home/english/index.aspx)\n* \ud83c\udfcf [IPL](https://www.iplt20.com/)\n* \u26bd [LALIGA](https://www.laliga.com/en-GB)\n* \u26be [MLB](https://www.mlb.com/)\n* \ud83c\udfc0 [NBA](https://www.nba.com/)\n* \ud83c\udfc0 [NCAAB](https://www.ncaa.com/sports/basketball-men/d1)\n* \ud83c\udfc0 [NCAABW](https://www.ncaa.com/sports/basketball-women/d1)\n* \ud83c\udfc8 [NCAAF](https://www.ncaa.com/sports/football/fbs)\n* \ud83c\udfc8 [NFL](https://www.nfl.com/)\n* \ud83c\udfd2 [NHL](https://www.nhl.com/)\n* \ud83c\udfc0 [WNBA](https://www.wnba.com/)\n* \ud83c\udfbe [WTA](https://www.wtatennis.com/)\n\n## Architecture :triangular_ruler:\n\n`sportsball` is an object orientated library. The entities are organised like so:\n\n* **Game**: A game within a season.\n    * **Team**: The team within the game. Note that in games with individual players a team exists as a wrapper.\n        * **Player**: A player within the team.\n            * **Address**: The address information of a players birth.\n            * **Owner**: The owner of the player.\n            * **Venue**: The college of the player.\n        * **Odds**: The odds for the team to win the game.\n            * **Bookie**: The bookie publishing the odds.\n        * **News**: News about the team the day before the game.\n        * **Social**: Social posts from the team the day before the game.\n        * **Coach**: A coach for the team.\n    * **Venue**: The venue the game was played in.\n        * **Address**: The address information of a venue.\n            * **Weather**: The weather at the address.\n    * **Dividend**: The dividends the game pays out.\n    * **Umpire**: The umpires adjudicating the game.\n\n## Caching\n\nThis library uses very aggressive caching due to the large data requirements. If the requests are about a recent game (generally in the last 7 days) the caching is bypassed. The caching is as follows:\n\n1. A joblib disk cache that caches calls to pydantic model creation functions. This changes on every version update to keep the models in sync. This is the fastest cache.\n2. A requests cache backed by sqlite that caches requests forever.\n3. An attempt to find the response is made to the wayback machine, and used if found.\n\nIt's very recommended that the user uses proxies defined in the `PROXIES` environment variable. The more proxies the easier it is to collect data.\n\n## Installation :inbox_tray:\n\nThis is a python package hosted on pypi, so to install simply run the following command:\n\n`pip install sportsball`\n\nor install using this local repository:\n\n`python setup.py install --old-and-unmanageable`\n\n## Usage example :eyes:\n\nThere are many different ways of using sportsball, but we generally recommend the CLI.\n\n### CLI\n\nTo fetch a dataframe containing information about a league, you can use the following CLI:\n\n```\nsportsball --league=nfl -\n```\n\nThe final argument denotes the file to write to, in this case `-` is stdout.\n\n### Python\n\nTo pull a dataframe containing all the information for a particular league, the following example can be used:\n\n```python\nfrom sportsball import sportsball as spb\n\nball = spb.SportsBall()\nleague = ball.league(spb.League.AFL)\ndf = league.to_frame()\n```\n\nThis results in a dataframe where each game is represented by all its features.\n\n### Environment\n\nIf you wish to use the providers that require API keys, you can create a `.env` file with the following variables inside it:\n\n```\nGOOGLE_API_KEY=APIKEY\nGRIBSTREAM_API_KEY=APIKEY\nX_API_KEY=APIKEY\nX_API_SECRET_KEY=APISECRETKEY\nX_ACCESS_TOKEN=ACCESSTOKEN\nX_ACCESS_TOKEN_SECRET=ACCESSTOKENSECRET\nPROXIES=CSVPROXIESLIST\n```\n\n## License :memo:\n\nThe project is available under the [MIT License](LICENSE).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A library for pulling in and normalising sports stats.",
    "version": "0.34.147",
    "project_urls": {
        "Homepage": "https://github.com/8W9aG/sportsball"
    },
    "split_keywords": [
        "sports",
        "data",
        "betting"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "217423a174bb55b9a89750c76aa2894d385729ecc9ad2cf21081aad16b49cc13",
                "md5": "1e44862e4a68614b6cabd58bd6c0e587",
                "sha256": "10e452b18b555355dc20c6b7ab6e0f81958ce3040c75b700bf57af535a45381a"
            },
            "downloads": -1,
            "filename": "sportsball-0.34.147.tar.gz",
            "has_sig": false,
            "md5_digest": "1e44862e4a68614b6cabd58bd6c0e587",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 743249,
            "upload_time": "2025-11-08T15:59:27",
            "upload_time_iso_8601": "2025-11-08T15:59:27.019757Z",
            "url": "https://files.pythonhosted.org/packages/21/74/23a174bb55b9a89750c76aa2894d385729ecc9ad2cf21081aad16b49cc13/sportsball-0.34.147.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-08 15:59:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "8W9aG",
    "github_project": "sportsball",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "2.2.3"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.31.0"
                ]
            ]
        },
        {
            "name": "requests-cache",
            "specs": [
                [
                    ">=",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.66.2"
                ]
            ]
        },
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.13.4"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    ">=",
                    "3.1.5"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    ">=",
                    "1.4.2"
                ]
            ]
        },
        {
            "name": "pyarrow",
            "specs": [
                [
                    ">=",
                    "18.0.0"
                ]
            ]
        },
        {
            "name": "ipython",
            "specs": [
                [
                    ">=",
                    "8.29.0"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    ">=",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "python-dotenv",
            "specs": [
                [
                    ">=",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "geocoder",
            "specs": [
                [
                    ">=",
                    "1.38.1"
                ]
            ]
        },
        {
            "name": "retry-requests",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "openmeteo-requests",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "nba_api",
            "specs": [
                [
                    ">=",
                    "1.9.0"
                ]
            ]
        },
        {
            "name": "timezonefinder",
            "specs": [
                [
                    ">=",
                    "6.6.3"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    ">=",
                    "2.11.9"
                ]
            ]
        },
        {
            "name": "flatten_json",
            "specs": [
                [
                    ">=",
                    "0.1.14"
                ]
            ]
        },
        {
            "name": "extruct",
            "specs": [
                [
                    ">=",
                    "0.18.0"
                ]
            ]
        },
        {
            "name": "wikipedia-api",
            "specs": [
                [
                    ">=",
                    "0.8.1"
                ]
            ]
        },
        {
            "name": "tweepy",
            "specs": [
                [
                    ">=",
                    "4.15.0"
                ]
            ]
        },
        {
            "name": "pytest-is-running",
            "specs": [
                [
                    ">=",
                    "1.5.1"
                ]
            ]
        },
        {
            "name": "PySocks",
            "specs": [
                [
                    ">=",
                    "1.7.1"
                ]
            ]
        },
        {
            "name": "func-timeout",
            "specs": [
                [
                    ">=",
                    "4.3.5"
                ]
            ]
        },
        {
            "name": "tenacity",
            "specs": [
                [
                    ">=",
                    "9.0.0"
                ]
            ]
        },
        {
            "name": "random_user_agent",
            "specs": [
                [
                    ">=",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "wayback",
            "specs": [
                [
                    ">=",
                    "0.4.5"
                ]
            ]
        },
        {
            "name": "cryptography",
            "specs": [
                [
                    ">=",
                    "44.0.0"
                ]
            ]
        },
        {
            "name": "feedparser",
            "specs": [
                [
                    ">=",
                    "6.0.11"
                ]
            ]
        },
        {
            "name": "dateparser",
            "specs": [
                [
                    ">=",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "playwright",
            "specs": [
                [
                    ">=",
                    "1.51.0"
                ]
            ]
        },
        {
            "name": "cchardet",
            "specs": [
                [
                    ">=",
                    "2.2.0a2"
                ]
            ]
        },
        {
            "name": "lxml",
            "specs": [
                [
                    ">=",
                    "5.3.0"
                ]
            ]
        },
        {
            "name": "gender-guesser",
            "specs": [
                [
                    ">=",
                    "0.4.0"
                ]
            ]
        },
        {
            "name": "scrapesession",
            "specs": [
                [
                    ">=",
                    "0.0.17"
                ]
            ]
        },
        {
            "name": "pyhigh",
            "specs": [
                [
                    ">=",
                    "0.0.6"
                ]
            ]
        },
        {
            "name": "datefinder",
            "specs": [
                [
                    ">=",
                    "0.7.3"
                ]
            ]
        }
    ],
    "lcname": "sportsball"
}
        
Elapsed time: 2.59548s