ncaa-stats-py


Namencaa-stats-py JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryAllows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.
upload_time2024-12-17 16:08:16
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords sports college college sports baseball
VCS
bugtrack_url
requirements beautifulsoup4 lxml pandas pytz requests tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ncaa_stats_py
Allows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.

# Basic Setup

## How to Install

This package is is available through the
[`pip` package manager](https://en.wikipedia.org/wiki/Pip_(package_manager)),
and can be installed through one of the following commands
in your terminal/shell:

```bash
pip install ncaa_stats_py
```

OR

```bash
python -m pip install ncaa_stats_py
```

If you are using a Linux/Mac instance,
you may need to specify `python3` when installing, as shown below:

```bash
python3 -m pip install ncaa_stats_py
```

Alternatively, `cfbd-json-py` can be installed from
this GitHub repository with the following command through pip:

```bash
pip install git+https://github.com/armstjc/ncaa_stats_py
```

OR

```bash
python -m pip install git+https://github.com/armstjc/ncaa_stats_py
```

OR

```bash
python3 -m pip install git+https://github.com/armstjc/ncaa_stats_py
```

## How to Use
`ncaa_stats_py` separates itself by doing the following
things when attempting to get data:
1. Automatically caching any data that is already parsed
2. Automatically forcing a 5 second sleep timer for any HTML call,
    to ensure that any function call from this package
    won't result in you getting IP banned
    (you do not *need* to add sleep timers if you're looping through,
    and calling functions in this python package).
3. Automatically refreshing any cached data if the data hasn't been refreshed in a while.

For example, the following code will work as-is,
    and in the second loop, the code will load in the teams
    even faster because the data is cached
    on the device you're running this code.

```python
from timeit import default_timer as timer

from ncaa_stats_py.baseball import (
    get_baseball_team_roster,
    get_baseball_teams
)

start_time = timer()

# Loads in a table with every DI NCAA baseball team in the 2024 season.
# If this is the first time you run this script,
# it may take some time to repopulate the NCAA baseball team information data.

teams_df = get_baseball_teams(season=2024, level="I")

end_time = timer()

time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

# Gets 5 random D1 teams from 2024
teams_df = teams_df.sample(5)
print(teams_df)
print()


# Let's send this to a list to make the loop slightly faster
team_ids_list = teams_df["team_id"].to_list()

# First loop
# If the data isn't cached, it should take 35-40 seconds to do this loop
start_time = timer()

for t_id in team_ids_list:
    print(f"On Team ID: {t_id}")
    df = get_baseball_team_roster(team_id=t_id)
    # print(df)

end_time = timer()

time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

# Second loop
# Because the data has been parsed and cached,
# this shouldn't take that long to loop through
start_time = timer()

for t_id in team_ids_list:
    print(f"On Team ID: {t_id}")
    df = get_baseball_team_roster(team_id=t_id)
    # print(df)

end_time = timer()
time_elapsed = end_time - start_time
print(f"Elapsed time: {time_elapsed:03f} seconds.\n\n")

```

# Dependencies

`ncaa_stats_py` is dependent on the following python packages:
- [`beautifulsoup4`](https://www.crummy.com/software/BeautifulSoup/): To assist with parsing HTML data.
- [`lxml`](https://lxml.de/): To work with `beautifulsoup4` in assisting with parsing HTML data.
- [`pandas`](https://github.com/pandas-dev/pandas): For `DataFrame` creation within package functions.
- [`pytz`](https://pythonhosted.org/pytz/): Used to attach timezone information for any date/date time objects encountered by this package.
- [`requests`](https://github.com/psf/requests): Used to make HTTPS requests.
- [`tqdm`](https://github.com/tqdm/tqdm): Used to show progress bars for actions in functions that are known to take minutes to load.

# License

This package is licensed under the MIT license. You can view the package's license [here](https://github.com/armstjc/ncaa_stats_py/blob/main/LICENSE).

# Documentation

For more information about this package, its functions, and ways you can use said functions can be found at [https://armstjc.github.io/ncaa_stats_py/](https://armstjc.github.io/ncaa_stats_py/).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ncaa-stats-py",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Joseph Armstrong <armstrongjoseph08@gmail.com>",
    "keywords": "sports, college, college sports, baseball",
    "author": null,
    "author_email": "Joseph Armstrong <armstrongjoseph08@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/f9/a6/9d8805d25c2e5356d4ae1bad5140a7c0f9612479484875561c6f3146fe84/ncaa_stats_py-0.0.5.tar.gz",
    "platform": null,
    "description": "# ncaa_stats_py\nAllows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.\n\n# Basic Setup\n\n## How to Install\n\nThis package is is available through the\n[`pip` package manager](https://en.wikipedia.org/wiki/Pip_(package_manager)),\nand can be installed through one of the following commands\nin your terminal/shell:\n\n```bash\npip install ncaa_stats_py\n```\n\nOR\n\n```bash\npython -m pip install ncaa_stats_py\n```\n\nIf you are using a Linux/Mac instance,\nyou may need to specify `python3` when installing, as shown below:\n\n```bash\npython3 -m pip install ncaa_stats_py\n```\n\nAlternatively, `cfbd-json-py` can be installed from\nthis GitHub repository with the following command through pip:\n\n```bash\npip install git+https://github.com/armstjc/ncaa_stats_py\n```\n\nOR\n\n```bash\npython -m pip install git+https://github.com/armstjc/ncaa_stats_py\n```\n\nOR\n\n```bash\npython3 -m pip install git+https://github.com/armstjc/ncaa_stats_py\n```\n\n## How to Use\n`ncaa_stats_py` separates itself by doing the following\nthings when attempting to get data:\n1. Automatically caching any data that is already parsed\n2. Automatically forcing a 5 second sleep timer for any HTML call,\n    to ensure that any function call from this package\n    won't result in you getting IP banned\n    (you do not *need* to add sleep timers if you're looping through,\n    and calling functions in this python package).\n3. Automatically refreshing any cached data if the data hasn't been refreshed in a while.\n\nFor example, the following code will work as-is,\n    and in the second loop, the code will load in the teams\n    even faster because the data is cached\n    on the device you're running this code.\n\n```python\nfrom timeit import default_timer as timer\n\nfrom ncaa_stats_py.baseball import (\n    get_baseball_team_roster,\n    get_baseball_teams\n)\n\nstart_time = timer()\n\n# Loads in a table with every DI NCAA baseball team in the 2024 season.\n# If this is the first time you run this script,\n# it may take some time to repopulate the NCAA baseball team information data.\n\nteams_df = get_baseball_teams(season=2024, level=\"I\")\n\nend_time = timer()\n\ntime_elapsed = end_time - start_time\nprint(f\"Elapsed time: {time_elapsed:03f} seconds.\\n\\n\")\n\n# Gets 5 random D1 teams from 2024\nteams_df = teams_df.sample(5)\nprint(teams_df)\nprint()\n\n\n# Let's send this to a list to make the loop slightly faster\nteam_ids_list = teams_df[\"team_id\"].to_list()\n\n# First loop\n# If the data isn't cached, it should take 35-40 seconds to do this loop\nstart_time = timer()\n\nfor t_id in team_ids_list:\n    print(f\"On Team ID: {t_id}\")\n    df = get_baseball_team_roster(team_id=t_id)\n    # print(df)\n\nend_time = timer()\n\ntime_elapsed = end_time - start_time\nprint(f\"Elapsed time: {time_elapsed:03f} seconds.\\n\\n\")\n\n# Second loop\n# Because the data has been parsed and cached,\n# this shouldn't take that long to loop through\nstart_time = timer()\n\nfor t_id in team_ids_list:\n    print(f\"On Team ID: {t_id}\")\n    df = get_baseball_team_roster(team_id=t_id)\n    # print(df)\n\nend_time = timer()\ntime_elapsed = end_time - start_time\nprint(f\"Elapsed time: {time_elapsed:03f} seconds.\\n\\n\")\n\n```\n\n# Dependencies\n\n`ncaa_stats_py` is dependent on the following python packages:\n- [`beautifulsoup4`](https://www.crummy.com/software/BeautifulSoup/): To assist with parsing HTML data.\n- [`lxml`](https://lxml.de/): To work with `beautifulsoup4` in assisting with parsing HTML data.\n- [`pandas`](https://github.com/pandas-dev/pandas): For `DataFrame` creation within package functions.\n- [`pytz`](https://pythonhosted.org/pytz/): Used to attach timezone information for any date/date time objects encountered by this package.\n- [`requests`](https://github.com/psf/requests): Used to make HTTPS requests.\n- [`tqdm`](https://github.com/tqdm/tqdm): Used to show progress bars for actions in functions that are known to take minutes to load.\n\n# License\n\nThis package is licensed under the MIT license. You can view the package's license [here](https://github.com/armstjc/ncaa_stats_py/blob/main/LICENSE).\n\n# Documentation\n\nFor more information about this package, its functions, and ways you can use said functions can be found at [https://armstjc.github.io/ncaa_stats_py/](https://armstjc.github.io/ncaa_stats_py/).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Allows a user to download and parse data from the National Collegiate Athletics Association (NCAA), and it's member sports.",
    "version": "0.0.5",
    "project_urls": {
        "changelog": "https://github.com/armstjc/ncaa_stats_py/blob/main/CHANGELOG.md",
        "documentation": "https://github.com/armstjc/ncaa_stats_py",
        "homepage": "https://github.com/armstjc/ncaa_stats_py",
        "repository": "https://github.com/armstjc/ncaa_stats_py.git"
    },
    "split_keywords": [
        "sports",
        " college",
        " college sports",
        " baseball"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "711c5864849affcd0ed343720e5694935b70fa51dc76b86217cfdbf5334101ac",
                "md5": "38c82cf3907add94d2b828262250c736",
                "sha256": "4792ff23099bad3cdfafc5db8ef8d19717d9c291887ec96a2f9872ef4c4280d5"
            },
            "downloads": -1,
            "filename": "ncaa_stats_py-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "38c82cf3907add94d2b828262250c736",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 95043,
            "upload_time": "2024-12-17T16:08:15",
            "upload_time_iso_8601": "2024-12-17T16:08:15.258638Z",
            "url": "https://files.pythonhosted.org/packages/71/1c/5864849affcd0ed343720e5694935b70fa51dc76b86217cfdbf5334101ac/ncaa_stats_py-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f9a69d8805d25c2e5356d4ae1bad5140a7c0f9612479484875561c6f3146fe84",
                "md5": "8c11260d7d4a29f1f5bc338b69b93618",
                "sha256": "b61bba4b202c3edec387cc2e80bfaaf49660e907a2606d380adf094e341f654e"
            },
            "downloads": -1,
            "filename": "ncaa_stats_py-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "8c11260d7d4a29f1f5bc338b69b93618",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 88303,
            "upload_time": "2024-12-17T16:08:16",
            "upload_time_iso_8601": "2024-12-17T16:08:16.730158Z",
            "url": "https://files.pythonhosted.org/packages/f9/a6/9d8805d25c2e5356d4ae1bad5140a7c0f9612479484875561c6f3146fe84/ncaa_stats_py-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-17 16:08:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "armstjc",
    "github_project": "ncaa_stats_py",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.12.2"
                ]
            ]
        },
        {
            "name": "lxml",
            "specs": [
                [
                    ">=",
                    "5.3"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "2.2.3"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    ">=",
                    "2024.2"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.32.3"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.67.1"
                ]
            ]
        }
    ],
    "lcname": "ncaa-stats-py"
}
        
Elapsed time: 0.41631s