# pcs_scraper
**v0.1.0**
A python package to query, organize and prepare pandas dataframes from procyclingstats.com data to facilitate further analysis
## Project Description
This project was undertaken as a side project while working as a data/race analyst with professional cycling teams. While commonplace for many other major sports, I couldn't find any available packages that provided access to professional cycling datasets. There were, however, already fantastic websites devoted to cataloguing this data and presenting it for free. The most user-friendly website I found was procyclingstats.com (PCS) but they didn't have a publicly available api, so I decided to make this package to interact with their posted data programmatically.
_pcs_scraper_ lets users interact with PCS through three fundamental and distinct classes:
1. Riders
2. Teams
3. Races
In the next versions of this project I would like to link the statistical data from PCS with rider Strava data.
## Installation
###### Via pip:
pip install pcs-scraper
###### Via conda:
Coming soon
###### Via source code:
Fork/clone this repo and create a conda environment to develop in using:
```
# create environment using existing environment file
conda env create -f environment.yml
# add pcs_scraper to path for environment
cd .../anaconda3/envs/pcs_env/lib/python3.9/site-packages
nano packages.pth
# then in nano type **full path** to main pcs_scraper directory (ie. .../Users/name/Desktop/pcs_scraper)
# press control+O to save file, press control+X to exit nano
```
## Usage
###### Basic
```
# for specific rider
# import
import pcs_scraper as pcs
# request rider object for tadej pogacar
pogacar = pcs.Rider(name = 'tadej-pogacar')
# get pogacar's entire race history
pogacar_race_hx = pogacar.get_race_history()
```
```
# for specific race
# import
import pcs_scraper as pcs
# request race object for tour de france
tdf = pcs.Race(name = 'tour-de-france', year = 2021)
# if unsure about spelling of race name according to PCS you can search using:
# race_options = pcs.race_options_by_year(2021)
# can refine output using race circuit or classification when requesting
# race_options = pcs.race_options_by_year(2021, classification = '2.UWT', circuit = 'UCI World Tour')
# request the GC results
tdf_final_gc = tdf.get_results()
```
```
# for specific team
# import
import pcs_scraper as pcs
# request team object for Ineos
ineos = pcs.Team(name = 'ineos-grenadiers', year = 2021)
# if unsure about spelling of team name according to PCS you can search using:
# team_options_2021 = pcs.teams_by_year(year = 2021, gender = 'M')
# get the riders from the team
ineos_2021_riders = ineos.get_riders()
```
###### Practical Examples
Coming soon
## Documentation
Coming soon
Raw data
{
"_id": null,
"home_page": "https://github.com/lucaskoensgen/pcs_scraper",
"name": "pcs-scraper",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "",
"keywords": "cycling,web-scraping,statistics,procyclingstats,peloton",
"author": "Lucas Koensgen",
"author_email": "l.koensgen@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/2a/e5/a82e5deb44162702be2c22a572078afd1aeae2973396828208d03c00892a/pcs-scraper-0.2.0.tar.gz",
"platform": null,
"description": "# pcs_scraper\n**v0.1.0**\nA python package to query, organize and prepare pandas dataframes from procyclingstats.com data to facilitate further analysis\n\n## Project Description\n\nThis project was undertaken as a side project while working as a data/race analyst with professional cycling teams. While commonplace for many other major sports, I couldn't find any available packages that provided access to professional cycling datasets. There were, however, already fantastic websites devoted to cataloguing this data and presenting it for free. The most user-friendly website I found was procyclingstats.com (PCS) but they didn't have a publicly available api, so I decided to make this package to interact with their posted data programmatically. \n\n_pcs_scraper_ lets users interact with PCS through three fundamental and distinct classes:\n1. Riders\n2. Teams\n3. Races\n\nIn the next versions of this project I would like to link the statistical data from PCS with rider Strava data.\n\n## Installation\n\n###### Via pip:\npip install pcs-scraper\n\n###### Via conda:\nComing soon\n\n###### Via source code:\nFork/clone this repo and create a conda environment to develop in using:\n```\n# create environment using existing environment file\nconda env create -f environment.yml\n\n# add pcs_scraper to path for environment\ncd .../anaconda3/envs/pcs_env/lib/python3.9/site-packages\nnano packages.pth \n# then in nano type **full path** to main pcs_scraper directory (ie. .../Users/name/Desktop/pcs_scraper)\n# press control+O to save file, press control+X to exit nano\n```\n\n## Usage\n###### Basic\n```\n# for specific rider\n\n# import \nimport pcs_scraper as pcs\n\n# request rider object for tadej pogacar\npogacar = pcs.Rider(name = 'tadej-pogacar')\n\n# get pogacar's entire race history \npogacar_race_hx = pogacar.get_race_history()\n```\n```\n# for specific race\n\n# import\nimport pcs_scraper as pcs\n\n# request race object for tour de france\ntdf = pcs.Race(name = 'tour-de-france', year = 2021)\n\n\t# if unsure about spelling of race name according to PCS you can search using:\n\t# race_options = pcs.race_options_by_year(2021)\n\t# can refine output using race circuit or classification when requesting\n\t# race_options = pcs.race_options_by_year(2021, classification = '2.UWT', circuit = 'UCI World Tour')\n\n# request the GC results\ntdf_final_gc = tdf.get_results()\n```\n```\n# for specific team\n\n# import\nimport pcs_scraper as pcs\n\n# request team object for Ineos \nineos = pcs.Team(name = 'ineos-grenadiers', year = 2021)\n\n\t# if unsure about spelling of team name according to PCS you can search using:\n\t# team_options_2021 = pcs.teams_by_year(year = 2021, gender = 'M')\n\n# get the riders from the team\nineos_2021_riders = ineos.get_riders()\n```\n\n###### Practical Examples\nComing soon\n\n## Documentation\nComing soon\n\n\n",
"bugtrack_url": null,
"license": "GPLv3",
"summary": "A python-based api to access procyclingstats data",
"version": "0.2.0",
"split_keywords": [
"cycling",
"web-scraping",
"statistics",
"procyclingstats",
"peloton"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e91fea59c1c5838cd1711a306220a433f7674382aa232d93887c1dabdacefec9",
"md5": "da59eb6e46d2ba960b653cedd18ceaf1",
"sha256": "7c0a57636657dd51541dfbca5f027fae2dbe7470c848ea5b7eea35a2963f9ae5"
},
"downloads": -1,
"filename": "pcs_scraper-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "da59eb6e46d2ba960b653cedd18ceaf1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 36916,
"upload_time": "2023-03-21T19:34:06",
"upload_time_iso_8601": "2023-03-21T19:34:06.433882Z",
"url": "https://files.pythonhosted.org/packages/e9/1f/ea59c1c5838cd1711a306220a433f7674382aa232d93887c1dabdacefec9/pcs_scraper-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2ae5a82e5deb44162702be2c22a572078afd1aeae2973396828208d03c00892a",
"md5": "ddace77554553ad9753342aabb73355a",
"sha256": "b160c2e89a2b07394fb5381b669e199a125fd682159fa3128615a6d98f736fd4"
},
"downloads": -1,
"filename": "pcs-scraper-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "ddace77554553ad9753342aabb73355a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 35106,
"upload_time": "2023-03-21T19:34:08",
"upload_time_iso_8601": "2023-03-21T19:34:08.138598Z",
"url": "https://files.pythonhosted.org/packages/2a/e5/a82e5deb44162702be2c22a572078afd1aeae2973396828208d03c00892a/pcs-scraper-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-21 19:34:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "lucaskoensgen",
"github_project": "pcs_scraper",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "pcs-scraper"
}