athlinks-races


Nameathlinks-races JSON
Version 0.0.7 PyPI version JSON
download
home_pageNone
SummaryWeb scraper for race results hosted on Athlinks.
upload_time2025-02-17 01:31:57
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseMIT License Copyright (c) 2022 Aaron Schroeder Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords running race tower race racing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # athlinks_races: web scraper for race results hosted on Athlinks

[![Supported Python Versions](https://img.shields.io/pypi/pyversions/athlinks-races/0.0.5)](https://pypi.org/project/athlinks-races/)
[![PyPI version](https://badge.fury.io/py/athlinks-races.svg)](https://badge.fury.io/py/athlinks-races)
![OS support](https://img.shields.io/badge/Linux-red)
[![Downloads](https://static.pepy.tech/badge/athlinks-races)](https://pepy.tech/project/athlinks-races)

![Screenshot of athlinks_races_cli --tui](athlinks_capture_screenshot.png)

## NOTE

This is a fork of the original [scrapy-athlinks](https://github.com/josevnz/scrapy-athlinks). I decided to take over 
as I want to add features that were not originally available on the project.


## Introduction


`athlinks_races` provides the [`RaceSpider`](athlinks_races/spiders/race.py) class.

This spider crawls through all results pages from a race hosted on athlinks.com,
building and following links to each athlete's individual results page, where it
collects their split data. It also collects some metadata about the race itself.

By default, the spider returns one race metadata object (`RaceItem`), and one `AthleteItem` per participant.

Each `AthleteItem` consists of some basic athlete info and a list of `RaceSplitItem` containing data from each 
split they recorded.

## How to use this package

### Using uv

If you have installed [uv](https://docs.astral.sh/uv/), is as simple as this:

```shell
uvx --from athlinks-races athlinks_races_cli --tui --race_url https://www.athlinks.com/event/382111/results/Event/1093108/Results
```

### python scripts

Scrapy can be operated entirely from python scripts.
[See the scrapy documentation for more info.](https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script)

#### Installation

The package is available on [PyPi](https://pypi.org/project/athlinks-races) and can be installed with `pip`:

```sh
python -m venv "$HOME/virtualenv/athlinks_races/"
. $HOME/virtualenv/athlinks_races/bin/activate
pip install athlinks_races
```

#### Example usage

[A demo script is included in this repo](athlinks_races/demo.py). It has plenty of features but is also monolithic on purpose.

```python
"""
Demonstrate the available classes.
You can run as python athlinks_races/demo.py
"""
from scrapy.crawler import CrawlerProcess
from athlinks_races import RaceSpider, AthleteItem, RaceItem


def main():
    # Make settings for two separate output files: one for athlete data,
    # one for race metadata.
    settings = {
        'FEEDS': {
            # Athlete data. Inside this file will be a list of dicts containing
            # data about each athlete's race and splits.
            'athletes.json': {
                'format': 'json',
                'overwrite': True,
                'item_classes': [AthleteItem],
            },
            # Race metadata. Inside this file will be a list with a single dict
            # containing info about the race itself.
            'metadata.json': {
                'format': 'json',
                'overwrite': True,
                'item_classes': [RaceItem],
            },
        }
    }
    process = CrawlerProcess(settings=settings)

    # Crawl results for the 2022 Leadville Trail 100 Run
    process.crawl(RaceSpider, 'https://www.athlinks.com/event/33913/results/Event/1018673/')
    process.start()


if __name__ == "__main__":
    main()
```

If you do a ```pip install --editable .[lint,dev]``` then you can run as

```shell
athlinks_cli
```

Then you can build the wheelhouse to install locally if needed:

```shell
python -m build .
```

### Command line

Alternatively, you may clone this repo for use like a typical Scrapy project
that you might create on your own.

#### Installation

```sh
python -m venv `$HOME/virtualenv/athlink_races`
. $HOME/virtualenv/athlink_races/bin/activate
git clone https://github.com/josevnz/athlinks-races
cd athlink-races
python install --editable .[lint,dev]
```

#### Example usage

Run a `RaceSpider`, few races with different years:

```shell
cd athlinks_races
scrapy crawl race -a url=https://www.athlinks.com/event/33913/results/Event/1018673 -O $HOME/1018673.json
scrapy crawl race -a url=https://www.athlinks.com/event/382111/results/Event/1093108 -O $HOME/1093108.json
scrapy crawl race -a url=https://www.athlinks.com/event/382111/results/Event/1062909 -O $HOME/1093108.json
```

Or the newer thlinks_races_cli, with the `--tui` argument:

```shell
(athlinks-races) [josevnz@dmaf5 athlinks_races]$ athlinks_races_cli --tui
2025-01-12 14:56:31 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2025-01-12 14:56:31 [scrapy.core.engine] INFO: Spider opened
2025-01-12 14:56:31 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2025-01-12 14:56:31 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2025-01-12 14:56:31 [asyncio] DEBUG: Using selector: EpollSelector

```

## Dependencies

All that is required is [Scrapy](https://scrapy.org/) and [Textual](https://github.com/Textualize/textual) (and its dependencies).

## Testing

```shell
. $HOME/virtualenv/athlink_races/bin/activate
pytest tests/*.py
```

Example session:

```shell
(athlinks_races) [josevnz@dmaf5 athlinks_races]$ pytest /home/josevnz/athlinks_races/tests/tests.py
============================================================= test session starts =============================================================
platform linux -- Python 3.11.6, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/josevnz/athlinks_races
configfile: pyproject.toml
collected 6 items                                                                                                                             

tests/tests.py ......                                                                                                                   [100%]

============================================================== 6 passed in 0.33s ==============================================================

```

## License

This project is licensed under the MIT License. See [LICENSE](LICENSE) file for details.

## Contact

You can get in touch with me here:

- GitHub: [https://github.com/josevnz](https://github.com/josevnz)

### Original Author

If you want to take a look at the original project. He is not in charge of this forked version.

- GitHub: [github.com/aaron-schroeder](https://github.com/aaron-schroeder)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "athlinks-races",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "running, race, tower race, racing",
    "author": null,
    "author_email": "Jose Vicente Nunez <kodegeek.com@protonmail.com>, Aaron Schroeder <aaron@trailzealot.com>",
    "download_url": "https://files.pythonhosted.org/packages/2f/e2/9055df6c68c8dde2b82433282ab27975e00ba34495e55dae5c305bfb29ba/athlinks_races-0.0.7.tar.gz",
    "platform": null,
    "description": "# athlinks_races: web scraper for race results hosted on Athlinks\n\n[![Supported Python Versions](https://img.shields.io/pypi/pyversions/athlinks-races/0.0.5)](https://pypi.org/project/athlinks-races/)\n[![PyPI version](https://badge.fury.io/py/athlinks-races.svg)](https://badge.fury.io/py/athlinks-races)\n![OS support](https://img.shields.io/badge/Linux-red)\n[![Downloads](https://static.pepy.tech/badge/athlinks-races)](https://pepy.tech/project/athlinks-races)\n\n![Screenshot of athlinks_races_cli --tui](athlinks_capture_screenshot.png)\n\n## NOTE\n\nThis is a fork of the original [scrapy-athlinks](https://github.com/josevnz/scrapy-athlinks). I decided to take over \nas I want to add features that were not originally available on the project.\n\n\n## Introduction\n\n\n`athlinks_races` provides the [`RaceSpider`](athlinks_races/spiders/race.py) class.\n\nThis spider crawls through all results pages from a race hosted on athlinks.com,\nbuilding and following links to each athlete's individual results page, where it\ncollects their split data. It also collects some metadata about the race itself.\n\nBy default, the spider returns one race metadata object (`RaceItem`), and one `AthleteItem` per participant.\n\nEach `AthleteItem` consists of some basic athlete info and a list of `RaceSplitItem` containing data from each \nsplit they recorded.\n\n## How to use this package\n\n### Using uv\n\nIf you have installed [uv](https://docs.astral.sh/uv/), is as simple as this:\n\n```shell\nuvx --from athlinks-races athlinks_races_cli --tui --race_url https://www.athlinks.com/event/382111/results/Event/1093108/Results\n```\n\n### python scripts\n\nScrapy can be operated entirely from python scripts.\n[See the scrapy documentation for more info.](https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script)\n\n#### Installation\n\nThe package is available on [PyPi](https://pypi.org/project/athlinks-races) and can be installed with `pip`:\n\n```sh\npython -m venv \"$HOME/virtualenv/athlinks_races/\"\n. $HOME/virtualenv/athlinks_races/bin/activate\npip install athlinks_races\n```\n\n#### Example usage\n\n[A demo script is included in this repo](athlinks_races/demo.py). It has plenty of features but is also monolithic on purpose.\n\n```python\n\"\"\"\nDemonstrate the available classes.\nYou can run as python athlinks_races/demo.py\n\"\"\"\nfrom scrapy.crawler import CrawlerProcess\nfrom athlinks_races import RaceSpider, AthleteItem, RaceItem\n\n\ndef main():\n    # Make settings for two separate output files: one for athlete data,\n    # one for race metadata.\n    settings = {\n        'FEEDS': {\n            # Athlete data. Inside this file will be a list of dicts containing\n            # data about each athlete's race and splits.\n            'athletes.json': {\n                'format': 'json',\n                'overwrite': True,\n                'item_classes': [AthleteItem],\n            },\n            # Race metadata. Inside this file will be a list with a single dict\n            # containing info about the race itself.\n            'metadata.json': {\n                'format': 'json',\n                'overwrite': True,\n                'item_classes': [RaceItem],\n            },\n        }\n    }\n    process = CrawlerProcess(settings=settings)\n\n    # Crawl results for the 2022 Leadville Trail 100 Run\n    process.crawl(RaceSpider, 'https://www.athlinks.com/event/33913/results/Event/1018673/')\n    process.start()\n\n\nif __name__ == \"__main__\":\n    main()\n```\n\nIf you do a ```pip install --editable .[lint,dev]``` then you can run as\n\n```shell\nathlinks_cli\n```\n\nThen you can build the wheelhouse to install locally if needed:\n\n```shell\npython -m build .\n```\n\n### Command line\n\nAlternatively, you may clone this repo for use like a typical Scrapy project\nthat you might create on your own.\n\n#### Installation\n\n```sh\npython -m venv `$HOME/virtualenv/athlink_races`\n. $HOME/virtualenv/athlink_races/bin/activate\ngit clone https://github.com/josevnz/athlinks-races\ncd athlink-races\npython install --editable .[lint,dev]\n```\n\n#### Example usage\n\nRun a `RaceSpider`, few races with different years:\n\n```shell\ncd athlinks_races\nscrapy crawl race -a url=https://www.athlinks.com/event/33913/results/Event/1018673 -O $HOME/1018673.json\nscrapy crawl race -a url=https://www.athlinks.com/event/382111/results/Event/1093108 -O $HOME/1093108.json\nscrapy crawl race -a url=https://www.athlinks.com/event/382111/results/Event/1062909 -O $HOME/1093108.json\n```\n\nOr the newer thlinks_races_cli, with the `--tui` argument:\n\n```shell\n(athlinks-races) [josevnz@dmaf5 athlinks_races]$ athlinks_races_cli --tui\n2025-01-12 14:56:31 [scrapy.middleware] INFO: Enabled item pipelines:\n[]\n2025-01-12 14:56:31 [scrapy.core.engine] INFO: Spider opened\n2025-01-12 14:56:31 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)\n2025-01-12 14:56:31 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023\n2025-01-12 14:56:31 [asyncio] DEBUG: Using selector: EpollSelector\n\n```\n\n## Dependencies\n\nAll that is required is [Scrapy](https://scrapy.org/) and [Textual](https://github.com/Textualize/textual) (and its dependencies).\n\n## Testing\n\n```shell\n. $HOME/virtualenv/athlink_races/bin/activate\npytest tests/*.py\n```\n\nExample session:\n\n```shell\n(athlinks_races) [josevnz@dmaf5 athlinks_races]$ pytest /home/josevnz/athlinks_races/tests/tests.py\n============================================================= test session starts =============================================================\nplatform linux -- Python 3.11.6, pytest-8.3.3, pluggy-1.5.0\nrootdir: /home/josevnz/athlinks_races\nconfigfile: pyproject.toml\ncollected 6 items                                                                                                                             \n\ntests/tests.py ......                                                                                                                   [100%]\n\n============================================================== 6 passed in 0.33s ==============================================================\n\n```\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) file for details.\n\n## Contact\n\nYou can get in touch with me here:\n\n- GitHub: [https://github.com/josevnz](https://github.com/josevnz)\n\n### Original Author\n\nIf you want to take a look at the original project. He is not in charge of this forked version.\n\n- GitHub: [github.com/aaron-schroeder](https://github.com/aaron-schroeder)\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2022 Aaron Schroeder\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.",
    "summary": "Web scraper for race results hosted on Athlinks.",
    "version": "0.0.7",
    "project_urls": {
        "Homepage": "https://kodegeek.com/",
        "Repository": "https://github.com/josevnz/athlinks-races/"
    },
    "split_keywords": [
        "running",
        " race",
        " tower race",
        " racing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "310065e876248da20cf1d46d4132ce40adf022550ff5d2fc4dfbe26881b5efde",
                "md5": "18c4597248144c8da45b0ce63dba8cff",
                "sha256": "efdc72e347107cc32de1895961b7f813e6044d4927d33a019aaeca0a83156bd2"
            },
            "downloads": -1,
            "filename": "athlinks_races-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18c4597248144c8da45b0ce63dba8cff",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 14583,
            "upload_time": "2025-02-17T01:31:54",
            "upload_time_iso_8601": "2025-02-17T01:31:54.372681Z",
            "url": "https://files.pythonhosted.org/packages/31/00/65e876248da20cf1d46d4132ce40adf022550ff5d2fc4dfbe26881b5efde/athlinks_races-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2fe29055df6c68c8dde2b82433282ab27975e00ba34495e55dae5c305bfb29ba",
                "md5": "ef62739cd17f44a77929ed450737315e",
                "sha256": "8ecacc15e7d6b13069afe0df7232268264ff45e360b464b4cbb4b03d207fbd7f"
            },
            "downloads": -1,
            "filename": "athlinks_races-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "ef62739cd17f44a77929ed450737315e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 18014,
            "upload_time": "2025-02-17T01:31:57",
            "upload_time_iso_8601": "2025-02-17T01:31:57.334341Z",
            "url": "https://files.pythonhosted.org/packages/2f/e2/9055df6c68c8dde2b82433282ab27975e00ba34495e55dae5c305bfb29ba/athlinks_races-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-17 01:31:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "josevnz",
    "github_project": "athlinks-races",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "athlinks-races"
}
        
Elapsed time: 0.42076s