board-game-scraper


Nameboard-game-scraper JSON
Version 2.22.0 PyPI version JSON
download
home_pagehttps://recommend.games/
SummaryBoard games data scraping and processing from BoardGameGeek and more!
upload_time2024-02-11 12:43:15
maintainer
docs_urlNone
authorMarkus Shepherd
requires_python>=3.7.0
licenseMIT
keywords board games tabletop games data datasets scraper scrapy spider boardgamegeek bgg ludoj ludoj-scraper
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# 🎲 Board Game Scraper 🕸

Scraping data about board games from the web. View the data live at
[Recommend.Games](https://recommend.games/)! Install via

```bash
pip install board-game-scraper
```

## Sources

* [BoardGameGeek](https://boardgamegeek.com/) (`bgg`)
* [DBpedia](https://wiki.dbpedia.org/) (`dbpedia`)
* [Luding.org](https://luding.org/) (`luding`)
* [Spielen.de](https://gesellschaftsspiele.spielen.de/) (`spielen`)
* [Wikidata](https://www.wikidata.org/) (`wikidata`)

## Run scrapers

[Requires Python 3](https://pythonclock.org/). Make sure
[Pipenv](https://docs.pipenv.org/) is installed and create the virtual
environment:

```bash
python3 -m pip install --upgrade pipenv
pipenv install --dev
pipenv shell
```

Run a spider like so:

```bash
JOBDIR="jobs/${SPIDER}/$(date --utc +'%Y-%m-%dT%H-%M-%S')"
scrapy crawl "${SPIDER}" \
    --output 'feeds/%(name)s/%(time)s/%(class)s.csv' \
    --set "JOBDIR=${JOBDIR}"
```

where `$SPIDER` is one of the IDs above.

Run all the spiders with the [`run_scrapers.sh`](run_scrapers.sh) script. Get a
list of the running scrapers' PIDs with the [`processes.sh`](processes.sh)
script. You can close all the running scrapers via

```bash
./processes.sh stop
```

and resume them later.

## Tests

You can run `scrapy check` to perform contract tests for all spiders, or
`scrapy check $SPIDER` to test one particular spider. If tests fails,
there most likely has been some change on the website and the spider needs
updating.

## Board game datasets

If you are interested in using any of the datasets produced by this scraper,
take a look at the
[BoardGameGeek guild](https://boardgamegeek.com/thread/2287371/boardgamegeek-games-and-ratings-datasets).
A subset of the data can also be found on [Kaggle](https://www.kaggle.com/mshepherd/board-games).

## Links

* [board-game-scraper](https://gitlab.com/recommend.games/board-game-scraper):
 This repository
* [Recommend.Games](https://recommend.games/): board game recommender using the
 scraped data
* [recommend-games-server](https://gitlab.com/recommend.games/recommend-games-server):
 Server code for [Recommend.Games](https://recommend.games/)
* [board-game-recommender](https://gitlab.com/recommend.games/board-game-recommender):
 Recommender code for [Recommend.Games](https://recommend.games/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://recommend.games/",
    "name": "board-game-scraper",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7.0",
    "maintainer_email": "",
    "keywords": "board games,tabletop games,data,datasets,scraper,scrapy,spider,boardgamegeek,bgg,ludoj,ludoj-scraper",
    "author": "Markus Shepherd",
    "author_email": "markus@recommend.games",
    "download_url": "https://files.pythonhosted.org/packages/c4/be/9ec76b6d9c08c37aa3db3f9837f944f35a4e007fec669c1055a449706db7/board-game-scraper-2.22.0.tar.gz",
    "platform": null,
    "description": "\n# \ud83c\udfb2 Board Game Scraper \ud83d\udd78\n\nScraping data about board games from the web. View the data live at\n[Recommend.Games](https://recommend.games/)! Install via\n\n```bash\npip install board-game-scraper\n```\n\n## Sources\n\n* [BoardGameGeek](https://boardgamegeek.com/) (`bgg`)\n* [DBpedia](https://wiki.dbpedia.org/) (`dbpedia`)\n* [Luding.org](https://luding.org/) (`luding`)\n* [Spielen.de](https://gesellschaftsspiele.spielen.de/) (`spielen`)\n* [Wikidata](https://www.wikidata.org/) (`wikidata`)\n\n## Run scrapers\n\n[Requires Python 3](https://pythonclock.org/). Make sure\n[Pipenv](https://docs.pipenv.org/) is installed and create the virtual\nenvironment:\n\n```bash\npython3 -m pip install --upgrade pipenv\npipenv install --dev\npipenv shell\n```\n\nRun a spider like so:\n\n```bash\nJOBDIR=\"jobs/${SPIDER}/$(date --utc +'%Y-%m-%dT%H-%M-%S')\"\nscrapy crawl \"${SPIDER}\" \\\n    --output 'feeds/%(name)s/%(time)s/%(class)s.csv' \\\n    --set \"JOBDIR=${JOBDIR}\"\n```\n\nwhere `$SPIDER` is one of the IDs above.\n\nRun all the spiders with the [`run_scrapers.sh`](run_scrapers.sh) script. Get a\nlist of the running scrapers' PIDs with the [`processes.sh`](processes.sh)\nscript. You can close all the running scrapers via\n\n```bash\n./processes.sh stop\n```\n\nand resume them later.\n\n## Tests\n\nYou can run `scrapy check` to perform contract tests for all spiders, or\n`scrapy check $SPIDER` to test one particular spider. If tests fails,\nthere most likely has been some change on the website and the spider needs\nupdating.\n\n## Board game datasets\n\nIf you are interested in using any of the datasets produced by this scraper,\ntake a look at the\n[BoardGameGeek guild](https://boardgamegeek.com/thread/2287371/boardgamegeek-games-and-ratings-datasets).\nA subset of the data can also be found on [Kaggle](https://www.kaggle.com/mshepherd/board-games).\n\n## Links\n\n* [board-game-scraper](https://gitlab.com/recommend.games/board-game-scraper):\n This repository\n* [Recommend.Games](https://recommend.games/): board game recommender using the\n scraped data\n* [recommend-games-server](https://gitlab.com/recommend.games/recommend-games-server):\n Server code for [Recommend.Games](https://recommend.games/)\n* [board-game-recommender](https://gitlab.com/recommend.games/board-game-recommender):\n Recommender code for [Recommend.Games](https://recommend.games/)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Board games data scraping and processing from BoardGameGeek and more!",
    "version": "2.22.0",
    "project_urls": {
        "Documentation": "https://gitlab.com/recommend.games/board-game-scraper/blob/master/README.md",
        "Funding": "https://paypal.me/mschepke",
        "Homepage": "https://recommend.games/",
        "Say Thanks!": "https://saythanks.io/to/mk.schepke%40gmail.com",
        "Source": "https://gitlab.com/recommend.games/board-game-scraper",
        "Tracker": "https://gitlab.com/recommend.games/board-game-scraper/issues",
        "Twitter": "https://twitter.com/recommend_games"
    },
    "split_keywords": [
        "board games",
        "tabletop games",
        "data",
        "datasets",
        "scraper",
        "scrapy",
        "spider",
        "boardgamegeek",
        "bgg",
        "ludoj",
        "ludoj-scraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cc48cbb26c02d404bf6a193a6386d8ca3a323b01d6926b92a7f3695b012deb14",
                "md5": "169f751dc252547430b37ccbf432058a",
                "sha256": "ac53dc7732d16eb99bea28abd5ed30679f2635b1833b9d7f9d23fdf3c7dd7f4b"
            },
            "downloads": -1,
            "filename": "board_game_scraper-2.22.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "169f751dc252547430b37ccbf432058a",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.7.0",
            "size": 73792,
            "upload_time": "2024-02-11T12:43:13",
            "upload_time_iso_8601": "2024-02-11T12:43:13.489411Z",
            "url": "https://files.pythonhosted.org/packages/cc/48/cbb26c02d404bf6a193a6386d8ca3a323b01d6926b92a7f3695b012deb14/board_game_scraper-2.22.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c4be9ec76b6d9c08c37aa3db3f9837f944f35a4e007fec669c1055a449706db7",
                "md5": "e9fa3ff0bfd66e4ec2b7de4b69dcb8cd",
                "sha256": "fde27badb99b5f4699a2ce6da776b8511ea25d4242e3d6ac625f2f13e778b691"
            },
            "downloads": -1,
            "filename": "board-game-scraper-2.22.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e9fa3ff0bfd66e4ec2b7de4b69dcb8cd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7.0",
            "size": 57731,
            "upload_time": "2024-02-11T12:43:15",
            "upload_time_iso_8601": "2024-02-11T12:43:15.752463Z",
            "url": "https://files.pythonhosted.org/packages/c4/be/9ec76b6d9c08c37aa3db3f9837f944f35a4e007fec669c1055a449706db7/board-game-scraper-2.22.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-11 12:43:15",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "recommend.games",
    "gitlab_project": "board-game-scraper",
    "lcname": "board-game-scraper"
}
        
Elapsed time: 0.19236s