seriesheatmap


Nameseriesheatmap JSON
Version 0.0.4 PyPI version JSON
download
home_page
SummaryScraper and heatmap plotter for episode ratings of series on IMDB
upload_time2022-12-21 22:31:27
maintainer
docs_urlNone
author
requires_python>=3.7
licenseMIT License Copyright (c) 2022 Florian Trautweiler Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords series heatmap scraper
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # IMDB Series Rating Scraper

## Introduction

This tool scrapes the website https://www.imdb.com for ratings of individual episodes of a series.
A csv file is generated to cache the ratings.
Using matplotlib, the tool then generates a heatmap representation of all episodes in the series.
Because this tools relies on scraping the html tree of the imdb page, it might break anytime. 
Feel free to message me if the scraper doesn't work anymore or create a pull request with adjusted xpaths.

| ![](examples/img/Dark.png)  |  ![](examples/img/Breaking_Bad.png)  |
|---------|-----|
| ![](examples/img/Game_Of_Thrones.png) | ![](examples/img/NCIS_Naval_Criminal_Investigative_Service.png) |

## Examples

### Data output

The following table shows data that is generated by the scraper for the first season of **Breaking Bad**.
For the full data output see `examples/data/Breaking Bad.csv`.

| season | episode | name                          | rating |
|--------|---------|-------------------------------|--------|
| 1      | 1       | Pilot                         | 9.0    |
| 1      | 2       | Cat's in the Bag...           | 8.6    |
| 1      | 3       | ...And the Bag's in the River | 8.7    |
| 1      | 4       | Cancer Man                    | 8.2    |
| 1      | 5       | Gray Matter                   | 8.3    |
| 1      | 6       | Crazy Handful of Nothin'      | 9.3    |
| 1      | 7       | A No-Rough-Stuff-Type Deal    | 8.8    |

### Heatmap output

The following image shows an example of the heatmap that can be generated.
Heatmaps of some example series can be found under `examples/img/`. 

![](examples/img/Breaking_Bad.png)

## Quickstart

### Dependencies

- Python version `Python 3.9.13`
- Python packages see `requirements.txt`

### Setup

1. Clone this repository

| **HTTPS**  | `$ git clone https://github.com/trflorian/imdb-scraper-heatmap.git` |
| ---|---|
| **SSH** |`$ git clone git@github.com:trflorian/imdb-scraper-heatmap.git` |

3. (Optional) Create a virtual environment for this project 
4. Install the required python packages in your python environment.

`$ python -m pip install -r requirements.txt`

5. Run `$ python scraper.py` to scrape the IMDB website for a specific series.
6. Run `$ python heatmap.py` to create a plot for the scraped series.

### Usage

``` 
$ python .\examples\heatmap.py --help

usage: heatmap.py [-h] [-s] [-d] [-o] [-n NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s, --show            show the heatmap plot instead of saving it
  -d, --dark            use dark mode for the plot style
  -o, --override        override existing plots, only used if show flag is not set
  -n NAME, --name NAME  name of the series, if not set the whole data directory will be scanned
```

## Development


### Upload to Pypi

```python -m build```

```python -m twine upload --skip-existing dist/*```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "seriesheatmap",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "series,heatmap,scraper",
    "author": "",
    "author_email": "Florian Trautweiler <florian.trautweiler@hispeed.ch>",
    "download_url": "https://files.pythonhosted.org/packages/95/3b/3bba16d94af7d186f54d99e4d4e880a4b3a4514b9f7a58e63897236b021d/seriesheatmap-0.0.4.tar.gz",
    "platform": null,
    "description": "# IMDB Series Rating Scraper\r\n\r\n## Introduction\r\n\r\nThis tool scrapes the website https://www.imdb.com for ratings of individual episodes of a series.\r\nA csv file is generated to cache the ratings.\r\nUsing matplotlib, the tool then generates a heatmap representation of all episodes in the series.\r\nBecause this tools relies on scraping the html tree of the imdb page, it might break anytime. \r\nFeel free to message me if the scraper doesn't work anymore or create a pull request with adjusted xpaths.\r\n\r\n| ![](examples/img/Dark.png)  |  ![](examples/img/Breaking_Bad.png)  |\r\n|---------|-----|\r\n| ![](examples/img/Game_Of_Thrones.png) | ![](examples/img/NCIS_Naval_Criminal_Investigative_Service.png) |\r\n\r\n## Examples\r\n\r\n### Data output\r\n\r\nThe following table shows data that is generated by the scraper for the first season of **Breaking Bad**.\r\nFor the full data output see `examples/data/Breaking Bad.csv`.\r\n\r\n| season | episode | name                          | rating |\r\n|--------|---------|-------------------------------|--------|\r\n| 1      | 1       | Pilot                         | 9.0    |\r\n| 1      | 2       | Cat's in the Bag...           | 8.6    |\r\n| 1      | 3       | ...And the Bag's in the River | 8.7    |\r\n| 1      | 4       | Cancer Man                    | 8.2    |\r\n| 1      | 5       | Gray Matter                   | 8.3    |\r\n| 1      | 6       | Crazy Handful of Nothin'      | 9.3    |\r\n| 1      | 7       | A No-Rough-Stuff-Type Deal    | 8.8    |\r\n\r\n### Heatmap output\r\n\r\nThe following image shows an example of the heatmap that can be generated.\r\nHeatmaps of some example series can be found under `examples/img/`. \r\n\r\n![](examples/img/Breaking_Bad.png)\r\n\r\n## Quickstart\r\n\r\n### Dependencies\r\n\r\n- Python version `Python 3.9.13`\r\n- Python packages see `requirements.txt`\r\n\r\n### Setup\r\n\r\n1. Clone this repository\r\n\r\n| **HTTPS**  | `$ git clone https://github.com/trflorian/imdb-scraper-heatmap.git` |\r\n| ---|---|\r\n| **SSH** |`$ git clone git@github.com:trflorian/imdb-scraper-heatmap.git` |\r\n\r\n3. (Optional) Create a virtual environment for this project \r\n4. Install the required python packages in your python environment.\r\n\r\n`$ python -m pip install -r requirements.txt`\r\n\r\n5. Run `$ python scraper.py` to scrape the IMDB website for a specific series.\r\n6. Run `$ python heatmap.py` to create a plot for the scraped series.\r\n\r\n### Usage\r\n\r\n``` \r\n$ python .\\examples\\heatmap.py --help\r\n\r\nusage: heatmap.py [-h] [-s] [-d] [-o] [-n NAME]\r\n\r\noptional arguments:\r\n  -h, --help            show this help message and exit\r\n  -s, --show            show the heatmap plot instead of saving it\r\n  -d, --dark            use dark mode for the plot style\r\n  -o, --override        override existing plots, only used if show flag is not set\r\n  -n NAME, --name NAME  name of the series, if not set the whole data directory will be scanned\r\n```\r\n\r\n## Development\r\n\r\n\r\n### Upload to Pypi\r\n\r\n```python -m build```\r\n\r\n```python -m twine upload --skip-existing dist/*```\r\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2022 Florian Trautweiler  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Scraper and heatmap plotter for episode ratings of series on IMDB",
    "version": "0.0.4",
    "split_keywords": [
        "series",
        "heatmap",
        "scraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "4c13d083828806d1e636ec7554dc5017",
                "sha256": "56ba6e9cb69de33e7aaddf98d664c2a028f9031b08a430502a35be6fd8330bec"
            },
            "downloads": -1,
            "filename": "seriesheatmap-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4c13d083828806d1e636ec7554dc5017",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 8726,
            "upload_time": "2022-12-21T22:31:25",
            "upload_time_iso_8601": "2022-12-21T22:31:25.422258Z",
            "url": "https://files.pythonhosted.org/packages/a5/e9/be899d080474802df78daa7230ded871ed0f2d42caa06dc2767415461f6c/seriesheatmap-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "7ba4d9ed5d39c1d4610a7e1086821dd9",
                "sha256": "6da2a0ab81c5130bb6b6949dd487c764b3fb01fa7a6a95dde4ad0e158150315d"
            },
            "downloads": -1,
            "filename": "seriesheatmap-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "7ba4d9ed5d39c1d4610a7e1086821dd9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 1297037,
            "upload_time": "2022-12-21T22:31:27",
            "upload_time_iso_8601": "2022-12-21T22:31:27.337679Z",
            "url": "https://files.pythonhosted.org/packages/95/3b/3bba16d94af7d186f54d99e4d4e880a4b3a4514b9f7a58e63897236b021d/seriesheatmap-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-21 22:31:27",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "seriesheatmap"
}
        
Elapsed time: 0.03316s