rottentomatoes-python


Namerottentomatoes-python JSON
Version 1.2.0 PyPI version JSON
download
home_pagehttps://github.com/preritdas/rottentomatoes-python
SummaryScrape Rotten Tomatoes's website for basic information on movies, without the use of their hard-to-attain official REST API.
upload_time2024-09-09 17:59:08
maintainerNone
docs_urlNone
authorPrerit Das
requires_pythonNone
licenseNone
keywords python movies rottentomatoes
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![tests](https://github.com/preritdas/rottentomatoes-python/actions/workflows/pytest.yml/badge.svg)
![pypi](https://github.com/preritdas/rottentomatoes-python/actions/workflows/python-publish.yml/badge.svg)
[![PyPI version](https://badge.fury.io/py/rottentomatoes-python.svg)](https://badge.fury.io/py/rottentomatoes-python)
![PyPI - Downloads](https://img.shields.io/pypi/dm/rottentomatoes-python)
![versions](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)


# :movie_camera: Rotten Tomatoes in Python (and API) :clapper:

> **Note**
> If at any point in your project this library stops working, returning errors for standalone functions or the `Movie` class, first try updating it with `pip install -U rottentomatoes-python`, and if it's still not working, submit an issue on this repo. 99% of the time it'll "stop working" because the Rotten Tomatoes site schema has changed, meaning some changes to web scraping and extraction under the hood are necessary to make everything work again. Tests run on this repo automatically once a day, so breaking changes to the Rotten Tomatoes site should be caught by myself or a maintainer pretty quickly.

This package allows you to easily fetch Rotten Tomatoes scores and other movie data such as genres, without the use of the official Rotten Tomatoes API. The package scrapes their website for the data. I built this because unfortunately, to get access to their API, you have to submit a special request which takes an inordinate amount of time to process, or doesn't go through at all. 

The package now, by default, scrapes the Rotten Tomatoes search page to find the true url of the first valid movie response (is a movie and has a tomatometer). This means queries that previously didn't work because their urls had a unique identifier or a year-released prefix, now work. The limitation of this new mechanism is that you only get the top response, and when searching for specific movies (sequels, by year, etc.) Rotten Tomatoes seems to return the same results as the original query. So, it's difficult to use specific queries to try and get the desired result movie as the top response. See #4 for more info on this.

There is now an API deployed to make querying multiple movies and getting several responses easier. The endpoint is https://rotten-tomatoes-api.ue.r.appspot.com and it's open and free to use. Visit `/docs` or `/redoc` in the browser to view the endpoints. Both endpoints live right now are browser accessible meaning you don't need an HTTP client to use the API. 

- https://rotten-tomatoes-api.ue.r.appspot.com/movie/bad_boys for JSON response of the top result
- https://rotten-tomatoes-api.ue.r.appspot.com/search/bad_boys for a JSON response of all valid results


## Usage

You can either call the standalone functions `tomatometer`, `audience_score`, `genres`, etc., or use the `Movie` class to only pass the name and have each attribute be fetched automatically. If you use the `Movie` class, you can print all attributes by printing the object itself, or by accessing each attribute individually. 

The weighted score is calculated using the formula $\frac{2}{3}(tomatometer) + \frac{1}{3}(audience)$. The result is then rounded to the nearest integer.

Basic usage examples:

```python
import rottentomatoes as rt

print(rt.tomatometer("happy gilmore"))
# Output: 61
# Type: int

print(rt.audience_score('top gun maverick'))
# Output: 99
# Type: int

print(rt.rating('everything everywhere all at once'))
# Output: R
# Type: str

print(rt.genres('top gun'))
# Output: ['Action', 'Adventure']
# Type: list[str]

print(rt.weighted_score('happy gilmore'))
# Output: 69
# Type: int

print(rt.year_released('happy gilmore'))
# Output: 1996
# Type: str

print(rt.actors('top gun maverick', max_actors=5))
# Output: ['Tom Cruise', 'Miles Teller', 'Jennifer Connelly', 'Jon Hamm', 'Glen Powell']
# Type: list[str]

# --- Using the Movie class ---
movie = rt.Movie('top gun')
print(movie)
# Output
    # Top Gun, PG, 1h 49m.
    # Released in 1986.
    # Tomatometer: 58
    # Weighted score: 66
    # Audience Score: 83
    # Genres - ['Action', 'Adventure']
    # Prominent actors: Tom Cruise, Kelly McGillis, Anthony Edwards, Val Kilmer, Tom Skerritt.
# Type: str

print(movie.weighted_score)
# Output: 66
# Type: int

print(movie.url)
# Output: 'https://www.rottentomatoes.com/m/top_gun_maverick'
# Type: str
```

## Exceptions

If you're using this package within a larger program, it's useful to know what exceptions are raised (and when) so they can be caught and handled.

### `LookupError`

When _any_ call is made to scrape the Rotten Tomatoes website (Tomatometer, Audience Score, Genres, etc.), if a proper movie page wasn't returned (can be due to a typo in name entry, duplicate movie names, etc.), a `LookupError` is raised, printing the attempted query url.


## Performance

`v0.3.0` makes the `Movie` class 19x more efficient. Data attained from scraping Rotten Tomatoes is temporarily cached and used to parse various other attributes. To test the performance difference, I used two separate virtual environments, `old` and `venv`. `rottentomatoes-python==0.2.5` was installed on `old`, and `rottentomatoes-python==0.3.0` was installed on `venv`. I then ran the same script, shown below, using each environment (Python 3.10.4). 

```python
import rottentomatoes as rt
from time import perf_counter


def test() -> None:
    start = perf_counter()
    movie = rt.Movie('top gun maverick')
    print('\n', movie, sep='')
    print(f"That took {perf_counter() - start} seconds.")


if __name__ == "__main__":
    test()
```

The results:

```console
❯ deactivate && source old/bin/activate && python test.py

Top Gun Maverick, PG-13, 2h 11m.
Released in 2022.
Tomatometer: 97
Weighted score: 97
Audience Score: 99
Genres - ['Action', 'Adventure']

That took 6.506246249999094 seconds.
❯ deactivate && source venv/bin/activate && python test.py

Top Gun Maverick, PG-13, 2h 11m.
Released in 2022.
Tomatometer: 97
Weighted score: 97
Audience Score: 99
Genres - ['Action', 'Adventure']
Prominent actors: Tom Cruise, Miles Teller, Jennifer Connelly, Jon Hamm, Glen Powell.

That took 0.3400420409961953 seconds.
```

## API

The API is deployed at https://rotten-tomatoes-api.ue.r.appspot.com/. It has two endpoints currently, `/movie/{movie_name}` and `/search/{movie_name}`. The first will pull one movie, the top result. The second will pull a list of _all_ valid movie results.

The first, with `movie_name="bad boys"`:

```json
{
  "name": "Bad Boys for Life",
  "tomatometer": 76,
  "audience_score": 96,
  "weighted_score": 82,
  "genres": [
    "Action",
    "Comedy"
  ],
  "rating": "R",
  "duration": "2h 4m",
  "year": "2020",
  "actors": [
    "Will Smith",
    "Martin Lawrence",
    "Vanessa Hudgens",
    "Jacob Scipio",
    "Alexander Ludwig"
  ],
  "directors": [
    "Adil El Arbi",
    "Bilall Fallah"
  ]
}
```

The second, with `movie_name="bad boys"`:

```json
{
  "movies": [
    {
      "name": "Bad Boys for Life",
      "tomatometer": 76,
      "audience_score": 96,
      "weighted_score": 82,
      "genres": [
        "Action",
        "Comedy"
      ],
      "rating": "R",
      "duration": "2h 4m",
      "year": "2020",
      "actors": [
        "Will Smith",
        "Martin Lawrence",
        "Vanessa Hudgens",
        "Jacob Scipio",
        "Alexander Ludwig"
      ],
      "directors": [
        "Adil El Arbi",
        "Bilall Fallah"
      ]
    },
    {
      "name": "Bad Boys II",
      "tomatometer": 23,
      "audience_score": 78,
      "weighted_score": 41,
      "genres": [
        "Action",
        "Comedy"
      ],
      "rating": "R",
      "duration": "2h 26m",
      "year": "2003",
      "actors": [
        "Martin Lawrence",
        "Will Smith",
        "Jordi Mollà",
        "Gabrielle Union",
        "Peter Stormare"
      ],
      "directors": [
        "Michael Bay"
      ]
    },
    {
      "name": "Bad Boys",
      "tomatometer": 43,
      "audience_score": 78,
      "weighted_score": 54,
      "genres": [
        "Action",
        "Comedy"
      ],
      "rating": "R",
      "duration": "1h 58m",
      "year": "1995",
      "actors": [
        "Martin Lawrence",
        "Will Smith",
        "Téa Leoni",
        "Tchéky Karyo",
        "Theresa Randle"
      ],
      "directors": [
        "Michael Bay"
      ]
    },
    {
      "name": "Bad Boys",
      "tomatometer": 90,
      "audience_score": 82,
      "weighted_score": 87,
      "genres": [
        "Drama"
      ],
      "rating": "R",
      "duration": "2h 3m",
      "year": "1983",
      "actors": [
        "Sean Penn",
        "Reni Santoni",
        "Esai Morales",
        "Jim Moody",
        "Ally Sheedy"
      ],
      "directors": [
        "Rick Rosenthal 2"
      ]
    }
  ]
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/preritdas/rottentomatoes-python",
    "name": "rottentomatoes-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, movies, rottentomatoes",
    "author": "Prerit Das",
    "author_email": "<preritdas@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/55/1d/365608f71ae7bd3c53216ad91038e030826757443d49d8546ef03ab0524a/rottentomatoes_python-1.2.0.tar.gz",
    "platform": null,
    "description": "![tests](https://github.com/preritdas/rottentomatoes-python/actions/workflows/pytest.yml/badge.svg)\n![pypi](https://github.com/preritdas/rottentomatoes-python/actions/workflows/python-publish.yml/badge.svg)\n[![PyPI version](https://badge.fury.io/py/rottentomatoes-python.svg)](https://badge.fury.io/py/rottentomatoes-python)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/rottentomatoes-python)\n![versions](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)\n\n\n# :movie_camera: Rotten Tomatoes in Python (and API) :clapper:\n\n> **Note**\n> If at any point in your project this library stops working, returning errors for standalone functions or the `Movie` class, first try updating it with `pip install -U rottentomatoes-python`, and if it's still not working, submit an issue on this repo. 99% of the time it'll \"stop working\" because the Rotten Tomatoes site schema has changed, meaning some changes to web scraping and extraction under the hood are necessary to make everything work again. Tests run on this repo automatically once a day, so breaking changes to the Rotten Tomatoes site should be caught by myself or a maintainer pretty quickly.\n\nThis package allows you to easily fetch Rotten Tomatoes scores and other movie data such as genres, without the use of the official Rotten Tomatoes API. The package scrapes their website for the data. I built this because unfortunately, to get access to their API, you have to submit a special request which takes an inordinate amount of time to process, or doesn't go through at all. \n\nThe package now, by default, scrapes the Rotten Tomatoes search page to find the true url of the first valid movie response (is a movie and has a tomatometer). This means queries that previously didn't work because their urls had a unique identifier or a year-released prefix, now work. The limitation of this new mechanism is that you only get the top response, and when searching for specific movies (sequels, by year, etc.) Rotten Tomatoes seems to return the same results as the original query. So, it's difficult to use specific queries to try and get the desired result movie as the top response. See #4 for more info on this.\n\nThere is now an API deployed to make querying multiple movies and getting several responses easier. The endpoint is https://rotten-tomatoes-api.ue.r.appspot.com and it's open and free to use. Visit `/docs` or `/redoc` in the browser to view the endpoints. Both endpoints live right now are browser accessible meaning you don't need an HTTP client to use the API. \n\n- https://rotten-tomatoes-api.ue.r.appspot.com/movie/bad_boys for JSON response of the top result\n- https://rotten-tomatoes-api.ue.r.appspot.com/search/bad_boys for a JSON response of all valid results\n\n\n## Usage\n\nYou can either call the standalone functions `tomatometer`, `audience_score`, `genres`, etc., or use the `Movie` class to only pass the name and have each attribute be fetched automatically. If you use the `Movie` class, you can print all attributes by printing the object itself, or by accessing each attribute individually. \n\nThe weighted score is calculated using the formula $\\frac{2}{3}(tomatometer) + \\frac{1}{3}(audience)$. The result is then rounded to the nearest integer.\n\nBasic usage examples:\n\n```python\nimport rottentomatoes as rt\n\nprint(rt.tomatometer(\"happy gilmore\"))\n# Output: 61\n# Type: int\n\nprint(rt.audience_score('top gun maverick'))\n# Output: 99\n# Type: int\n\nprint(rt.rating('everything everywhere all at once'))\n# Output: R\n# Type: str\n\nprint(rt.genres('top gun'))\n# Output: ['Action', 'Adventure']\n# Type: list[str]\n\nprint(rt.weighted_score('happy gilmore'))\n# Output: 69\n# Type: int\n\nprint(rt.year_released('happy gilmore'))\n# Output: 1996\n# Type: str\n\nprint(rt.actors('top gun maverick', max_actors=5))\n# Output: ['Tom Cruise', 'Miles Teller', 'Jennifer Connelly', 'Jon Hamm', 'Glen Powell']\n# Type: list[str]\n\n# --- Using the Movie class ---\nmovie = rt.Movie('top gun')\nprint(movie)\n# Output\n    # Top Gun, PG, 1h 49m.\n    # Released in 1986.\n    # Tomatometer: 58\n    # Weighted score: 66\n    # Audience Score: 83\n    # Genres - ['Action', 'Adventure']\n    # Prominent actors: Tom Cruise, Kelly McGillis, Anthony Edwards, Val Kilmer, Tom Skerritt.\n# Type: str\n\nprint(movie.weighted_score)\n# Output: 66\n# Type: int\n\nprint(movie.url)\n# Output: 'https://www.rottentomatoes.com/m/top_gun_maverick'\n# Type: str\n```\n\n## Exceptions\n\nIf you're using this package within a larger program, it's useful to know what exceptions are raised (and when) so they can be caught and handled.\n\n### `LookupError`\n\nWhen _any_ call is made to scrape the Rotten Tomatoes website (Tomatometer, Audience Score, Genres, etc.), if a proper movie page wasn't returned (can be due to a typo in name entry, duplicate movie names, etc.), a `LookupError` is raised, printing the attempted query url.\n\n\n## Performance\n\n`v0.3.0` makes the `Movie` class 19x more efficient. Data attained from scraping Rotten Tomatoes is temporarily cached and used to parse various other attributes. To test the performance difference, I used two separate virtual environments, `old` and `venv`. `rottentomatoes-python==0.2.5` was installed on `old`, and `rottentomatoes-python==0.3.0` was installed on `venv`. I then ran the same script, shown below, using each environment (Python 3.10.4). \n\n```python\nimport rottentomatoes as rt\nfrom time import perf_counter\n\n\ndef test() -> None:\n    start = perf_counter()\n    movie = rt.Movie('top gun maverick')\n    print('\\n', movie, sep='')\n    print(f\"That took {perf_counter() - start} seconds.\")\n\n\nif __name__ == \"__main__\":\n    test()\n```\n\nThe results:\n\n```console\n\u276f deactivate && source old/bin/activate && python test.py\n\nTop Gun Maverick, PG-13, 2h 11m.\nReleased in 2022.\nTomatometer: 97\nWeighted score: 97\nAudience Score: 99\nGenres - ['Action', 'Adventure']\n\nThat took 6.506246249999094 seconds.\n\u276f deactivate && source venv/bin/activate && python test.py\n\nTop Gun Maverick, PG-13, 2h 11m.\nReleased in 2022.\nTomatometer: 97\nWeighted score: 97\nAudience Score: 99\nGenres - ['Action', 'Adventure']\nProminent actors: Tom Cruise, Miles Teller, Jennifer Connelly, Jon Hamm, Glen Powell.\n\nThat took 0.3400420409961953 seconds.\n```\n\n## API\n\nThe API is deployed at https://rotten-tomatoes-api.ue.r.appspot.com/. It has two endpoints currently, `/movie/{movie_name}` and `/search/{movie_name}`. The first will pull one movie, the top result. The second will pull a list of _all_ valid movie results.\n\nThe first, with `movie_name=\"bad boys\"`:\n\n```json\n{\n  \"name\": \"Bad Boys for Life\",\n  \"tomatometer\": 76,\n  \"audience_score\": 96,\n  \"weighted_score\": 82,\n  \"genres\": [\n    \"Action\",\n    \"Comedy\"\n  ],\n  \"rating\": \"R\",\n  \"duration\": \"2h 4m\",\n  \"year\": \"2020\",\n  \"actors\": [\n    \"Will Smith\",\n    \"Martin Lawrence\",\n    \"Vanessa Hudgens\",\n    \"Jacob Scipio\",\n    \"Alexander Ludwig\"\n  ],\n  \"directors\": [\n    \"Adil El Arbi\",\n    \"Bilall Fallah\"\n  ]\n}\n```\n\nThe second, with `movie_name=\"bad boys\"`:\n\n```json\n{\n  \"movies\": [\n    {\n      \"name\": \"Bad Boys for Life\",\n      \"tomatometer\": 76,\n      \"audience_score\": 96,\n      \"weighted_score\": 82,\n      \"genres\": [\n        \"Action\",\n        \"Comedy\"\n      ],\n      \"rating\": \"R\",\n      \"duration\": \"2h 4m\",\n      \"year\": \"2020\",\n      \"actors\": [\n        \"Will Smith\",\n        \"Martin Lawrence\",\n        \"Vanessa Hudgens\",\n        \"Jacob Scipio\",\n        \"Alexander Ludwig\"\n      ],\n      \"directors\": [\n        \"Adil El Arbi\",\n        \"Bilall Fallah\"\n      ]\n    },\n    {\n      \"name\": \"Bad Boys II\",\n      \"tomatometer\": 23,\n      \"audience_score\": 78,\n      \"weighted_score\": 41,\n      \"genres\": [\n        \"Action\",\n        \"Comedy\"\n      ],\n      \"rating\": \"R\",\n      \"duration\": \"2h 26m\",\n      \"year\": \"2003\",\n      \"actors\": [\n        \"Martin Lawrence\",\n        \"Will Smith\",\n        \"Jordi Moll\u00e0\",\n        \"Gabrielle Union\",\n        \"Peter Stormare\"\n      ],\n      \"directors\": [\n        \"Michael Bay\"\n      ]\n    },\n    {\n      \"name\": \"Bad Boys\",\n      \"tomatometer\": 43,\n      \"audience_score\": 78,\n      \"weighted_score\": 54,\n      \"genres\": [\n        \"Action\",\n        \"Comedy\"\n      ],\n      \"rating\": \"R\",\n      \"duration\": \"1h 58m\",\n      \"year\": \"1995\",\n      \"actors\": [\n        \"Martin Lawrence\",\n        \"Will Smith\",\n        \"T\u00e9a Leoni\",\n        \"Tch\u00e9ky Karyo\",\n        \"Theresa Randle\"\n      ],\n      \"directors\": [\n        \"Michael Bay\"\n      ]\n    },\n    {\n      \"name\": \"Bad Boys\",\n      \"tomatometer\": 90,\n      \"audience_score\": 82,\n      \"weighted_score\": 87,\n      \"genres\": [\n        \"Drama\"\n      ],\n      \"rating\": \"R\",\n      \"duration\": \"2h 3m\",\n      \"year\": \"1983\",\n      \"actors\": [\n        \"Sean Penn\",\n        \"Reni Santoni\",\n        \"Esai Morales\",\n        \"Jim Moody\",\n        \"Ally Sheedy\"\n      ],\n      \"directors\": [\n        \"Rick Rosenthal 2\"\n      ]\n    }\n  ]\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Scrape Rotten Tomatoes's website for basic information on movies, without the use of their hard-to-attain official REST API.",
    "version": "1.2.0",
    "project_urls": {
        "Homepage": "https://github.com/preritdas/rottentomatoes-python"
    },
    "split_keywords": [
        "python",
        " movies",
        " rottentomatoes"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3cc4b0ccefaeaf3fb7c52102daa0ce486d6064088913a12a786c14cdb754081f",
                "md5": "38ec9edd7d1fe562af0b25a03fe21050",
                "sha256": "6fa1ac223c10a5e5f6a7a6b64fa62e8926486ce26cdefa2d1c0d2078d1021da2"
            },
            "downloads": -1,
            "filename": "rottentomatoes_python-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "38ec9edd7d1fe562af0b25a03fe21050",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 16020,
            "upload_time": "2024-09-09T17:59:06",
            "upload_time_iso_8601": "2024-09-09T17:59:06.926573Z",
            "url": "https://files.pythonhosted.org/packages/3c/c4/b0ccefaeaf3fb7c52102daa0ce486d6064088913a12a786c14cdb754081f/rottentomatoes_python-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "551d365608f71ae7bd3c53216ad91038e030826757443d49d8546ef03ab0524a",
                "md5": "4bea4e8f3f24564bf7d50ed84521d0a8",
                "sha256": "1f43df9ae807a97da81ccf3371da2ac37b35f40a43355ac2c94afda7c695458f"
            },
            "downloads": -1,
            "filename": "rottentomatoes_python-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4bea4e8f3f24564bf7d50ed84521d0a8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16444,
            "upload_time": "2024-09-09T17:59:08",
            "upload_time_iso_8601": "2024-09-09T17:59:08.614283Z",
            "url": "https://files.pythonhosted.org/packages/55/1d/365608f71ae7bd3c53216ad91038e030826757443d49d8546ef03ab0524a/rottentomatoes_python-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-09 17:59:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "preritdas",
    "github_project": "rottentomatoes-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "rottentomatoes-python"
}
        
Elapsed time: 2.93522s