google-flights-scraper


Namegoogle-flights-scraper JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/yourusername/google_flights_scraper
SummaryA Python package to scrape flight data from Google Flights.
upload_time2024-04-21 12:06:22
maintainerNone
docs_urlNone
authorHugo Gonçalves
requires_python>=3.8
licenseNone
keywords google flights scraper google flights flights scraper flights
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # google-flights-scraping

This Python module uses Playwright to scrape flight data from Google Flights. It automates the process of searching for flights based on user input parameters such as origin, destination, departure date, and number of passengers. The scraped data includes detailed flight information, such as prices, dates, companies, duration, stops, emissions, and more.
The current script was inspired from [Arthur Chukhrai's article on scraping Google Flights with Python](https://dev.to/chukhraiartur/scrape-google-flights-with-python-4dln).

## Features

- **Automated Scraping**: Automates the retrieval of flight information from Google Flights.
- **Customizable Search**: Allows specifying various parameters like departure and destination cities, dates, and number of passengers.
- **Detailed Flight Data**: Retrieves comprehensive details about each flight option available.
- **Production Ready**: Includes production-specific configurations for optimal performance.
- **Command-Line Interface**: Provides a CLI tool for easy interaction with the scraper.
- **Configurable Options**: Supports various options like verbose output, headless mode, and pretty printing of JSON output.

## Prerequisites

Before you begin using this module, ensure you have the following installed:

- Python 3.7 or higher
- Playwright
- Selectolax
- Click

## Installation

1. Clone the repository to your local machine:

```bash
git clone https://github.com/kurouge/google-flights-scraping.git
cd google-flights-scraping
```

2. Install the required dependencies:

```bash
pip install -r requirements.txt
```

## Usage

### Module Usage

To use the module, you need to provide the parameters for your flight search. Here is an example of how to run the script:

```python
from google_flights import GoogleFlights

# Set your flight search parameters
origin = 'New York'
destination = 'London'
departure_date = '2024-09-15'
passengers = 1

scraper = GoogleFlights(headless=True)
results = scraper.search(origin, destination, departure_date, passengers)
results_json = json.dumps(results, indent=4)
print(results_json)
```

### CLI Usage

The module also includes a command-line interface (CLI) for interacting with the Google Flights scraper without directly using Python scripts. This can be especially useful for automating tasks or integrating the scraper into larger workflows.

### Installation

Ensure the CLI script is executable:

```python
chmod +x /path/to/google-flights-scraping/bin/cli_script.py
```

### Available Options

You can configure the following options via the command line:

- `--origin, -o`: Set the departure city (required).
- `--destination, -a`: Set the destination city (required).
- `--departure-date, -dd`: Set the departure date in DD-MM-YYYY format. Defaults to three days from the current date.
- `--passengers, -p`: Specify the number of passengers. Defaults to 1.
- `--verbose, -v`: Enable verbose output for more detailed logs.
- `--headless, -hl`: Run the browser in headless mode for a headless server environment.
- `--pretty, -pr`: Enable pretty printing of the output JSON.

### Running the CLI

To run the CLI tool, use the following command format:

```bash
/path/to/google-flights-scraping/cli_script.py --origin "New York" --destination "London" --departure-date "15-09-2024" --passengers 2 --verbose --headless --pretty
```

This command will search for flights from New York to London on September 15, 2024, for 2 passengers, with verbose, headless and pretty modes enabled.

## Configuration for production

In a production environment, make the following adjustments to the code:

- Reduce slow_mo for faster execution.
- Comment out debug statements and unnecessary time.sleep() calls except where absolutely necessary.

## Contributing

Contributions to this module are welcome. Please follow the standard procedures by forking the repository, making your changes, and submitting a pull request for review.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Contact

Your Name - hugoglvs@icloud.com - [Hugo Gonçalves]
Project Link: https://github.com/hugoglvs/google_flights_scraper

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/google_flights_scraper",
    "name": "google-flights-scraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "google flights scraper, google flights, flights scraper, flights",
    "author": "Hugo Gon\u00e7alves",
    "author_email": "hugoglvs@icloud.com",
    "download_url": null,
    "platform": null,
    "description": "# google-flights-scraping\n\nThis Python module uses Playwright to scrape flight data from Google Flights. It automates the process of searching for flights based on user input parameters such as origin, destination, departure date, and number of passengers. The scraped data includes detailed flight information, such as prices, dates, companies, duration, stops, emissions, and more.\nThe current script was inspired from [Arthur Chukhrai's article on scraping Google Flights with Python](https://dev.to/chukhraiartur/scrape-google-flights-with-python-4dln).\n\n## Features\n\n- **Automated Scraping**: Automates the retrieval of flight information from Google Flights.\n- **Customizable Search**: Allows specifying various parameters like departure and destination cities, dates, and number of passengers.\n- **Detailed Flight Data**: Retrieves comprehensive details about each flight option available.\n- **Production Ready**: Includes production-specific configurations for optimal performance.\n- **Command-Line Interface**: Provides a CLI tool for easy interaction with the scraper.\n- **Configurable Options**: Supports various options like verbose output, headless mode, and pretty printing of JSON output.\n\n## Prerequisites\n\nBefore you begin using this module, ensure you have the following installed:\n\n- Python 3.7 or higher\n- Playwright\n- Selectolax\n- Click\n\n## Installation\n\n1. Clone the repository to your local machine:\n\n```bash\ngit clone https://github.com/kurouge/google-flights-scraping.git\ncd google-flights-scraping\n```\n\n2. Install the required dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n## Usage\n\n### Module Usage\n\nTo use the module, you need to provide the parameters for your flight search. Here is an example of how to run the script:\n\n```python\nfrom google_flights import GoogleFlights\n\n# Set your flight search parameters\norigin = 'New York'\ndestination = 'London'\ndeparture_date = '2024-09-15'\npassengers = 1\n\nscraper = GoogleFlights(headless=True)\nresults = scraper.search(origin, destination, departure_date, passengers)\nresults_json = json.dumps(results, indent=4)\nprint(results_json)\n```\n\n### CLI Usage\n\nThe module also includes a command-line interface (CLI) for interacting with the Google Flights scraper without directly using Python scripts. This can be especially useful for automating tasks or integrating the scraper into larger workflows.\n\n### Installation\n\nEnsure the CLI script is executable:\n\n```python\nchmod +x /path/to/google-flights-scraping/bin/cli_script.py\n```\n\n### Available Options\n\nYou can configure the following options via the command line:\n\n- `--origin, -o`: Set the departure city (required).\n- `--destination, -a`: Set the destination city (required).\n- `--departure-date, -dd`: Set the departure date in DD-MM-YYYY format. Defaults to three days from the current date.\n- `--passengers, -p`: Specify the number of passengers. Defaults to 1.\n- `--verbose, -v`: Enable verbose output for more detailed logs.\n- `--headless, -hl`: Run the browser in headless mode for a headless server environment.\n- `--pretty, -pr`: Enable pretty printing of the output JSON.\n\n### Running the CLI\n\nTo run the CLI tool, use the following command format:\n\n```bash\n/path/to/google-flights-scraping/cli_script.py --origin \"New York\" --destination \"London\" --departure-date \"15-09-2024\" --passengers 2 --verbose --headless --pretty\n```\n\nThis command will search for flights from New York to London on September 15, 2024, for 2 passengers, with verbose, headless and pretty modes enabled.\n\n## Configuration for production\n\nIn a production environment, make the following adjustments to the code:\n\n- Reduce slow_mo for faster execution.\n- Comment out debug statements and unnecessary time.sleep() calls except where absolutely necessary.\n\n## Contributing\n\nContributions to this module are welcome. Please follow the standard procedures by forking the repository, making your changes, and submitting a pull request for review.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Contact\n\nYour Name - hugoglvs@icloud.com - [Hugo Gon\u00e7alves]\nProject Link: https://github.com/hugoglvs/google_flights_scraper\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package to scrape flight data from Google Flights.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/yourusername/google_flights_scraper"
    },
    "split_keywords": [
        "google flights scraper",
        " google flights",
        " flights scraper",
        " flights"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3e797c1466742ece2edc937ec20083416b097cfd3082d84e60f0178fc052ae13",
                "md5": "d6655aa20eb86e84d9080fbf2a2e5221",
                "sha256": "092aabdfeab2e6037b3a01e82f3ed0eca51d72da3e49540cbfd0adadf111c1bd"
            },
            "downloads": -1,
            "filename": "google_flights_scraper-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d6655aa20eb86e84d9080fbf2a2e5221",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 7481,
            "upload_time": "2024-04-21T12:06:22",
            "upload_time_iso_8601": "2024-04-21T12:06:22.569066Z",
            "url": "https://files.pythonhosted.org/packages/3e/79/7c1466742ece2edc937ec20083416b097cfd3082d84e60f0178fc052ae13/google_flights_scraper-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-21 12:06:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "google_flights_scraper",
    "github_not_found": true,
    "lcname": "google-flights-scraper"
}
        
Elapsed time: 0.26757s