MetaDataScraper

Name	MetaDataScraper JSON
Version	1.0.4 JSON
	download
home_page	None
Summary	A module designed to automate the extraction of follower counts and post details from a public Facebook page.
upload_time	2024-08-15 14:20:06
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	None
keywords	facebook scraper meta selenium webdriver-manager automation web-scraping web-crawling web-automation facebook-scraper facebook-web-scraper meta-scraper
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![Licence](https://badgen.net/github/license/ishan-surana/MetaDataScraper?color=DC143C)](https://github.com/ishan-surana/MetaDataScraper/blob/main/LICENCE) [![Python](https://img.shields.io/badge/python-%3E=3.10-slateblue.svg)](https://www.python.org/downloads/release/python-3119/) [![Wheel](https://img.shields.io/badge/wheel-yes-FF00C9.svg)](https://files.pythonhosted.org/packages/02/80/c53d5e8439361c913e23b6345e85e748a7ac7e82e22cb9f7cd9ec77d5d52/MetaDataScraper-1.0.0-py3-none-any.whl) [![Latest](https://badgen.net/github/release/ishan-surana/MetaDataScraper?label=latest+release&color=green)](https://pypi.org/project/MetaDataScraper/1.0.0/) [![Releases](https://badgen.net/github/releases/ishan-surana/MetaDataScraper?color=orange)](https://github.com/ishan-surana/MetaDataScraper/releases) [![Stars](https://badgen.net/github/stars/ishan-surana/MetaDataScraper?color=yellow)](https://github.com/ishan-surana/MetaDataScraper/stargazers) [![Forks](https://badgen.net/github/forks/ishan-surana/MetaDataScraper?color=dark)](https://github.com/ishan-surana/MetaDataScraper/forks) [![Issues](https://badgen.net/github/issues/ishan-surana/MetaDataScraper?color=800000)](https://github.com/ishan-surana/MetaDataScraper/issues) [![PRs](https://badgen.net/github/prs/ishan-surana/MetaDataScraper?color=C71585)](https://github.com/ishan-surana/MetaDataScraper/pulls) ![Downloads](https://img.shields.io/github/downloads/ishan-surana/MetaDataScraper/total) [![Last commit](https://badgen.net/github/last-commit/ishan-surana/MetaDataScraper?color=blue)](https://github.com/ishan-surana/MetaDataScraper/commits/main/) [![Workflow](https://github.com/ishan-surana/MetaDataScraper/actions/workflows/python-publish.yml/badge.svg)](https://github.com/ishan-surana/MetaDataScraper/blob/main/.github/workflows/python-publish.yml) [![PyPI](https://d25lcipzij17d.cloudfront.net/badge.svg?id=py&r=r&ts=1683906897&type=6e&v=1.0.0&x2=0)](https://pypi.org/project/MetaDataScraper/) [![Maintained](https://img.shields.io/badge/maintained-yes-cyan)](https://github.com/ishan-surana/MetaDataScraper/pulse) [![OS](https://img.shields.io/badge/OS-Windows-FF0000)](https://www.microsoft.com/software-download/windows11) [![Documentation Status](https://readthedocs.org/projects/metadatascraper/badge/?version=latest)](https://metadatascraper.readthedocs.io/en/latest/?badge=latest)<br>
---
## <div align=center>Support this package by donating here! ➡️ [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-badge?style=plastic&logo=buy-me-a-coffee&color=black)](https://www.buymeacoffee.com/ishansurana) [![Paypal](https://img.shields.io/badge/PayPal-badge?style=plastic&logo=paypal&color=white)](https://www.paypal.com/paypalme/ishansurana)</div><br>

# MetaDataScraper

MetaDataScraper is a Python package designed to automate the extraction of information like follower counts, and post details & interactions from a public Facebook page, in the form of a list. It uses Selenium WebDriver for web automation and scraping.  
The module provides two classes: `LoginlessScraper` and `LoggedInScraper`. The `LoginlessScraper` class does not require any authentication or API keys to scrape the data. However, it has a drawback of being unable to access some Facebook pages. 
The `LoggedInScraper` class overcomes this drawback by utilising the credentials of a Facebook account (of user) to login and scrape the data.

## Installation

You can install MetaDataScraper using pip:

```
pip install MetaDataScraper
```

Make sure you have Python 3.x and pip installed.

## Usage

To use MetaDataScraper, follow these steps:

1. Import the `LoginlessScraper` or the `LoggedInScraper` class:

   ```python
   from MetaDataScraper import LoginlessScraper, LoggedInScraper
   ```

2. Initialize the scraper with the Facebook page ID:

   ```python
   page_id = "your_target_page_id"
   scraper = LoginlessScraper(page_id)
   email = "your_facebook_email"
   password = "your_facebook_password"
   scraper = LoggedInScraper(page_id, email, password)
   ```

3. Scrape the Facebook page to retrieve information:

   ```python
   result = scraper.scrape()
   ```

4. Access the scraped data from the result dictionary:

   ```python
   print(f"Followers: {result['followers']}")
   print(f"Post Texts: {result['post_texts']}")
   print(f"Post Likes: {result['post_likes']}")
   print(f"Post Shares: {result['post_shares']}")
   print(f"Is Video: {result['is_video']}")
   print(f"Video Links: {result['video_links']}")
   ```

## Features

- **Automated Extraction**: Automatically fetches follower counts, post texts, likes, shares, and video links from Facebook pages.
- **Comprehensive Data Retrieval**: Retrieves detailed information about each post, including text content, interaction metrics (likes, shares), and multimedia (e.g., video links).
- **Flexible Handling**: Adapts to diverse post structures and various types of multimedia content present on Facebook pages, like post texts or reels.
- **Enhanced Access with Logged-In Scraper**: Overcomes limitations faced by anonymous scraping (loginless) by utilizing Facebook account credentials for broader page access.
- **Headless Operation**: Executes scraping tasks in headless mode, ensuring seamless and non-intrusive data collection without displaying a browser interface.
- **Scalability**: Supports scaling to handle large volumes of data extraction efficiently, suitable for monitoring multiple Facebook pages simultaneously.
- **Dependency Management**: Utilizes Selenium WebDriver for robust web automation and scraping capabilities, compatible with Python 3.x environments.
- **Ease of Use**: Simplifies the process with straightforward initialization and method calls, facilitating quick integration into existing workflows.

## Dependencies

- selenium
- webdriver_manager

## License

This project is licensed under the Apache Software License Version 2.0 - see the [LICENSE](https://github.com/ishan-surana/MetaDataScraper/blob/main/LICENCE) file for details.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "MetaDataScraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Ishan Surana <ishansurana1234@gmail.com>",
    "keywords": "facebook, scraper, meta, selenium, webdriver-manager, automation, web-scraping, web-crawling, web-automation, facebook-scraper, facebook-web-scraper, meta-scraper",
    "author": null,
    "author_email": "Ishan Surana <ishansurana1234@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/a0/0c/3e9706544734db509f31a8b33b0640ed8b421c123934eab50e78b7041622/metadatascraper-1.0.4.tar.gz",
    "platform": null,
    "description": "[![Licence](https://badgen.net/github/license/ishan-surana/MetaDataScraper?color=DC143C)](https://github.com/ishan-surana/MetaDataScraper/blob/main/LICENCE) [![Python](https://img.shields.io/badge/python-%3E=3.10-slateblue.svg)](https://www.python.org/downloads/release/python-3119/) [![Wheel](https://img.shields.io/badge/wheel-yes-FF00C9.svg)](https://files.pythonhosted.org/packages/02/80/c53d5e8439361c913e23b6345e85e748a7ac7e82e22cb9f7cd9ec77d5d52/MetaDataScraper-1.0.0-py3-none-any.whl) [![Latest](https://badgen.net/github/release/ishan-surana/MetaDataScraper?label=latest+release&color=green)](https://pypi.org/project/MetaDataScraper/1.0.0/) [![Releases](https://badgen.net/github/releases/ishan-surana/MetaDataScraper?color=orange)](https://github.com/ishan-surana/MetaDataScraper/releases) [![Stars](https://badgen.net/github/stars/ishan-surana/MetaDataScraper?color=yellow)](https://github.com/ishan-surana/MetaDataScraper/stargazers) [![Forks](https://badgen.net/github/forks/ishan-surana/MetaDataScraper?color=dark)](https://github.com/ishan-surana/MetaDataScraper/forks) [![Issues](https://badgen.net/github/issues/ishan-surana/MetaDataScraper?color=800000)](https://github.com/ishan-surana/MetaDataScraper/issues) [![PRs](https://badgen.net/github/prs/ishan-surana/MetaDataScraper?color=C71585)](https://github.com/ishan-surana/MetaDataScraper/pulls) ![Downloads](https://img.shields.io/github/downloads/ishan-surana/MetaDataScraper/total) [![Last commit](https://badgen.net/github/last-commit/ishan-surana/MetaDataScraper?color=blue)](https://github.com/ishan-surana/MetaDataScraper/commits/main/) [![Workflow](https://github.com/ishan-surana/MetaDataScraper/actions/workflows/python-publish.yml/badge.svg)](https://github.com/ishan-surana/MetaDataScraper/blob/main/.github/workflows/python-publish.yml) [![PyPI](https://d25lcipzij17d.cloudfront.net/badge.svg?id=py&r=r&ts=1683906897&type=6e&v=1.0.0&x2=0)](https://pypi.org/project/MetaDataScraper/) [![Maintained](https://img.shields.io/badge/maintained-yes-cyan)](https://github.com/ishan-surana/MetaDataScraper/pulse) [![OS](https://img.shields.io/badge/OS-Windows-FF0000)](https://www.microsoft.com/software-download/windows11) [![Documentation Status](https://readthedocs.org/projects/metadatascraper/badge/?version=latest)](https://metadatascraper.readthedocs.io/en/latest/?badge=latest)<br>\n---\n## <div align=center>Support this package by donating here! \u27a1\ufe0f [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-badge?style=plastic&logo=buy-me-a-coffee&color=black)](https://www.buymeacoffee.com/ishansurana) [![Paypal](https://img.shields.io/badge/PayPal-badge?style=plastic&logo=paypal&color=white)](https://www.paypal.com/paypalme/ishansurana)</div><br>\n\n# MetaDataScraper\n\nMetaDataScraper is a Python package designed to automate the extraction of information like follower counts, and post details & interactions from a public Facebook page, in the form of a list. It uses Selenium WebDriver for web automation and scraping.  \nThe module provides two classes: `LoginlessScraper` and `LoggedInScraper`. The `LoginlessScraper` class does not require any authentication or API keys to scrape the data. However, it has a drawback of being unable to access some Facebook pages. \nThe `LoggedInScraper` class overcomes this drawback by utilising the credentials of a Facebook account (of user) to login and scrape the data.\n\n## Installation\n\nYou can install MetaDataScraper using pip:\n\n```\npip install MetaDataScraper\n```\n\nMake sure you have Python 3.x and pip installed.\n\n## Usage\n\nTo use MetaDataScraper, follow these steps:\n\n1. Import the `LoginlessScraper` or the `LoggedInScraper` class:\n\n   ```python\n   from MetaDataScraper import LoginlessScraper, LoggedInScraper\n   ```\n\n2. Initialize the scraper with the Facebook page ID:\n\n   ```python\n   page_id = \"your_target_page_id\"\n   scraper = LoginlessScraper(page_id)\n   email = \"your_facebook_email\"\n   password = \"your_facebook_password\"\n   scraper = LoggedInScraper(page_id, email, password)\n   ```\n\n3. Scrape the Facebook page to retrieve information:\n\n   ```python\n   result = scraper.scrape()\n   ```\n\n4. Access the scraped data from the result dictionary:\n\n   ```python\n   print(f\"Followers: {result['followers']}\")\n   print(f\"Post Texts: {result['post_texts']}\")\n   print(f\"Post Likes: {result['post_likes']}\")\n   print(f\"Post Shares: {result['post_shares']}\")\n   print(f\"Is Video: {result['is_video']}\")\n   print(f\"Video Links: {result['video_links']}\")\n   ```\n\n## Features\n\n- **Automated Extraction**: Automatically fetches follower counts, post texts, likes, shares, and video links from Facebook pages.\n- **Comprehensive Data Retrieval**: Retrieves detailed information about each post, including text content, interaction metrics (likes, shares), and multimedia (e.g., video links).\n- **Flexible Handling**: Adapts to diverse post structures and various types of multimedia content present on Facebook pages, like post texts or reels.\n- **Enhanced Access with Logged-In Scraper**: Overcomes limitations faced by anonymous scraping (loginless) by utilizing Facebook account credentials for broader page access.\n- **Headless Operation**: Executes scraping tasks in headless mode, ensuring seamless and non-intrusive data collection without displaying a browser interface.\n- **Scalability**: Supports scaling to handle large volumes of data extraction efficiently, suitable for monitoring multiple Facebook pages simultaneously.\n- **Dependency Management**: Utilizes Selenium WebDriver for robust web automation and scraping capabilities, compatible with Python 3.x environments.\n- **Ease of Use**: Simplifies the process with straightforward initialization and method calls, facilitating quick integration into existing workflows.\n\n## Dependencies\n\n- selenium\n- webdriver_manager\n\n## License\n\nThis project is licensed under the Apache Software License Version 2.0 - see the [LICENSE](https://github.com/ishan-surana/MetaDataScraper/blob/main/LICENCE) file for details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A module designed to automate the extraction of follower counts and post details from a public Facebook page.",
    "version": "1.0.4",
    "project_urls": {
        "Changelog": "https://github.com/ishan-surana/MetaDataScraper/releases",
        "Documentation": "https://metadatascraper.readthedocs.io/en/latest/",
        "Homepage": "https://metadatascraper.readthedocs.io/en/latest/",
        "Issues": "https://github.com/ishan-surana/MetaDataScraper/issues",
        "Repository": "https://github.com/ishan-surana/MetaDataScraper"
    },
    "split_keywords": [
        "facebook",
        " scraper",
        " meta",
        " selenium",
        " webdriver-manager",
        " automation",
        " web-scraping",
        " web-crawling",
        " web-automation",
        " facebook-scraper",
        " facebook-web-scraper",
        " meta-scraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b54383b83694bfb524ee64be197c73ba5de96b7801730694b7b409d55da463d",
                "md5": "83e45ff49dc9bfb8f6e7c31445db7622",
                "sha256": "66e968f19b6d3e7e1127ffda11241d88c2a18bec860fcb165f99e797eb09b880"
            },
            "downloads": -1,
            "filename": "MetaDataScraper-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "83e45ff49dc9bfb8f6e7c31445db7622",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 13346,
            "upload_time": "2024-08-15T14:20:05",
            "upload_time_iso_8601": "2024-08-15T14:20:05.037602Z",
            "url": "https://files.pythonhosted.org/packages/0b/54/383b83694bfb524ee64be197c73ba5de96b7801730694b7b409d55da463d/MetaDataScraper-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a00c3e9706544734db509f31a8b33b0640ed8b421c123934eab50e78b7041622",
                "md5": "49ee7a4708f10db32d4d9ff4c2ae170f",
                "sha256": "731e6b94d85a32c3db76f8e4f4356b95429e027378230a59f07d3e97ca9d3bd2"
            },
            "downloads": -1,
            "filename": "metadatascraper-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "49ee7a4708f10db32d4d9ff4c2ae170f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 13589,
            "upload_time": "2024-08-15T14:20:06",
            "upload_time_iso_8601": "2024-08-15T14:20:06.423320Z",
            "url": "https://files.pythonhosted.org/packages/a0/0c/3e9706544734db509f31a8b33b0640ed8b421c123934eab50e78b7041622/metadatascraper-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-15 14:20:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ishan-surana",
    "github_project": "MetaDataScraper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "metadatascraper"
}

None