Scrapelytix


NameScrapelytix JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA package to allow analyzing soccer event data easily
upload_time2024-05-24 22:13:59
maintainerNone
docs_urlNone
authorSoham Basu, Debrup Mitra
requires_python<=3.12.1,>=3.8
licenseUnlicensed
keywords python soccer soccer analysis passing network
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # scrapelytix
Welcome to Scrapelytix! This package allows you to scrape football event data and visualize various metrics like pass-maps, progressive passes, shot-maps. We will show you how to scrape whoscored, sofascore and fbref to gather data in the first place.

## Installation
First, install the package using pip:
pip install scrapelytix

## Usage Guide
### Step 1: Prepare the Data
To use Scrapelytix, you need to provide the URL of the match scorecard from the match center you want to scrape and your User-Agent.

### Step 2: Scrape the Data
Here's a step-by-step guide to using Scrapelytix:

Import the Required Modules:

import re
import json
import pandas as pd
import requests
import numpy as np
from Datawiz.extraction import pass_data, player_data
from Datawiz.filter import analyze_passes, analyze_shots
from Datawiz.plot import pass_network, prg_passes, shot_map

Prompt the User for the URL:

url = input("Enter the URL for pass data: ")
url_shots = 'https://api.sofascore.com/api/v1/event/11352376/shotmap'
HEADERS = {
    'User-Agent': your user agent,
    'Referer': "https://www.whoscored.com/",
    'Accept-Language': "en-US,en;q=0.5",
    'Accept-Encoding': "gzip, deflate, br",
    'Connection': "keep-alive",
    'Upgrade-Insecure-Requests': "1",
    'TE': "Trailers"
}

Extract Player and Pass Data:

players_df, df_passes = player_data(url, HEADERS)
print(players_df.head())
print(df_passes.head())

Analyze Passes:

# Extracting team ids
team_ids = df_passes['teamId'].unique()

# Defining home and away colors for each team
home_colors = ['#FF0000', '#0000FF']  # Example colors (replace with actual colors)
away_colors = ['#FFFFFF', '#FFFFFF']  # Example colors (replace with actual colors)

# Creating DataFrame for clubs with team IDs and labels
df_clubs = pd.DataFrame({'Team ID': team_ids, 'Team Label': ['Team X', 'Team Y'], 'Team Name': ['Team X', 'Team Y'], 'Home Color': home_colors, 'Away Color': away_colors})

# Assuming there are only two teams, you can assign home and away team IDs accordingly
home_team_id = team_ids[0]
away_team_id = team_ids[1]

pass_between_home, pass_between_away, avg_loc_home, avg_loc_away, passes_home, passes_away, df_prg_home, df_comp_prg_home, df_uncomp_prg_home, df_prg_away, df_comp_prg_away, df_uncomp_prg_away = analyze_passes(df_passes, players_df, home_team_id, away_team_id)

# Example print statements to verify the results
print("Passes Between Home Players:")
print(pass_between_home.head())

print("Average Locations of Home Players:")
print(avg_loc_home.head())

print("Home Team Progressive Passes:")
print(df_prg_home.head())

Visualize the Data:

# Pass Network Visualization
pass_network(pass_between_home, pass_between_away, avg_loc_home, avg_loc_away, home_team_id, away_team_id, df_clubs)

# Progressive Passes Visualization
prg_passes(df_comp_prg_home, df_uncomp_prg_home, df_comp_prg_away, df_uncomp_prg_away, home_team_id, away_team_id, df_clubs)


## Output Samples

To better illustrate the steps, you can include screenshots of the outputs after each significant step. This will help users understand the intermediate results and the final visualizations.

1. *Initial Data Extraction:*
   - ![Player Data](images/player_data.png)
   - ![Pass Data](images/pass_data.png)

2. *Pass Network Visualization:*
   - ![Pass Network](images/pass_network.png)

3. *Progressive Passes Visualization:*
   - ![Progressive Passes](images/progressive_passes.png)

### Conclusion

Scrapelytix makes it easy to scrape football match data and visualize important statistics. By following the steps above, you can analyze and visualize data for any match you are interested in.

Feel free to explore the package and provide feedback or contributions on GitHub!

### Contribution

Contributions are welcome! If you find any issues or have suggestions, please open an issue or create a pull request on the [GitHub repository](https://github.com/30debrup/scrapelytix)(https://github.com/gxdfather7/scrapelytix).

### License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "Scrapelytix",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<=3.12.1,>=3.8",
    "maintainer_email": null,
    "keywords": "python, soccer, soccer analysis, passing network",
    "author": "Soham Basu, Debrup Mitra",
    "author_email": "basusoham034@gmail.com, 30debrup@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/74/35/a70245d17ecf7ff2b3f4eb3291a50e92545cb97cde36834e7522996a4975/Scrapelytix-0.0.2.tar.gz",
    "platform": null,
    "description": "# scrapelytix\nWelcome to Scrapelytix! This package allows you to scrape football event data and visualize various metrics like pass-maps, progressive passes, shot-maps. We will show you how to scrape whoscored, sofascore and fbref to gather data in the first place.\n\n## Installation\nFirst, install the package using pip:\npip install scrapelytix\n\n## Usage Guide\n### Step 1: Prepare the Data\nTo use Scrapelytix, you need to provide the URL of the match scorecard from the match center you want to scrape and your User-Agent.\n\n### Step 2: Scrape the Data\nHere's a step-by-step guide to using Scrapelytix:\n\nImport the Required Modules:\n\nimport re\nimport json\nimport pandas as pd\nimport requests\nimport numpy as np\nfrom Datawiz.extraction import pass_data, player_data\nfrom Datawiz.filter import analyze_passes, analyze_shots\nfrom Datawiz.plot import pass_network, prg_passes, shot_map\n\nPrompt the User for the URL:\n\nurl = input(\"Enter the URL for pass data: \")\nurl_shots = 'https://api.sofascore.com/api/v1/event/11352376/shotmap'\nHEADERS = {\n    'User-Agent': your user agent,\n    'Referer': \"https://www.whoscored.com/\",\n    'Accept-Language': \"en-US,en;q=0.5\",\n    'Accept-Encoding': \"gzip, deflate, br\",\n    'Connection': \"keep-alive\",\n    'Upgrade-Insecure-Requests': \"1\",\n    'TE': \"Trailers\"\n}\n\nExtract Player and Pass Data:\n\nplayers_df, df_passes = player_data(url, HEADERS)\nprint(players_df.head())\nprint(df_passes.head())\n\nAnalyze Passes:\n\n# Extracting team ids\nteam_ids = df_passes['teamId'].unique()\n\n# Defining home and away colors for each team\nhome_colors = ['#FF0000', '#0000FF']  # Example colors (replace with actual colors)\naway_colors = ['#FFFFFF', '#FFFFFF']  # Example colors (replace with actual colors)\n\n# Creating DataFrame for clubs with team IDs and labels\ndf_clubs = pd.DataFrame({'Team ID': team_ids, 'Team Label': ['Team X', 'Team Y'], 'Team Name': ['Team X', 'Team Y'], 'Home Color': home_colors, 'Away Color': away_colors})\n\n# Assuming there are only two teams, you can assign home and away team IDs accordingly\nhome_team_id = team_ids[0]\naway_team_id = team_ids[1]\n\npass_between_home, pass_between_away, avg_loc_home, avg_loc_away, passes_home, passes_away, df_prg_home, df_comp_prg_home, df_uncomp_prg_home, df_prg_away, df_comp_prg_away, df_uncomp_prg_away = analyze_passes(df_passes, players_df, home_team_id, away_team_id)\n\n# Example print statements to verify the results\nprint(\"Passes Between Home Players:\")\nprint(pass_between_home.head())\n\nprint(\"Average Locations of Home Players:\")\nprint(avg_loc_home.head())\n\nprint(\"Home Team Progressive Passes:\")\nprint(df_prg_home.head())\n\nVisualize the Data:\n\n# Pass Network Visualization\npass_network(pass_between_home, pass_between_away, avg_loc_home, avg_loc_away, home_team_id, away_team_id, df_clubs)\n\n# Progressive Passes Visualization\nprg_passes(df_comp_prg_home, df_uncomp_prg_home, df_comp_prg_away, df_uncomp_prg_away, home_team_id, away_team_id, df_clubs)\n\n\n## Output Samples\n\nTo better illustrate the steps, you can include screenshots of the outputs after each significant step. This will help users understand the intermediate results and the final visualizations.\n\n1. *Initial Data Extraction:*\n   - ![Player Data](images/player_data.png)\n   - ![Pass Data](images/pass_data.png)\n\n2. *Pass Network Visualization:*\n   - ![Pass Network](images/pass_network.png)\n\n3. *Progressive Passes Visualization:*\n   - ![Progressive Passes](images/progressive_passes.png)\n\n### Conclusion\n\nScrapelytix makes it easy to scrape football match data and visualize important statistics. By following the steps above, you can analyze and visualize data for any match you are interested in.\n\nFeel free to explore the package and provide feedback or contributions on GitHub!\n\n### Contribution\n\nContributions are welcome! If you find any issues or have suggestions, please open an issue or create a pull request on the [GitHub repository](https://github.com/30debrup/scrapelytix)(https://github.com/gxdfather7/scrapelytix).\n\n### License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "Unlicensed",
    "summary": "A package to allow analyzing soccer event data easily",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "python",
        " soccer",
        " soccer analysis",
        " passing network"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "15f06cf930b545923e7ebc22d54f5882375c90c9d2400aa63ee49c80d81f3c6b",
                "md5": "c6a26aa5d08004dc81081ad5f973813e",
                "sha256": "ce26b4d3cb395cf11022e8f4ecf0845c33125b6b909e5f8600768a6ab3601e64"
            },
            "downloads": -1,
            "filename": "Scrapelytix-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c6a26aa5d08004dc81081ad5f973813e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<=3.12.1,>=3.8",
            "size": 9065,
            "upload_time": "2024-05-24T22:13:58",
            "upload_time_iso_8601": "2024-05-24T22:13:58.024586Z",
            "url": "https://files.pythonhosted.org/packages/15/f0/6cf930b545923e7ebc22d54f5882375c90c9d2400aa63ee49c80d81f3c6b/Scrapelytix-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7435a70245d17ecf7ff2b3f4eb3291a50e92545cb97cde36834e7522996a4975",
                "md5": "c3f09371cf6d129317406a90366f0f58",
                "sha256": "d3182ddc6ad2de4e0f0434b5fbed6d337064756079ffe9d41a37bb313c81a9ad"
            },
            "downloads": -1,
            "filename": "Scrapelytix-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c3f09371cf6d129317406a90366f0f58",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<=3.12.1,>=3.8",
            "size": 9945,
            "upload_time": "2024-05-24T22:13:59",
            "upload_time_iso_8601": "2024-05-24T22:13:59.757350Z",
            "url": "https://files.pythonhosted.org/packages/74/35/a70245d17ecf7ff2b3f4eb3291a50e92545cb97cde36834e7522996a4975/Scrapelytix-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-24 22:13:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "scrapelytix"
}
        
Elapsed time: 0.26636s