# PyAttrScore - Python Attribution Modeling Package
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
[](https://badge.fury.io/py/pyattrscore)
[](https://github.com/pyattrscore/pyattrscore/actions)
[](https://codecov.io/gh/pyattrscore/pyattrscore)
PyAttrScore is a Python package designed to calculate marketing attribution scores using multiple models. It includes validation, logging, error handling, and comprehensive testing modules, making it ready for integration into analytics pipelines to measure channel effectiveness.
## ๐ Features
- **Multiple Attribution Models**: First Touch, Last Touch, Linear, Time Decay (Exponential & Linear), U-Shaped, Windowed First Touch, and **Football-Inspired Attribution**
- **๐ Football Attribution Model**: Treats marketing channels as football players with distinct roles (Scorer, Assister, Key Passer, Most Passes, Most Minutes, Most Dribbles, Participant) and calculates a Channel Impact Score (CIS) based on role weights
- **Role-Based Attribution**: Assigns credit based on channel roles in the customer journey, providing intuitive team-based insights
- **Channel Archetypes**: Classifies channels into Generator, Assister, Closer, and Participant archetypes for strategic analysis
- **Configurable Role Weights**: Customize the impact of each football role on the CIS calculation
- **Comprehensive Channel Metrics**: Includes goals, assists, key passes, engagement time, expected goals, and more
- **Production Ready**: Robust error handling, logging, and validation for reliable use in analytics pipelines
- **Flexible Configuration**: YAML-based and programmatic configuration options for all models
- **Data Validation**: Built-in Pydantic models ensure input data integrity
- **Comprehensive Testing**: Over 90% test coverage with pytest for confidence in results
- **Easy Integration**: Simple API design for seamless integration into existing workflows
- **Performance Optimized**: Efficient algorithms designed for large-scale data processing
- **Advanced Analytics**: Team performance summaries, role-based channel analysis, and batch processing support
## ๐ฆ Installation
### From PyPI (Recommended)
```bash
pip install pyattrscore
```
### From Source
```bash
git clone https://github.com/pyattrscore/pyattrscore.git
cd pyattrscore
pip install -e .
```
### Development Installation
```bash
git clone https://github.com/pyattrscore/pyattrscore.git
cd pyattrscore
pip install -e ".[dev]"
```
## ๐โโ๏ธ Quick Start
### ๐ Football Attribution Demo
Experience the revolutionary Football-Inspired Attribution model:
```bash
# Run the football attribution demo
python main.py --football
# Compare all attribution models
python main.py --compare
# Run detailed football analysis
python football_example.py
# Use sample data
python main.py --football --data sample_data.csv
```
### Basic Usage
```python
import pandas as pd
from datetime import datetime
from pyattrscore import FirstTouchAttribution, AttributionConfig
# Sample touchpoint data
data = pd.DataFrame([
{
'user_id': 'user_001',
'touchpoint_id': 'tp_001',
'channel': 'email',
'timestamp': datetime(2023, 1, 1, 10, 0),
'conversion': False,
'conversion_value': None
},
{
'user_id': 'user_001',
'touchpoint_id': 'tp_002',
'channel': 'social_media',
'timestamp': datetime(2023, 1, 2, 14, 30),
'conversion': False,
'conversion_value': None
},
{
'user_id': 'user_001',
'touchpoint_id': 'tp_003',
'channel': 'search',
'timestamp': datetime(2023, 1, 3, 9, 15),
'conversion': True,
'conversion_value': 150.0
}
])
# Initialize attribution model
config = AttributionConfig(attribution_window_days=30)
model = FirstTouchAttribution(config)
# Calculate attribution
results = model.calculate_attribution(data)
print(results)
```
### Using Different Models
```python
from pyattrscore import (
LinearAttribution,
ExponentialDecayAttribution,
UShapedAttribution,
FootballAttribution,
get_model
)
# Method 1: Direct instantiation
linear_model = LinearAttribution(config)
results_linear = linear_model.calculate_attribution(data)
# Method 2: Using model factory
decay_model = get_model('exponential_decay', config)
results_decay = decay_model.calculate_attribution(data)
# Method 3: Football Attribution
football_model = get_model('football')
results_football = football_model.calculate_attribution(data)
# Method 4: U-Shaped with custom weights
u_shaped_model = UShapedAttribution(
config,
first_touch_weight=0.3,
last_touch_weight=0.5
)
results_u_shaped = u_shaped_model.calculate_attribution(data)
```
## ๐ Attribution Models
### 1. First Touch Attribution
Assigns 100% credit to the first touchpoint in the customer journey.
```python
from pyattrscore import FirstTouchAttribution
model = FirstTouchAttribution()
results = model.calculate_attribution(data)
```
**Use Cases:**
- Understanding awareness channel effectiveness
- Short sales cycles
- Top-of-funnel optimization
### 2. Last Touch Attribution
Assigns 100% credit to the last touchpoint before conversion.
```python
from pyattrscore import LastTouchAttribution
model = LastTouchAttribution()
results = model.calculate_attribution(data)
```
**Use Cases:**
- Understanding closing channel effectiveness
- Bottom-of-funnel optimization
- Direct response campaigns
### 3. Linear Attribution
Distributes credit equally among all touchpoints within the attribution window.
```python
from pyattrscore import LinearAttribution, AttributionConfig
config = AttributionConfig(attribution_window_days=30)
model = LinearAttribution(config)
results = model.calculate_attribution(data)
```
**Use Cases:**
- Balanced view of customer journey
- Multi-touch attribution analysis
- Understanding overall channel contribution
### 4. Time Decay Attribution
Credits touchpoints based on their proximity to conversion.
```python
from pyattrscore import ExponentialDecayAttribution, LinearDecayAttribution
# Exponential decay
config = AttributionConfig(attribution_window_days=30, decay_rate=0.5)
exp_model = ExponentialDecayAttribution(config)
results_exp = exp_model.calculate_attribution(data)
# Linear decay
linear_decay_model = LinearDecayAttribution(config)
results_linear_decay = linear_decay_model.calculate_attribution(data)
```
**Use Cases:**
- Understanding recency impact
- Time-sensitive attribution analysis
- Weighting recent touchpoints higher
### 5. U-Shaped Attribution
Assigns higher credit to first and last touchpoints, distributing remainder to middle touchpoints.
```python
from pyattrscore import UShapedAttribution
model = UShapedAttribution(
first_touch_weight=0.4,
last_touch_weight=0.4
# Remaining 20% distributed to middle touchpoints
)
results = model.calculate_attribution(data)
```
**Use Cases:**
- Balancing awareness and conversion touchpoints
- Multi-touch customer journeys
- Understanding nurturing touchpoint value
### 6. Windowed First Touch Attribution
Assigns 100% credit to the first touchpoint within the attribution window.
```python
from pyattrscore import WindowedFirstTouchAttribution, AttributionConfig
config = AttributionConfig(attribution_window_days=14)
model = WindowedFirstTouchAttribution(config)
results = model.calculate_attribution(data)
```
**Use Cases:**
- Understanding recent awareness drivers
- Time-bounded first touch analysis
- Focusing on relevant touchpoints
### 7. ๐ Football-Based Attribution Model (Improved Definition)
The Football-Based Attribution Model applies a football (soccer) metaphor to marketing attribution, treating marketing channels as players on a football team. Each channel is assigned a role based on its contribution to the customer journey, and a Channel Impact Score (CIS) is calculated to quantify its overall impact.
```python
from pyattrscore import FootballAttribution, FootballAttributionConfig
# Configure the football model
config = FootballAttributionConfig(
attribution_window_days=30,
scorer_weight=0.25, # Final conversion touchpoint
assister_weight=0.20, # Setup touchpoint before conversion
key_passer_weight=0.15, # Journey initiator
most_passes_weight=0.14, # Most frequent engagement
most_minutes_weight=0.10, # Longest engagement time
most_dribbles_weight=0.10, # Cold lead revival
participant_weight=0.06, # Supporting touchpoint
baseline_weight=0.1,
cold_lead_threshold_days=7
)
model = FootballAttribution(config)
results = model.calculate_attribution(data)
# Get team performance summary
summary = model.get_channel_performance_summary(results)
print(summary)
```
### Football Roles and Their Marketing Analogies
- **Scorer**: The final touchpoint that directly leads to conversion, analogous to the striker who scores the goal.
- **Assister**: The touchpoint immediately preceding the conversion, setting up the "goal," similar to a midfielder providing an assist.
- **Key Passer**: The journey initiator, the first touchpoint that starts the conversion build-up, like a defender or playmaker starting the play.
- **Most Passes**: The channel with the highest frequency of engagement, representing consistent involvement.
- **Most Minutes**: The channel with the longest engagement time, indicating sustained interaction.
- **Most Dribbles**: The channel that revives cold leads, re-engaging users after inactivity.
- **Participant**: Supporting touchpoints that contribute but do not fit the above roles.
### Channel Archetypes
Channels are classified into archetypes based on their typical marketing role:
- **Generator**: Creates awareness and initiates plays (e.g., Organic Search, Social Media).
- **Assister**: Nurtures and sets up conversions (e.g., Email, Paid Search).
- **Closer**: Finishes conversions (e.g., Direct, Referral).
- **Participant**: Supporting roles that assist the team.
### Channel Impact Score (CIS) Formula
The CIS quantifies the contribution of each channel by combining a baseline weight with weighted role contributions:
```
CIS = baseline_weight + (1 - baseline_weight) ร ฮฃ(role_weight ร role_indicator)
```
Where:
- `baseline_weight` is a minimum credit assigned to all touchpoints.
- `role_weight` is the predefined weight for each football role.
- `role_indicator` is 1 if the channel has the role, 0 otherwise.
This formula ensures that channels with key roles receive higher attribution while all channels receive some baseline credit.
## โ๏ธ Configuration
### Using Configuration Objects
```python
from pyattrscore import AttributionConfig, FootballAttributionConfig
# Standard configuration
config = AttributionConfig(
attribution_window_days=30,
decay_rate=0.6,
include_non_converting_paths=False
)
# Football-specific configuration
football_config = FootballAttributionConfig(
attribution_window_days=30,
scorer_weight=0.25,
assister_weight=0.20,
baseline_weight=0.1,
channel_archetypes={
'organic_search': 'generator',
'paid_search': 'assister',
'direct': 'closer',
'referral': 'closer'
}
)
```
### Using YAML Configuration
```yaml
# config.yaml
global:
attribution_window_days: 30
log_level: "INFO"
models:
linear:
use_attribution_window: true
exponential_decay:
decay_rate: 0.5
use_attribution_window: true
football:
role_weights:
scorer_weight: 0.25
assister_weight: 0.20
key_passer_weight: 0.15
baseline_weight: 0.1
cold_lead_threshold_days: 7
channel_archetypes:
organic_search: "generator"
paid_search: "assister"
direct: "closer"
```
```python
import yaml
from pyattrscore import AttributionConfig, FootballAttributionConfig
with open('config.yaml', 'r') as f:
config_dict = yaml.safe_load(f)
config = AttributionConfig(**config_dict['global'])
football_config = FootballAttributionConfig(**config_dict['models']['football'])
```
## ๐ Advanced Usage
### Football Attribution Analysis
```python
from pyattrscore import FootballAttribution
import pandas as pd
# Load your data
data = pd.read_csv('sample_data.csv')
# Initialize football model
model = FootballAttribution()
results = model.calculate_attribution(data)
# Analyze team performance
summary = model.get_channel_performance_summary(results)
# Top performers
print("๐ฅ
Top Scorers (Closers):")
top_scorers = summary.nlargest(3, 'channel_goals')
print(top_scorers[['channel', 'channel_goals', 'channel_archetype']])
print("\n๐ฏ Top Assisters (Setup Channels):")
top_assisters = summary.nlargest(3, 'channel_assists')
print(top_assisters[['channel', 'channel_assists', 'channel_archetype']])
# Team formation analysis
print("\n๐๏ธ Team Formation Performance:")
archetype_performance = summary.groupby('channel_archetype').agg({
'channel_goals': 'sum',
'channel_assists': 'sum',
'attribution_score': 'sum'
}).round(2)
print(archetype_performance)
```
### Model Comparison
```python
from pyattrscore import get_model, list_models
# Compare multiple models including football
models_to_compare = ['first_touch', 'last_touch', 'linear', 'u_shaped', 'football']
results_comparison = {}
for model_name in models_to_compare:
model = get_model(model_name, config)
results = model.calculate_attribution(data)
# Aggregate by channel
channel_attribution = results.groupby('channel')['attribution_score'].sum()
results_comparison[model_name] = channel_attribution
comparison_df = pd.DataFrame(results_comparison).fillna(0)
print(comparison_df)
# Football-specific analysis
if 'football' in models_to_compare:
football_model = get_model('football')
football_results = football_model.calculate_attribution(data)
team_summary = football_model.get_channel_performance_summary(football_results)
print("\n๐ Team Performance Summary:")
print(team_summary[['channel', 'channel_archetype', 'channel_goals', 'channel_assists']])
```
### Batch Processing Multiple Users
```python
import pandas as pd
from pyattrscore import FootballAttribution
# Large dataset with multiple users
large_data = pd.DataFrame([
# User 1 journey
{'user_id': 'user_001', 'touchpoint_id': 'tp_001', 'channel': 'email',
'timestamp': datetime(2023, 1, 1), 'conversion': False, 'engagement_time': 30.0},
{'user_id': 'user_001', 'touchpoint_id': 'tp_002', 'channel': 'search',
'timestamp': datetime(2023, 1, 2), 'conversion': True, 'conversion_value': 100.0, 'engagement_time': 60.0},
# User 2 journey
{'user_id': 'user_002', 'touchpoint_id': 'tp_003', 'channel': 'social',
'timestamp': datetime(2023, 1, 1), 'conversion': False, 'engagement_time': 25.0},
{'user_id': 'user_002', 'touchpoint_id': 'tp_004', 'channel': 'email',
'timestamp': datetime(2023, 1, 3), 'conversion': True, 'conversion_value': 200.0, 'engagement_time': 45.0},
])
model = FootballAttribution()
results = model.calculate_attribution(large_data)
# Analyze results by channel
channel_performance = results.groupby('channel').agg({
'attribution_score': 'sum',
'attribution_value': 'sum',
'user_id': 'nunique',
'channel_goals': 'first',
'channel_assists': 'first'
}).round(4)
print(channel_performance)
```
## ๐ง Data Requirements
### Required Columns
Your input DataFrame must contain these columns:
- `user_id` (str): Unique identifier for each user/customer
- `touchpoint_id` (str): Unique identifier for each touchpoint
- `channel` (str): Marketing channel name (e.g., 'email', 'search', 'social')
- `timestamp` (datetime): When the touchpoint occurred
### Optional Columns
- `conversion` (bool): Whether this touchpoint led to a conversion
- `conversion_value` (float): Monetary value of the conversion
- `engagement_time` (float): Time spent on the touchpoint (recommended for Football Attribution)
### Sample Data File
Use the provided `sample_data.csv` for testing:
```python
import pandas as pd
from pyattrscore import FootballAttribution
# Load sample data
data = pd.read_csv('sample_data.csv')
print(data.head())
# Run football attribution
model = FootballAttribution()
results = model.calculate_attribution(data)
```
### Data Validation
PyAttrScore automatically validates your data:
```python
from pyattrscore.exceptions import InvalidInputError
try:
results = model.calculate_attribution(invalid_data)
except InvalidInputError as e:
print(f"Data validation failed: {e}")
print(f"Invalid fields: {e.invalid_fields}")
```
## ๐ Output Format
Attribution results include:
```python
# Standard columns
results.columns
# ['user_id', 'touchpoint_id', 'channel', 'timestamp', 'conversion',
# 'attribution_score', 'attribution_percentage', 'model_name', 'attribution_value']
# Football-specific columns (when using FootballAttribution)
# ['football_roles', 'channel_archetype', 'channel_goals', 'channel_assists',
# 'channel_passes', 'channel_minutes', 'channel_expected_goals']
# Example output
print(results.head())
# user_id touchpoint_id channel ... football_roles channel_archetype
# 0 user_001 tp_001 email ... [assister] assister
# 1 user_001 tp_002 social_media ... [key_passer] generator
# 2 user_001 tp_003 search ... [scorer] closer
```
## ๐งช Testing
Run the test suite:
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=pyattrscore --cov-report=html
# Run football attribution tests
pytest tests/test_football.py
# Run with verbose output
pytest -v
```
## ๐ Football Attribution Examples
### Example 1: Specification Example
```python
# The classic example from the specification
data = pd.DataFrame({
'user_id': ['customer_1', 'customer_1', 'customer_1'],
'touchpoint_id': ['tp_1', 'tp_2', 'tp_3'],
'channel': ['organic_search', 'paid_search', 'referral'],
'timestamp': [
datetime(2024, 1, 1, 10, 0),
datetime(2024, 1, 2, 11, 0),
datetime(2024, 1, 3, 12, 0)
],
'conversion': [False, False, True],
'conversion_value': [None, None, 100.0],
'engagement_time': [30.0, 45.0, 60.0]
})
model = FootballAttribution()
results = model.calculate_attribution(data)
# Expected results:
# Referral (Closer): ~39%
# Paid Search (Assister): ~26%
# Organic Search (Generator): ~35%
```
### Example 2: Multi-Customer Analysis
```python
# Run the comprehensive example
python football_example.py
# This will show:
# - Role assignments for each touchpoint
# - Channel performance metrics
# - Team formation analysis
# - Football analytics insights
```
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Acknowledgments
- Built with [pandas](https://pandas.pydata.org/), [numpy](https://numpy.org/), and [pydantic](https://pydantic-docs.helpmanual.io/)
- Inspired by marketing attribution research and football analytics
Raw data
{
"_id": null,
"home_page": null,
"name": "pyattrscore",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Mani Gidijala <mani@example.com>",
"keywords": "attribution, marketing, analytics, conversion, touchpoint, customer-journey, marketing-mix-modeling, data-science, machine-learning",
"author": "Mani Gidijala",
"author_email": "Mani Gidijala <mani@example.com>",
"download_url": "https://files.pythonhosted.org/packages/7a/0a/9938d58666e245cf698d39a37cebeeb4bb596522e325d04ea247e8749e4c/pyattrscore-0.0.1.tar.gz",
"platform": null,
"description": "# PyAttrScore - Python Attribution Modeling Package\n\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n[](https://badge.fury.io/py/pyattrscore)\n[](https://github.com/pyattrscore/pyattrscore/actions)\n[](https://codecov.io/gh/pyattrscore/pyattrscore)\n\nPyAttrScore is a Python package designed to calculate marketing attribution scores using multiple models. It includes validation, logging, error handling, and comprehensive testing modules, making it ready for integration into analytics pipelines to measure channel effectiveness.\n\n## \ud83d\ude80 Features\n\n- **Multiple Attribution Models**: First Touch, Last Touch, Linear, Time Decay (Exponential & Linear), U-Shaped, Windowed First Touch, and **Football-Inspired Attribution**\n- **\ud83c\udfc8 Football Attribution Model**: Treats marketing channels as football players with distinct roles (Scorer, Assister, Key Passer, Most Passes, Most Minutes, Most Dribbles, Participant) and calculates a Channel Impact Score (CIS) based on role weights\n- **Role-Based Attribution**: Assigns credit based on channel roles in the customer journey, providing intuitive team-based insights\n- **Channel Archetypes**: Classifies channels into Generator, Assister, Closer, and Participant archetypes for strategic analysis\n- **Configurable Role Weights**: Customize the impact of each football role on the CIS calculation\n- **Comprehensive Channel Metrics**: Includes goals, assists, key passes, engagement time, expected goals, and more\n- **Production Ready**: Robust error handling, logging, and validation for reliable use in analytics pipelines\n- **Flexible Configuration**: YAML-based and programmatic configuration options for all models\n- **Data Validation**: Built-in Pydantic models ensure input data integrity\n- **Comprehensive Testing**: Over 90% test coverage with pytest for confidence in results\n- **Easy Integration**: Simple API design for seamless integration into existing workflows\n- **Performance Optimized**: Efficient algorithms designed for large-scale data processing\n- **Advanced Analytics**: Team performance summaries, role-based channel analysis, and batch processing support\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install pyattrscore\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/pyattrscore/pyattrscore.git\ncd pyattrscore\npip install -e .\n```\n\n### Development Installation\n\n```bash\ngit clone https://github.com/pyattrscore/pyattrscore.git\ncd pyattrscore\npip install -e \".[dev]\"\n```\n\n## \ud83c\udfc3\u200d\u2642\ufe0f Quick Start\n\n### \ud83c\udfc8 Football Attribution Demo\n\nExperience the revolutionary Football-Inspired Attribution model:\n\n```bash\n# Run the football attribution demo\npython main.py --football\n\n# Compare all attribution models\npython main.py --compare\n\n# Run detailed football analysis\npython football_example.py\n\n# Use sample data\npython main.py --football --data sample_data.csv\n```\n\n### Basic Usage\n\n```python\nimport pandas as pd\nfrom datetime import datetime\nfrom pyattrscore import FirstTouchAttribution, AttributionConfig\n\n# Sample touchpoint data\ndata = pd.DataFrame([\n {\n 'user_id': 'user_001',\n 'touchpoint_id': 'tp_001',\n 'channel': 'email',\n 'timestamp': datetime(2023, 1, 1, 10, 0),\n 'conversion': False,\n 'conversion_value': None\n },\n {\n 'user_id': 'user_001',\n 'touchpoint_id': 'tp_002',\n 'channel': 'social_media',\n 'timestamp': datetime(2023, 1, 2, 14, 30),\n 'conversion': False,\n 'conversion_value': None\n },\n {\n 'user_id': 'user_001',\n 'touchpoint_id': 'tp_003',\n 'channel': 'search',\n 'timestamp': datetime(2023, 1, 3, 9, 15),\n 'conversion': True,\n 'conversion_value': 150.0\n }\n])\n\n# Initialize attribution model\nconfig = AttributionConfig(attribution_window_days=30)\nmodel = FirstTouchAttribution(config)\n\n# Calculate attribution\nresults = model.calculate_attribution(data)\nprint(results)\n```\n\n### Using Different Models\n\n```python\nfrom pyattrscore import (\n LinearAttribution,\n ExponentialDecayAttribution,\n UShapedAttribution,\n FootballAttribution,\n get_model\n)\n\n# Method 1: Direct instantiation\nlinear_model = LinearAttribution(config)\nresults_linear = linear_model.calculate_attribution(data)\n\n# Method 2: Using model factory\ndecay_model = get_model('exponential_decay', config)\nresults_decay = decay_model.calculate_attribution(data)\n\n# Method 3: Football Attribution\nfootball_model = get_model('football')\nresults_football = football_model.calculate_attribution(data)\n\n# Method 4: U-Shaped with custom weights\nu_shaped_model = UShapedAttribution(\n config, \n first_touch_weight=0.3, \n last_touch_weight=0.5\n)\nresults_u_shaped = u_shaped_model.calculate_attribution(data)\n```\n\n## \ud83d\udcca Attribution Models\n\n### 1. First Touch Attribution\nAssigns 100% credit to the first touchpoint in the customer journey.\n\n```python\nfrom pyattrscore import FirstTouchAttribution\n\nmodel = FirstTouchAttribution()\nresults = model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Understanding awareness channel effectiveness\n- Short sales cycles\n- Top-of-funnel optimization\n\n### 2. Last Touch Attribution\nAssigns 100% credit to the last touchpoint before conversion.\n\n```python\nfrom pyattrscore import LastTouchAttribution\n\nmodel = LastTouchAttribution()\nresults = model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Understanding closing channel effectiveness\n- Bottom-of-funnel optimization\n- Direct response campaigns\n\n### 3. Linear Attribution\nDistributes credit equally among all touchpoints within the attribution window.\n\n```python\nfrom pyattrscore import LinearAttribution, AttributionConfig\n\nconfig = AttributionConfig(attribution_window_days=30)\nmodel = LinearAttribution(config)\nresults = model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Balanced view of customer journey\n- Multi-touch attribution analysis\n- Understanding overall channel contribution\n\n### 4. Time Decay Attribution\nCredits touchpoints based on their proximity to conversion.\n\n```python\nfrom pyattrscore import ExponentialDecayAttribution, LinearDecayAttribution\n\n# Exponential decay\nconfig = AttributionConfig(attribution_window_days=30, decay_rate=0.5)\nexp_model = ExponentialDecayAttribution(config)\nresults_exp = exp_model.calculate_attribution(data)\n\n# Linear decay\nlinear_decay_model = LinearDecayAttribution(config)\nresults_linear_decay = linear_decay_model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Understanding recency impact\n- Time-sensitive attribution analysis\n- Weighting recent touchpoints higher\n\n### 5. U-Shaped Attribution\nAssigns higher credit to first and last touchpoints, distributing remainder to middle touchpoints.\n\n```python\nfrom pyattrscore import UShapedAttribution\n\nmodel = UShapedAttribution(\n first_touch_weight=0.4,\n last_touch_weight=0.4\n # Remaining 20% distributed to middle touchpoints\n)\nresults = model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Balancing awareness and conversion touchpoints\n- Multi-touch customer journeys\n- Understanding nurturing touchpoint value\n\n### 6. Windowed First Touch Attribution\nAssigns 100% credit to the first touchpoint within the attribution window.\n\n```python\nfrom pyattrscore import WindowedFirstTouchAttribution, AttributionConfig\n\nconfig = AttributionConfig(attribution_window_days=14)\nmodel = WindowedFirstTouchAttribution(config)\nresults = model.calculate_attribution(data)\n```\n\n**Use Cases:**\n- Understanding recent awareness drivers\n- Time-bounded first touch analysis\n- Focusing on relevant touchpoints\n\n### 7. \ud83c\udfc8 Football-Based Attribution Model (Improved Definition)\n\nThe Football-Based Attribution Model applies a football (soccer) metaphor to marketing attribution, treating marketing channels as players on a football team. Each channel is assigned a role based on its contribution to the customer journey, and a Channel Impact Score (CIS) is calculated to quantify its overall impact.\n\n```python\nfrom pyattrscore import FootballAttribution, FootballAttributionConfig\n\n# Configure the football model\nconfig = FootballAttributionConfig(\n attribution_window_days=30,\n scorer_weight=0.25, # Final conversion touchpoint\n assister_weight=0.20, # Setup touchpoint before conversion\n key_passer_weight=0.15, # Journey initiator\n most_passes_weight=0.14, # Most frequent engagement\n most_minutes_weight=0.10, # Longest engagement time\n most_dribbles_weight=0.10, # Cold lead revival\n participant_weight=0.06, # Supporting touchpoint\n baseline_weight=0.1,\n cold_lead_threshold_days=7\n)\n\nmodel = FootballAttribution(config)\nresults = model.calculate_attribution(data)\n\n# Get team performance summary\nsummary = model.get_channel_performance_summary(results)\nprint(summary)\n```\n\n### Football Roles and Their Marketing Analogies\n\n- **Scorer**: The final touchpoint that directly leads to conversion, analogous to the striker who scores the goal.\n- **Assister**: The touchpoint immediately preceding the conversion, setting up the \"goal,\" similar to a midfielder providing an assist.\n- **Key Passer**: The journey initiator, the first touchpoint that starts the conversion build-up, like a defender or playmaker starting the play.\n- **Most Passes**: The channel with the highest frequency of engagement, representing consistent involvement.\n- **Most Minutes**: The channel with the longest engagement time, indicating sustained interaction.\n- **Most Dribbles**: The channel that revives cold leads, re-engaging users after inactivity.\n- **Participant**: Supporting touchpoints that contribute but do not fit the above roles.\n\n### Channel Archetypes\n\nChannels are classified into archetypes based on their typical marketing role:\n\n- **Generator**: Creates awareness and initiates plays (e.g., Organic Search, Social Media).\n- **Assister**: Nurtures and sets up conversions (e.g., Email, Paid Search).\n- **Closer**: Finishes conversions (e.g., Direct, Referral).\n- **Participant**: Supporting roles that assist the team.\n\n### Channel Impact Score (CIS) Formula\n\nThe CIS quantifies the contribution of each channel by combining a baseline weight with weighted role contributions:\n\n```\nCIS = baseline_weight + (1 - baseline_weight) \u00d7 \u03a3(role_weight \u00d7 role_indicator)\n```\n\nWhere:\n\n- `baseline_weight` is a minimum credit assigned to all touchpoints.\n- `role_weight` is the predefined weight for each football role.\n- `role_indicator` is 1 if the channel has the role, 0 otherwise.\n\nThis formula ensures that channels with key roles receive higher attribution while all channels receive some baseline credit.\n\n\n## \u2699\ufe0f Configuration\n\n### Using Configuration Objects\n\n```python\nfrom pyattrscore import AttributionConfig, FootballAttributionConfig\n\n# Standard configuration\nconfig = AttributionConfig(\n attribution_window_days=30,\n decay_rate=0.6,\n include_non_converting_paths=False\n)\n\n# Football-specific configuration\nfootball_config = FootballAttributionConfig(\n attribution_window_days=30,\n scorer_weight=0.25,\n assister_weight=0.20,\n baseline_weight=0.1,\n channel_archetypes={\n 'organic_search': 'generator',\n 'paid_search': 'assister',\n 'direct': 'closer',\n 'referral': 'closer'\n }\n)\n```\n\n### Using YAML Configuration\n\n```yaml\n# config.yaml\nglobal:\n attribution_window_days: 30\n log_level: \"INFO\"\n\nmodels:\n linear:\n use_attribution_window: true\n \n exponential_decay:\n decay_rate: 0.5\n use_attribution_window: true\n \n football:\n role_weights:\n scorer_weight: 0.25\n assister_weight: 0.20\n key_passer_weight: 0.15\n baseline_weight: 0.1\n cold_lead_threshold_days: 7\n channel_archetypes:\n organic_search: \"generator\"\n paid_search: \"assister\"\n direct: \"closer\"\n```\n\n```python\nimport yaml\nfrom pyattrscore import AttributionConfig, FootballAttributionConfig\n\nwith open('config.yaml', 'r') as f:\n config_dict = yaml.safe_load(f)\n\nconfig = AttributionConfig(**config_dict['global'])\nfootball_config = FootballAttributionConfig(**config_dict['models']['football'])\n```\n\n## \ud83d\udcc8 Advanced Usage\n\n### Football Attribution Analysis\n\n```python\nfrom pyattrscore import FootballAttribution\nimport pandas as pd\n\n# Load your data\ndata = pd.read_csv('sample_data.csv')\n\n# Initialize football model\nmodel = FootballAttribution()\nresults = model.calculate_attribution(data)\n\n# Analyze team performance\nsummary = model.get_channel_performance_summary(results)\n\n# Top performers\nprint(\"\ud83e\udd45 Top Scorers (Closers):\")\ntop_scorers = summary.nlargest(3, 'channel_goals')\nprint(top_scorers[['channel', 'channel_goals', 'channel_archetype']])\n\nprint(\"\\n\ud83c\udfaf Top Assisters (Setup Channels):\")\ntop_assisters = summary.nlargest(3, 'channel_assists')\nprint(top_assisters[['channel', 'channel_assists', 'channel_archetype']])\n\n# Team formation analysis\nprint(\"\\n\ud83c\udfdf\ufe0f Team Formation Performance:\")\narchetype_performance = summary.groupby('channel_archetype').agg({\n 'channel_goals': 'sum',\n 'channel_assists': 'sum',\n 'attribution_score': 'sum'\n}).round(2)\nprint(archetype_performance)\n```\n\n### Model Comparison\n\n```python\nfrom pyattrscore import get_model, list_models\n\n# Compare multiple models including football\nmodels_to_compare = ['first_touch', 'last_touch', 'linear', 'u_shaped', 'football']\nresults_comparison = {}\n\nfor model_name in models_to_compare:\n model = get_model(model_name, config)\n results = model.calculate_attribution(data)\n \n # Aggregate by channel\n channel_attribution = results.groupby('channel')['attribution_score'].sum()\n results_comparison[model_name] = channel_attribution\n\ncomparison_df = pd.DataFrame(results_comparison).fillna(0)\nprint(comparison_df)\n\n# Football-specific analysis\nif 'football' in models_to_compare:\n football_model = get_model('football')\n football_results = football_model.calculate_attribution(data)\n team_summary = football_model.get_channel_performance_summary(football_results)\n print(\"\\n\ud83c\udfc8 Team Performance Summary:\")\n print(team_summary[['channel', 'channel_archetype', 'channel_goals', 'channel_assists']])\n```\n\n### Batch Processing Multiple Users\n\n```python\nimport pandas as pd\nfrom pyattrscore import FootballAttribution\n\n# Large dataset with multiple users\nlarge_data = pd.DataFrame([\n # User 1 journey\n {'user_id': 'user_001', 'touchpoint_id': 'tp_001', 'channel': 'email', \n 'timestamp': datetime(2023, 1, 1), 'conversion': False, 'engagement_time': 30.0},\n {'user_id': 'user_001', 'touchpoint_id': 'tp_002', 'channel': 'search', \n 'timestamp': datetime(2023, 1, 2), 'conversion': True, 'conversion_value': 100.0, 'engagement_time': 60.0},\n \n # User 2 journey\n {'user_id': 'user_002', 'touchpoint_id': 'tp_003', 'channel': 'social', \n 'timestamp': datetime(2023, 1, 1), 'conversion': False, 'engagement_time': 25.0},\n {'user_id': 'user_002', 'touchpoint_id': 'tp_004', 'channel': 'email', \n 'timestamp': datetime(2023, 1, 3), 'conversion': True, 'conversion_value': 200.0, 'engagement_time': 45.0},\n])\n\nmodel = FootballAttribution()\nresults = model.calculate_attribution(large_data)\n\n# Analyze results by channel\nchannel_performance = results.groupby('channel').agg({\n 'attribution_score': 'sum',\n 'attribution_value': 'sum',\n 'user_id': 'nunique',\n 'channel_goals': 'first',\n 'channel_assists': 'first'\n}).round(4)\n\nprint(channel_performance)\n```\n\n## \ud83d\udd27 Data Requirements\n\n### Required Columns\n\nYour input DataFrame must contain these columns:\n\n- `user_id` (str): Unique identifier for each user/customer\n- `touchpoint_id` (str): Unique identifier for each touchpoint\n- `channel` (str): Marketing channel name (e.g., 'email', 'search', 'social')\n- `timestamp` (datetime): When the touchpoint occurred\n\n### Optional Columns\n\n- `conversion` (bool): Whether this touchpoint led to a conversion\n- `conversion_value` (float): Monetary value of the conversion\n- `engagement_time` (float): Time spent on the touchpoint (recommended for Football Attribution)\n\n### Sample Data File\n\nUse the provided `sample_data.csv` for testing:\n\n```python\nimport pandas as pd\nfrom pyattrscore import FootballAttribution\n\n# Load sample data\ndata = pd.read_csv('sample_data.csv')\nprint(data.head())\n\n# Run football attribution\nmodel = FootballAttribution()\nresults = model.calculate_attribution(data)\n```\n\n### Data Validation\n\nPyAttrScore automatically validates your data:\n\n```python\nfrom pyattrscore.exceptions import InvalidInputError\n\ntry:\n results = model.calculate_attribution(invalid_data)\nexcept InvalidInputError as e:\n print(f\"Data validation failed: {e}\")\n print(f\"Invalid fields: {e.invalid_fields}\")\n```\n\n## \ud83d\udcca Output Format\n\nAttribution results include:\n\n```python\n# Standard columns\nresults.columns\n# ['user_id', 'touchpoint_id', 'channel', 'timestamp', 'conversion',\n# 'attribution_score', 'attribution_percentage', 'model_name', 'attribution_value']\n\n# Football-specific columns (when using FootballAttribution)\n# ['football_roles', 'channel_archetype', 'channel_goals', 'channel_assists', \n# 'channel_passes', 'channel_minutes', 'channel_expected_goals']\n\n# Example output\nprint(results.head())\n# user_id touchpoint_id channel ... football_roles channel_archetype\n# 0 user_001 tp_001 email ... [assister] assister\n# 1 user_001 tp_002 social_media ... [key_passer] generator\n# 2 user_001 tp_003 search ... [scorer] closer\n```\n\n## \ud83e\uddea Testing\n\nRun the test suite:\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=pyattrscore --cov-report=html\n\n# Run football attribution tests\npytest tests/test_football.py\n\n# Run with verbose output\npytest -v\n```\n\n## \ud83c\udfc8 Football Attribution Examples\n\n### Example 1: Specification Example\n```python\n# The classic example from the specification\ndata = pd.DataFrame({\n 'user_id': ['customer_1', 'customer_1', 'customer_1'],\n 'touchpoint_id': ['tp_1', 'tp_2', 'tp_3'],\n 'channel': ['organic_search', 'paid_search', 'referral'],\n 'timestamp': [\n datetime(2024, 1, 1, 10, 0),\n datetime(2024, 1, 2, 11, 0),\n datetime(2024, 1, 3, 12, 0)\n ],\n 'conversion': [False, False, True],\n 'conversion_value': [None, None, 100.0],\n 'engagement_time': [30.0, 45.0, 60.0]\n})\n\nmodel = FootballAttribution()\nresults = model.calculate_attribution(data)\n\n# Expected results:\n# Referral (Closer): ~39%\n# Paid Search (Assister): ~26% \n# Organic Search (Generator): ~35%\n```\n\n### Example 2: Multi-Customer Analysis\n```python\n# Run the comprehensive example\npython football_example.py\n\n# This will show:\n# - Role assignments for each touchpoint\n# - Channel performance metrics\n# - Team formation analysis\n# - Football analytics insights\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- Built with [pandas](https://pandas.pydata.org/), [numpy](https://numpy.org/), and [pydantic](https://pydantic-docs.helpmanual.io/)\n- Inspired by marketing attribution research and football analytics\n",
"bugtrack_url": null,
"license": null,
"summary": "Python Attribution Modeling Package for Marketing Analytics",
"version": "0.0.1",
"project_urls": null,
"split_keywords": [
"attribution",
" marketing",
" analytics",
" conversion",
" touchpoint",
" customer-journey",
" marketing-mix-modeling",
" data-science",
" machine-learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a54e50987b6b0ff6bc58fde1e22e009a484297693864e7a4285b715768037f2f",
"md5": "31bee5e88d2bdbd2d3d7e9e63982b0b3",
"sha256": "bede259b726afb3dc3e3157ec05a57a7c2392a5648a681b84dca57e0b029f75d"
},
"downloads": -1,
"filename": "pyattrscore-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "31bee5e88d2bdbd2d3d7e9e63982b0b3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 122031,
"upload_time": "2025-11-01T09:13:30",
"upload_time_iso_8601": "2025-11-01T09:13:30.724905Z",
"url": "https://files.pythonhosted.org/packages/a5/4e/50987b6b0ff6bc58fde1e22e009a484297693864e7a4285b715768037f2f/pyattrscore-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7a0a9938d58666e245cf698d39a37cebeeb4bb596522e325d04ea247e8749e4c",
"md5": "b844459894538272456d1c769af6532a",
"sha256": "72f7e639356f9dffadf500bea1669549a0b6bcf98b01527056eace062d438fc4"
},
"downloads": -1,
"filename": "pyattrscore-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "b844459894538272456d1c769af6532a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 107908,
"upload_time": "2025-11-01T09:13:32",
"upload_time_iso_8601": "2025-11-01T09:13:32.773111Z",
"url": "https://files.pythonhosted.org/packages/7a/0a/9938d58666e245cf698d39a37cebeeb4bb596522e325d04ea247e8749e4c/pyattrscore-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-01 09:13:32",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "pyattrscore"
}