utm-referrer-attribution-parser


Nameutm-referrer-attribution-parser JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryModern Python library combining referrer parsing with tracking parameter extraction for web analytics
upload_time2025-08-10 21:08:55
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords analytics attribution referrer tracking utm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # utm-referrer-attribution-parser

A modern Python library that combines referrer parsing with tracking parameter extraction for comprehensive web analytics attribution.

## โœจ Super Simple API

```python
from utm_referrer_parser import webmetic_referrer

# Just pass the URL and optional referrer - that's it!
result = webmetic_referrer(
    url="https://example.com/page?utm_source=google&utm_medium=cpc&gclid=abc123",
    referrer="https://www.google.com/search?q=analytics"
)

print(result)
# {
#     'source': 'google',
#     'medium': 'cpc',
#     'click_id': 'abc123',
#     'click_id_type': 'gclid',
#     'term': 'analytics'
# }
```

## ๐Ÿš€ Features

- **Ultra-Simple API**: Just `webmetic_referrer(url, referrer)` - that's it!
- **Unified Click Tracking**: Clean `click_id` and `click_id_type` fields instead of 15+ individual parameters
- **25+ Tracking Parameters**: UTM, Google Ads, Facebook, TikTok, LinkedIn, email platforms, and more
- **Smart Referrer Analysis**: Uses Snowplow's referrer database for accurate source/medium classification
- **Advanced Domain Parsing**: Uses tldextract for robust international domain handling (.co.uk, .com.au, etc.)
- **Auto-updating Database**: Weekly updates of referrer database with local fallback
- **High Performance**: In-memory caching and optimized parsing
- **Framework Agnostic**: Works with any Python web framework
- **Production Ready**: 99%+ accuracy validated with 150+ real-world test cases
- **International Support**: Handles global search engines (Google, Bing, Baidu, Yandex, Naver, etc.)

## ๐Ÿ“ฆ Installation

```bash
pip install utm-referrer-attribution-parser
```

## ๐ŸŽฏ Quick Examples

### Google Ads Click
```python
result = webmetic_referrer(
    url="https://site.com/landing?utm_source=google&utm_medium=cpc&gclid=abc123"
)
# Returns: {'source': 'google', 'medium': 'cpc', 'click_id': 'abc123', 'click_id_type': 'gclid'}
```

### Facebook Ad
```python
result = webmetic_referrer(
    url="https://site.com/product?fbclid=fb123",
    referrer="https://www.facebook.com/"
)
# Returns: {'source': 'facebook', 'medium': 'cpc', 'click_id': 'fb123', 'click_id_type': 'fbclid'}
```

### Organic Search
```python
result = webmetic_referrer(
    url="https://site.com/blog",
    referrer="https://www.google.com/search?q=analytics+guide"
)
# Returns: {'source': 'Google', 'medium': 'search', 'term': 'analytics guide'}
```

### Direct Traffic
```python
result = webmetic_referrer("https://site.com/")
# Returns: {'source': '(direct)', 'medium': '(none)'}
```

### Internal Navigation
```python
result = webmetic_referrer(
    url="https://shop.example.com/products",
    referrer="https://example.com/"
)
# Returns: {'source': '(internal)', 'medium': 'internal'}
```

The library automatically detects internal navigation between subdomains using advanced TLD parsing, correctly handling complex domains like `.co.uk`, `.com.au`, `.org.br`, etc.

## ๐ŸŽฏ Unified Click Tracking

Instead of tracking 15+ individual click ID fields, we provide a clean unified structure:

### Old Approach (Complex)
```python
# Multiple individual fields to check
result = {
    'gclid': 'abc123',
    'fbclid': None,
    'ttclid': None,
    'msclkid': None,
    # ... 15+ more fields
}
```

### New Approach (Clean)
```python
# Just 2 unified fields
result = {
    'click_id': 'abc123',        # The actual tracking value
    'click_id_type': 'gclid'     # Which parameter it came from
}
```

### Benefits
- **Cleaner API**: 2 fields instead of 15+
- **Easier Logic**: Simple `if result['click_id']` checks
- **Platform Detection**: Still get source/medium attribution automatically
- **Priority Handling**: Google Ads โ†’ Facebook โ†’ Microsoft โ†’ Other platforms

## Supported Parameters

### Standard UTM
- `utm_source`, `utm_medium`, `utm_campaign`, `utm_term`, `utm_content`, `utm_id`

### Click Tracking (Unified)
- `click_id` - The actual click tracking value
- `click_id_type` - Which parameter provided it (`gclid`, `fbclid`, `ttclid`, etc.)

### Google Ads Metadata
- `gclsrc`, `gad_source`, `srsltid`

### Social Media
- `igshid` (Instagram), `sccid` (Snapchat)

### Email Marketing
- `mc_cid`, `mc_eid` (Mailchimp)
- `ml_subscriber_hash` (MailerLite)

### Other Platform Parameters
- `epik` (Pinterest), `ttd_uuid` (Trade Desk), `obOrigUrl` (Outbrain), and more

## ๐Ÿงช Validation & Testing

This library has been extensively tested with:
- **150+ real database cases** from production environments
- **50+ diverse internet scenarios** covering global platforms
- **99%+ accuracy rate** in attribution detection
- **100% error handling** - no crashes on malformed inputs

### Supported Platforms
- **Search Engines**: Google, Bing, Baidu, Yandex, DuckDuckGo, Naver, Yahoo, Ecosia
- **Social Media**: Facebook, Instagram, TikTok, Twitter, LinkedIn, Pinterest, Reddit, Snapchat
- **Email Marketing**: Mailchimp, MailerLite, Constant Contact, SendGrid, ConvertKit
- **Business Tools**: Slack, Microsoft Teams, Calendly, Notion, Zoom
- **E-commerce**: Amazon, eBay, Shopify, Etsy, AliExpress

## ๐Ÿ”„ Migration from Complex Systems

Replace complex tracking data dictionaries with simple function calls:

```python
# OLD: Complex dictionary approach
tracking_data = {
    "dl": "https://site.com/?utm_source=google&gclid=abc123",
    "dr": "https://www.google.com/search?q=analytics", 
    "bu": "https://site.com"
}
result = parse_attribution(tracking_data)

# NEW: Ultra-simple API
result = webmetic_referrer(
    url="https://site.com/?utm_source=google&gclid=abc123",
    referrer="https://www.google.com/search?q=analytics"
)
```

## ๐Ÿ“Š What Makes This Different

- **Intelligent Priority**: UTM parameters โ†’ Click IDs โ†’ Referrer analysis โ†’ Direct traffic
- **Unified Click Tracking**: Clean `click_id`/`click_id_type` structure instead of 15+ individual fields
- **Click ID Detection**: Automatically identifies 25+ types of advertising click IDs
- **International Ready**: Built-in support for global search engines and platforms  
- **Real-world Tested**: Validated against actual production analytics data
- **Future Proof**: Auto-updating referrer database keeps up with new platforms

## License

MIT License - see LICENSE file for details.
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "utm-referrer-attribution-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "analytics, attribution, referrer, tracking, utm",
    "author": null,
    "author_email": "Webmetic <support@webmetic.de>",
    "download_url": "https://files.pythonhosted.org/packages/82/89/60b5ca670c63249bc89de944ffbf67a28f8cbe261c76cecae3c323211b87/utm_referrer_attribution_parser-0.1.3.tar.gz",
    "platform": null,
    "description": "# utm-referrer-attribution-parser\n\nA modern Python library that combines referrer parsing with tracking parameter extraction for comprehensive web analytics attribution.\n\n## \u2728 Super Simple API\n\n```python\nfrom utm_referrer_parser import webmetic_referrer\n\n# Just pass the URL and optional referrer - that's it!\nresult = webmetic_referrer(\n    url=\"https://example.com/page?utm_source=google&utm_medium=cpc&gclid=abc123\",\n    referrer=\"https://www.google.com/search?q=analytics\"\n)\n\nprint(result)\n# {\n#     'source': 'google',\n#     'medium': 'cpc',\n#     'click_id': 'abc123',\n#     'click_id_type': 'gclid',\n#     'term': 'analytics'\n# }\n```\n\n## \ud83d\ude80 Features\n\n- **Ultra-Simple API**: Just `webmetic_referrer(url, referrer)` - that's it!\n- **Unified Click Tracking**: Clean `click_id` and `click_id_type` fields instead of 15+ individual parameters\n- **25+ Tracking Parameters**: UTM, Google Ads, Facebook, TikTok, LinkedIn, email platforms, and more\n- **Smart Referrer Analysis**: Uses Snowplow's referrer database for accurate source/medium classification\n- **Advanced Domain Parsing**: Uses tldextract for robust international domain handling (.co.uk, .com.au, etc.)\n- **Auto-updating Database**: Weekly updates of referrer database with local fallback\n- **High Performance**: In-memory caching and optimized parsing\n- **Framework Agnostic**: Works with any Python web framework\n- **Production Ready**: 99%+ accuracy validated with 150+ real-world test cases\n- **International Support**: Handles global search engines (Google, Bing, Baidu, Yandex, Naver, etc.)\n\n## \ud83d\udce6 Installation\n\n```bash\npip install utm-referrer-attribution-parser\n```\n\n## \ud83c\udfaf Quick Examples\n\n### Google Ads Click\n```python\nresult = webmetic_referrer(\n    url=\"https://site.com/landing?utm_source=google&utm_medium=cpc&gclid=abc123\"\n)\n# Returns: {'source': 'google', 'medium': 'cpc', 'click_id': 'abc123', 'click_id_type': 'gclid'}\n```\n\n### Facebook Ad\n```python\nresult = webmetic_referrer(\n    url=\"https://site.com/product?fbclid=fb123\",\n    referrer=\"https://www.facebook.com/\"\n)\n# Returns: {'source': 'facebook', 'medium': 'cpc', 'click_id': 'fb123', 'click_id_type': 'fbclid'}\n```\n\n### Organic Search\n```python\nresult = webmetic_referrer(\n    url=\"https://site.com/blog\",\n    referrer=\"https://www.google.com/search?q=analytics+guide\"\n)\n# Returns: {'source': 'Google', 'medium': 'search', 'term': 'analytics guide'}\n```\n\n### Direct Traffic\n```python\nresult = webmetic_referrer(\"https://site.com/\")\n# Returns: {'source': '(direct)', 'medium': '(none)'}\n```\n\n### Internal Navigation\n```python\nresult = webmetic_referrer(\n    url=\"https://shop.example.com/products\",\n    referrer=\"https://example.com/\"\n)\n# Returns: {'source': '(internal)', 'medium': 'internal'}\n```\n\nThe library automatically detects internal navigation between subdomains using advanced TLD parsing, correctly handling complex domains like `.co.uk`, `.com.au`, `.org.br`, etc.\n\n## \ud83c\udfaf Unified Click Tracking\n\nInstead of tracking 15+ individual click ID fields, we provide a clean unified structure:\n\n### Old Approach (Complex)\n```python\n# Multiple individual fields to check\nresult = {\n    'gclid': 'abc123',\n    'fbclid': None,\n    'ttclid': None,\n    'msclkid': None,\n    # ... 15+ more fields\n}\n```\n\n### New Approach (Clean)\n```python\n# Just 2 unified fields\nresult = {\n    'click_id': 'abc123',        # The actual tracking value\n    'click_id_type': 'gclid'     # Which parameter it came from\n}\n```\n\n### Benefits\n- **Cleaner API**: 2 fields instead of 15+\n- **Easier Logic**: Simple `if result['click_id']` checks\n- **Platform Detection**: Still get source/medium attribution automatically\n- **Priority Handling**: Google Ads \u2192 Facebook \u2192 Microsoft \u2192 Other platforms\n\n## Supported Parameters\n\n### Standard UTM\n- `utm_source`, `utm_medium`, `utm_campaign`, `utm_term`, `utm_content`, `utm_id`\n\n### Click Tracking (Unified)\n- `click_id` - The actual click tracking value\n- `click_id_type` - Which parameter provided it (`gclid`, `fbclid`, `ttclid`, etc.)\n\n### Google Ads Metadata\n- `gclsrc`, `gad_source`, `srsltid`\n\n### Social Media\n- `igshid` (Instagram), `sccid` (Snapchat)\n\n### Email Marketing\n- `mc_cid`, `mc_eid` (Mailchimp)\n- `ml_subscriber_hash` (MailerLite)\n\n### Other Platform Parameters\n- `epik` (Pinterest), `ttd_uuid` (Trade Desk), `obOrigUrl` (Outbrain), and more\n\n## \ud83e\uddea Validation & Testing\n\nThis library has been extensively tested with:\n- **150+ real database cases** from production environments\n- **50+ diverse internet scenarios** covering global platforms\n- **99%+ accuracy rate** in attribution detection\n- **100% error handling** - no crashes on malformed inputs\n\n### Supported Platforms\n- **Search Engines**: Google, Bing, Baidu, Yandex, DuckDuckGo, Naver, Yahoo, Ecosia\n- **Social Media**: Facebook, Instagram, TikTok, Twitter, LinkedIn, Pinterest, Reddit, Snapchat\n- **Email Marketing**: Mailchimp, MailerLite, Constant Contact, SendGrid, ConvertKit\n- **Business Tools**: Slack, Microsoft Teams, Calendly, Notion, Zoom\n- **E-commerce**: Amazon, eBay, Shopify, Etsy, AliExpress\n\n## \ud83d\udd04 Migration from Complex Systems\n\nReplace complex tracking data dictionaries with simple function calls:\n\n```python\n# OLD: Complex dictionary approach\ntracking_data = {\n    \"dl\": \"https://site.com/?utm_source=google&gclid=abc123\",\n    \"dr\": \"https://www.google.com/search?q=analytics\", \n    \"bu\": \"https://site.com\"\n}\nresult = parse_attribution(tracking_data)\n\n# NEW: Ultra-simple API\nresult = webmetic_referrer(\n    url=\"https://site.com/?utm_source=google&gclid=abc123\",\n    referrer=\"https://www.google.com/search?q=analytics\"\n)\n```\n\n## \ud83d\udcca What Makes This Different\n\n- **Intelligent Priority**: UTM parameters \u2192 Click IDs \u2192 Referrer analysis \u2192 Direct traffic\n- **Unified Click Tracking**: Clean `click_id`/`click_id_type` structure instead of 15+ individual fields\n- **Click ID Detection**: Automatically identifies 25+ types of advertising click IDs\n- **International Ready**: Built-in support for global search engines and platforms  \n- **Real-world Tested**: Validated against actual production analytics data\n- **Future Proof**: Auto-updating referrer database keeps up with new platforms\n\n## License\n\nMIT License - see LICENSE file for details.",
    "bugtrack_url": null,
    "license": null,
    "summary": "Modern Python library combining referrer parsing with tracking parameter extraction for web analytics",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://webmetic.de",
        "Issues": "https://github.com/webmetic/utm-referrer-attribution-parser/issues",
        "Repository": "https://github.com/webmetic/utm-referrer-attribution-parser",
        "Website": "https://webmetic.de"
    },
    "split_keywords": [
        "analytics",
        " attribution",
        " referrer",
        " tracking",
        " utm"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9a693ab5e5bfb5c2be099a91c0a28beb76ef2e42c84ebac471421296b08d7553",
                "md5": "cd2928f9375c5a1a67abe93fb6ed047b",
                "sha256": "c2d33a75b8365873ce9ff8cc9c6b92c2fab0f2a991dadd83974acb23d72e1f92"
            },
            "downloads": -1,
            "filename": "utm_referrer_attribution_parser-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cd2928f9375c5a1a67abe93fb6ed047b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 34607,
            "upload_time": "2025-08-10T21:08:53",
            "upload_time_iso_8601": "2025-08-10T21:08:53.786924Z",
            "url": "https://files.pythonhosted.org/packages/9a/69/3ab5e5bfb5c2be099a91c0a28beb76ef2e42c84ebac471421296b08d7553/utm_referrer_attribution_parser-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "828960b5ca670c63249bc89de944ffbf67a28f8cbe261c76cecae3c323211b87",
                "md5": "679e083cdb8e1d14d79ab637fb725ed0",
                "sha256": "5c445157896372a8036b1f8efb35b98089ce267b655249923e6429c06bf1c35e"
            },
            "downloads": -1,
            "filename": "utm_referrer_attribution_parser-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "679e083cdb8e1d14d79ab637fb725ed0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 47247,
            "upload_time": "2025-08-10T21:08:55",
            "upload_time_iso_8601": "2025-08-10T21:08:55.239427Z",
            "url": "https://files.pythonhosted.org/packages/82/89/60b5ca670c63249bc89de944ffbf67a28f8cbe261c76cecae3c323211b87/utm_referrer_attribution_parser-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-10 21:08:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "webmetic",
    "github_project": "utm-referrer-attribution-parser",
    "github_not_found": true,
    "lcname": "utm-referrer-attribution-parser"
}
        
Elapsed time: 0.64681s