`tldextract` accurately separates a URL's subdomain, domain, and public suffix,
using the Public Suffix List (PSL).
>>> import tldextract
>>> tldextract.extract('http://forums.news.cnn.com/')
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')
>>> tldextract.extract('http://forums.bbc.co.uk/') # United Kingdom
ExtractResult(subdomain='forums', domain='bbc', suffix='co.uk')
>>> tldextract.extract('http://www.worldbank.org.kg/') # Kyrgyzstan
ExtractResult(subdomain='www', domain='worldbank', suffix='org.kg')
`ExtractResult` is a namedtuple, so it's simple to access the parts you want.
>>> ext = tldextract.extract('http://forums.bbc.co.uk')
>>> (ext.subdomain, ext.domain, ext.suffix)
('forums', 'bbc', 'co.uk')
>>> # rejoin subdomain and domain
>>> '.'.join(ext[:2])
'forums.bbc'
>>> # a common alias
>>> ext.registered_domain
'bbc.co.uk'
By default, this package supports the public ICANN TLDs and their exceptions.
You can optionally support the Public Suffix List's private domains as well.
Raw data
{
"_id": null,
"home_page": "https://github.com/john-kurkowski/tldextract",
"name": "tldextract",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "tld domain subdomain url parse extract urlparse urlsplit public suffix list publicsuffix publicsuffixlist",
"author": "John Kurkowski",
"author_email": "john.kurkowski@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/5e/10/8b126c7314daadd52c2c27250d27fea46f20d1d8f823d4fedd3fb9dc7e79/tldextract-3.3.0.tar.gz",
"platform": null,
"description": " `tldextract` accurately separates a URL's subdomain, domain, and public suffix,\nusing the Public Suffix List (PSL).\n\n >>> import tldextract\n >>> tldextract.extract('http://forums.news.cnn.com/')\n ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')\n >>> tldextract.extract('http://forums.bbc.co.uk/') # United Kingdom\n ExtractResult(subdomain='forums', domain='bbc', suffix='co.uk')\n >>> tldextract.extract('http://www.worldbank.org.kg/') # Kyrgyzstan\n ExtractResult(subdomain='www', domain='worldbank', suffix='org.kg')\n\n`ExtractResult` is a namedtuple, so it's simple to access the parts you want.\n\n >>> ext = tldextract.extract('http://forums.bbc.co.uk')\n >>> (ext.subdomain, ext.domain, ext.suffix)\n ('forums', 'bbc', 'co.uk')\n >>> # rejoin subdomain and domain\n >>> '.'.join(ext[:2])\n 'forums.bbc'\n >>> # a common alias\n >>> ext.registered_domain\n 'bbc.co.uk'\n\nBy default, this package supports the public ICANN TLDs and their exceptions.\nYou can optionally support the Public Suffix List's private domains as well.\n\n\n",
"bugtrack_url": null,
"license": "BSD License",
"summary": "Accurately separates a URL's subdomain, domain, and public suffix, using the Public Suffix List (PSL). By default, this includes the public ICANN TLDs and their exceptions. You can optionally support the Public Suffix List's private domains as well.",
"version": "3.3.0",
"split_keywords": [
"tld",
"domain",
"subdomain",
"url",
"parse",
"extract",
"urlparse",
"urlsplit",
"public",
"suffix",
"list",
"publicsuffix",
"publicsuffixlist"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "d28ad18e37489f62920c0c82f9fd5c50",
"sha256": "5d88321b1b528ebb8f678c72ab023f37caf6381f6af9576b4e60fd266cff178c"
},
"downloads": -1,
"filename": "tldextract-3.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d28ad18e37489f62920c0c82f9fd5c50",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 93581,
"upload_time": "2022-05-04T23:35:38",
"upload_time_iso_8601": "2022-05-04T23:35:38.509901Z",
"url": "https://files.pythonhosted.org/packages/8f/ef/6a05da5e708016b495b2d559c773c7f89fc87fd683058697a89e237e30f3/tldextract-3.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "69eac4b37a72121e0f3e31fd1915b7bb",
"sha256": "adcd24abf21ce3450417cd5a00f23b7e57554ce8ae827334dd12bfcbb6274cf1"
},
"downloads": -1,
"filename": "tldextract-3.3.0.tar.gz",
"has_sig": false,
"md5_digest": "69eac4b37a72121e0f3e31fd1915b7bb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 110388,
"upload_time": "2022-05-04T23:35:40",
"upload_time_iso_8601": "2022-05-04T23:35:40.255362Z",
"url": "https://files.pythonhosted.org/packages/5e/10/8b126c7314daadd52c2c27250d27fea46f20d1d8f823d4fedd3fb9dc7e79/tldextract-3.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-05-04 23:35:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "john-kurkowski",
"github_project": "tldextract",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "tldextract"
}