publicsuffixlist
===
[Public Suffix List](https://publicsuffix.org/) parser implementation for
Python 3.5+.
- Compliant with [TEST DATA](https://raw.githubusercontent.com/publicsuffix/list/master/tests/test_psl.txt)
- Supports IDN (unicode and punycoded).
- Supports Python3.5+
- Shipped with built-in PSL and an updater script.
- Written in Pure Python with no library dependencies.
[](https://github.com/ko-zu/psl/actions/workflows/autorelease.yml)
[](https://github.com/ko-zu/psl/actions/workflows/citest.yml)
[](https://badge.fury.io/py/publicsuffixlist)
[](http://pepy.tech/project/publicsuffixlist)
Install
===
`publicsuffixlist` can be installed via `pip`.
```
$ pip install publicsuffixlist
```
Usage
===
Basic Usage:
```python
from publicsuffixlist import PublicSuffixList
psl = PublicSuffixList()
# Uses built-in PSL file
print(psl.publicsuffix("www.example.com")) # "com"
# the longest public suffix part
print(psl.privatesuffix("www.example.com")) # "example.com"
# the shortest domain assigned for a registrant
print(psl.privatesuffix("com")) # None
# Returns None if no private (non-public) part found
print(psl.publicsuffix("www.example.unknownnewtld")) # "unknownnewtld"
# New TLDs are valid public suffix by default
print(psl.publicsuffix("www.example.香港")) #"香港"
# Accepts unicode
print(psl.publicsuffix("www.example.xn--j6w193g")) # "xn--j6w193g"
# Accepts Punycode IDNs by default
print(psl.privatesuffix("WWW.EXAMPLE.COM")) # "example.com"
# Returns in lowercase by default
print(psl.privatesuffix("WWW.EXAMPLE.COM", keep_case=True) # "EXAMPLE.COM"
# kwarg `keep_case=True` to disable the case conversion
```
The latest PSL is packaged once a day. If you need to parse your own version,
it can be passed as a file-like iterable object, or just a `str`:
```python
with open("latest_psl.dat", "rb") as f:
psl = PublicSuffixList(f)
```
The unittest and PSL updater can be invoked as module.
```
$ python -m publicsuffixlist.test
$ python -m publicsuffixlist.update
```
Additional convenient methods:
```python
print(psl.is_private("example.com")) # True
print(psl.is_public("example.com")) # False
print(psl.privateparts("aaa.www.example.com")) # ("aaa", "www", "example.com")
print(psl.subdomain("aaa.www.example.com", depth=1)) # "www.example.com"
```
Limitation
===
#### Domain Label Validation
`publicsuffixlist` do NOT provide domain name and label validation.
In the DNS protocol, most 8-bit characters are acceptable as labels of domain
names. While ICANN-compliant registries do not accept domain names containing
underscores (_), hostnames may include them. For example, DMARC records can
contain underscores. Users must confirm that the input domain names are valid
based on their specific context.
#### Punycode Handling
Partially encoded (Unicode-mixed) Punycode is not supported due to very slow
Punycode encoding/decoding and unpredictable encoding results. If you are
unsure whether an input is valid Punycode, you should use:
`unknowndomain.encode("idna").decode("ascii")`. This method, converting to idna
is idempotent.
#### Handling Arbitrary Binary
If you need to accept arbitrary or malicious binary data, it can be passed as a
tuple of bytes. Note that the returned bytes may include byte patterns that
cannot be decoded or represented as a standard domain name.
Example:
```python
psl.privatesuffix((b"a.a", b"a.example\xff", b"com")) # (b"a.example\xff", b"com")
# Note that IDNs must be punycoded when passed as tuple of bytes.
psl = PublicSuffixList("例.example")
psl.publicsuffix((b"xn--fsq", b"example")) # (b"xn--fsq", b"example")
# UTF-8 encoded bytes of "例" do not match.
psl.publicsuffix((b"\xe4\xbe\x8b", b"example")) # (b"example",)
```
License
===
- This module is licensed under Mozilla Public License 2.0.
- The Public Suffix List maintained by the Mozilla Foundation is licensed under
the Mozilla Public License 2.0.
- The PSL testcase dataset is in the public domain (CC0).
Development / Packaging
===
This module and its packaging workflow are maintained in the author's
repository located at https://github.com/ko-zu/psl.
A new package, which includes the latest PSL file, is automatically generated
and uploaded to PyPI. The last part of the version number represents the
release date. For example, `0.10.1.20230331` indicates a release date of March
31, 2023.
This package dropped support for Python 2.7 and Python 3.4 or prior versions at
the version 1.0.0 release in June 2024. The last version that works on Python
2.x is 0.10.0.x.
Source / Link
===
- GitHub repository: (https://github.com/ko-zu/psl)
- PyPI: (https://pypi.org/project/publicsuffixlist/)
Raw data
{
"_id": null,
"home_page": "https://github.com/ko-zu/psl",
"name": "publicsuffixlist",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.5",
"maintainer_email": null,
"keywords": null,
"author": "ko-zu",
"author_email": "causeless@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/43/15/0bb327fbbee1e76077ed47abe4f22a96c3a101fccea658b5dfd2ed802730/publicsuffixlist-1.0.2.20250225.tar.gz",
"platform": null,
"description": "publicsuffixlist\n===\n\n[Public Suffix List](https://publicsuffix.org/) parser implementation for\nPython 3.5+.\n\n- Compliant with [TEST DATA](https://raw.githubusercontent.com/publicsuffix/list/master/tests/test_psl.txt)\n- Supports IDN (unicode and punycoded).\n- Supports Python3.5+\n- Shipped with built-in PSL and an updater script.\n- Written in Pure Python with no library dependencies.\n\n[](https://github.com/ko-zu/psl/actions/workflows/autorelease.yml)\n[](https://github.com/ko-zu/psl/actions/workflows/citest.yml)\n[](https://badge.fury.io/py/publicsuffixlist)\n[](http://pepy.tech/project/publicsuffixlist)\n\nInstall\n===\n`publicsuffixlist` can be installed via `pip`.\n```\n$ pip install publicsuffixlist\n```\n\nUsage\n===\n\nBasic Usage:\n\n```python\nfrom publicsuffixlist import PublicSuffixList\n\npsl = PublicSuffixList()\n# Uses built-in PSL file\n\nprint(psl.publicsuffix(\"www.example.com\")) # \"com\"\n# the longest public suffix part\n\nprint(psl.privatesuffix(\"www.example.com\")) # \"example.com\"\n# the shortest domain assigned for a registrant\n\nprint(psl.privatesuffix(\"com\")) # None\n# Returns None if no private (non-public) part found\n\nprint(psl.publicsuffix(\"www.example.unknownnewtld\")) # \"unknownnewtld\"\n# New TLDs are valid public suffix by default\n\nprint(psl.publicsuffix(\"www.example.\u9999\u6e2f\")) #\"\u9999\u6e2f\"\n# Accepts unicode\n\nprint(psl.publicsuffix(\"www.example.xn--j6w193g\")) # \"xn--j6w193g\"\n# Accepts Punycode IDNs by default\n\nprint(psl.privatesuffix(\"WWW.EXAMPLE.COM\")) # \"example.com\"\n# Returns in lowercase by default\n\nprint(psl.privatesuffix(\"WWW.EXAMPLE.COM\", keep_case=True) # \"EXAMPLE.COM\"\n# kwarg `keep_case=True` to disable the case conversion\n```\n\nThe latest PSL is packaged once a day. If you need to parse your own version,\nit can be passed as a file-like iterable object, or just a `str`:\n\n```python\nwith open(\"latest_psl.dat\", \"rb\") as f:\n psl = PublicSuffixList(f)\n```\n\nThe unittest and PSL updater can be invoked as module.\n```\n$ python -m publicsuffixlist.test\n$ python -m publicsuffixlist.update\n```\n\nAdditional convenient methods:\n\n```python\nprint(psl.is_private(\"example.com\")) # True\nprint(psl.is_public(\"example.com\")) # False\nprint(psl.privateparts(\"aaa.www.example.com\")) # (\"aaa\", \"www\", \"example.com\")\nprint(psl.subdomain(\"aaa.www.example.com\", depth=1)) # \"www.example.com\"\n```\n\nLimitation\n===\n\n#### Domain Label Validation\n\n`publicsuffixlist` do NOT provide domain name and label validation.\nIn the DNS protocol, most 8-bit characters are acceptable as labels of domain\nnames. While ICANN-compliant registries do not accept domain names containing\nunderscores (_), hostnames may include them. For example, DMARC records can\ncontain underscores. Users must confirm that the input domain names are valid\nbased on their specific context.\n\n#### Punycode Handling\nPartially encoded (Unicode-mixed) Punycode is not supported due to very slow\nPunycode encoding/decoding and unpredictable encoding results. If you are\nunsure whether an input is valid Punycode, you should use:\n`unknowndomain.encode(\"idna\").decode(\"ascii\")`. This method, converting to idna\nis idempotent.\n\n#### Handling Arbitrary Binary\nIf you need to accept arbitrary or malicious binary data, it can be passed as a\ntuple of bytes. Note that the returned bytes may include byte patterns that\ncannot be decoded or represented as a standard domain name.\nExample:\n```python\npsl.privatesuffix((b\"a.a\", b\"a.example\\xff\", b\"com\")) # (b\"a.example\\xff\", b\"com\")\n\n# Note that IDNs must be punycoded when passed as tuple of bytes.\npsl = PublicSuffixList(\"\u4f8b.example\")\npsl.publicsuffix((b\"xn--fsq\", b\"example\")) # (b\"xn--fsq\", b\"example\")\n# UTF-8 encoded bytes of \"\u4f8b\" do not match.\npsl.publicsuffix((b\"\\xe4\\xbe\\x8b\", b\"example\")) # (b\"example\",)\n```\n\nLicense\n===\n\n- This module is licensed under Mozilla Public License 2.0.\n- The Public Suffix List maintained by the Mozilla Foundation is licensed under\n the Mozilla Public License 2.0.\n- The PSL testcase dataset is in the public domain (CC0).\n\n\nDevelopment / Packaging\n===\nThis module and its packaging workflow are maintained in the author's\nrepository located at https://github.com/ko-zu/psl.\n\nA new package, which includes the latest PSL file, is automatically generated\nand uploaded to PyPI. The last part of the version number represents the\nrelease date. For example, `0.10.1.20230331` indicates a release date of March\n31, 2023.\n\nThis package dropped support for Python 2.7 and Python 3.4 or prior versions at\nthe version 1.0.0 release in June 2024. The last version that works on Python\n2.x is 0.10.0.x.\n\n\nSource / Link\n===\n\n- GitHub repository: (https://github.com/ko-zu/psl)\n- PyPI: (https://pypi.org/project/publicsuffixlist/)\n\n",
"bugtrack_url": null,
"license": "MPL-2.0",
"summary": "publicsuffixlist implement",
"version": "1.0.2.20250225",
"project_urls": {
"Homepage": "https://github.com/ko-zu/psl"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0e588c73bdb34473090024da924cc920afaed96c179e305558cced66a48ae603",
"md5": "25c692c723fa43cbdd4322587375cc4a",
"sha256": "33b3a9e8ac4c125bcffbde42c65375cb72c7ff7352466395328146bf53ef1e55"
},
"downloads": -1,
"filename": "publicsuffixlist-1.0.2.20250225-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "25c692c723fa43cbdd4322587375cc4a",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.5",
"size": 104465,
"upload_time": "2025-02-25T03:32:21",
"upload_time_iso_8601": "2025-02-25T03:32:21.569172Z",
"url": "https://files.pythonhosted.org/packages/0e/58/8c73bdb34473090024da924cc920afaed96c179e305558cced66a48ae603/publicsuffixlist-1.0.2.20250225-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "43150bb327fbbee1e76077ed47abe4f22a96c3a101fccea658b5dfd2ed802730",
"md5": "a82939a47521500ef8e980b5b5e08ab0",
"sha256": "fb78dab1e437a84aacaed63c40fd18a0d34e0bf280d6d8acab890b472fdfd764"
},
"downloads": -1,
"filename": "publicsuffixlist-1.0.2.20250225.tar.gz",
"has_sig": false,
"md5_digest": "a82939a47521500ef8e980b5b5e08ab0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5",
"size": 104803,
"upload_time": "2025-02-25T03:32:22",
"upload_time_iso_8601": "2025-02-25T03:32:22.766223Z",
"url": "https://files.pythonhosted.org/packages/43/15/0bb327fbbee1e76077ed47abe4f22a96c3a101fccea658b5dfd2ed802730/publicsuffixlist-1.0.2.20250225.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-25 03:32:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ko-zu",
"github_project": "psl",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "publicsuffixlist"
}