psl-dns


Namepsl-dns JSON
Version 1.1.1 PyPI version JSON
download
home_pagehttps://github.com/sse-secure-systems/psl-dns
SummaryQuery the Public Suffix List (PSL) via DNS and check the PSL status of a domain.
upload_time2024-08-20 18:15:45
maintainerNone
docs_urlNone
authorPeter Thomassen
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DNS-based Public Suffix List handling for Python

This Python package provides a `PSL` class for [querying the Public
Suffix List (PSL)](https://publicsuffix.zone/) via the DNS. By utilizing
the library, one can retrieve information about the public suffix
status of a domain as well as the PSL rules governing it. There is also
a corresponding command-line tool, `psl-dns_query`, enabling convenient
use of the library from the shell.

Public suffix information is based on DNS lookups only; no rule
matching is performed at lookup time. To make this possible, the PSL
rules have been encoded in the DNS itself (currently under the
DNSSEC-enabled zone `query.publicsuffix.zone`). This facilitates easy
querying without the need to keep the PSL at hand. The PSL zone is
maintained by [SSE](https://securesystems.de/) and usually updated once
a day.

The `Parser` class (along with the `psl-dns_parse` command) is used to
iterate over a [PSL file](https://publicsuffix.org/list/public_suffix_list.dat)
and convert the ruleset into a list of DNS Resource Record sets for
submission to the DNS operator. The tool adds an extra `TXT` record at
the root of the PSL zone, containing the parsing timestamp as well as
the PSL file SHA-256 hash for currentness checking.

The package also contains the `psl-dns_check` command (based on the
`Checker` class) to iterate over a PSL file and query the DNS for each
rule encountered, to verify whether the PSL zone contents are in
agreement with the file. (Note that DNS caching may cause update
delays; after a zone update, you may be receiving outdated information
until the TTL of the PSL DNS records expires. To make sure, specify one
of the PSL zone's authoritative servers as the `resolver` argument.)

**Note:** DNS resolvers learn about the domains that get queried, so
depending on the use case, using this service may not be up to your
privacy standards. It is possible though to set up a private copy of
the query zone and configure a local resolver to avoid query leaks.

## Usage

### Python
The following examples show how to query the PSL via DNS using the
`PSL` class. For advanced use, please refer to the source.

Example use cases for the `Parser` and `Checker` classes can be found
in the scripts under `psl/commands/`.

#### Initialize
```python
>>> from psl_dns import PSL
>>> psl = PSL()
```

If your system resolver does not support `PTR` records, you can set
another resolver during initialization: `PSL(resolver='...')`

#### Query public suffix status of a domain (for the rules, see below)
```python
>>> psl.is_public_suffix('com')
True
>>> psl.is_public_suffix('checkip.dedyn.io')
False
>>> psl.is_public_suffix('takatsu.kawasaki.jp')
True
>>> psl.is_public_suffix('www.ikuoufukushi.takatsu.kawasaki.jp')
False
>>> psl.is_public_suffix('city.kawasaki.jp')
False
>>> psl.is_public_suffix('www.library.city.kawasaki.jp')
False
```

#### Get the public suffix for a domain
```python
>>> psl.get_public_suffix('com')
'com'
>>> psl.get_public_suffix('checkip.dedyn.io')
'dedyn.io'
```

The following examples are based on PSL wildcard rules. Wildcard labels
are expanded into the respective labels of the domain of interest:

```python
>>> psl.get_public_suffix('takatsu.kawasaki.jp')  # Wildcard *.kawasaki.jp
'takatsu.kawasaki.jp'
>>> psl.get_public_suffix('www.ikuoufukushi.takatsu.kawasaki.jp')  # same
'takatsu.kawasaki.jp'
>>> psl.get_public_suffix('city.kawasaki.jp')  # Wildcard exception
'jp'
>>> psl.get_public_suffix('www.library.city.kawasaki.jp')  # same
'jp'
```

If the queried domain has a trailing dot, the dot is preserved in the
response. Furthermore, IDDA mode is preserved so that Unicode queries
return Unicode responses, and Punycode queries return Punycode responses:

```python
>>> psl.get_public_suffix('www.xn--55qx5d.cn')
'xn--55qx5d.cn'
>>> psl.get_public_suffix('www.公司.cn.')
'公司.cn.'
```

#### Get the set of rules applicable for a domain
```python
>>> psl.get_rules('com')
{'com'}
>>> psl.get_rules('checkip.dedyn.io')
{'dedyn.io'}
>>> psl.get_rules('takatsu.kawasaki.jp')
{'*.kawasaki.jp'}
>>> psl.get_rules('www.ikuoufukushi.takatsu.kawasaki.jp')
{'*.kawasaki.jp'}
>>> psl.get_rules('city.kawasaki.jp') # Note wildcard exception
{'jp', '!city.kawasaki.jp', '*.kawasaki.jp'}
>>> psl.get_rules('www.library.city.kawasaki.jp') # same
{'jp', '!city.kawasaki.jp', '*.kawasaki.jp'}
```

Rules are always returned in Unicode encoding and without a trailing
dot, consistent with the encoding in the Public Suffix List itself:

```python
>>> psl.get_rules('www.xn--55qx5d.cn.')
{'公司.cn'}
```

### Command line

#### psl-dns_query
This is a command-line interface to the `PSL` class demonstrated in the
previous section.

```sh
$ psl-dns_query -h
usage: psl-dns_query [-h] [--zone ZONE] [--resolver RESOLVER]
                     [--timeout TIMEOUT] [-l] [-c] [-v]
                     domain

Query the PSL via DNS and check the PSL status of a domain.

Returns the the word "public" or "private", followed by the public
suffix that covers the queried domain. IDNA mode and trailing dots
(if given) are preserved.

Optionally, the set of applicable rules and the PSL checksum can be
displayed.

Exit codes: 0 (public) or 1 (private).

positional arguments:
  domain               Domain to query

optional arguments:
  -h, --help           show this help message and exit
  --zone ZONE          PSL zone to use (default: query.publicsuffix.zone)
  --resolver RESOLVER  DNS resolver to use instead of system resolver
                       (default: None)
  --timeout TIMEOUT    DNS query timeout (seconds) (default: 5)
  -l                   Show set of applicable rules (default: False)
  -c                   Show PSL checksum (default: False)
  -v, --verbose        Increase output verbosity (default: 0)
```

##### Retrieve status and public suffix
```sh
# Plain
$ psl-dns_query com
public com

# Same, followed by the set of relevant rules, in no particular order
$ psl-dns_query www.ck -l
private *
*.ck
!www.ck
*
```

#### psl-dns_parse
```sh
$ psl-dns_parse -h
usage: psl-dns_parse [-h] [--zone ZONE] [--format FORMAT] [-l] [-v] psl_file

Print rules from a Public Suffix List (PSL) file in DNS RRsets format.

positional arguments:
  psl_file         Path to PSL file

optional arguments:
  -h, --help       show this help message and exit
  --zone ZONE      PSL zone to use (default: query.publicsuffix.zone)
  --format FORMAT  Output format to use (default: deSEC)
  -l               List available formats (default: False)
  -v, --verbose    Increase output verbosity (default: 0)
```

##### Convert current PSL file to deSEC RRsets
```sh
# Note: This produces very long output
$ time psl-dns_parse <(curl https://publicsuffix.org/list/public_suffix_list.dat) | jq .
[
  {
    "subname": "ac",
    "ttl": 86400,
    "type": "PTR",
    "records": [
      "ac."
    ]
  },
  ... # shortened for readability
  {
    "subname": "",
    "ttl": 86400,
    "type": "TXT",
    "records": [
      "\"1555895008 d205f587d61c6bbf05bec818776da1dd030ce68f2e8912fea732158b9a33cc54\""
    ]
  }
]

real	0m1.262s
user	0m0.475s
sys	0m0.239s
```

#### psl-dns_check
```sh
$ psl-dns_check -h
usage: psl-dns_check [-h] [--resolver RESOLVER] [--timeout TIMEOUT]
                     [--zone ZONE] [-v]
                     psl_file

Check rules from the Public Suffix List (PSL) via DNS and output
inconsistencies.

positional arguments:
  psl_file             Path to PSL file

optional arguments:
  -h, --help           show this help message and exit
  --resolver RESOLVER  DNS resolver to use instead of system resolver
                       (default: None)
  --timeout TIMEOUT    DNS query timeout (in seconds) (default: 5)
  --zone ZONE          PSL zone to use (default: query.publicsuffix.zone)
  -v, --verbose        Increase output verbosity (default: 0)
```

##### Verifying the correctness of the PSL zone
```sh
$ time psl-dns_check -v <(curl https://publicsuffix.org/list/public_suffix_list.dat)
... # shortened for readability
INFO:psl:Querying for zone.id.query.publicsuffix.zone. TXT
INFO:psl:Querying for zone.id.query.publicsuffix.zone. PTR
INFO:psl:Querying for query.publicsuffix.zone. TXT
WARNING:psl:Hash mismatch! Input PSL file appears to differ from remote version.
8684 rules with 3 inconsistencies found

real	13m42.366s
user	0m38.560s
sys	0m8.383s
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sse-secure-systems/psl-dns",
    "name": "psl-dns",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Peter Thomassen",
    "author_email": "peter.thomassen@securesystems.de",
    "download_url": "https://files.pythonhosted.org/packages/c8/f9/f42662a9daee3568e74f31608ef7566beb3cfb1b066625871cc462ca1877/psl_dns-1.1.1.tar.gz",
    "platform": null,
    "description": "# DNS-based Public Suffix List handling for Python\n\nThis Python package provides a `PSL` class for [querying the Public\nSuffix List (PSL)](https://publicsuffix.zone/) via the DNS. By utilizing\nthe library, one can retrieve information about the public suffix\nstatus of a domain as well as the PSL rules governing it. There is also\na corresponding command-line tool, `psl-dns_query`, enabling convenient\nuse of the library from the shell.\n\nPublic suffix information is based on DNS lookups only; no rule\nmatching is performed at lookup time. To make this possible, the PSL\nrules have been encoded in the DNS itself (currently under the\nDNSSEC-enabled zone `query.publicsuffix.zone`). This facilitates easy\nquerying without the need to keep the PSL at hand. The PSL zone is\nmaintained by [SSE](https://securesystems.de/) and usually updated once\na day.\n\nThe `Parser` class (along with the `psl-dns_parse` command) is used to\niterate over a [PSL file](https://publicsuffix.org/list/public_suffix_list.dat)\nand convert the ruleset into a list of DNS Resource Record sets for\nsubmission to the DNS operator. The tool adds an extra `TXT` record at\nthe root of the PSL zone, containing the parsing timestamp as well as\nthe PSL file SHA-256 hash for currentness checking.\n\nThe package also contains the `psl-dns_check` command (based on the\n`Checker` class) to iterate over a PSL file and query the DNS for each\nrule encountered, to verify whether the PSL zone contents are in\nagreement with the file. (Note that DNS caching may cause update\ndelays; after a zone update, you may be receiving outdated information\nuntil the TTL of the PSL DNS records expires. To make sure, specify one\nof the PSL zone's authoritative servers as the `resolver` argument.)\n\n**Note:** DNS resolvers learn about the domains that get queried, so\ndepending on the use case, using this service may not be up to your\nprivacy standards. It is possible though to set up a private copy of\nthe query zone and configure a local resolver to avoid query leaks.\n\n## Usage\n\n### Python\nThe following examples show how to query the PSL via DNS using the\n`PSL` class. For advanced use, please refer to the source.\n\nExample use cases for the `Parser` and `Checker` classes can be found\nin the scripts under `psl/commands/`.\n\n#### Initialize\n```python\n>>> from psl_dns import PSL\n>>> psl = PSL()\n```\n\nIf your system resolver does not support `PTR` records, you can set\nanother resolver during initialization: `PSL(resolver='...')`\n\n#### Query public suffix status of a domain (for the rules, see below)\n```python\n>>> psl.is_public_suffix('com')\nTrue\n>>> psl.is_public_suffix('checkip.dedyn.io')\nFalse\n>>> psl.is_public_suffix('takatsu.kawasaki.jp')\nTrue\n>>> psl.is_public_suffix('www.ikuoufukushi.takatsu.kawasaki.jp')\nFalse\n>>> psl.is_public_suffix('city.kawasaki.jp')\nFalse\n>>> psl.is_public_suffix('www.library.city.kawasaki.jp')\nFalse\n```\n\n#### Get the public suffix for a domain\n```python\n>>> psl.get_public_suffix('com')\n'com'\n>>> psl.get_public_suffix('checkip.dedyn.io')\n'dedyn.io'\n```\n\nThe following examples are based on PSL wildcard rules. Wildcard labels\nare expanded into the respective labels of the domain of interest:\n\n```python\n>>> psl.get_public_suffix('takatsu.kawasaki.jp')  # Wildcard *.kawasaki.jp\n'takatsu.kawasaki.jp'\n>>> psl.get_public_suffix('www.ikuoufukushi.takatsu.kawasaki.jp')  # same\n'takatsu.kawasaki.jp'\n>>> psl.get_public_suffix('city.kawasaki.jp')  # Wildcard exception\n'jp'\n>>> psl.get_public_suffix('www.library.city.kawasaki.jp')  # same\n'jp'\n```\n\nIf the queried domain has a trailing dot, the dot is preserved in the\nresponse. Furthermore, IDDA mode is preserved so that Unicode queries\nreturn Unicode responses, and Punycode queries return Punycode responses:\n\n```python\n>>> psl.get_public_suffix('www.xn--55qx5d.cn')\n'xn--55qx5d.cn'\n>>> psl.get_public_suffix('www.\u516c\u53f8.cn.')\n'\u516c\u53f8.cn.'\n```\n\n#### Get the set of rules applicable for a domain\n```python\n>>> psl.get_rules('com')\n{'com'}\n>>> psl.get_rules('checkip.dedyn.io')\n{'dedyn.io'}\n>>> psl.get_rules('takatsu.kawasaki.jp')\n{'*.kawasaki.jp'}\n>>> psl.get_rules('www.ikuoufukushi.takatsu.kawasaki.jp')\n{'*.kawasaki.jp'}\n>>> psl.get_rules('city.kawasaki.jp') # Note wildcard exception\n{'jp', '!city.kawasaki.jp', '*.kawasaki.jp'}\n>>> psl.get_rules('www.library.city.kawasaki.jp') # same\n{'jp', '!city.kawasaki.jp', '*.kawasaki.jp'}\n```\n\nRules are always returned in Unicode encoding and without a trailing\ndot, consistent with the encoding in the Public Suffix List itself:\n\n```python\n>>> psl.get_rules('www.xn--55qx5d.cn.')\n{'\u516c\u53f8.cn'}\n```\n\n### Command line\n\n#### psl-dns_query\nThis is a command-line interface to the `PSL` class demonstrated in the\nprevious section.\n\n```sh\n$ psl-dns_query -h\nusage: psl-dns_query [-h] [--zone ZONE] [--resolver RESOLVER]\n                     [--timeout TIMEOUT] [-l] [-c] [-v]\n                     domain\n\nQuery the PSL via DNS and check the PSL status of a domain.\n\nReturns the the word \"public\" or \"private\", followed by the public\nsuffix that covers the queried domain. IDNA mode and trailing dots\n(if given) are preserved.\n\nOptionally, the set of applicable rules and the PSL checksum can be\ndisplayed.\n\nExit codes: 0 (public) or 1 (private).\n\npositional arguments:\n  domain               Domain to query\n\noptional arguments:\n  -h, --help           show this help message and exit\n  --zone ZONE          PSL zone to use (default: query.publicsuffix.zone)\n  --resolver RESOLVER  DNS resolver to use instead of system resolver\n                       (default: None)\n  --timeout TIMEOUT    DNS query timeout (seconds) (default: 5)\n  -l                   Show set of applicable rules (default: False)\n  -c                   Show PSL checksum (default: False)\n  -v, --verbose        Increase output verbosity (default: 0)\n```\n\n##### Retrieve status and public suffix\n```sh\n# Plain\n$ psl-dns_query com\npublic com\n\n# Same, followed by the set of relevant rules, in no particular order\n$ psl-dns_query www.ck -l\nprivate *\n*.ck\n!www.ck\n*\n```\n\n#### psl-dns_parse\n```sh\n$ psl-dns_parse -h\nusage: psl-dns_parse [-h] [--zone ZONE] [--format FORMAT] [-l] [-v] psl_file\n\nPrint rules from a Public Suffix List (PSL) file in DNS RRsets format.\n\npositional arguments:\n  psl_file         Path to PSL file\n\noptional arguments:\n  -h, --help       show this help message and exit\n  --zone ZONE      PSL zone to use (default: query.publicsuffix.zone)\n  --format FORMAT  Output format to use (default: deSEC)\n  -l               List available formats (default: False)\n  -v, --verbose    Increase output verbosity (default: 0)\n```\n\n##### Convert current PSL file to deSEC RRsets\n```sh\n# Note: This produces very long output\n$ time psl-dns_parse <(curl https://publicsuffix.org/list/public_suffix_list.dat) | jq .\n[\n  {\n    \"subname\": \"ac\",\n    \"ttl\": 86400,\n    \"type\": \"PTR\",\n    \"records\": [\n      \"ac.\"\n    ]\n  },\n  ... # shortened for readability\n  {\n    \"subname\": \"\",\n    \"ttl\": 86400,\n    \"type\": \"TXT\",\n    \"records\": [\n      \"\\\"1555895008 d205f587d61c6bbf05bec818776da1dd030ce68f2e8912fea732158b9a33cc54\\\"\"\n    ]\n  }\n]\n\nreal\t0m1.262s\nuser\t0m0.475s\nsys\t0m0.239s\n```\n\n#### psl-dns_check\n```sh\n$ psl-dns_check -h\nusage: psl-dns_check [-h] [--resolver RESOLVER] [--timeout TIMEOUT]\n                     [--zone ZONE] [-v]\n                     psl_file\n\nCheck rules from the Public Suffix List (PSL) via DNS and output\ninconsistencies.\n\npositional arguments:\n  psl_file             Path to PSL file\n\noptional arguments:\n  -h, --help           show this help message and exit\n  --resolver RESOLVER  DNS resolver to use instead of system resolver\n                       (default: None)\n  --timeout TIMEOUT    DNS query timeout (in seconds) (default: 5)\n  --zone ZONE          PSL zone to use (default: query.publicsuffix.zone)\n  -v, --verbose        Increase output verbosity (default: 0)\n```\n\n##### Verifying the correctness of the PSL zone\n```sh\n$ time psl-dns_check -v <(curl https://publicsuffix.org/list/public_suffix_list.dat)\n... # shortened for readability\nINFO:psl:Querying for zone.id.query.publicsuffix.zone. TXT\nINFO:psl:Querying for zone.id.query.publicsuffix.zone. PTR\nINFO:psl:Querying for query.publicsuffix.zone. TXT\nWARNING:psl:Hash mismatch! Input PSL file appears to differ from remote version.\n8684 rules with 3 inconsistencies found\n\nreal\t13m42.366s\nuser\t0m38.560s\nsys\t0m8.383s\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Query the Public Suffix List (PSL) via DNS and check the PSL status of a domain.",
    "version": "1.1.1",
    "project_urls": {
        "Homepage": "https://github.com/sse-secure-systems/psl-dns"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "82d248bc3a13f4ef7c3249ab3a17385fe512cc767b7c3ea50d79ad84fa5cf1ba",
                "md5": "ee5fda12bffd8e78aab3b00739618127",
                "sha256": "1163a803919c5814d4bae6698241e6000d5364f1e695d6e2cb9c2fabd24371db"
            },
            "downloads": -1,
            "filename": "psl_dns-1.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ee5fda12bffd8e78aab3b00739618127",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 16664,
            "upload_time": "2024-08-20T18:15:44",
            "upload_time_iso_8601": "2024-08-20T18:15:44.322844Z",
            "url": "https://files.pythonhosted.org/packages/82/d2/48bc3a13f4ef7c3249ab3a17385fe512cc767b7c3ea50d79ad84fa5cf1ba/psl_dns-1.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c8f9f42662a9daee3568e74f31608ef7566beb3cfb1b066625871cc462ca1877",
                "md5": "bdefd973407ee1cb1577376e0c5817f6",
                "sha256": "6962d32c44fc4098b1d3bffc7cf781a29812a3e675e0ddf98f2985eef1466981"
            },
            "downloads": -1,
            "filename": "psl_dns-1.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "bdefd973407ee1cb1577376e0c5817f6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15667,
            "upload_time": "2024-08-20T18:15:45",
            "upload_time_iso_8601": "2024-08-20T18:15:45.667789Z",
            "url": "https://files.pythonhosted.org/packages/c8/f9/f42662a9daee3568e74f31608ef7566beb3cfb1b066625871cc462ca1877/psl_dns-1.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-20 18:15:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sse-secure-systems",
    "github_project": "psl-dns",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "psl-dns"
}
        
Elapsed time: 0.51838s