blocklist-aggregator


Nameblocklist-aggregator JSON
Version 1.3.1 PyPI version JSON
download
home_pagehttps://github.com/dmachard/blocklist-aggregator
SummaryDomains blocklist aggregator
upload_time2024-07-06 08:41:07
maintainerNone
docs_urlNone
authorDenis MACHARD
requires_pythonNone
licenseNone
keywords blocklist aggregator domains dns blacklist whitelist
VCS
bugtrack_url
requirements pyyaml requests pure-cdb
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![Testing](https://github.com/dmachard/blocklist-aggregator/workflows/Testing/badge.svg) ![Build](https://github.com/dmachard/blocklist-aggregator/workflows/Build/badge.svg) ![Publish](https://github.com/dmachard/blocklist-aggregator/workflows/Publish/badge.svg) 

# Blocklist aggregator

This python module does the aggregation of several ads/tracking/malware lists, and merges them into a unified list with duplicates removed.
Create your own list from several sources.

See the **[blocklist-domains](https://github.com/dmachard/blocklist-domains)** repository for an implementation.

Default sources are defined on the [configuration file](../main/blocklist_aggregator/blocklist.conf)

## Table of contents

* [Installation](#installation)
* [Get Started](#get-started)
* [Custom Configuration](#custom-configuration)
* [Fetch and save-it to files](#fetch-and-save-it-to-files)

## Installation

![python 3.12.x](https://img.shields.io/badge/python%203.12.x-tested-blue) ![python 3.11.x](https://img.shields.io/badge/python%203.11.x-tested-blue) ![python 3.10.x](https://img.shields.io/badge/python%203.10.x-tested-blue) ![python 3.9.x](https://img.shields.io/badge/python%203.9.x-tested-blue) ![python 3.8.x](https://img.shields.io/badge/python%203.8.x-tested-blue)

If you want to generate your own unified blocklist, 
install this module with the pip command.

```python
pip install blocklist_aggregator
```

## Get started

This basic example enable to get a unified list of domains.
You can save-it in a file or do what you want.

```python
import blocklist_aggregator

unified = blocklist_aggregator.fetch()
print(unified)
[ "doubleclick.net", ..., "telemetry.dropbox.com" ]

print(len(unified))
152978
```

## Custom configuration

See the default [configuration file](../main/blocklist_aggregator/blocklist.conf)

The configuration contains:

* the ads/tracking/malware URL lists with the pattern (regex) to use
* the domains list to exclude (whitelist)
* additionnal domains list to block (blacklist)

The configuration can be overwritten at runtime.

```python
cfg_yaml = "verbose: true"
unified = blocklist_aggregator.fetch(cfg_update=cfg_yaml)
```

or loaded from external config file

```python
unified = blocklist_aggregator.fetch(cfg_filename="/home/custom-blocklist.conf")
```

## Fetch and save-it to files

This module can be used to export the list in several format:

* text
* hosts
* CDB (key/value database)

```python
import blocklist_aggregator

# fetch domains
unified = blocklist_aggregator.fetch()

# save to a text file
blocklist_aggregator.save_raw(filename="/tmp/unified_list.txt")

# save to hosts file
blocklist_aggregator.save_hosts(filename="/tmp/unified_hosts.txt", ip="0.0.0.0")

# save to CDB
blocklist_aggregator.save_cdb(filename="/tmp/unified_domains.cdb")
```

## For developpers

Run test units

```bash
python3 -m unittest discover tests/ -v
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dmachard/blocklist-aggregator",
    "name": "blocklist-aggregator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "blocklist aggregator domains dns blacklist whitelist",
    "author": "Denis MACHARD",
    "author_email": "d.machard@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/3a/d9/85070d45f50a3f8882599cc354bb7382c4dfd25e3dd817f5dad26286863a/blocklist_aggregator-1.3.1.tar.gz",
    "platform": "any",
    "description": "![Testing](https://github.com/dmachard/blocklist-aggregator/workflows/Testing/badge.svg) ![Build](https://github.com/dmachard/blocklist-aggregator/workflows/Build/badge.svg) ![Publish](https://github.com/dmachard/blocklist-aggregator/workflows/Publish/badge.svg) \n\n# Blocklist aggregator\n\nThis python module does the aggregation of several ads/tracking/malware lists, and merges them into a unified list with duplicates removed.\nCreate your own list from several sources.\n\nSee the **[blocklist-domains](https://github.com/dmachard/blocklist-domains)** repository for an implementation.\n\nDefault sources are defined on the [configuration file](../main/blocklist_aggregator/blocklist.conf)\n\n## Table of contents\n\n* [Installation](#installation)\n* [Get Started](#get-started)\n* [Custom Configuration](#custom-configuration)\n* [Fetch and save-it to files](#fetch-and-save-it-to-files)\n\n## Installation\n\n![python 3.12.x](https://img.shields.io/badge/python%203.12.x-tested-blue) ![python 3.11.x](https://img.shields.io/badge/python%203.11.x-tested-blue) ![python 3.10.x](https://img.shields.io/badge/python%203.10.x-tested-blue) ![python 3.9.x](https://img.shields.io/badge/python%203.9.x-tested-blue) ![python 3.8.x](https://img.shields.io/badge/python%203.8.x-tested-blue)\n\nIf you want to generate your own unified blocklist, \ninstall this module with the pip command.\n\n```python\npip install blocklist_aggregator\n```\n\n## Get started\n\nThis basic example enable to get a unified list of domains.\nYou can save-it in a file or do what you want.\n\n```python\nimport blocklist_aggregator\n\nunified = blocklist_aggregator.fetch()\nprint(unified)\n[ \"doubleclick.net\", ..., \"telemetry.dropbox.com\" ]\n\nprint(len(unified))\n152978\n```\n\n## Custom configuration\n\nSee the default [configuration file](../main/blocklist_aggregator/blocklist.conf)\n\nThe configuration contains:\n\n* the ads/tracking/malware URL lists with the pattern (regex) to use\n* the domains list to exclude (whitelist)\n* additionnal domains list to block (blacklist)\n\nThe configuration can be overwritten at runtime.\n\n```python\ncfg_yaml = \"verbose: true\"\nunified = blocklist_aggregator.fetch(cfg_update=cfg_yaml)\n```\n\nor loaded from external config file\n\n```python\nunified = blocklist_aggregator.fetch(cfg_filename=\"/home/custom-blocklist.conf\")\n```\n\n## Fetch and save-it to files\n\nThis module can be used to export the list in several format:\n\n* text\n* hosts\n* CDB (key/value database)\n\n```python\nimport blocklist_aggregator\n\n# fetch domains\nunified = blocklist_aggregator.fetch()\n\n# save to a text file\nblocklist_aggregator.save_raw(filename=\"/tmp/unified_list.txt\")\n\n# save to hosts file\nblocklist_aggregator.save_hosts(filename=\"/tmp/unified_hosts.txt\", ip=\"0.0.0.0\")\n\n# save to CDB\nblocklist_aggregator.save_cdb(filename=\"/tmp/unified_domains.cdb\")\n```\n\n## For developpers\n\nRun test units\n\n```bash\npython3 -m unittest discover tests/ -v\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Domains blocklist aggregator",
    "version": "1.3.1",
    "project_urls": {
        "Homepage": "https://github.com/dmachard/blocklist-aggregator"
    },
    "split_keywords": [
        "blocklist",
        "aggregator",
        "domains",
        "dns",
        "blacklist",
        "whitelist"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e59660a7ae6eb2d3c9c5a115c4c73d453338a98c1f7590b55e8ddf5c522d9a05",
                "md5": "9952d1e1356db1830930a2f0c4a750fd",
                "sha256": "b2dd4250bd9f3224ed7e5cc499ae801f23d835d8e5c69dce1f8911dadf5a1a66"
            },
            "downloads": -1,
            "filename": "blocklist_aggregator-1.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9952d1e1356db1830930a2f0c4a750fd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 6068,
            "upload_time": "2024-07-06T08:41:06",
            "upload_time_iso_8601": "2024-07-06T08:41:06.257905Z",
            "url": "https://files.pythonhosted.org/packages/e5/96/60a7ae6eb2d3c9c5a115c4c73d453338a98c1f7590b55e8ddf5c522d9a05/blocklist_aggregator-1.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3ad985070d45f50a3f8882599cc354bb7382c4dfd25e3dd817f5dad26286863a",
                "md5": "4cbf832c2f97c2a553739993c6120ddc",
                "sha256": "b4e64b4a2af8b9ef619f6a7c9cc23ed4ae8582dabd512176df5de955366196ac"
            },
            "downloads": -1,
            "filename": "blocklist_aggregator-1.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "4cbf832c2f97c2a553739993c6120ddc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6012,
            "upload_time": "2024-07-06T08:41:07",
            "upload_time_iso_8601": "2024-07-06T08:41:07.789153Z",
            "url": "https://files.pythonhosted.org/packages/3a/d9/85070d45f50a3f8882599cc354bb7382c4dfd25e3dd817f5dad26286863a/blocklist_aggregator-1.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-06 08:41:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dmachard",
    "github_project": "blocklist-aggregator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.0"
                ]
            ]
        },
        {
            "name": "pure-cdb",
            "specs": [
                [
                    "==",
                    "4.0.0"
                ]
            ]
        }
    ],
    "lcname": "blocklist-aggregator"
}
        
Elapsed time: 0.26853s