hibp-downloader


Namehibp-downloader JSON
Version 0.3.1 PyPI version JSON
download
home_page
SummaryEfficiently download HIBP new pwned password data by hash-prefix for a local-copy
upload_time2024-02-09 01:48:57
maintainer
docs_urlNone
authorNicholas de Jong
requires_python>=3.8,<4.0
licenseBSD-3-Clause
keywords hibp-downloader hibp haveibeenpwned haveibeenpwned-downloader sha1 ntlm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # hibp-downloader

[![pypi](https://img.shields.io/pypi/v/hibp-downloader.svg)](https://pypi.python.org/pypi/hibp-downloader/)
[![python](https://img.shields.io/pypi/pyversions/hibp-downloader.svg)](https://github.com/threatpatrols/hibp-downloader/)
[![build tests](https://github.com/threatpatrols/hibp-downloader/actions/workflows/build-tests.yml/badge.svg)](https://github.com/threatpatrols/hibp-downloader/actions/workflows/build-tests.yml)
[![docs](https://img.shields.io/readthedocs/hibp-downloader)](https://hibp-downloader.readthedocs.io)
[![license](https://img.shields.io/github/license/threatpatrols/hibp-downloader.svg)](https://github.com/threatpatrols/hibp-downloader)

This is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome
[HIBP](https://haveibeenpwned.com/Passwords) pwned passwords [api-endpoint](https://api.pwnedpasswords.com) using all the good bits;
multiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to make things as fast 
as is Pythonly possible.

## Features

 - Easily resume interrupted `download` operations into a `--data-path` without re-clobbering api-source.
 - Only download hash-prefix content blocks when the source content has changed (via content ETAG values); thus making 
   it easy to periodically re-sync when needed.
 - Ability to directly `query` for compromised password values from the data in-place; efficient enough to attach a 
   service with reasonable loads.
 - Ability to generate a single text file with in-order pwned password hash values, similar to [PwnedPasswordsDownloader](https://github.com/HaveIBeenPwned/PwnedPasswordsDownloader) from the HIBP team.
 - Per prefix file metadata in JSON format for easy data reuse by other tooling if required.

## Install
```commandline
pip install --upgrade hibp-downloader
```

## Usage
![screenshot-help.png](https://raw.githubusercontent.com/threatpatrols/hibp-downloader/main/docs/content/assets/screenshot-help.png)

## Performance
Sample download activity log; host with 12 cores on 45Mbit/s DSL connection. 
```text
2023-11-12T21:25:08+1000 | INFO | hibp-downloader | prefix=00ec3 source=[lc:10 et:2 rc:3800 ro:0 xx:0] processed=[62.0MB ~43589H/s] api=[105req/s 60.0MB] runtime=1.2min
2023-11-12T21:25:09+1000 | INFO | hibp-downloader | prefix=00eff source=[lc:10 et:2 rc:3850 ro:0 xx:0] processed=[62.8MB ~43547H/s] api=[105req/s 60.8MB] runtime=1.2min
2023-11-12T21:25:10+1000 | INFO | hibp-downloader | prefix=00f3b source=[lc:10 et:2 rc:3900 ro:0 xx:0] processed=[63.7MB ~43528H/s] api=[105req/s 61.7MB] runtime=1.2min
2023-11-12T21:25:11+1000 | INFO | hibp-downloader | prefix=00f6d source=[lc:10 et:2 rc:3950 ro:0 xx:0] processed=[64.5MB ~43541H/s] api=[105req/s 62.5MB] runtime=1.3min
```

 - 105x requests per second to `api.pwnedpasswords.com`
 - Log sources are shorthand:
     - `lc`: 10x prefix files from local-cache
     - `et`: 2x etag-match responses
     - `rc`: 3950x from remote-cache
     - `ro`: 0x from remote-origin
     - `xx`: 0x failed download
 - 62MB downloaded in ~75 seconds
 - Approx ~43k hash values per second

## Project

 - Github - [github.com/threatpatrols/hibp-downloader](https://github.com/threatpatrols/hibp-downloader)
 - PyPI - [pypi.org/project/hibp-downloader/](https://pypi.org/project/hibp-downloader/)
 - ReadTheDocs - [hibp-downloader.readthedocs.io](https://hibp-downloader.readthedocs.io)

## Copyright
 - Copyright &copy; 2023 [Threat Patrols Pty Ltd](https://www.threatpatrols.com)
 - Copyright &copy; 2023 [Nicholas de Jong](https://www.nicholasdejong.com)

All rights reserved.

## License
 * BSD-3-Clause - see LICENSE file for details.


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "hibp-downloader",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "hibp-downloader,hibp,haveibeenpwned,haveibeenpwned-downloader,sha1,ntlm",
    "author": "Nicholas de Jong",
    "author_email": "contact@threatpatrols.com",
    "download_url": "https://files.pythonhosted.org/packages/0e/b9/f18a66f51a8184abd788f6e1ce3bda629de6a4f846145ae76d6cceb7b222/hibp_downloader-0.3.1.tar.gz",
    "platform": null,
    "description": "# hibp-downloader\n\n[![pypi](https://img.shields.io/pypi/v/hibp-downloader.svg)](https://pypi.python.org/pypi/hibp-downloader/)\n[![python](https://img.shields.io/pypi/pyversions/hibp-downloader.svg)](https://github.com/threatpatrols/hibp-downloader/)\n[![build tests](https://github.com/threatpatrols/hibp-downloader/actions/workflows/build-tests.yml/badge.svg)](https://github.com/threatpatrols/hibp-downloader/actions/workflows/build-tests.yml)\n[![docs](https://img.shields.io/readthedocs/hibp-downloader)](https://hibp-downloader.readthedocs.io)\n[![license](https://img.shields.io/github/license/threatpatrols/hibp-downloader.svg)](https://github.com/threatpatrols/hibp-downloader)\n\nThis is a CLI tool to efficiently download a local copy of the pwned password hash data from the very awesome\n[HIBP](https://haveibeenpwned.com/Passwords) pwned passwords [api-endpoint](https://api.pwnedpasswords.com) using all the good bits;\nmultiprocessing, async-processes, local-caching, content-etags and http2-connection pooling to make things as fast \nas is Pythonly possible.\n\n## Features\n\n - Easily resume interrupted `download` operations into a `--data-path` without re-clobbering api-source.\n - Only download hash-prefix content blocks when the source content has changed (via content ETAG values); thus making \n   it easy to periodically re-sync when needed.\n - Ability to directly `query` for compromised password values from the data in-place; efficient enough to attach a \n   service with reasonable loads.\n - Ability to generate a single text file with in-order pwned password hash values, similar to [PwnedPasswordsDownloader](https://github.com/HaveIBeenPwned/PwnedPasswordsDownloader) from the HIBP team.\n - Per prefix file metadata in JSON format for easy data reuse by other tooling if required.\n\n## Install\n```commandline\npip install --upgrade hibp-downloader\n```\n\n## Usage\n![screenshot-help.png](https://raw.githubusercontent.com/threatpatrols/hibp-downloader/main/docs/content/assets/screenshot-help.png)\n\n## Performance\nSample download activity log; host with 12 cores on 45Mbit/s DSL connection. \n```text\n2023-11-12T21:25:08+1000 | INFO | hibp-downloader | prefix=00ec3 source=[lc:10 et:2 rc:3800 ro:0 xx:0] processed=[62.0MB ~43589H/s] api=[105req/s 60.0MB] runtime=1.2min\n2023-11-12T21:25:09+1000 | INFO | hibp-downloader | prefix=00eff source=[lc:10 et:2 rc:3850 ro:0 xx:0] processed=[62.8MB ~43547H/s] api=[105req/s 60.8MB] runtime=1.2min\n2023-11-12T21:25:10+1000 | INFO | hibp-downloader | prefix=00f3b source=[lc:10 et:2 rc:3900 ro:0 xx:0] processed=[63.7MB ~43528H/s] api=[105req/s 61.7MB] runtime=1.2min\n2023-11-12T21:25:11+1000 | INFO | hibp-downloader | prefix=00f6d source=[lc:10 et:2 rc:3950 ro:0 xx:0] processed=[64.5MB ~43541H/s] api=[105req/s 62.5MB] runtime=1.3min\n```\n\n - 105x requests per second to `api.pwnedpasswords.com`\n - Log sources are shorthand:\n     - `lc`: 10x prefix files from local-cache\n     - `et`: 2x etag-match responses\n     - `rc`: 3950x from remote-cache\n     - `ro`: 0x from remote-origin\n     - `xx`: 0x failed download\n - 62MB downloaded in ~75 seconds\n - Approx ~43k hash values per second\n\n## Project\n\n - Github - [github.com/threatpatrols/hibp-downloader](https://github.com/threatpatrols/hibp-downloader)\n - PyPI - [pypi.org/project/hibp-downloader/](https://pypi.org/project/hibp-downloader/)\n - ReadTheDocs - [hibp-downloader.readthedocs.io](https://hibp-downloader.readthedocs.io)\n\n## Copyright\n - Copyright &copy; 2023 [Threat Patrols Pty Ltd](https://www.threatpatrols.com)\n - Copyright &copy; 2023 [Nicholas de Jong](https://www.nicholasdejong.com)\n\nAll rights reserved.\n\n## License\n * BSD-3-Clause - see LICENSE file for details.\n\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "Efficiently download HIBP new pwned password data by hash-prefix for a local-copy",
    "version": "0.3.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/threatpatrols/hibp-downloader/issues",
        "Documentation": "https://hibp-downloader.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/threatpatrols/hibp-downloader",
        "Repository": "https://github.com/threatpatrols/hibp-downloader"
    },
    "split_keywords": [
        "hibp-downloader",
        "hibp",
        "haveibeenpwned",
        "haveibeenpwned-downloader",
        "sha1",
        "ntlm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b349570b9fe497295aa6401e9b88f03c8692d1670aaeb568ca1e7a67f46ee54b",
                "md5": "7414e82e6d9c91248777c37af5290ce9",
                "sha256": "025d961f6957e1cb859178e553d2568890136913e1d67000d36f79a0ca9a3a29"
            },
            "downloads": -1,
            "filename": "hibp_downloader-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7414e82e6d9c91248777c37af5290ce9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 25859,
            "upload_time": "2024-02-09T01:48:55",
            "upload_time_iso_8601": "2024-02-09T01:48:55.327491Z",
            "url": "https://files.pythonhosted.org/packages/b3/49/570b9fe497295aa6401e9b88f03c8692d1670aaeb568ca1e7a67f46ee54b/hibp_downloader-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0eb9f18a66f51a8184abd788f6e1ce3bda629de6a4f846145ae76d6cceb7b222",
                "md5": "a9af145eb8ddd7e098cf5c7ae408b5e1",
                "sha256": "54a0119672bcf9d86a6e2a531c34c89a300532c52b1167ae9f6ecc67d8f95b1e"
            },
            "downloads": -1,
            "filename": "hibp_downloader-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "a9af145eb8ddd7e098cf5c7ae408b5e1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 20499,
            "upload_time": "2024-02-09T01:48:57",
            "upload_time_iso_8601": "2024-02-09T01:48:57.600946Z",
            "url": "https://files.pythonhosted.org/packages/0e/b9/f18a66f51a8184abd788f6e1ce3bda629de6a4f846145ae76d6cceb7b222/hibp_downloader-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-09 01:48:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "threatpatrols",
    "github_project": "hibp-downloader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "hibp-downloader"
}
        
Elapsed time: 0.18346s