registry_downloader


Nameregistry_downloader JSON
Version 0.1.5 PyPI version JSON
download
home_pageNone
SummaryDownload & analyze business registry data
upload_time2024-11-11 22:50:36
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseNone
keywords business data downloader registry
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Business Registry Download

This is a tool to download business registry data from Estonian, Finnish, Latvian, Lithuanian and Czech business registers. These files are usually updated daily by the registers and contain information about companies, their officers, and other relevant information.

Happy to take PRs for other countries!

It's easy to load these files with [dlt](https://dlthub.com/), [duckdb](https://duckdb.org/) or transform them with [dbt](https://www.getdbt.com/) and integrate this into your data pipelines.

The downloads are all done in parrallel and async so it's pretty fast.

## To use from command line

Make sure you have uv installed
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
```

_Run with default settings:_
```sh
uvx registry_downloader
```

_Or override options for download directory, countries, and override URL:_
```sh
uvx registry_downloader --download-dir "./downloads" --countries ee --override-url ee=https://avaandmed.ariregister.rik.ee/et/avaandmete-allalaadimine
```

## To use as a library

You can install the `registry_downloader` package using either `pip` or `uv`. Here are examples for both:

**Using pip:**
```sh
pip install registry_downloader
```

**Using uv:**
```sh
uv add registry_downloader
```

_Run with default settings:_
```python
import asyncio
from registry_downloader import run_downloader

async def main() -> None:
    await run_downloader()

if __name__ == "__main__":
    asyncio.run(main()) 
```

_Or override options for download directory, countries, and override URL:_
```python
import asyncio
from registry_downloader import run_downloader

async def main() -> None:
    await run_downloader(
        download_dir="./downloads",
        countries=["ee"],
        override_url=["ee=https://avaandmed.ariregister.rik.ee/et/avaandmete-allalaadimine"]
    )

if __name__ == "__main__":
    asyncio.run(main()) 
```

## To develop

1. Install uv
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
```

2. Create a virtual environment and activate it
```sh
uv venv && source .venv/bin/activate
```

3. Install dependencies and ensure the virtual environment is in sync
```sh
uv sync
```

4. Build the project or run it locally with defaults
```sh
uv build
```
_or_
```sh
uv run src/registry_downloader
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "registry_downloader",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "business data, downloader, registry",
    "author": null,
    "author_email": "Martin Salo <martin@salo.ee>",
    "download_url": "https://files.pythonhosted.org/packages/9a/41/754115750789a9c4b1d8a3c2014fc298851f7ac242f6983f22dabb537665/registry_downloader-0.1.5.tar.gz",
    "platform": null,
    "description": "# Business Registry Download\n\nThis is a tool to download business registry data from Estonian, Finnish, Latvian, Lithuanian and Czech business registers. These files are usually updated daily by the registers and contain information about companies, their officers, and other relevant information.\n\nHappy to take PRs for other countries!\n\nIt's easy to load these files with [dlt](https://dlthub.com/), [duckdb](https://duckdb.org/) or transform them with [dbt](https://www.getdbt.com/) and integrate this into your data pipelines.\n\nThe downloads are all done in parrallel and async so it's pretty fast.\n\n## To use from command line\n\nMake sure you have uv installed\n```sh\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n```\n\n_Run with default settings:_\n```sh\nuvx registry_downloader\n```\n\n_Or override options for download directory, countries, and override URL:_\n```sh\nuvx registry_downloader --download-dir \"./downloads\" --countries ee --override-url ee=https://avaandmed.ariregister.rik.ee/et/avaandmete-allalaadimine\n```\n\n## To use as a library\n\nYou can install the `registry_downloader` package using either `pip` or `uv`. Here are examples for both:\n\n**Using pip:**\n```sh\npip install registry_downloader\n```\n\n**Using uv:**\n```sh\nuv add registry_downloader\n```\n\n_Run with default settings:_\n```python\nimport asyncio\nfrom registry_downloader import run_downloader\n\nasync def main() -> None:\n    await run_downloader()\n\nif __name__ == \"__main__\":\n    asyncio.run(main()) \n```\n\n_Or override options for download directory, countries, and override URL:_\n```python\nimport asyncio\nfrom registry_downloader import run_downloader\n\nasync def main() -> None:\n    await run_downloader(\n        download_dir=\"./downloads\",\n        countries=[\"ee\"],\n        override_url=[\"ee=https://avaandmed.ariregister.rik.ee/et/avaandmete-allalaadimine\"]\n    )\n\nif __name__ == \"__main__\":\n    asyncio.run(main()) \n```\n\n## To develop\n\n1. Install uv\n```sh\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n```\n\n2. Create a virtual environment and activate it\n```sh\nuv venv && source .venv/bin/activate\n```\n\n3. Install dependencies and ensure the virtual environment is in sync\n```sh\nuv sync\n```\n\n4. Build the project or run it locally with defaults\n```sh\nuv build\n```\n_or_\n```sh\nuv run src/registry_downloader\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "Download & analyze business registry data",
    "version": "0.1.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/salomartin/registry_downloader/issues",
        "Homepage": "https://github.com/salomartin/registry_downloader"
    },
    "split_keywords": [
        "business data",
        " downloader",
        " registry"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "91cdf12b66fdf85f3d079c17cb77a94ad744abbdfeb9cfdbe8a4cf089506a2c7",
                "md5": "17498baf0421de032fd20ce38b99cc31",
                "sha256": "398ee6692c351c80f3050399af5f4dcaf639011525139409713cc963897d51a4"
            },
            "downloads": -1,
            "filename": "registry_downloader-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "17498baf0421de032fd20ce38b99cc31",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 15305,
            "upload_time": "2024-11-11T22:50:34",
            "upload_time_iso_8601": "2024-11-11T22:50:34.791774Z",
            "url": "https://files.pythonhosted.org/packages/91/cd/f12b66fdf85f3d079c17cb77a94ad744abbdfeb9cfdbe8a4cf089506a2c7/registry_downloader-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9a41754115750789a9c4b1d8a3c2014fc298851f7ac242f6983f22dabb537665",
                "md5": "382a7dc67dff5afa7ef53cfc80b981a6",
                "sha256": "7fd16c9734f490f4bf86787f81c75efdf1fd634637a86f6028bcc55ecbf1ca14"
            },
            "downloads": -1,
            "filename": "registry_downloader-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "382a7dc67dff5afa7ef53cfc80b981a6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 22679,
            "upload_time": "2024-11-11T22:50:36",
            "upload_time_iso_8601": "2024-11-11T22:50:36.298069Z",
            "url": "https://files.pythonhosted.org/packages/9a/41/754115750789a9c4b1d8a3c2014fc298851f7ac242f6983f22dabb537665/registry_downloader-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-11 22:50:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "salomartin",
    "github_project": "registry_downloader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "registry_downloader"
}
        
Elapsed time: 8.93575s