# edi-energy.de scraper
[](LICENSE)






The Python package `edi_energy_scraper` provides easy to use methods to mirror the free documents on bdew-mako.de.
### Rationale / Why?
If you'd like to be informed about new regulations or data formats being published on bdew-mako.de you can either
- visit the site every day and hope that you see the changes if this is your favourite hobby,
- or automate the task.
This repository helps you with the latter. It allows you to create an up-to-date copy of edi-energy.de on your local
computer. Other than if you mirrored the files using `wget` or `curl`, you'll get a clean and intuitive directory
structure.
From there you can e.g. commit the files into a VCS (like e.g. our [edi_energy_mirror](https://github.com/Hochfrequenz/edi_energy_mirror)), scrape the PDF/Word files for later use...
We're all hoping for the day of true digitization on which this repository will become obsolete.
### See also
There is a similar project in C# by Fabian Wetzel: [fabsenet/edi-energy-extracto](https://github.com/fabsenet/edi-energy-extractor/).
Other than this project, it stores the downloaded data in a database instead of a file system.
It also works with `bdew-mako.de`.
## How to use the Package (as a user)
Install via pip:
```bash
pip install edi_energy_scraper
```
Create a directory in which you'd like to save the mirrored data:
```bash
mkdir edi_energy_de
```
Then import it and start the download:
```python
import asyncio
from edi_energy_scraper import EdiEnergyScraper
# add the following lines to enable debug logging to stdout (CLI)
# import logging
# import sys
# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
async def mirror():
scraper = EdiEnergyScraper(path_to_mirror_directory="edi_energy_de")
await scraper.mirror()
if __name__ == "__main__":
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
asyncio.run(mirror())
```
This creates a directory structure:
```
-|-your_script_cwd.py
|-edi_energy_de
|- FV2310 (contains files valid since 2023-10-01)
|- ahb.pdf
|- ahb.docx
|- ...
|- FV2404 (contains files valid since 2024-04-03)
|- mig.pdf
|- mig.docx
|- ...
|- FV2504 (contains files valid since 2025-06-06)
|- allgemeine_festlegungen.pdf
|- schema.xsd
|- ...
```
> [!TIP]
> You can extract the information encoded into the filenames:
> ```python
> from edi_energy_scraper import DocumentMetadata
> structured_information = DocumentMetadata.from_filename("AHB_COMDIS_1.0f_99991231_20250605_20250605_8872.pdf")
> # DocumentMetadata(kind='MIG', edifact_format=<EdifactFormat.REQOTE: 'REQOTE'>, valid_from=datetime.date(2023, 9, 30), valid_unt...traordinary_publication=True, is_error_correction=False, is_informational_reading_version=True, additional_text=None, id=10071)
```
## How to use this Repository on Your Machine (for development)
Please follow the instructions in
our [Python Template Repository](https://github.com/Hochfrequenz/python_template_repository#how-to-use-this-repository-on-your-machine)
. And for further information, see the [Tox Repository](https://github.com/tox-dev/tox).
## Contribute
You are very welcome to contribute to this template repository by opening a pull request against the main branch.
Raw data
{
"_id": null,
"home_page": null,
"name": "edi-energy-scraper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "ahb, automation, bdew, edi@energy",
"author": null,
"author_email": "Hochfrequenz Unternehmensberatung GmbH <info+github@hochfrequenz.de>",
"download_url": "https://files.pythonhosted.org/packages/57/98/d11e295a477324219accd79d4b29b1b7d5948fc5a23bb470f582c800f15d/edi_energy_scraper-2.0.8.tar.gz",
"platform": null,
"description": "# edi-energy.de scraper\n\n[](LICENSE)\n\n\n\n\n\n\n\nThe Python package `edi_energy_scraper` provides easy to use methods to mirror the free documents on bdew-mako.de.\n\n### Rationale / Why?\n\nIf you'd like to be informed about new regulations or data formats being published on bdew-mako.de you can either\n\n- visit the site every day and hope that you see the changes if this is your favourite hobby,\n- or automate the task.\n\nThis repository helps you with the latter. It allows you to create an up-to-date copy of edi-energy.de on your local\ncomputer. Other than if you mirrored the files using `wget` or `curl`, you'll get a clean and intuitive directory\nstructure.\n\nFrom there you can e.g. commit the files into a VCS (like e.g. our [edi_energy_mirror](https://github.com/Hochfrequenz/edi_energy_mirror)), scrape the PDF/Word files for later use...\n\nWe're all hoping for the day of true digitization on which this repository will become obsolete.\n\n### See also\nThere is a similar project in C# by Fabian Wetzel: [fabsenet/edi-energy-extracto](https://github.com/fabsenet/edi-energy-extractor/).\nOther than this project, it stores the downloaded data in a database instead of a file system.\nIt also works with `bdew-mako.de`.\n\n## How to use the Package (as a user)\n\nInstall via pip:\n\n```bash\npip install edi_energy_scraper\n```\n\nCreate a directory in which you'd like to save the mirrored data:\n\n```bash\nmkdir edi_energy_de\n```\n\nThen import it and start the download:\n\n```python\nimport asyncio\nfrom edi_energy_scraper import EdiEnergyScraper\n\n\n# add the following lines to enable debug logging to stdout (CLI)\n# import logging\n# import sys\n# logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)\n\nasync def mirror():\n scraper = EdiEnergyScraper(path_to_mirror_directory=\"edi_energy_de\")\n await scraper.mirror()\n\n\nif __name__ == \"__main__\":\n loop = asyncio.new_event_loop()\n asyncio.set_event_loop(loop)\n asyncio.run(mirror())\n\n```\n\nThis creates a directory structure:\n\n```\n-|-your_script_cwd.py\n |-edi_energy_de\n |- FV2310 (contains files valid since 2023-10-01)\n |- ahb.pdf\n |- ahb.docx\n |- ...\n |- FV2404 (contains files valid since 2024-04-03)\n |- mig.pdf\n |- mig.docx\n |- ...\n |- FV2504 (contains files valid since 2025-06-06)\n |- allgemeine_festlegungen.pdf\n |- schema.xsd\n |- ...\n```\n\n> [!TIP]\n> You can extract the information encoded into the filenames:\n> ```python\n> from edi_energy_scraper import DocumentMetadata\n> structured_information = DocumentMetadata.from_filename(\"AHB_COMDIS_1.0f_99991231_20250605_20250605_8872.pdf\")\n> # DocumentMetadata(kind='MIG', edifact_format=<EdifactFormat.REQOTE: 'REQOTE'>, valid_from=datetime.date(2023, 9, 30), valid_unt...traordinary_publication=True, is_error_correction=False, is_informational_reading_version=True, additional_text=None, id=10071)\n```\n\n## How to use this Repository on Your Machine (for development)\n\nPlease follow the instructions in\nour [Python Template Repository](https://github.com/Hochfrequenz/python_template_repository#how-to-use-this-repository-on-your-machine)\n. And for further information, see the [Tox Repository](https://github.com/tox-dev/tox).\n\n## Contribute\n\nYou are very welcome to contribute to this template repository by opening a pull request against the main branch.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "a scraper to mirror edi-energy.de",
"version": "2.0.8",
"project_urls": {
"Changelog": "https://github.com/Hochfrequenz/edi_energy_scraper/releases",
"Homepage": "https://github.com/Hochfrequenz/edi_energy_scraper"
},
"split_keywords": [
"ahb",
" automation",
" bdew",
" edi@energy"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2da2a632de88b96a935a05003bc1d8a9a27a897b24b7da0570d14f29b8d95fba",
"md5": "ccdd8512b8a93ea88315a70e5bed9219",
"sha256": "35ab731dd2bc31af19eaa9f7e1481425a5b892c3578bec59900260c4b1ae2b26"
},
"downloads": -1,
"filename": "edi_energy_scraper-2.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ccdd8512b8a93ea88315a70e5bed9219",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 13159,
"upload_time": "2025-02-10T07:51:35",
"upload_time_iso_8601": "2025-02-10T07:51:35.576314Z",
"url": "https://files.pythonhosted.org/packages/2d/a2/a632de88b96a935a05003bc1d8a9a27a897b24b7da0570d14f29b8d95fba/edi_energy_scraper-2.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5798d11e295a477324219accd79d4b29b1b7d5948fc5a23bb470f582c800f15d",
"md5": "dd1b2d3e239102a39e8a545c84d8385f",
"sha256": "d890771e00139bcdcfc3bf6f573c3d96bcedc326633013d94dfb7c4eeac47d93"
},
"downloads": -1,
"filename": "edi_energy_scraper-2.0.8.tar.gz",
"has_sig": false,
"md5_digest": "dd1b2d3e239102a39e8a545c84d8385f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 18384,
"upload_time": "2025-02-10T07:51:36",
"upload_time_iso_8601": "2025-02-10T07:51:36.920686Z",
"url": "https://files.pythonhosted.org/packages/57/98/d11e295a477324219accd79d4b29b1b7d5948fc5a23bb470f582c800f15d/edi_energy_scraper-2.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-10 07:51:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Hochfrequenz",
"github_project": "edi_energy_scraper",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "aiohappyeyeballs",
"specs": [
[
"==",
"2.4.4"
]
]
},
{
"name": "aiohttp",
"specs": [
[
"==",
"3.11.11"
]
]
},
{
"name": "aiosignal",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"25.1.0"
]
]
},
{
"name": "efoli",
"specs": [
[
"==",
"1.4.0"
]
]
},
{
"name": "frozenlist",
"specs": [
[
"==",
"1.5.0"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "more-itertools",
"specs": [
[
"==",
"10.6.0"
]
]
},
{
"name": "multidict",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "propcache",
"specs": [
[
"==",
"0.2.1"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.10.6"
]
]
},
{
"name": "pydantic-core",
"specs": [
[
"==",
"2.27.2"
]
]
},
{
"name": "pypdf",
"specs": [
[
"==",
"5.2.0"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2025.1"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.12.2"
]
]
},
{
"name": "yarl",
"specs": [
[
"==",
"1.18.3"
]
]
}
],
"tox": true,
"lcname": "edi-energy-scraper"
}