[![Unit Tests](https://github.com/mpm-tu-berlin/eflips-ingest/actions/workflows/unittests.yml/badge.svg)](https://github.com/mpm-tu-berlin/eflips-ingest/actions/workflows/unittests.yml)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
# eflips-ingest
---
Part of the [eFLIPS/simBA](https://github.com/stars/ludgerheide/lists/ebus2030) list of projects.
---
This repository contains code to import bus schedules from various sources into an [eFLIPS-Model](https://github.com/mpm-tu-berlin/eflips-model) database.
## Installation
1. Set up a [PostgreSQL](https://www.postgresql.org/) database with the [PostGIS](https://postgis.net/) extension and `BTREE_gist` enabled.
```bash
apt install postgresql postgis
sudo -u postgres psql createdb eflips
sudo -u postgres psql eflips -c "CREATE EXTENSION postgis;"
sudo -u postgres psql eflips -c "CREATE EXTENSION btree_gist;"
```
2. Clone this git repository (or [download a specific release](https://github.com/mpm-tu-berlin/eflips-depot/releases))
```bash
git clone git@github.com:mpm-tu-berlin/eflips-model.git
```
3. Install the packages listed in `poetry.lock` and `pyproject.toml` into your Python environment. Notes:
- This project depends on [pyproj](https://pyproj4.github.io/pyproj/stable/installation.html), which may require the `proj-bin` package (`apt install proj-bin` on Ubuntu).
- The supported platforms are macOS and Linux.
- Using the [poetry](https://python-poetry.org/) package manager is recommended. It can be installed accoring to the
instructions listed [here](https://python-poetry.org/docs/#installing-with-the-official-installer).
```bash
poetry install
```
## Usage
### Command line
The code is organized into various Python files under the `eflips/ingest` folder, each for ingesting a specific data source. These files should be runnable using `python eflips/ingest/x.py` The following data sources are currently supported:
- `bvgxml.py`: XML files emitted by BVG's proprietary software.
- Requires at least `GOOGLE_MAPS_API_KEY`, also `OPENELEVATION_URL` is suggested to save money on Google Maps API calls.
- Known Limitations:
- The source data is known to contain some partial rotations of the bus routes. These are imported, which may lead to errors when creating a depot for each start and end spot, as some roations start/end at spots that are not depots.
- Some routes that are going from the same stop to the stop itself are not imported, as they are considered to be invalid.
- Some route's distance and time is fudged, when it is found to be unrealistically short or long.
### API
---
All API Classes should inherit from `eflip.ingest.base.AbstractIngester`. This class also provides documentation on how to implement the required methods.
---
The `eflips-ingest` package is designed to provide a standardized access method to the different ingesters. It is
designed as a reusable package providing a programmatic API. It provides the `Ingester` class with the following
methods:
- `__init__(self, database_url: str)`: The constructor. It takes the database URL as a parameter.
- `prepare(self, progress_callback: None | Callable[[float], None] = None, **kwargs: Dict[str, str | int | float | bool | Enum | Path],) -> Tuple[bool, UUID | Dict[str, str]]:`:
This method is used to validate the data. It
has multiple keyword arguments, one for each parameter. It returns a tuple containing a boolean indicating whether
the data is valid and a UUID if the data is valid or a dictionary containing the error names and messages if the
data is invalid. It is suggested that the input form is generated by
[introspecting](https://docs.python.org/3/library/inspect.html#introspecting-callables-with-the-signature-object)
this method. If a UUID is returned, this UUID can be used to start the ingestion process, even with another instance
of this Ingester class (the validated input data is saved to a temporary directory).
- `ingest(self, UUID: UUID, progress_callback: None | Callable[[float], None]) -> None`: This method is used to start the
ingestion process. It takes the UUID returned by the `validate` method and a callback function that will be called
with the progress of the ingestion process. It is suggested that the progress is displayed in the UI. This method
may raise an exception if the ingestion process fails. *Note that exceptions should not be raised during the
ingest process if everything goes well, as we are checking validity in the `validate` method.*
- `prepare_param_names()` This property should return a dict with a short description of each parameter that the `prepare` method takes. This can be used for Naming fields in the UI.
- `prepare_param_descriptions()` This property should return a dict with a long description of each parameter that the `prepare` method takes. This can be used for tooltips in the UI.
In addition, the package provides a `get_ingesters` function that returns a list of `Ingester` classes.
## Testing
---
**NOTE**: Be aware that the tests will clear the database specified in the `DATABASE_URL` environment variable. Make sure that you are not using a database that you want to keep.
---
Testing is done using the `pytest` framework with tests located in the `tests`directory. To run the tests, execute the following command in the root directory of the repository:
```bash
export PYTHONPATH=tests:. # To make sure that the tests can find the eflips package
export DATABASE_URL=postgis://postgres:postgres@localhost:5432/postgres # Or whatever your database URL is
export GOOGLE_MAPS_API_KEY=put_your_api_key_here # Required for some tests
export OPENELEVATION_URL=put_your_url_here # Optional, required for some tests
pytest
```
## Development
We utilize the [GitHub Flow](https://docs.github.com/get-started/quickstart/github-flow) branching structure. This means
that the `main` branch is always deployable and that all development happens in feature branches. The feature branches
are merged into `main` via pull requests.
We use [black](https://black.readthedocs.io/en/stable/) for code formatting. You can use
[pre-commit](https://pre-commit.com/) to ensure the code is formatted correctly before committing. You are also free to
use other methods to format the code, but please ensure that the code is formatted correctly before committing.
Please make sure that your `poetry.lock` and `pyproject.toml` files are consistent before committing. You can use `poetry check` to check this. This is also checked by pre-commit.
## License
This project is licensed under the AGPLv3 license - see the [LICENSE](LICENSE.md) file for details.
## Funding Notice
This code was developed as part of the project [eBus2030+](https://www.eflip.de/) funded by the Federal German Ministry for Digital and Transport (BMDV) under grant number 03EMF0402.
Raw data
{
"_id": null,
"home_page": "https://github.com/mpm-tu-berlin/eflips-ingest",
"name": "eflips-ingest",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Ludger Heide",
"author_email": "ludger.heide@lhtechnologies.de",
"download_url": "https://files.pythonhosted.org/packages/28/39/3dc50b4105a58485484fb91c83d05f7c3c316e11756c3b6303ca3dec949b/eflips_ingest-1.2.78.tar.gz",
"platform": null,
"description": "[![Unit Tests](https://github.com/mpm-tu-berlin/eflips-ingest/actions/workflows/unittests.yml/badge.svg)](https://github.com/mpm-tu-berlin/eflips-ingest/actions/workflows/unittests.yml)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n# eflips-ingest\n\n---\n\nPart of the [eFLIPS/simBA](https://github.com/stars/ludgerheide/lists/ebus2030) list of projects.\n\n---\n\nThis repository contains code to import bus schedules from various sources into an [eFLIPS-Model](https://github.com/mpm-tu-berlin/eflips-model) database.\n\n## Installation\n\n1. Set up a [PostgreSQL](https://www.postgresql.org/) database with the [PostGIS](https://postgis.net/) extension and `BTREE_gist` enabled.\n ```bash\n apt install postgresql postgis\n sudo -u postgres psql createdb eflips\n sudo -u postgres psql eflips -c \"CREATE EXTENSION postgis;\"\n sudo -u postgres psql eflips -c \"CREATE EXTENSION btree_gist;\"\n ```\n\n2. Clone this git repository (or [download a specific release](https://github.com/mpm-tu-berlin/eflips-depot/releases))\n ```bash\n git clone git@github.com:mpm-tu-berlin/eflips-model.git\n ```\n3. Install the packages listed in `poetry.lock` and `pyproject.toml` into your Python environment. Notes:\n - This project depends on [pyproj](https://pyproj4.github.io/pyproj/stable/installation.html), which may require the `proj-bin` package (`apt install proj-bin` on Ubuntu).\n - The supported platforms are macOS and Linux.\n - Using the [poetry](https://python-poetry.org/) package manager is recommended. It can be installed accoring to the\n instructions listed [here](https://python-poetry.org/docs/#installing-with-the-official-installer).\n ```bash\n poetry install\n ```\n\n## Usage\n\n### Command line\n\nThe code is organized into various Python files under the `eflips/ingest` folder, each for ingesting a specific data source. These files should be runnable using `python eflips/ingest/x.py` The following data sources are currently supported:\n\n- `bvgxml.py`: XML files emitted by BVG's proprietary software. \n - Requires at least `GOOGLE_MAPS_API_KEY`, also `OPENELEVATION_URL` is suggested to save money on Google Maps API calls.\n - Known Limitations:\n - The source data is known to contain some partial rotations of the bus routes. These are imported, which may lead to errors when creating a depot for each start and end spot, as some roations start/end at spots that are not depots.\n - Some routes that are going from the same stop to the stop itself are not imported, as they are considered to be invalid.\n - Some route's distance and time is fudged, when it is found to be unrealistically short or long.\n### API\n\n---\n\nAll API Classes should inherit from `eflip.ingest.base.AbstractIngester`. This class also provides documentation on how to implement the required methods.\n\n---\n\nThe `eflips-ingest` package is designed to provide a standardized access method to the different ingesters. It is\ndesigned as a reusable package providing a programmatic API. It provides the `Ingester` class with the following\nmethods:\n\n- `__init__(self, database_url: str)`: The constructor. It takes the database URL as a parameter.\n- `prepare(self, progress_callback: None | Callable[[float], None] = None, **kwargs: Dict[str, str | int | float | bool | Enum | Path],) -> Tuple[bool, UUID | Dict[str, str]]:`: \n This method is used to validate the data. It\n has multiple keyword arguments, one for each parameter. It returns a tuple containing a boolean indicating whether \n the data is valid and a UUID if the data is valid or a dictionary containing the error names and messages if the \n data is invalid. It is suggested that the input form is generated by \n [introspecting](https://docs.python.org/3/library/inspect.html#introspecting-callables-with-the-signature-object) \n this method. If a UUID is returned, this UUID can be used to start the ingestion process, even with another instance\n of this Ingester class (the validated input data is saved to a temporary directory).\n- `ingest(self, UUID: UUID, progress_callback: None | Callable[[float], None]) -> None`: This method is used to start the \n ingestion process. It takes the UUID returned by the `validate` method and a callback function that will be called \n with the progress of the ingestion process. It is suggested that the progress is displayed in the UI. This method\n may raise an exception if the ingestion process fails. *Note that exceptions should not be raised during the\n ingest process if everything goes well, as we are checking validity in the `validate` method.*\n- `prepare_param_names()` This property should return a dict with a short description of each parameter that the `prepare` method takes. This can be used for Naming fields in the UI.\n- `prepare_param_descriptions()` This property should return a dict with a long description of each parameter that the `prepare` method takes. This can be used for tooltips in the UI.\n\nIn addition, the package provides a `get_ingesters` function that returns a list of `Ingester` classes.\n\n## Testing\n\n---\n\n**NOTE**: Be aware that the tests will clear the database specified in the `DATABASE_URL` environment variable. Make sure that you are not using a database that you want to keep.\n\n---\n\nTesting is done using the `pytest` framework with tests located in the `tests`directory. To run the tests, execute the following command in the root directory of the repository:\n\n```bash\n export PYTHONPATH=tests:. # To make sure that the tests can find the eflips package\n export DATABASE_URL=postgis://postgres:postgres@localhost:5432/postgres # Or whatever your database URL is\n export GOOGLE_MAPS_API_KEY=put_your_api_key_here # Required for some tests\n export OPENELEVATION_URL=put_your_url_here # Optional, required for some tests\n pytest\n```\n\n\n\n## Development\n\nWe utilize the [GitHub Flow](https://docs.github.com/get-started/quickstart/github-flow) branching structure. This means\nthat the `main` branch is always deployable and that all development happens in feature branches. The feature branches\nare merged into `main` via pull requests.\n\n\nWe use [black](https://black.readthedocs.io/en/stable/) for code formatting. You can use \n[pre-commit](https://pre-commit.com/) to ensure the code is formatted correctly before committing. You are also free to\nuse other methods to format the code, but please ensure that the code is formatted correctly before committing.\n\nPlease make sure that your `poetry.lock` and `pyproject.toml` files are consistent before committing. You can use `poetry check` to check this. This is also checked by pre-commit.\n\n## License\n\nThis project is licensed under the AGPLv3 license - see the [LICENSE](LICENSE.md) file for details.\n\n## Funding Notice\n\nThis code was developed as part of the project [eBus2030+](https://www.eflip.de/) funded by the Federal German Ministry for Digital and Transport (BMDV) under grant number 03EMF0402.\n",
"bugtrack_url": null,
"license": "AGPL-3.0-or-later",
"summary": "A collection of import scripts for converting bus schedule data into the [eflips-model](https://github.com/mpm-tu-berlin/eflips-model) data format.",
"version": "1.2.78",
"project_urls": {
"Homepage": "https://github.com/mpm-tu-berlin/eflips-ingest",
"Repository": "https://github.com/mpm-tu-berlin/eflips-ingest"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c1fe96c32982fa4984347f42ca3a43a63ce587504140fd445cd9c0147db79664",
"md5": "8846cec21ad0c74f1222fc2638c8f148",
"sha256": "e396e7d0f244b75d651cdafea51fd99333f1d4c91d95c0588f158c796ef443ba"
},
"downloads": -1,
"filename": "eflips_ingest-1.2.78-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8846cec21ad0c74f1222fc2638c8f148",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 80243,
"upload_time": "2024-12-11T13:29:58",
"upload_time_iso_8601": "2024-12-11T13:29:58.985929Z",
"url": "https://files.pythonhosted.org/packages/c1/fe/96c32982fa4984347f42ca3a43a63ce587504140fd445cd9c0147db79664/eflips_ingest-1.2.78-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "28393dc50b4105a58485484fb91c83d05f7c3c316e11756c3b6303ca3dec949b",
"md5": "d2c456dc07002f150cf55b3a34ee6dbe",
"sha256": "1c6d6c2c788ce0dc051b3cd7524849ce13c51c4853ba40a32fe52eb8b9915a24"
},
"downloads": -1,
"filename": "eflips_ingest-1.2.78.tar.gz",
"has_sig": false,
"md5_digest": "d2c456dc07002f150cf55b3a34ee6dbe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 66536,
"upload_time": "2024-12-11T13:30:00",
"upload_time_iso_8601": "2024-12-11T13:30:00.463421Z",
"url": "https://files.pythonhosted.org/packages/28/39/3dc50b4105a58485484fb91c83d05f7c3c316e11756c3b6303ca3dec949b/eflips_ingest-1.2.78.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-11 13:30:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mpm-tu-berlin",
"github_project": "eflips-ingest",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "eflips-ingest"
}