Name | databus JSON |
Version |
0.1.0
JSON |
| download |
home_page | None |
Summary | Python SDK and command-line toolkit for GTFS data processing, validation, and analysis. Provides programmatic access to Databús APIs, GTFS manipulation utilities, data conversion tools, and automated testing frameworks for transit data workflows and research applications. |
upload_time | 2025-08-20 18:53:55 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
gtfs
transit
transportation
data-processing
validation
api
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Databús Python SDK
[](https://www.python.org/downloads/)
[](LICENSE)
[](https://github.com/psf/black)
Python SDK and command-line toolkit for GTFS data processing, validation, and analysis. Provides programmatic access to Databús APIs, GTFS manipulation utilities, data conversion tools, and automated testing frameworks for transit data workflows and research applications.
## Features
### 🚌 GTFS Data Processing
- Load and manipulate GTFS feeds from ZIP files or directories
- Filter feeds by geographic bounds or date ranges
- Export processed feeds to various formats
- Statistical analysis and reporting
### ✅ Data Validation
- Comprehensive GTFS specification compliance checking
- Custom validation rules and quality metrics
- Detailed validation reports with scoring
- Integration with standard validation tools
### 🌐 API Integration
- Full access to Databús API endpoints
- Automatic feed discovery and metadata retrieval
- Bulk download and synchronization capabilities
- Rate limiting and retry mechanisms
### 🛠️ Command-Line Tools
- Intuitive CLI for common operations
- Rich output formatting and progress indicators
- Batch processing and automation support
- Integration with shell scripts and workflows
## Installation
### Using uv (recommended)
```bash
# Install from PyPI (when published)
uv pip install databus
# Install from source
git clone https://github.com/fabianabarca/databus-py.git
cd databus-py
uv pip install -e .
```
### Using pip
```bash
# Install from PyPI (when published)
pip install databus
# Install from source
git clone https://github.com/fabianabarca/databus-py.git
cd databus-py
pip install -e .
```
## Quick Start
### Python API
```python
from databus import DatabusClient, GTFSProcessor, GTFSValidator
# Connect to Databús API
client = DatabusClient("https://api.databus.cr")
feeds = client.get_feeds(country="CR")
# Process a GTFS feed
processor = GTFSProcessor("costa_rica_gtfs.zip")
processor.load_feed()
# Get feed statistics
stats = processor.get_feed_stats()
print(f"Routes: {stats['routes']}, Stops: {stats['stops']}")
# Validate the feed
validator = GTFSValidator(processor)
report = validator.validate()
print(f"Validation score: {report.score}/100")
# Filter by geographic area
san_jose_area = processor.filter_by_bounding_box(
9.8, -84.2, 10.1, -83.9
)
san_jose_area.export_to_zip("san_jose_gtfs.zip")
```
### Command Line Interface
```bash
# List available feeds
databus api feeds --country CR
# Download a feed
databus api download costa-rica-gtfs
# Get feed information
databus gtfs info costa_rica_gtfs.zip
# Validate a feed
databus gtfs validate costa_rica_gtfs.zip
# Filter feed by bounding box
databus gtfs filter costa_rica_gtfs.zip san_jose.zip \
--bbox "-84.2,9.8,-83.9,10.1"
# Filter by date range
databus gtfs filter costa_rica_gtfs.zip current_service.zip \
--dates "2024-01-01,2024-12-31"
```
## Documentation
### Core Classes
#### DatabusClient
The main interface for interacting with Databús APIs:
```python
client = DatabusClient(
base_url="https://api.databus.cr",
api_key="your_api_key", # Optional
timeout=30
)
# Discover feeds
feeds = client.get_feeds()
costarica_feeds = client.get_feeds(country="CR")
# Get detailed feed information
feed = client.get_feed("costa-rica-gtfs")
# Access GTFS data
agencies = client.get_agencies("costa-rica-gtfs")
routes = client.get_routes("costa-rica-gtfs")
stops = client.get_stops("costa-rica-gtfs")
# Download feeds
client.download_feed("costa-rica-gtfs", "costa_rica.zip")
```
#### GTFSProcessor
Load, manipulate, and analyze GTFS feeds:
```python
processor = GTFSProcessor("feed.zip")
processor.load_feed()
# Access GTFS tables as DataFrames
routes = processor.get_routes()
stops = processor.get_stops(as_geodataframe=True)
trips = processor.get_trips(route_id="route_1")
# Get comprehensive statistics
stats = processor.get_feed_stats()
route_stats = processor.get_route_stats("route_1")
# Filter and transform
filtered = processor.filter_by_bounding_box(
min_lat=9.8, min_lon=-84.2,
max_lat=10.1, max_lon=-83.9
)
date_filtered = processor.filter_by_dates(
"2024-01-01", "2024-12-31"
)
# Export results
processor.export_to_zip("processed_feed.zip")
```
#### GTFSValidator
Validate GTFS feeds for compliance and quality:
```python
validator = GTFSValidator(processor)
report = validator.validate()
print(f"Status: {report.status}")
print(f"Score: {report.score}/100")
print(f"Errors: {len(report.errors)}")
print(f"Warnings: {len(report.warnings)}")
# Access detailed issues
for error in report.errors:
print(f"Error: {error['message']}")
# Save report
with open("validation_report.json", "w") as f:
f.write(report.to_json())
```
### Configuration
Configure the library using environment variables or configuration files:
```bash
# Environment variables
export DATABUS_API_URL="https://api.databus.cr"
export DATABUS_API_KEY="your_api_key"
export DATABUS_LOG_LEVEL="INFO"
```
Or create a configuration file at `~/.databus/config.json`:
```json
{
"api": {
"base_url": "https://api.databus.cr",
"api_key": "your_api_key",
"timeout": 30
},
"logging": {
"level": "INFO"
},
"processing": {
"chunk_size": 10000
}
}
```
## Development
### Setup Development Environment
```bash
git clone https://github.com/fabianabarca/databus-py.git
cd databus-py
# Install with development dependencies
uv pip install -e ".[dev,test]"
# Install pre-commit hooks
pre-commit install
```
### Running Tests
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=databus --cov-report=html
# Run specific test file
pytest tests/test_gtfs_processor.py
```
### Code Quality
```bash
# Format code
black src/databus tests/
# Sort imports
isort src/databus tests/
# Lint code
flake8 src/databus tests/
# Type checking
mypy src/databus
```
## Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and development process.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Built on top of [gtfs-kit](https://github.com/mrcagney/gtfs_kit) for GTFS processing
- Uses [pandas](https://pandas.pydata.org/) and [geopandas](https://geopandas.org/) for data manipulation
- CLI powered by [click](https://click.palletsprojects.com/) and [rich](https://rich.readthedocs.io/)
- Validation framework inspired by [gtfs-validator](https://github.com/MobilityData/gtfs-validator)
## Related Projects
- [Databús](https://github.com/fabianabarca/databus) - The main Databús platform
- [GTFS Specification](https://gtfs.org/) - General Transit Feed Specification
- [OpenMobilityData](https://transitland.org/) - Global transit data platform
Raw data
{
"_id": null,
"home_page": null,
"name": "databus",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "gtfs, transit, transportation, data-processing, validation, api",
"author": null,
"author_email": "Fabi\u00e1n Abarca <ensinergia@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/a5/80/7ca138bdfcb55dc6325a03b988ee77a01a6e5539902f1d2b3c4d5d942449/databus-0.1.0.tar.gz",
"platform": null,
"description": "# Datab\u00fas Python SDK\n\n[](https://www.python.org/downloads/)\n[](LICENSE)\n[](https://github.com/psf/black)\n\nPython SDK and command-line toolkit for GTFS data processing, validation, and analysis. Provides programmatic access to Datab\u00fas APIs, GTFS manipulation utilities, data conversion tools, and automated testing frameworks for transit data workflows and research applications.\n\n## Features\n\n### \ud83d\ude8c GTFS Data Processing\n- Load and manipulate GTFS feeds from ZIP files or directories\n- Filter feeds by geographic bounds or date ranges\n- Export processed feeds to various formats\n- Statistical analysis and reporting\n\n### \u2705 Data Validation\n- Comprehensive GTFS specification compliance checking\n- Custom validation rules and quality metrics\n- Detailed validation reports with scoring\n- Integration with standard validation tools\n\n### \ud83c\udf10 API Integration\n- Full access to Datab\u00fas API endpoints\n- Automatic feed discovery and metadata retrieval\n- Bulk download and synchronization capabilities\n- Rate limiting and retry mechanisms\n\n### \ud83d\udee0\ufe0f Command-Line Tools\n- Intuitive CLI for common operations\n- Rich output formatting and progress indicators\n- Batch processing and automation support\n- Integration with shell scripts and workflows\n\n## Installation\n\n### Using uv (recommended)\n\n```bash\n# Install from PyPI (when published)\nuv pip install databus\n\n# Install from source\ngit clone https://github.com/fabianabarca/databus-py.git\ncd databus-py\nuv pip install -e .\n```\n\n### Using pip\n\n```bash\n# Install from PyPI (when published)\npip install databus\n\n# Install from source\ngit clone https://github.com/fabianabarca/databus-py.git\ncd databus-py\npip install -e .\n```\n\n## Quick Start\n\n### Python API\n\n```python\nfrom databus import DatabusClient, GTFSProcessor, GTFSValidator\n\n# Connect to Datab\u00fas API\nclient = DatabusClient(\"https://api.databus.cr\")\nfeeds = client.get_feeds(country=\"CR\")\n\n# Process a GTFS feed\nprocessor = GTFSProcessor(\"costa_rica_gtfs.zip\")\nprocessor.load_feed()\n\n# Get feed statistics\nstats = processor.get_feed_stats()\nprint(f\"Routes: {stats['routes']}, Stops: {stats['stops']}\")\n\n# Validate the feed\nvalidator = GTFSValidator(processor)\nreport = validator.validate()\nprint(f\"Validation score: {report.score}/100\")\n\n# Filter by geographic area\nsan_jose_area = processor.filter_by_bounding_box(\n 9.8, -84.2, 10.1, -83.9\n)\nsan_jose_area.export_to_zip(\"san_jose_gtfs.zip\")\n```\n\n### Command Line Interface\n\n```bash\n# List available feeds\ndatabus api feeds --country CR\n\n# Download a feed\ndatabus api download costa-rica-gtfs\n\n# Get feed information\ndatabus gtfs info costa_rica_gtfs.zip\n\n# Validate a feed\ndatabus gtfs validate costa_rica_gtfs.zip\n\n# Filter feed by bounding box\ndatabus gtfs filter costa_rica_gtfs.zip san_jose.zip \\\n --bbox \"-84.2,9.8,-83.9,10.1\"\n\n# Filter by date range\ndatabus gtfs filter costa_rica_gtfs.zip current_service.zip \\\n --dates \"2024-01-01,2024-12-31\"\n```\n\n## Documentation\n\n### Core Classes\n\n#### DatabusClient\n\nThe main interface for interacting with Datab\u00fas APIs:\n\n```python\nclient = DatabusClient(\n base_url=\"https://api.databus.cr\",\n api_key=\"your_api_key\", # Optional\n timeout=30\n)\n\n# Discover feeds\nfeeds = client.get_feeds()\ncostarica_feeds = client.get_feeds(country=\"CR\")\n\n# Get detailed feed information\nfeed = client.get_feed(\"costa-rica-gtfs\")\n\n# Access GTFS data\nagencies = client.get_agencies(\"costa-rica-gtfs\")\nroutes = client.get_routes(\"costa-rica-gtfs\")\nstops = client.get_stops(\"costa-rica-gtfs\")\n\n# Download feeds\nclient.download_feed(\"costa-rica-gtfs\", \"costa_rica.zip\")\n```\n\n#### GTFSProcessor\n\nLoad, manipulate, and analyze GTFS feeds:\n\n```python\nprocessor = GTFSProcessor(\"feed.zip\")\nprocessor.load_feed()\n\n# Access GTFS tables as DataFrames\nroutes = processor.get_routes()\nstops = processor.get_stops(as_geodataframe=True)\ntrips = processor.get_trips(route_id=\"route_1\")\n\n# Get comprehensive statistics\nstats = processor.get_feed_stats()\nroute_stats = processor.get_route_stats(\"route_1\")\n\n# Filter and transform\nfiltered = processor.filter_by_bounding_box(\n min_lat=9.8, min_lon=-84.2,\n max_lat=10.1, max_lon=-83.9\n)\ndate_filtered = processor.filter_by_dates(\n \"2024-01-01\", \"2024-12-31\"\n)\n\n# Export results\nprocessor.export_to_zip(\"processed_feed.zip\")\n```\n\n#### GTFSValidator\n\nValidate GTFS feeds for compliance and quality:\n\n```python\nvalidator = GTFSValidator(processor)\nreport = validator.validate()\n\nprint(f\"Status: {report.status}\")\nprint(f\"Score: {report.score}/100\")\nprint(f\"Errors: {len(report.errors)}\")\nprint(f\"Warnings: {len(report.warnings)}\")\n\n# Access detailed issues\nfor error in report.errors:\n print(f\"Error: {error['message']}\")\n\n# Save report\nwith open(\"validation_report.json\", \"w\") as f:\n f.write(report.to_json())\n```\n\n### Configuration\n\nConfigure the library using environment variables or configuration files:\n\n```bash\n# Environment variables\nexport DATABUS_API_URL=\"https://api.databus.cr\"\nexport DATABUS_API_KEY=\"your_api_key\"\nexport DATABUS_LOG_LEVEL=\"INFO\"\n```\n\nOr create a configuration file at `~/.databus/config.json`:\n\n```json\n{\n \"api\": {\n \"base_url\": \"https://api.databus.cr\",\n \"api_key\": \"your_api_key\",\n \"timeout\": 30\n },\n \"logging\": {\n \"level\": \"INFO\"\n },\n \"processing\": {\n \"chunk_size\": 10000\n }\n}\n```\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/fabianabarca/databus-py.git\ncd databus-py\n\n# Install with development dependencies\nuv pip install -e \".[dev,test]\"\n\n# Install pre-commit hooks\npre-commit install\n```\n\n### Running Tests\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=databus --cov-report=html\n\n# Run specific test file\npytest tests/test_gtfs_processor.py\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack src/databus tests/\n\n# Sort imports\nisort src/databus tests/\n\n# Lint code\nflake8 src/databus tests/\n\n# Type checking\nmypy src/databus\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\nPlease read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct and development process.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Built on top of [gtfs-kit](https://github.com/mrcagney/gtfs_kit) for GTFS processing\n- Uses [pandas](https://pandas.pydata.org/) and [geopandas](https://geopandas.org/) for data manipulation\n- CLI powered by [click](https://click.palletsprojects.com/) and [rich](https://rich.readthedocs.io/)\n- Validation framework inspired by [gtfs-validator](https://github.com/MobilityData/gtfs-validator)\n\n## Related Projects\n\n- [Datab\u00fas](https://github.com/fabianabarca/databus) - The main Datab\u00fas platform\n- [GTFS Specification](https://gtfs.org/) - General Transit Feed Specification\n- [OpenMobilityData](https://transitland.org/) - Global transit data platform\n",
"bugtrack_url": null,
"license": null,
"summary": "Python SDK and command-line toolkit for GTFS data processing, validation, and analysis. Provides programmatic access to Datab\u00fas APIs, GTFS manipulation utilities, data conversion tools, and automated testing frameworks for transit data workflows and research applications.",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/fabianabarca/databus-py/issues",
"Documentation": "https://databus-py.readthedocs.io",
"Homepage": "https://github.com/fabianabarca/databus-py",
"Repository": "https://github.com/fabianabarca/databus-py.git"
},
"split_keywords": [
"gtfs",
" transit",
" transportation",
" data-processing",
" validation",
" api"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f3e72d49270df8157881240eb346c9abdc938ad93e03b2f6fd2bc276ea9e0728",
"md5": "7d397c38d403b39d25a0bd2f272162b3",
"sha256": "f443f8c2482a82713f2821eb9b8feb5490956cabcffaf6241eeff69758e09446"
},
"downloads": -1,
"filename": "databus-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7d397c38d403b39d25a0bd2f272162b3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 35854,
"upload_time": "2025-08-20T18:53:53",
"upload_time_iso_8601": "2025-08-20T18:53:53.552813Z",
"url": "https://files.pythonhosted.org/packages/f3/e7/2d49270df8157881240eb346c9abdc938ad93e03b2f6fd2bc276ea9e0728/databus-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a5807ca138bdfcb55dc6325a03b988ee77a01a6e5539902f1d2b3c4d5d942449",
"md5": "52027f7d6631515068c436a91a7f94cf",
"sha256": "3c164bea428730f20521b5985e5b34d8d60dd52e6779d8fff1fb502a6b75228a"
},
"downloads": -1,
"filename": "databus-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "52027f7d6631515068c436a91a7f94cf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 33527,
"upload_time": "2025-08-20T18:53:55",
"upload_time_iso_8601": "2025-08-20T18:53:55.096267Z",
"url": "https://files.pythonhosted.org/packages/a5/80/7ca138bdfcb55dc6325a03b988ee77a01a6e5539902f1d2b3c4d5d942449/databus-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-20 18:53:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "fabianabarca",
"github_project": "databus-py",
"github_not_found": true,
"lcname": "databus"
}