ffiec-data-collector


Nameffiec-data-collector JSON
Version 2.0.0rc2 PyPI version JSON
download
home_pagehttps://github.com/call-report/ffiec-data-collector
SummaryLightweight Python library for collecting bulk FFIEC CDR data
upload_time2025-08-11 19:25:38
maintainerNone
docs_urlNone
authorMichael
requires_python>=3.10
licenseMPL-2.0
keywords ffiec banking financial data cdr ubpr
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FFIEC Data Collector

A lightweight Python library for collecting bulk FFIEC CDR data. Direct HTTP implementation that interfaces with FFIEC's ASP.NET WebForms backend. This version (2.0) is an update from the previous library, which required the use of Selenium and a headless browser. This library does not require any browser automation, making it faster and more efficient.

## Key Functions

- Direct HTTP requests, no browser automation needed
- Access to CDR Bulk Data, UBPR Ratios, Rankings, and Statistics
- Data downloads in XBRL and TSV formats
- Tracks breaking changes in FFIEC's website structure
- No external dependencies like Selenium
- Save to disk or process in memory

## Installation

### Release Candidate (Current)

```bash
# Install the latest release candidate
pip install --pre ffiec-data-collector

# Or install specific RC version
pip install ffiec-data-collector==2.0.0rc1
```

Using uv:
```bash
uv add ffiec-data-collector==2.0.0rc1
```

### Stable Release (Future)

```bash
pip install ffiec-data-collector
```

Using uv:
```bash
uv add ffiec-data-collector
```

### Install from source

Using pip:
```bash
git clone https://github.com/call-report/data-collector.git
cd data-collector
pip install -e .
```

Using uv:
```bash
git clone https://github.com/call-report/data-collector.git
cd data-collector
uv sync --dev
```

## Quick Start

```python
from ffiec_data_collector import FFIECDownloader, Product, FileFormat

# Initialize downloader
downloader = FFIECDownloader()

# Download latest Call Report
result = downloader.download_latest(Product.CALL_SINGLE, FileFormat.XBRL)
print(f"Downloaded: {result.filename}")
print(f"Data last updated: {result.last_updated}")

# Download specific quarter
result = downloader.download_cdr_single_period("20240331")
print(f"Downloaded Q1 2024: {result.filename}")

# Get available quarters
info = downloader.get_bulk_data_sources_cdr()
print(f"Available quarters: {info['available_quarters']}")
```

## Available Data Products

| Product | Description | Periods |
|---------|-------------|---------|
| `CALL_SINGLE` | Call Reports - Single Period | One quarter |
| `CALL_FOUR_PERIODS` | Call Reports - Balance Sheet, Income Statement, Past Due | Four quarters |
| `UBPR_RATIO_SINGLE` | UBPR Ratio - Single Period | One quarter |
| `UBPR_RATIO_FOUR` | UBPR Ratio - Four Periods | Four quarters |
| `UBPR_RANK_FOUR` | UBPR Rank - Four Periods | Four quarters |
| `UBPR_STATS_FOUR` | UBPR Stats - Four Periods | Four quarters |

## Advanced Usage

### Download Multiple Quarters

```python
quarters = ["20240331", "20231231", "20230930", "20230630"]
for quarter in quarters:
    result = downloader.download_cdr_single_period(quarter)
    if result.success:
        print(f"✓ {quarter}: {result.filename}")
```

### Download to Memory

```python
from io import BytesIO

# Get content without saving to disk
content = downloader.download(
    product=Product.CALL_SINGLE,
    period="20240331",
    format=FileFormat.XBRL,
    save_to_disk=False
)

# Process ZIP file in memory
import zipfile
with zipfile.ZipFile(content) as zf:
    for info in zf.filelist:
        print(f"{info.filename}: {info.file_size} bytes")
```

### Website Structure Validation

```python
from ffiec_data_collector import ValidatedFFIECDownloader

# Automatically validates website hasn't changed
validated_downloader = ValidatedFFIECDownloader()
result = validated_downloader.download(
    product=Product.CALL_SINGLE,
    period="20240331"
)
```

### Check Website Health

```python
from ffiec_data_collector import ThumbprintValidator

validator = ThumbprintValidator()
results = validator.validate_all()

for page_type, result in results.items():
    print(f"{page_type}: {'Valid' if result['valid'] else 'Invalid'}")
```

## Command Line Interface

```bash
# Download latest Call Report
ffiec-download --product call-single --format xbrl

# Download specific quarter
ffiec-download --product call-single --quarter 20240331

# Validate website structure
ffiec-validate
```

## Documentation

Full documentation is available at [Read the Docs](https://ffiec-data-collector.readthedocs.io/).

### Building Documentation Locally

```bash
# Install documentation dependencies
pip install -e ".[docs]"

# Build HTML documentation
cd docs
make html

# View documentation
open _build/html/index.html
```

### Documentation Structure

The documentation includes:
- **Getting Started** - Installation and quick start guide
- **API Reference** - Complete API documentation
- **Examples** - Jupyter notebooks and code examples
- **Development** - Contributing guidelines and development setup

## Examples

See the `examples/` directory for Jupyter notebooks demonstrating:
- Basic downloading workflows
- Bulk data collection
- Building data pipelines
- Processing downloaded data

## API Reference

### FFIECDownloader

Main class for downloading FFIEC data.

**Methods:**
- `download(product, period, format)` - Download specific data
- `download_latest(product, format)` - Download most recent data
- `get_available_products()` - List all products
- `select_product(product)` - Get available periods for product
- `get_bulk_data_sources_cdr()` - Get CDR metadata
- `get_bulk_data_sources_ubpr()` - Get UBPR metadata

### Product Enum

Available data products:
- `Product.CALL_SINGLE` - Call Reports (single period)
- `Product.CALL_FOUR_PERIODS` - Call Reports (four periods)
- `Product.UBPR_RATIO_SINGLE` - UBPR Ratios (single period)
- `Product.UBPR_RATIO_FOUR` - UBPR Ratios (four periods)
- `Product.UBPR_RANK_FOUR` - UBPR Rankings
- `Product.UBPR_STATS_FOUR` - UBPR Statistics

### FileFormat Enum

Supported file formats:
- `FileFormat.XBRL` - eXtensible Business Reporting Language
- `FileFormat.TSV` - Tab-delimited values

## Requirements

- Python 3.10+
- requests
- python-dateutil

## Development

```bash
# Clone repository
git clone https://github.com/call-report/ffiec-data-collector.git
cd ffiec-data-collector

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black ffiec_data_collector/

# Type checking
mypy ffiec_data_collector/
```

## Publishing to PyPI

```bash
# Build distribution packages
python -m build

# Upload to TestPyPI (for testing)
python -m twine upload --repository testpypi dist/*

# Upload to PyPI
python -m twine upload dist/*
```

## License

Mozilla Public License 2.0 - see LICENSE file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Support

For issues, questions, or suggestions, please open an issue on GitHub.

## Important Disclaimers

### Government Website Usage Responsibility

**You are responsible for your use of this library with respect to the FFIEC government website.** The FFIEC website has terms of use and acceptable use policies that you must comply with. If you are integrating this code into software applications, you are responsible for implementing appropriate safeguards such as:

- Rate limiting and request throttling
- Circuit breakers to prevent excessive requests
- Monitoring and logging of usage patterns
- Respect for server resources and bandwidth

Failure to use this library responsibly may result in your IP address being blocked or other restrictions imposed by the FFIEC.

### Website Structure Dependencies

**This library relies on the current structure of FFIEC web pages, which were not designed for automated access.** When the library runs, it validates that the assumed structure of the web page remains unchanged. If the FFIEC updates their website structure:

- The library will detect structural changes through its thumbprint validation system
- A `WebpageChangeException` will be raised to prevent incorrect operation
- You will need to update to a newer version of this library that supports the new structure

This design ensures the library fails safely rather than producing incorrect results when the website changes.

### General Disclaimer

This library is not affiliated with, endorsed by, or sponsored by the FFIEC. It is an independent tool for accessing publicly available data. Users are solely responsible for ensuring their usage complies with all applicable terms of service, laws, and regulations.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/call-report/ffiec-data-collector",
    "name": "ffiec-data-collector",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ffiec, banking, financial, data, cdr, ubpr",
    "author": "Michael",
    "author_email": "Michael <michael@civicforge.solutions>",
    "download_url": "https://files.pythonhosted.org/packages/5f/59/aa77913db21092c75aec54b6f2597993f7138d9fb5f2249034cd70b1e6d8/ffiec_data_collector-2.0.0rc2.tar.gz",
    "platform": null,
    "description": "# FFIEC Data Collector\n\nA lightweight Python library for collecting bulk FFIEC CDR data. Direct HTTP implementation that interfaces with FFIEC's ASP.NET WebForms backend. This version (2.0) is an update from the previous library, which required the use of Selenium and a headless browser. This library does not require any browser automation, making it faster and more efficient.\n\n## Key Functions\n\n- Direct HTTP requests, no browser automation needed\n- Access to CDR Bulk Data, UBPR Ratios, Rankings, and Statistics\n- Data downloads in XBRL and TSV formats\n- Tracks breaking changes in FFIEC's website structure\n- No external dependencies like Selenium\n- Save to disk or process in memory\n\n## Installation\n\n### Release Candidate (Current)\n\n```bash\n# Install the latest release candidate\npip install --pre ffiec-data-collector\n\n# Or install specific RC version\npip install ffiec-data-collector==2.0.0rc1\n```\n\nUsing uv:\n```bash\nuv add ffiec-data-collector==2.0.0rc1\n```\n\n### Stable Release (Future)\n\n```bash\npip install ffiec-data-collector\n```\n\nUsing uv:\n```bash\nuv add ffiec-data-collector\n```\n\n### Install from source\n\nUsing pip:\n```bash\ngit clone https://github.com/call-report/data-collector.git\ncd data-collector\npip install -e .\n```\n\nUsing uv:\n```bash\ngit clone https://github.com/call-report/data-collector.git\ncd data-collector\nuv sync --dev\n```\n\n## Quick Start\n\n```python\nfrom ffiec_data_collector import FFIECDownloader, Product, FileFormat\n\n# Initialize downloader\ndownloader = FFIECDownloader()\n\n# Download latest Call Report\nresult = downloader.download_latest(Product.CALL_SINGLE, FileFormat.XBRL)\nprint(f\"Downloaded: {result.filename}\")\nprint(f\"Data last updated: {result.last_updated}\")\n\n# Download specific quarter\nresult = downloader.download_cdr_single_period(\"20240331\")\nprint(f\"Downloaded Q1 2024: {result.filename}\")\n\n# Get available quarters\ninfo = downloader.get_bulk_data_sources_cdr()\nprint(f\"Available quarters: {info['available_quarters']}\")\n```\n\n## Available Data Products\n\n| Product | Description | Periods |\n|---------|-------------|---------|\n| `CALL_SINGLE` | Call Reports - Single Period | One quarter |\n| `CALL_FOUR_PERIODS` | Call Reports - Balance Sheet, Income Statement, Past Due | Four quarters |\n| `UBPR_RATIO_SINGLE` | UBPR Ratio - Single Period | One quarter |\n| `UBPR_RATIO_FOUR` | UBPR Ratio - Four Periods | Four quarters |\n| `UBPR_RANK_FOUR` | UBPR Rank - Four Periods | Four quarters |\n| `UBPR_STATS_FOUR` | UBPR Stats - Four Periods | Four quarters |\n\n## Advanced Usage\n\n### Download Multiple Quarters\n\n```python\nquarters = [\"20240331\", \"20231231\", \"20230930\", \"20230630\"]\nfor quarter in quarters:\n    result = downloader.download_cdr_single_period(quarter)\n    if result.success:\n        print(f\"\u2713 {quarter}: {result.filename}\")\n```\n\n### Download to Memory\n\n```python\nfrom io import BytesIO\n\n# Get content without saving to disk\ncontent = downloader.download(\n    product=Product.CALL_SINGLE,\n    period=\"20240331\",\n    format=FileFormat.XBRL,\n    save_to_disk=False\n)\n\n# Process ZIP file in memory\nimport zipfile\nwith zipfile.ZipFile(content) as zf:\n    for info in zf.filelist:\n        print(f\"{info.filename}: {info.file_size} bytes\")\n```\n\n### Website Structure Validation\n\n```python\nfrom ffiec_data_collector import ValidatedFFIECDownloader\n\n# Automatically validates website hasn't changed\nvalidated_downloader = ValidatedFFIECDownloader()\nresult = validated_downloader.download(\n    product=Product.CALL_SINGLE,\n    period=\"20240331\"\n)\n```\n\n### Check Website Health\n\n```python\nfrom ffiec_data_collector import ThumbprintValidator\n\nvalidator = ThumbprintValidator()\nresults = validator.validate_all()\n\nfor page_type, result in results.items():\n    print(f\"{page_type}: {'Valid' if result['valid'] else 'Invalid'}\")\n```\n\n## Command Line Interface\n\n```bash\n# Download latest Call Report\nffiec-download --product call-single --format xbrl\n\n# Download specific quarter\nffiec-download --product call-single --quarter 20240331\n\n# Validate website structure\nffiec-validate\n```\n\n## Documentation\n\nFull documentation is available at [Read the Docs](https://ffiec-data-collector.readthedocs.io/).\n\n### Building Documentation Locally\n\n```bash\n# Install documentation dependencies\npip install -e \".[docs]\"\n\n# Build HTML documentation\ncd docs\nmake html\n\n# View documentation\nopen _build/html/index.html\n```\n\n### Documentation Structure\n\nThe documentation includes:\n- **Getting Started** - Installation and quick start guide\n- **API Reference** - Complete API documentation\n- **Examples** - Jupyter notebooks and code examples\n- **Development** - Contributing guidelines and development setup\n\n## Examples\n\nSee the `examples/` directory for Jupyter notebooks demonstrating:\n- Basic downloading workflows\n- Bulk data collection\n- Building data pipelines\n- Processing downloaded data\n\n## API Reference\n\n### FFIECDownloader\n\nMain class for downloading FFIEC data.\n\n**Methods:**\n- `download(product, period, format)` - Download specific data\n- `download_latest(product, format)` - Download most recent data\n- `get_available_products()` - List all products\n- `select_product(product)` - Get available periods for product\n- `get_bulk_data_sources_cdr()` - Get CDR metadata\n- `get_bulk_data_sources_ubpr()` - Get UBPR metadata\n\n### Product Enum\n\nAvailable data products:\n- `Product.CALL_SINGLE` - Call Reports (single period)\n- `Product.CALL_FOUR_PERIODS` - Call Reports (four periods)\n- `Product.UBPR_RATIO_SINGLE` - UBPR Ratios (single period)\n- `Product.UBPR_RATIO_FOUR` - UBPR Ratios (four periods)\n- `Product.UBPR_RANK_FOUR` - UBPR Rankings\n- `Product.UBPR_STATS_FOUR` - UBPR Statistics\n\n### FileFormat Enum\n\nSupported file formats:\n- `FileFormat.XBRL` - eXtensible Business Reporting Language\n- `FileFormat.TSV` - Tab-delimited values\n\n## Requirements\n\n- Python 3.10+\n- requests\n- python-dateutil\n\n## Development\n\n```bash\n# Clone repository\ngit clone https://github.com/call-report/ffiec-data-collector.git\ncd ffiec-data-collector\n\n# Install in development mode\npip install -e \".[dev]\"\n\n# Run tests\npytest\n\n# Format code\nblack ffiec_data_collector/\n\n# Type checking\nmypy ffiec_data_collector/\n```\n\n## Publishing to PyPI\n\n```bash\n# Build distribution packages\npython -m build\n\n# Upload to TestPyPI (for testing)\npython -m twine upload --repository testpypi dist/*\n\n# Upload to PyPI\npython -m twine upload dist/*\n```\n\n## License\n\nMozilla Public License 2.0 - see LICENSE file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## Support\n\nFor issues, questions, or suggestions, please open an issue on GitHub.\n\n## Important Disclaimers\n\n### Government Website Usage Responsibility\n\n**You are responsible for your use of this library with respect to the FFIEC government website.** The FFIEC website has terms of use and acceptable use policies that you must comply with. If you are integrating this code into software applications, you are responsible for implementing appropriate safeguards such as:\n\n- Rate limiting and request throttling\n- Circuit breakers to prevent excessive requests\n- Monitoring and logging of usage patterns\n- Respect for server resources and bandwidth\n\nFailure to use this library responsibly may result in your IP address being blocked or other restrictions imposed by the FFIEC.\n\n### Website Structure Dependencies\n\n**This library relies on the current structure of FFIEC web pages, which were not designed for automated access.** When the library runs, it validates that the assumed structure of the web page remains unchanged. If the FFIEC updates their website structure:\n\n- The library will detect structural changes through its thumbprint validation system\n- A `WebpageChangeException` will be raised to prevent incorrect operation\n- You will need to update to a newer version of this library that supports the new structure\n\nThis design ensures the library fails safely rather than producing incorrect results when the website changes.\n\n### General Disclaimer\n\nThis library is not affiliated with, endorsed by, or sponsored by the FFIEC. It is an independent tool for accessing publicly available data. Users are solely responsible for ensuring their usage complies with all applicable terms of service, laws, and regulations.\n",
    "bugtrack_url": null,
    "license": "MPL-2.0",
    "summary": "Lightweight Python library for collecting bulk FFIEC CDR data",
    "version": "2.0.0rc2",
    "project_urls": {
        "Documentation": "https://ffiec-data-collector.readthedocs.io",
        "Homepage": "https://github.com/call-report/ffiec-data-collector",
        "Issues": "https://github.com/call-report/ffiec-data-collector/issues",
        "Repository": "https://github.com/call-report/ffiec-data-collector"
    },
    "split_keywords": [
        "ffiec",
        " banking",
        " financial",
        " data",
        " cdr",
        " ubpr"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "04dcdee374b45d1cb2784105320e2c1b741f7a68c5285bcc8ccf442a759226bd",
                "md5": "56ca214a6b9f8ecc1c1e572e5295c80a",
                "sha256": "56d3b0ed8d2190f884d8839a3865196012048b46ce7d06be493ce4c0ca049471"
            },
            "downloads": -1,
            "filename": "ffiec_data_collector-2.0.0rc2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "56ca214a6b9f8ecc1c1e572e5295c80a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 21533,
            "upload_time": "2025-08-11T19:25:36",
            "upload_time_iso_8601": "2025-08-11T19:25:36.753632Z",
            "url": "https://files.pythonhosted.org/packages/04/dc/dee374b45d1cb2784105320e2c1b741f7a68c5285bcc8ccf442a759226bd/ffiec_data_collector-2.0.0rc2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5f59aa77913db21092c75aec54b6f2597993f7138d9fb5f2249034cd70b1e6d8",
                "md5": "2f4d6d99dd1d85a169dc827c90d22383",
                "sha256": "0ca952d93348e29ae72cd704b2d9c76f2376254480fa330386fc5239ba7f5edb"
            },
            "downloads": -1,
            "filename": "ffiec_data_collector-2.0.0rc2.tar.gz",
            "has_sig": false,
            "md5_digest": "2f4d6d99dd1d85a169dc827c90d22383",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 31611,
            "upload_time": "2025-08-11T19:25:38",
            "upload_time_iso_8601": "2025-08-11T19:25:38.313900Z",
            "url": "https://files.pythonhosted.org/packages/5f/59/aa77913db21092c75aec54b6f2597993f7138d9fb5f2249034cd70b1e6d8/ffiec_data_collector-2.0.0rc2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-11 19:25:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "call-report",
    "github_project": "ffiec-data-collector",
    "github_not_found": true,
    "lcname": "ffiec-data-collector"
}
        
Elapsed time: 0.89673s