odps-python


Nameodps-python JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryHigh-performance Python library for Open Data Product Specification (ODPS) v4.0 with caching, validation, and international standards compliance
upload_time2025-08-31 08:48:03
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseApache-2.0
keywords open-data data-product specification odps iso-standards validation rfc e164 iso639 iso3166 iso4217 iso8601 performance caching protocols type-safety
VCS
bugtrack_url
requirements PyYAML pycountry phonenumbers
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ODPS Python Library

[![PyPI version](https://badge.fury.io/py/odps-python.svg)](https://badge.fury.io/py/odps-python)
[![Python Support](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://github.com/accenture/odps-python)
[![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

A comprehensive, high-performance Python library for creating, validating, and manipulating [Open Data Product Specification (ODPS) v4.0](https://opendataproducts.org/v4.0/) documents with full international standards compliance.

## 🚀 Features

### Core Capabilities
- **Complete ODPS v4.0 Support**: Full implementation of the Open Data Product Specification
- **International Standards Compliance**: Validates against ISO, RFC, and ITU-T standards
- **Flexible I/O**: JSON and YAML serialization/deserialization support
- **Type Safety**: Comprehensive type hints and protocol-based duck typing
- **Multilingual Support**: Full support for multilingual field dictionaries

### Performance & Architecture
- **High Performance**: Optimized with validation caching, serialization caching, and `__slots__`
- **Modular Architecture**: Pluggable validation framework and component system
- **Protocol-Based Design**: Duck typing protocols for better type safety
- **Comprehensive Error Handling**: Hierarchical exception system with 20+ specific error types

### Standards Validation
- **ISO 639-1**: Language code validation
- **ISO 3166-1 alpha-2**: Country code validation
- **ISO 4217**: Currency code validation
- **ISO 8601**: Date/time format validation
- **ITU-T E.164**: Phone number format validation
- **RFC 5322**: Email address validation
- **RFC 3986**: URI/URL validation

### Developer Experience
- **Comprehensive Documentation**: Full API documentation and examples
- **IDE Support**: Complete type hints for excellent IntelliSense
- **Detailed Error Messages**: Specific validation errors with context

## Installation

```bash
pip install odps-python

# For full standards validation support:
pip install "odps-python[validation]"

# For development:
pip install "odps-python[dev]"
```

## Quick Start

```python
from odps import OpenDataProduct
from odps.models import ProductDetails, DataAccessMethod, DataHolder, License

# Create a new data product with international standards compliance
product = ProductDetails(
    name="My Weather API",
    product_id="weather-api-v1", 
    visibility="public",
    status="production",
    type="dataset",
    description="Real-time weather data",
    language=["en", "fr"],  # ISO 639-1 language codes
    homepage="https://example.com"  # RFC 3986 compliant URI
)

# Create ODPS document
odp = OpenDataProduct(product)

# Add data access with required default method
default_access = DataAccessMethod(
    name={"en": "REST API", "fr": "API REST"},  # Multilingual support
    output_port_type="API",
    access_url="https://api.example.com/weather",  # RFC 3986 URI
    documentation_url="https://docs.example.com"   # RFC 3986 URI
)
odp.add_data_access(default_access)

# Add data holder with validated contact info
odp.data_holder = DataHolder(
    name="Weather Corp",
    email="contact@example.com",  # RFC 5322 email validation
    phone_number="+12125551234"    # E.164 phone validation
)

# Add license with ISO 8601 date validation
odp.license = License(
    scope_of_use="commercial",
    valid_from="2024-01-01",          # ISO 8601 date
    valid_until="2025-12-31T23:59:59Z"  # ISO 8601 datetime
)

# Comprehensive validation with all standards
try:
    odp.validate()
    print("✓ Document valid with full standards compliance")
except Exception as e:
    print(f"Validation errors: {e}")

# Export
print(odp.to_json())
odp.save("my-product.json")

# Load existing document
loaded = OpenDataProduct.from_file("my-product.json")
```

## Core Components

### ProductDetails (Required)
- `name`: Product name
- `product_id`: Unique identifier  
- `visibility`: public, private, etc.
- `status`: draft, production, etc.
- `type`: dataset, algorithm, etc.

### Optional Components
- **DataContract**: API specifications and data schemas
- **SLA**: Service level agreements  
- **DataQuality**: Quality metrics and rules
- **PricingPlans**: Pricing tiers with ISO 4217 currency validation
- **License**: Usage rights with ISO 8601 date validation
- **DataAccess**: Access methods (requires `default` method per ODPS v4.0)
- **DataHolder**: Contact information with email/phone validation
- **PaymentGateways**: Payment processing details

## Validation Standards

The library enforces all international standards referenced in ODPS v4.0:

| Standard | Used For | Example |
|----------|----------|----------|
| **ISO 639-1** | Language codes | `"en"`, `"fr"`, `"de"` |
| **ISO 3166-1 alpha-2** | Country codes | `"US"`, `"GB"`, `"DE"` |
| **ISO 4217** | Currency codes | `"USD"`, `"EUR"`, `"GBP"` |
| **ISO 8601** | Date/time formats | `"2024-01-01"`, `"2024-01-01T12:00:00Z"` |
| **E.164** | Phone numbers | `"+12125551234"` |
| **RFC 5322** | Email addresses | `"user@example.com"` |
| **RFC 3986** | URIs/URLs | `"https://example.com/api"` |

### Multilingual Support

Fields like `dataAccess.name` and `dataAccess.description` support multilingual dictionaries:

```python
{
    "name": {
        "en": "Weather API",
        "fr": "API Météo",
        "de": "Wetter-API"
    }
}
```

All language keys are validated against ISO 639-1 standards.

## ⚡ Performance Features

### Intelligent Caching
The library includes sophisticated caching for optimal performance:

```python
import time
from odps import OpenDataProduct, ProductDetails

# Create a product
details = ProductDetails(
    name="Performance Test",
    product_id="perf-001", 
    visibility="public",
    status="draft",
    type="dataset"
)
product = OpenDataProduct(details)

# First validation - full processing
start = time.time()
product.validate()
first_time = time.time() - start

# Second validation - cached result
start = time.time()
product.validate()
cached_time = time.time() - start

print(f"Cache speedup: {first_time/cached_time:.1f}x")  # Typically 20-50x faster
```

### Compliance Assessment
```python
# Comprehensive compliance checking
compliance_level = product.compliance_level  # "minimal", "basic", "substantial", "full"
is_production_ready = product.is_production_ready
validation_errors = product.validation_errors  # No exceptions raised
component_count = product.component_count
```

## 🔧 Advanced Usage

### Custom Validation

```python
from odps.validators import ODPSValidator

# Validate individual components
print(ODPSValidator.validate_iso639_language_code("en"))  # True
print(ODPSValidator.validate_currency_code("USD"))        # True
print(ODPSValidator.validate_email("test@example.com"))   # True
print(ODPSValidator.validate_phone_number("+12125551234"))  # True
print(ODPSValidator.validate_iso8601_date("2024-01-01"))    # True
```

### Loading from Different Formats

```python
# From JSON
odp = OpenDataProduct.from_json(json_string)

# From YAML  
odp = OpenDataProduct.from_yaml(yaml_string)

# From file (auto-detects format)
odp = OpenDataProduct.from_file("product.json")
odp = OpenDataProduct.from_file("product.yaml")
```

## Development

```bash
git clone https://github.com/accenture/odps-python
cd odps-python
pip install -e ".[dev]"
python examples/comprehensive_example.py
```

### Dependencies

The library requires the following packages for full standards compliance:
- `pycountry`: ISO standards validation (languages, countries, currencies)
- `phonenumbers`: E.164 phone number validation
- `PyYAML`: YAML format support

## License

Apache License 2.0 - see LICENSE file for details.

## Contributing

Contributions welcome! Please read CONTRIBUTING.md for guidelines.

## Error Handling

The library provides detailed validation error messages that reference specific standards:

```python
try:
    odp.validate()
except ODPSValidationError as e:
    print(e)
    # Output: "Validation errors: Invalid ISO 639-1 language code: 'xyz'; 
    #          dataHolder email must be a valid RFC 5322 email address"
```

## 🏆 Acknowledgments

We extend our gratitude to the following:

**[Open Data Product Initiative Team](https://opendataproducts.org/)** - Special thanks to the team at opendataproducts.org for their work in creating and maintaining the Open Data Product Specification (ODPS). Their vision of standardizing data product descriptions and enabling better data discovery and interoperability has made this library possible. The ODPS v4.0 specification represents years of collaborative effort from industry experts, data practitioners, and open source contributors who are driving the future of data standardization.

**Python Community** - For the exceptional ecosystem of libraries and tools that power this implementation, including PyYAML, pycountry, phonenumbers, and the countless other packages that make Python development a joy.

**Data Community** - For embracing open standards and driving the need for better data product specifications and tooling that benefits everyone in the data ecosystem.

## 📚 Links & References

- [Open Data Product Specification v4.0](https://opendataproducts.org/v4.0/)
- [ODPS Schema](https://opendataproducts.org/v4.0/schema/)
- [ISO 639-1 Language Codes](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)
- [ISO 3166-1 Country Codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
- [ISO 4217 Currency Codes](https://en.wikipedia.org/wiki/ISO_4217)
- [ISO 8601 Date/Time Format](https://en.wikipedia.org/wiki/ISO_8601)
- [E.164 Phone Number Format](https://en.wikipedia.org/wiki/E.164)
- [RFC 5322 Email Format](https://datatracker.ietf.org/doc/html/rfc5322)
- [RFC 3986 URI Format](https://datatracker.ietf.org/doc/html/rfc3986)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "odps-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "open-data, data-product, specification, odps, iso-standards, validation, rfc, e164, iso639, iso3166, iso4217, iso8601, performance, caching, protocols, type-safety",
    "author": null,
    "author_email": "Chris Howard <chris.howard@accenture.com>",
    "download_url": null,
    "platform": null,
    "description": "# ODPS Python Library\n\n[![PyPI version](https://badge.fury.io/py/odps-python.svg)](https://badge.fury.io/py/odps-python)\n[![Python Support](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://github.com/accenture/odps-python)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nA comprehensive, high-performance Python library for creating, validating, and manipulating [Open Data Product Specification (ODPS) v4.0](https://opendataproducts.org/v4.0/) documents with full international standards compliance.\n\n## \ud83d\ude80 Features\n\n### Core Capabilities\n- **Complete ODPS v4.0 Support**: Full implementation of the Open Data Product Specification\n- **International Standards Compliance**: Validates against ISO, RFC, and ITU-T standards\n- **Flexible I/O**: JSON and YAML serialization/deserialization support\n- **Type Safety**: Comprehensive type hints and protocol-based duck typing\n- **Multilingual Support**: Full support for multilingual field dictionaries\n\n### Performance & Architecture\n- **High Performance**: Optimized with validation caching, serialization caching, and `__slots__`\n- **Modular Architecture**: Pluggable validation framework and component system\n- **Protocol-Based Design**: Duck typing protocols for better type safety\n- **Comprehensive Error Handling**: Hierarchical exception system with 20+ specific error types\n\n### Standards Validation\n- **ISO 639-1**: Language code validation\n- **ISO 3166-1 alpha-2**: Country code validation\n- **ISO 4217**: Currency code validation\n- **ISO 8601**: Date/time format validation\n- **ITU-T E.164**: Phone number format validation\n- **RFC 5322**: Email address validation\n- **RFC 3986**: URI/URL validation\n\n### Developer Experience\n- **Comprehensive Documentation**: Full API documentation and examples\n- **IDE Support**: Complete type hints for excellent IntelliSense\n- **Detailed Error Messages**: Specific validation errors with context\n\n## Installation\n\n```bash\npip install odps-python\n\n# For full standards validation support:\npip install \"odps-python[validation]\"\n\n# For development:\npip install \"odps-python[dev]\"\n```\n\n## Quick Start\n\n```python\nfrom odps import OpenDataProduct\nfrom odps.models import ProductDetails, DataAccessMethod, DataHolder, License\n\n# Create a new data product with international standards compliance\nproduct = ProductDetails(\n    name=\"My Weather API\",\n    product_id=\"weather-api-v1\", \n    visibility=\"public\",\n    status=\"production\",\n    type=\"dataset\",\n    description=\"Real-time weather data\",\n    language=[\"en\", \"fr\"],  # ISO 639-1 language codes\n    homepage=\"https://example.com\"  # RFC 3986 compliant URI\n)\n\n# Create ODPS document\nodp = OpenDataProduct(product)\n\n# Add data access with required default method\ndefault_access = DataAccessMethod(\n    name={\"en\": \"REST API\", \"fr\": \"API REST\"},  # Multilingual support\n    output_port_type=\"API\",\n    access_url=\"https://api.example.com/weather\",  # RFC 3986 URI\n    documentation_url=\"https://docs.example.com\"   # RFC 3986 URI\n)\nodp.add_data_access(default_access)\n\n# Add data holder with validated contact info\nodp.data_holder = DataHolder(\n    name=\"Weather Corp\",\n    email=\"contact@example.com\",  # RFC 5322 email validation\n    phone_number=\"+12125551234\"    # E.164 phone validation\n)\n\n# Add license with ISO 8601 date validation\nodp.license = License(\n    scope_of_use=\"commercial\",\n    valid_from=\"2024-01-01\",          # ISO 8601 date\n    valid_until=\"2025-12-31T23:59:59Z\"  # ISO 8601 datetime\n)\n\n# Comprehensive validation with all standards\ntry:\n    odp.validate()\n    print(\"\u2713 Document valid with full standards compliance\")\nexcept Exception as e:\n    print(f\"Validation errors: {e}\")\n\n# Export\nprint(odp.to_json())\nodp.save(\"my-product.json\")\n\n# Load existing document\nloaded = OpenDataProduct.from_file(\"my-product.json\")\n```\n\n## Core Components\n\n### ProductDetails (Required)\n- `name`: Product name\n- `product_id`: Unique identifier  \n- `visibility`: public, private, etc.\n- `status`: draft, production, etc.\n- `type`: dataset, algorithm, etc.\n\n### Optional Components\n- **DataContract**: API specifications and data schemas\n- **SLA**: Service level agreements  \n- **DataQuality**: Quality metrics and rules\n- **PricingPlans**: Pricing tiers with ISO 4217 currency validation\n- **License**: Usage rights with ISO 8601 date validation\n- **DataAccess**: Access methods (requires `default` method per ODPS v4.0)\n- **DataHolder**: Contact information with email/phone validation\n- **PaymentGateways**: Payment processing details\n\n## Validation Standards\n\nThe library enforces all international standards referenced in ODPS v4.0:\n\n| Standard | Used For | Example |\n|----------|----------|----------|\n| **ISO 639-1** | Language codes | `\"en\"`, `\"fr\"`, `\"de\"` |\n| **ISO 3166-1 alpha-2** | Country codes | `\"US\"`, `\"GB\"`, `\"DE\"` |\n| **ISO 4217** | Currency codes | `\"USD\"`, `\"EUR\"`, `\"GBP\"` |\n| **ISO 8601** | Date/time formats | `\"2024-01-01\"`, `\"2024-01-01T12:00:00Z\"` |\n| **E.164** | Phone numbers | `\"+12125551234\"` |\n| **RFC 5322** | Email addresses | `\"user@example.com\"` |\n| **RFC 3986** | URIs/URLs | `\"https://example.com/api\"` |\n\n### Multilingual Support\n\nFields like `dataAccess.name` and `dataAccess.description` support multilingual dictionaries:\n\n```python\n{\n    \"name\": {\n        \"en\": \"Weather API\",\n        \"fr\": \"API M\u00e9t\u00e9o\",\n        \"de\": \"Wetter-API\"\n    }\n}\n```\n\nAll language keys are validated against ISO 639-1 standards.\n\n## \u26a1 Performance Features\n\n### Intelligent Caching\nThe library includes sophisticated caching for optimal performance:\n\n```python\nimport time\nfrom odps import OpenDataProduct, ProductDetails\n\n# Create a product\ndetails = ProductDetails(\n    name=\"Performance Test\",\n    product_id=\"perf-001\", \n    visibility=\"public\",\n    status=\"draft\",\n    type=\"dataset\"\n)\nproduct = OpenDataProduct(details)\n\n# First validation - full processing\nstart = time.time()\nproduct.validate()\nfirst_time = time.time() - start\n\n# Second validation - cached result\nstart = time.time()\nproduct.validate()\ncached_time = time.time() - start\n\nprint(f\"Cache speedup: {first_time/cached_time:.1f}x\")  # Typically 20-50x faster\n```\n\n### Compliance Assessment\n```python\n# Comprehensive compliance checking\ncompliance_level = product.compliance_level  # \"minimal\", \"basic\", \"substantial\", \"full\"\nis_production_ready = product.is_production_ready\nvalidation_errors = product.validation_errors  # No exceptions raised\ncomponent_count = product.component_count\n```\n\n## \ud83d\udd27 Advanced Usage\n\n### Custom Validation\n\n```python\nfrom odps.validators import ODPSValidator\n\n# Validate individual components\nprint(ODPSValidator.validate_iso639_language_code(\"en\"))  # True\nprint(ODPSValidator.validate_currency_code(\"USD\"))        # True\nprint(ODPSValidator.validate_email(\"test@example.com\"))   # True\nprint(ODPSValidator.validate_phone_number(\"+12125551234\"))  # True\nprint(ODPSValidator.validate_iso8601_date(\"2024-01-01\"))    # True\n```\n\n### Loading from Different Formats\n\n```python\n# From JSON\nodp = OpenDataProduct.from_json(json_string)\n\n# From YAML  \nodp = OpenDataProduct.from_yaml(yaml_string)\n\n# From file (auto-detects format)\nodp = OpenDataProduct.from_file(\"product.json\")\nodp = OpenDataProduct.from_file(\"product.yaml\")\n```\n\n## Development\n\n```bash\ngit clone https://github.com/accenture/odps-python\ncd odps-python\npip install -e \".[dev]\"\npython examples/comprehensive_example.py\n```\n\n### Dependencies\n\nThe library requires the following packages for full standards compliance:\n- `pycountry`: ISO standards validation (languages, countries, currencies)\n- `phonenumbers`: E.164 phone number validation\n- `PyYAML`: YAML format support\n\n## License\n\nApache License 2.0 - see LICENSE file for details.\n\n## Contributing\n\nContributions welcome! Please read CONTRIBUTING.md for guidelines.\n\n## Error Handling\n\nThe library provides detailed validation error messages that reference specific standards:\n\n```python\ntry:\n    odp.validate()\nexcept ODPSValidationError as e:\n    print(e)\n    # Output: \"Validation errors: Invalid ISO 639-1 language code: 'xyz'; \n    #          dataHolder email must be a valid RFC 5322 email address\"\n```\n\n## \ud83c\udfc6 Acknowledgments\n\nWe extend our gratitude to the following:\n\n**[Open Data Product Initiative Team](https://opendataproducts.org/)** - Special thanks to the team at opendataproducts.org for their work in creating and maintaining the Open Data Product Specification (ODPS). Their vision of standardizing data product descriptions and enabling better data discovery and interoperability has made this library possible. The ODPS v4.0 specification represents years of collaborative effort from industry experts, data practitioners, and open source contributors who are driving the future of data standardization.\n\n**Python Community** - For the exceptional ecosystem of libraries and tools that power this implementation, including PyYAML, pycountry, phonenumbers, and the countless other packages that make Python development a joy.\n\n**Data Community** - For embracing open standards and driving the need for better data product specifications and tooling that benefits everyone in the data ecosystem.\n\n## \ud83d\udcda Links & References\n\n- [Open Data Product Specification v4.0](https://opendataproducts.org/v4.0/)\n- [ODPS Schema](https://opendataproducts.org/v4.0/schema/)\n- [ISO 639-1 Language Codes](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)\n- [ISO 3166-1 Country Codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)\n- [ISO 4217 Currency Codes](https://en.wikipedia.org/wiki/ISO_4217)\n- [ISO 8601 Date/Time Format](https://en.wikipedia.org/wiki/ISO_8601)\n- [E.164 Phone Number Format](https://en.wikipedia.org/wiki/E.164)\n- [RFC 5322 Email Format](https://datatracker.ietf.org/doc/html/rfc5322)\n- [RFC 3986 URI Format](https://datatracker.ietf.org/doc/html/rfc3986)\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "High-performance Python library for Open Data Product Specification (ODPS) v4.0 with caching, validation, and international standards compliance",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/accenture/odps-python/issues",
        "Documentation": "https://github.com/accenture/odps-python#readme",
        "Homepage": "https://github.com/accenture/odps-python",
        "Repository": "https://github.com/accenture/odps-python"
    },
    "split_keywords": [
        "open-data",
        " data-product",
        " specification",
        " odps",
        " iso-standards",
        " validation",
        " rfc",
        " e164",
        " iso639",
        " iso3166",
        " iso4217",
        " iso8601",
        " performance",
        " caching",
        " protocols",
        " type-safety"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f2685a2f2a0ecdc2c4859aff9baf92d255ea3fa5a35bec9645d43707dc0ac04",
                "md5": "a3f038812646fda82dae327a1efb6af8",
                "sha256": "0fdf5858b94d8643a1dbecfff131a42d081fd84a215acfd1509740ef5a0fec28"
            },
            "downloads": -1,
            "filename": "odps_python-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a3f038812646fda82dae327a1efb6af8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 35640,
            "upload_time": "2025-08-31T08:48:03",
            "upload_time_iso_8601": "2025-08-31T08:48:03.606484Z",
            "url": "https://files.pythonhosted.org/packages/2f/26/85a2f2a0ecdc2c4859aff9baf92d255ea3fa5a35bec9645d43707dc0ac04/odps_python-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-31 08:48:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "accenture",
    "github_project": "odps-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "PyYAML",
            "specs": [
                [
                    ">=",
                    "6.0.2"
                ]
            ]
        },
        {
            "name": "pycountry",
            "specs": [
                [
                    ">=",
                    "24.6.1"
                ]
            ]
        },
        {
            "name": "phonenumbers",
            "specs": [
                [
                    ">=",
                    "9.0.11"
                ]
            ]
        }
    ],
    "lcname": "odps-python"
}
        
Elapsed time: 0.95606s