ddex-parser


Nameddex-parser JSON
Version 0.4.0 PyPI version JSON
download
home_pageNone
SummaryHigh-performance DDEX XML parser for Python
upload_time2025-09-15 01:26:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords ddex xml parser music metadata ern
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DDEX Parser - Python Bindings

[![PyPI version](https://img.shields.io/pypi/v/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)
[![Python versions](https://img.shields.io/pypi/pyversions/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)
[![Downloads](https://img.shields.io/pypi/dm/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

High-performance DDEX XML parser for Python with built-in security features and comprehensive metadata extraction. Parse DDEX files 10x faster than traditional XML parsers with full support for all DDEX versions and profiles.

## Installation

```bash
pip install ddex-parser
```

## Security Notice
**v0.4.0 fixes a critical security vulnerability (RUSTSEC-2025-0020).** 
All users should upgrade immediately for PyO3 0.24 compatibility and enhanced security.

## Quick Start

```python
from ddex_parser import DDEXParser
import pandas as pd

# Parse DDEX file
parser = DDEXParser()
result = parser.parse_file("release.xml")

# Access parsed data
print(f"Release: {result.release_title}")
print(f"Artist: {result.main_artist}")
print(f"Tracks: {len(result.tracks)}")

# Convert to DataFrame for analysis
tracks_df = result.to_dataframe()
print(tracks_df.head())
```

## Features

### 🚀 High Performance
- **10x faster** than standard XML parsers
- Streaming support for large files (>100MB)
- Memory-efficient processing
- Native Rust implementation with Python bindings

### 🔒 Security Built-in
- XXE (XML External Entity) attack protection
- Entity expansion limits
- Memory-bounded parsing
- Deep nesting protection

### 📊 Data Science Ready
- Direct pandas DataFrame export
- Structured metadata extraction
- JSON serialization support
- Type hints for better IDE experience

### 🎵 Music Industry Focused
- Support for all DDEX versions (3.2, 3.3, 4.0+)
- Release, track, and artist metadata
- Rights and usage information
- Territory and deal terms
- Image and audio resource handling

## API Reference

### DDEXParser

```python
from ddex_parser import DDEXParser

parser = DDEXParser(
    max_entity_expansions=1000,  # Limit entity expansions for security
    max_depth=100,               # Maximum XML nesting depth
    streaming=True               # Enable streaming for large files
)
```

### Parsing Methods

#### `parse_file(path: str) -> DDEXResult`

Parse a DDEX XML file from disk.

```python
result = parser.parse_file("path/to/release.xml")
```

#### `parse_string(xml: str) -> DDEXResult`

Parse DDEX XML from a string.

```python
with open("release.xml", "r") as f:
    xml_content = f.read()
result = parser.parse_string(xml_content)
```

#### `parse_async(path: str) -> Awaitable[DDEXResult]`

Asynchronous parsing for non-blocking operations.

```python
import asyncio

async def parse_ddex():
    result = await parser.parse_async("release.xml")
    return result

# Usage
result = asyncio.run(parse_ddex())
```

## DataFrame Integration

Perfect for data analysis workflows:

```python
import pandas as pd
from ddex_parser import DDEXParser

parser = DDEXParser()
result = parser.parse_file("catalog.xml")

# Get tracks as DataFrame
tracks_df = result.to_dataframe("tracks")
print(tracks_df.columns)
# ['track_id', 'title', 'artist', 'duration', 'isrc', 'genre', ...]

# Analyze your catalog
genre_counts = tracks_df['genre'].value_counts()
avg_duration = tracks_df['duration'].mean()

# Export for further analysis
tracks_df.to_csv("catalog_analysis.csv")
tracks_df.to_parquet("catalog_analysis.parquet")
```

## Performance Benchmarks

Performance comparison on a MacBook Pro M2:

| File Size | ddex-parser | lxml | xml.etree | Speedup |
|-----------|-------------|------|-----------|----------|
| 10KB      | 0.8ms       | 8ms  | 12ms      | 10x-15x |
| 100KB     | 3ms         | 45ms | 78ms      | 15x-26x |
| 1MB       | 28ms        | 380ms| 650ms     | 13x-23x |
| 10MB      | 180ms       | 3.2s | 5.8s      | 18x-32x |

Memory usage is consistently 60-80% lower than traditional parsers.

## Integration with ddex-builder

Round-trip compatibility with ddex-builder for complete workflows:

```python
from ddex_parser import DDEXParser
from ddex_builder import DDEXBuilder

# Parse existing DDEX file
parser = DDEXParser()
original = parser.parse_file("input.xml")

# Modify data
modified_data = original.to_dict()
modified_data['tracks'][0]['title'] = "New Title"

# Build new DDEX file
builder = DDEXBuilder()
new_xml = builder.build_from_dict(modified_data)

# Verify round-trip integrity
new_result = parser.parse_string(new_xml)
assert new_result.tracks[0].title == "New Title"
```

## Requirements
- Python 3.8+
- pandas (optional, for DataFrame support)
- PyO3 0.24 compatible runtime

## License

This project is licensed under the MIT License - see the [LICENSE](https://github.com/daddykev/ddex-suite/blob/main/LICENSE) file for details.

## Related Projects

- **[ddex-builder](https://pypi.org/project/ddex-builder/)** - Build deterministic DDEX XML files
- **[ddex-parser (npm)](https://www.npmjs.com/package/ddex-parser)** - JavaScript/TypeScript bindings
- **[DDEX Suite](https://ddex-suite.org)** - Complete DDEX processing toolkit

---

Built for the music industry. Powered by Rust for maximum performance and safety.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ddex-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "ddex, xml, parser, music, metadata, ern",
    "author": null,
    "author_email": "Kevin Marques Moo <daddykev@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/72/41/4f02e6f4748dce2a47844ded430e90bb3ed5fcd29ae7646fe296f3662a8c/ddex_parser-0.4.0.tar.gz",
    "platform": null,
    "description": "# DDEX Parser - Python Bindings\n\n[![PyPI version](https://img.shields.io/pypi/v/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)\n[![Python versions](https://img.shields.io/pypi/pyversions/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)\n[![Downloads](https://img.shields.io/pypi/dm/ddex-parser.svg)](https://pypi.org/project/ddex-parser/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nHigh-performance DDEX XML parser for Python with built-in security features and comprehensive metadata extraction. Parse DDEX files 10x faster than traditional XML parsers with full support for all DDEX versions and profiles.\n\n## Installation\n\n```bash\npip install ddex-parser\n```\n\n## Security Notice\n**v0.4.0 fixes a critical security vulnerability (RUSTSEC-2025-0020).** \nAll users should upgrade immediately for PyO3 0.24 compatibility and enhanced security.\n\n## Quick Start\n\n```python\nfrom ddex_parser import DDEXParser\nimport pandas as pd\n\n# Parse DDEX file\nparser = DDEXParser()\nresult = parser.parse_file(\"release.xml\")\n\n# Access parsed data\nprint(f\"Release: {result.release_title}\")\nprint(f\"Artist: {result.main_artist}\")\nprint(f\"Tracks: {len(result.tracks)}\")\n\n# Convert to DataFrame for analysis\ntracks_df = result.to_dataframe()\nprint(tracks_df.head())\n```\n\n## Features\n\n### \ud83d\ude80 High Performance\n- **10x faster** than standard XML parsers\n- Streaming support for large files (>100MB)\n- Memory-efficient processing\n- Native Rust implementation with Python bindings\n\n### \ud83d\udd12 Security Built-in\n- XXE (XML External Entity) attack protection\n- Entity expansion limits\n- Memory-bounded parsing\n- Deep nesting protection\n\n### \ud83d\udcca Data Science Ready\n- Direct pandas DataFrame export\n- Structured metadata extraction\n- JSON serialization support\n- Type hints for better IDE experience\n\n### \ud83c\udfb5 Music Industry Focused\n- Support for all DDEX versions (3.2, 3.3, 4.0+)\n- Release, track, and artist metadata\n- Rights and usage information\n- Territory and deal terms\n- Image and audio resource handling\n\n## API Reference\n\n### DDEXParser\n\n```python\nfrom ddex_parser import DDEXParser\n\nparser = DDEXParser(\n    max_entity_expansions=1000,  # Limit entity expansions for security\n    max_depth=100,               # Maximum XML nesting depth\n    streaming=True               # Enable streaming for large files\n)\n```\n\n### Parsing Methods\n\n#### `parse_file(path: str) -> DDEXResult`\n\nParse a DDEX XML file from disk.\n\n```python\nresult = parser.parse_file(\"path/to/release.xml\")\n```\n\n#### `parse_string(xml: str) -> DDEXResult`\n\nParse DDEX XML from a string.\n\n```python\nwith open(\"release.xml\", \"r\") as f:\n    xml_content = f.read()\nresult = parser.parse_string(xml_content)\n```\n\n#### `parse_async(path: str) -> Awaitable[DDEXResult]`\n\nAsynchronous parsing for non-blocking operations.\n\n```python\nimport asyncio\n\nasync def parse_ddex():\n    result = await parser.parse_async(\"release.xml\")\n    return result\n\n# Usage\nresult = asyncio.run(parse_ddex())\n```\n\n## DataFrame Integration\n\nPerfect for data analysis workflows:\n\n```python\nimport pandas as pd\nfrom ddex_parser import DDEXParser\n\nparser = DDEXParser()\nresult = parser.parse_file(\"catalog.xml\")\n\n# Get tracks as DataFrame\ntracks_df = result.to_dataframe(\"tracks\")\nprint(tracks_df.columns)\n# ['track_id', 'title', 'artist', 'duration', 'isrc', 'genre', ...]\n\n# Analyze your catalog\ngenre_counts = tracks_df['genre'].value_counts()\navg_duration = tracks_df['duration'].mean()\n\n# Export for further analysis\ntracks_df.to_csv(\"catalog_analysis.csv\")\ntracks_df.to_parquet(\"catalog_analysis.parquet\")\n```\n\n## Performance Benchmarks\n\nPerformance comparison on a MacBook Pro M2:\n\n| File Size | ddex-parser | lxml | xml.etree | Speedup |\n|-----------|-------------|------|-----------|----------|\n| 10KB      | 0.8ms       | 8ms  | 12ms      | 10x-15x |\n| 100KB     | 3ms         | 45ms | 78ms      | 15x-26x |\n| 1MB       | 28ms        | 380ms| 650ms     | 13x-23x |\n| 10MB      | 180ms       | 3.2s | 5.8s      | 18x-32x |\n\nMemory usage is consistently 60-80% lower than traditional parsers.\n\n## Integration with ddex-builder\n\nRound-trip compatibility with ddex-builder for complete workflows:\n\n```python\nfrom ddex_parser import DDEXParser\nfrom ddex_builder import DDEXBuilder\n\n# Parse existing DDEX file\nparser = DDEXParser()\noriginal = parser.parse_file(\"input.xml\")\n\n# Modify data\nmodified_data = original.to_dict()\nmodified_data['tracks'][0]['title'] = \"New Title\"\n\n# Build new DDEX file\nbuilder = DDEXBuilder()\nnew_xml = builder.build_from_dict(modified_data)\n\n# Verify round-trip integrity\nnew_result = parser.parse_string(new_xml)\nassert new_result.tracks[0].title == \"New Title\"\n```\n\n## Requirements\n- Python 3.8+\n- pandas (optional, for DataFrame support)\n- PyO3 0.24 compatible runtime\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/daddykev/ddex-suite/blob/main/LICENSE) file for details.\n\n## Related Projects\n\n- **[ddex-builder](https://pypi.org/project/ddex-builder/)** - Build deterministic DDEX XML files\n- **[ddex-parser (npm)](https://www.npmjs.com/package/ddex-parser)** - JavaScript/TypeScript bindings\n- **[DDEX Suite](https://ddex-suite.org)** - Complete DDEX processing toolkit\n\n---\n\nBuilt for the music industry. Powered by Rust for maximum performance and safety.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "High-performance DDEX XML parser for Python",
    "version": "0.4.0",
    "project_urls": null,
    "split_keywords": [
        "ddex",
        " xml",
        " parser",
        " music",
        " metadata",
        " ern"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ae1af7f1629f5173de53251f927287a23022d5d83cf06a0f56c647e5f0d173ad",
                "md5": "4f8bde36de6e12cef84c531024b41852",
                "sha256": "9a4aca023fd028cf9acd7093be9860df08aecf918f44c421718c518b170fe7ac"
            },
            "downloads": -1,
            "filename": "ddex_parser-0.4.0-cp38-abi3-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "4f8bde36de6e12cef84c531024b41852",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 457526,
            "upload_time": "2025-09-15T01:26:52",
            "upload_time_iso_8601": "2025-09-15T01:26:52.424112Z",
            "url": "https://files.pythonhosted.org/packages/ae/1a/f7f1629f5173de53251f927287a23022d5d83cf06a0f56c647e5f0d173ad/ddex_parser-0.4.0-cp38-abi3-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "72414f02e6f4748dce2a47844ded430e90bb3ed5fcd29ae7646fe296f3662a8c",
                "md5": "016457b53a6110083bf2c05451c5099a",
                "sha256": "473970df6e4d9f595af3b41d1ecf76e00ac23fa428f87b40d2cc666eada3503c"
            },
            "downloads": -1,
            "filename": "ddex_parser-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "016457b53a6110083bf2c05451c5099a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 301645,
            "upload_time": "2025-09-15T01:26:54",
            "upload_time_iso_8601": "2025-09-15T01:26:54.361742Z",
            "url": "https://files.pythonhosted.org/packages/72/41/4f02e6f4748dce2a47844ded430e90bb3ed5fcd29ae7646fe296f3662a8c/ddex_parser-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-15 01:26:54",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ddex-parser"
}
        
Elapsed time: 0.91129s