oxidize


Nameoxidize JSON
Version 0.7.0 PyPI version JSON
download
home_pageNone
SummaryHigh-performance data processing tools for Python, built with Rust
upload_time2025-09-07 01:03:44
maintainerNone
docs_urlNone
authorEric Aleman
requires_python>=3.9
licenseNone
keywords data-processing rust performance tools
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Oxidize

High-performance data processing tools for Python, built with Rust.

## Philosophy

- **Best of both worlds**: Python interfaces with Rust backends for both simplicity and performance
- **True parallelism**: GIL release for concurrent processing
- **Easy installation**: Pre-built wheels, no compilation required
- **Practical**: Specialized solutions for common data engineering tasks

## Tools

### [oxidize-postal](https://github.com/ericaleman/oxidize-postal)
oxidize-postal is an alternative to pypostal for Python bindings of the libpostal library, which provides address parsing and normalization with international support. 

oxidize-postal provides the same address parsing capabilities as pypostal but addresses key limitations: it installs without C compilation, releases the Python GIL for true parallel processing, and offers a cleaner API. Built using Rust and libpostal-rust bindings to the libpostal C library.

```python
import oxidize_postal

parsed = oxidize_postal.parse_address("781 Franklin Ave Brooklyn NY 11216")
# {'house_number': '781', 'road': 'franklin ave', 'city': 'brooklyn', 'state': 'ny', 'postcode': '11216'}

expansions = oxidize_postal.expand_address("123 Main St NYC NY")
# ['123 main street nyc new york', '123 main street nyc ny', ...]
```

### [oxidize-xml](https://github.com/ericaleman/oxidize-xml)
oxidize-xml is an alternative to lxml and provides streaming XML to JSON conversion for large files.

oxidize-xml is more specialized and opiniated, focusing on common data engineering workflows for extracting repeated elements from large XML files like API responses, log files, and data exports, is particularly built for engineers and analysts working in DuckDB or Polars.

```python
import oxidize_xml

# Extract repeated elements to JSON Lines
count = oxidize_xml.parse_xml_file_to_json_file("data.xml", "book", "output.jsonl")

# Stream processing for large files
json_lines = oxidize_xml.parse_xml_file_to_json_string("export.xml", "record")
```

## Future Tools

New versions to oxidize-xml / oxidize-postal plus new packages coming soon.

## License

MIT License for all tools.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "oxidize",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "data-processing, rust, performance, tools",
    "author": "Eric Aleman",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/29/da/bf69cdb542d3fd324f934445c462456a5faaa9a6b22ac27006308aead2af/oxidize-0.7.0.tar.gz",
    "platform": null,
    "description": "# Oxidize\n\nHigh-performance data processing tools for Python, built with Rust.\n\n## Philosophy\n\n- **Best of both worlds**: Python interfaces with Rust backends for both simplicity and performance\n- **True parallelism**: GIL release for concurrent processing\n- **Easy installation**: Pre-built wheels, no compilation required\n- **Practical**: Specialized solutions for common data engineering tasks\n\n## Tools\n\n### [oxidize-postal](https://github.com/ericaleman/oxidize-postal)\noxidize-postal is an alternative to pypostal for Python bindings of the libpostal library, which provides address parsing and normalization with international support. \n\noxidize-postal provides the same address parsing capabilities as pypostal but addresses key limitations: it installs without C compilation, releases the Python GIL for true parallel processing, and offers a cleaner API. Built using Rust and libpostal-rust bindings to the libpostal C library.\n\n```python\nimport oxidize_postal\n\nparsed = oxidize_postal.parse_address(\"781 Franklin Ave Brooklyn NY 11216\")\n# {'house_number': '781', 'road': 'franklin ave', 'city': 'brooklyn', 'state': 'ny', 'postcode': '11216'}\n\nexpansions = oxidize_postal.expand_address(\"123 Main St NYC NY\")\n# ['123 main street nyc new york', '123 main street nyc ny', ...]\n```\n\n### [oxidize-xml](https://github.com/ericaleman/oxidize-xml)\noxidize-xml is an alternative to lxml and provides streaming XML to JSON conversion for large files.\n\noxidize-xml is more specialized and opiniated, focusing on common data engineering workflows for extracting repeated elements from large XML files like API responses, log files, and data exports, is particularly built for engineers and analysts working in DuckDB or Polars.\n\n```python\nimport oxidize_xml\n\n# Extract repeated elements to JSON Lines\ncount = oxidize_xml.parse_xml_file_to_json_file(\"data.xml\", \"book\", \"output.jsonl\")\n\n# Stream processing for large files\njson_lines = oxidize_xml.parse_xml_file_to_json_string(\"export.xml\", \"record\")\n```\n\n## Future Tools\n\nNew versions to oxidize-xml / oxidize-postal plus new packages coming soon.\n\n## License\n\nMIT License for all tools.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "High-performance data processing tools for Python, built with Rust",
    "version": "0.7.0",
    "project_urls": {
        "Repository": "https://github.com/ericaleman/oxidize"
    },
    "split_keywords": [
        "data-processing",
        " rust",
        " performance",
        " tools"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f9d6a583972b15f234f7c12ecef8e4c85ad509ac25359b0614995f0cc2ccb4a4",
                "md5": "26bd47c43637f149b8aeb88555cdd766",
                "sha256": "148d7297013e3c843b2c4f0f0ee41d91788c0809bd15009ca8c4ea8646c81709"
            },
            "downloads": -1,
            "filename": "oxidize-0.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "26bd47c43637f149b8aeb88555cdd766",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 2912,
            "upload_time": "2025-09-07T01:03:43",
            "upload_time_iso_8601": "2025-09-07T01:03:43.374477Z",
            "url": "https://files.pythonhosted.org/packages/f9/d6/a583972b15f234f7c12ecef8e4c85ad509ac25359b0614995f0cc2ccb4a4/oxidize-0.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "29dabf69cdb542d3fd324f934445c462456a5faaa9a6b22ac27006308aead2af",
                "md5": "bcb871ef478c6c64ac481a7bc672c0a2",
                "sha256": "b3a446265641720d93af86941ef5af8274ce732b061454e129b7c368f9e98d28"
            },
            "downloads": -1,
            "filename": "oxidize-0.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bcb871ef478c6c64ac481a7bc672c0a2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 2755,
            "upload_time": "2025-09-07T01:03:44",
            "upload_time_iso_8601": "2025-09-07T01:03:44.426027Z",
            "url": "https://files.pythonhosted.org/packages/29/da/bf69cdb542d3fd324f934445c462456a5faaa9a6b22ac27006308aead2af/oxidize-0.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-07 01:03:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ericaleman",
    "github_project": "oxidize",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "oxidize"
}
        
Elapsed time: 3.19035s