piah


Namepiah JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
Summaryautomatically parse PDF's and texts to dataclasses
upload_time2024-04-01 04:48:41
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords ai dataclass parser piah
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # piah

[![PyPI - Version](https://img.shields.io/pypi/v/piah.svg)](https://pypi.org/project/piah)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/piah.svg)](https://pypi.org/project/piah)

-----

Piah automatically parse the data from PDF's or texts based only in the [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) that you provide and return the same [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) fullfilled with the values.
Piah is based in the [OxyParser](https://github.com/oxylabs/OxyParser/)

**Table of Contents**

- [Installation](#installation)
- [Example](#example)
- [TODO](#todo)
- [Know Issues](#know-issues)
- [License](#license)

## Installation

```console
pip install piah
```

## Example
```python
from piah import Piah
from dataclasses import dataclass

@dataclass
class Person:
  name: str
  age: int

parser = Piah("gpt-3.5-turbo")
result = parser.parse("Hello Iam python and I have 33 years old", Person)
```
to parse PDF's:
```python
result = parser.parse("example.pdf", Person)
#or
result = parser.parse(Path("example.pdf"), Person)
```

## TODO
- [ ] Write docstrings
- [ ] Improve allowed types
- [ ] Improve system prompt

## Know Issues
Seems that `piah` don't pass every time in the test, because the LLM don't parse
correctly every time large PDF's

## License

`piah` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "piah",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ai, dataclass, parser, piah",
    "author": null,
    "author_email": "fabiobarkoski <fabiobarkoskii@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/3d/5d/cc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21/piah-0.1.1.tar.gz",
    "platform": null,
    "description": "# piah\n\n[![PyPI - Version](https://img.shields.io/pypi/v/piah.svg)](https://pypi.org/project/piah)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/piah.svg)](https://pypi.org/project/piah)\n\n-----\n\nPiah automatically parse the data from PDF's or texts based only in the [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) that you provide and return the same [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) fullfilled with the values.\nPiah is based in the [OxyParser](https://github.com/oxylabs/OxyParser/)\n\n**Table of Contents**\n\n- [Installation](#installation)\n- [Example](#example)\n- [TODO](#todo)\n- [Know Issues](#know-issues)\n- [License](#license)\n\n## Installation\n\n```console\npip install piah\n```\n\n## Example\n```python\nfrom piah import Piah\nfrom dataclasses import dataclass\n\n@dataclass\nclass Person:\n  name: str\n  age: int\n\nparser = Piah(\"gpt-3.5-turbo\")\nresult = parser.parse(\"Hello Iam python and I have 33 years old\", Person)\n```\nto parse PDF's:\n```python\nresult = parser.parse(\"example.pdf\", Person)\n#or\nresult = parser.parse(Path(\"example.pdf\"), Person)\n```\n\n## TODO\n- [ ] Write docstrings\n- [ ] Improve allowed types\n- [ ] Improve system prompt\n\n## Know Issues\nSeems that `piah` don't pass every time in the test, because the LLM don't parse\ncorrectly every time large PDF's\n\n## License\n\n`piah` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "automatically parse PDF's and texts to dataclasses",
    "version": "0.1.1",
    "project_urls": {
        "Documentation": "https://github.com/fabiobarkoski/piah#readme",
        "Issues": "https://github.com/fabiobarkoski/piah/issues",
        "Source": "https://github.com/fabiobarkoski/piah"
    },
    "split_keywords": [
        "ai",
        " dataclass",
        " parser",
        " piah"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f79e6b1ec6ad4014955b2bdbc0eefe5319e648bec0f389ecea552edfec3c4535",
                "md5": "fe09aa485e1ddf2826ba16791615e819",
                "sha256": "fa22eb8570337eb6dd498401ec96604983e985d4d0396ade0df2e9096f4bc7bb"
            },
            "downloads": -1,
            "filename": "piah-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fe09aa485e1ddf2826ba16791615e819",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 5351,
            "upload_time": "2024-04-01T04:48:42",
            "upload_time_iso_8601": "2024-04-01T04:48:42.674575Z",
            "url": "https://files.pythonhosted.org/packages/f7/9e/6b1ec6ad4014955b2bdbc0eefe5319e648bec0f389ecea552edfec3c4535/piah-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3d5dcc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21",
                "md5": "d4e91b48a07cb0e3f4b062446fbdaf36",
                "sha256": "02fae02474a8990ecb7ff9d1b5c705268978ebd18b00d49d5be3b9c80d4c7cb4"
            },
            "downloads": -1,
            "filename": "piah-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d4e91b48a07cb0e3f4b062446fbdaf36",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 53492,
            "upload_time": "2024-04-01T04:48:41",
            "upload_time_iso_8601": "2024-04-01T04:48:41.470108Z",
            "url": "https://files.pythonhosted.org/packages/3d/5d/cc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21/piah-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-01 04:48:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "fabiobarkoski",
    "github_project": "piah#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "piah"
}
        
Elapsed time: 0.22575s