Name | piah JSON |
Version |
0.1.1
JSON |
| download |
home_page | None |
Summary | automatically parse PDF's and texts to dataclasses |
upload_time | 2024-04-01 04:48:41 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | None |
keywords |
ai
dataclass
parser
piah
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# piah
[![PyPI - Version](https://img.shields.io/pypi/v/piah.svg)](https://pypi.org/project/piah)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/piah.svg)](https://pypi.org/project/piah)
-----
Piah automatically parse the data from PDF's or texts based only in the [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) that you provide and return the same [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) fullfilled with the values.
Piah is based in the [OxyParser](https://github.com/oxylabs/OxyParser/)
**Table of Contents**
- [Installation](#installation)
- [Example](#example)
- [TODO](#todo)
- [Know Issues](#know-issues)
- [License](#license)
## Installation
```console
pip install piah
```
## Example
```python
from piah import Piah
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
parser = Piah("gpt-3.5-turbo")
result = parser.parse("Hello Iam python and I have 33 years old", Person)
```
to parse PDF's:
```python
result = parser.parse("example.pdf", Person)
#or
result = parser.parse(Path("example.pdf"), Person)
```
## TODO
- [ ] Write docstrings
- [ ] Improve allowed types
- [ ] Improve system prompt
## Know Issues
Seems that `piah` don't pass every time in the test, because the LLM don't parse
correctly every time large PDF's
## License
`piah` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.
Raw data
{
"_id": null,
"home_page": null,
"name": "piah",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "ai, dataclass, parser, piah",
"author": null,
"author_email": "fabiobarkoski <fabiobarkoskii@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/3d/5d/cc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21/piah-0.1.1.tar.gz",
"platform": null,
"description": "# piah\n\n[![PyPI - Version](https://img.shields.io/pypi/v/piah.svg)](https://pypi.org/project/piah)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/piah.svg)](https://pypi.org/project/piah)\n\n-----\n\nPiah automatically parse the data from PDF's or texts based only in the [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) that you provide and return the same [dataclass](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) fullfilled with the values.\nPiah is based in the [OxyParser](https://github.com/oxylabs/OxyParser/)\n\n**Table of Contents**\n\n- [Installation](#installation)\n- [Example](#example)\n- [TODO](#todo)\n- [Know Issues](#know-issues)\n- [License](#license)\n\n## Installation\n\n```console\npip install piah\n```\n\n## Example\n```python\nfrom piah import Piah\nfrom dataclasses import dataclass\n\n@dataclass\nclass Person:\n name: str\n age: int\n\nparser = Piah(\"gpt-3.5-turbo\")\nresult = parser.parse(\"Hello Iam python and I have 33 years old\", Person)\n```\nto parse PDF's:\n```python\nresult = parser.parse(\"example.pdf\", Person)\n#or\nresult = parser.parse(Path(\"example.pdf\"), Person)\n```\n\n## TODO\n- [ ] Write docstrings\n- [ ] Improve allowed types\n- [ ] Improve system prompt\n\n## Know Issues\nSeems that `piah` don't pass every time in the test, because the LLM don't parse\ncorrectly every time large PDF's\n\n## License\n\n`piah` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.\n",
"bugtrack_url": null,
"license": null,
"summary": "automatically parse PDF's and texts to dataclasses",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://github.com/fabiobarkoski/piah#readme",
"Issues": "https://github.com/fabiobarkoski/piah/issues",
"Source": "https://github.com/fabiobarkoski/piah"
},
"split_keywords": [
"ai",
" dataclass",
" parser",
" piah"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f79e6b1ec6ad4014955b2bdbc0eefe5319e648bec0f389ecea552edfec3c4535",
"md5": "fe09aa485e1ddf2826ba16791615e819",
"sha256": "fa22eb8570337eb6dd498401ec96604983e985d4d0396ade0df2e9096f4bc7bb"
},
"downloads": -1,
"filename": "piah-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fe09aa485e1ddf2826ba16791615e819",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 5351,
"upload_time": "2024-04-01T04:48:42",
"upload_time_iso_8601": "2024-04-01T04:48:42.674575Z",
"url": "https://files.pythonhosted.org/packages/f7/9e/6b1ec6ad4014955b2bdbc0eefe5319e648bec0f389ecea552edfec3c4535/piah-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3d5dcc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21",
"md5": "d4e91b48a07cb0e3f4b062446fbdaf36",
"sha256": "02fae02474a8990ecb7ff9d1b5c705268978ebd18b00d49d5be3b9c80d4c7cb4"
},
"downloads": -1,
"filename": "piah-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "d4e91b48a07cb0e3f4b062446fbdaf36",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 53492,
"upload_time": "2024-04-01T04:48:41",
"upload_time_iso_8601": "2024-04-01T04:48:41.470108Z",
"url": "https://files.pythonhosted.org/packages/3d/5d/cc169eb8984e8dd1170e4bcfd112485f6fb27dd6512df1d8c478fbb0db21/piah-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-01 04:48:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "fabiobarkoski",
"github_project": "piah#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "piah"
}