Name | yadt JSON |
Version |
0.1.2
JSON |
| download |
home_page | None |
Summary | Yet Another Document Translator |
upload_time | 2025-02-03 16:20:00 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.13,>=3.10 |
license | None |
keywords |
pdf
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<!-- # Yet Another Document Translator -->
<div align="center">
<img src="./docs/images/banner.png" width="320px" alt="YADT"/>
<h2 id="title">Yet Another Document Translator</h2>
<p>
<!-- PyPI -->
<a href="https://pypi.org/project/yadt/">
<img src="https://img.shields.io/pypi/v/yadt"></a>
<a href="https://pepy.tech/projects/yadt">
<img src="https://static.pepy.tech/badge/yadt"></a>
<!-- <a href="https://github.com/funstory-ai/yadt/pulls">
<img src="https://img.shields.io/badge/contributions-welcome-green"></a> -->
<!-- License -->
<a href="./LICENSE">
<img src="https://img.shields.io/github/license/funstory-ai/yadt"></a>
<a href="https://t.me/+Z9_SgnxmsmA5NzBl">
<img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"></a>
</p>
</div>
PDF scientific paper translation and bilingual comparison library.
- Provides a simple [command line interface](#getting-started).
- Provides a [Python API](#python-api).
- Mainly designed to be embedded into other programs, but can also be used directly for simple translation tasks.
## Preview
<div align="center">
<img src="./docs/images/preview.gif" width="80%"/>
</div>
## Getting Started
### Install from PyPI
We recommend using the Tool feature of [uv](https://github.com/astral-sh/uv) to install yadt.
1. First, you need to refer to [uv installation](https://github.com/astral-sh/uv#installation) to install uv and set up the `PATH` environment variable as prompted.
2. Use the following command to install yadt:
```bash
uv tool install --python 3.12 yadt
yadt --help
```
3. Use the `yadt` command. For example:
```bash
yadt --bing --files example.pdf
# multiple files
yadt --bing --files example1.pdf --files example2.pdf
```
### Install from Source
We still recommend using [uv](https://github.com/astral-sh/uv) to manage virtual environments.
1. First, you need to refer to [uv installation](https://github.com/astral-sh/uv#installation) to install uv and set up the `PATH` environment variable as prompted.
2. Use the following command to install yadt:
```bash
# clone the project
git clone https://github.com/funstory-ai/yadt
# enter the project directory
cd yadt
# install dependencies and run yadt
uv run yadt --help
```
3. Use the `uv run yadt` command. For example:
```bash
uv run yadt --bing --files example.pdf
# multiple files
uv run yadt --bing --files example.pdf --files example2.pdf
```
> [!TIP]
> The absolute path is recommended.
## Advanced Options
### Language Options
- `--lang-in`, `-li`: Source language code (default: en)
- `--lang-out`, `-lo`: Target language code (default: zh)
> [!TIP]
> Currently, this project mainly focuses on English-to-Chinese translation, and other scenarios have not been tested yet.
### PDF Processing Options
- `--files`: One or more file paths to input PDF documents.
- `--pages`, `-p`: Specify pages to translate (e.g., "1,2,1-,-3,3-5"). If not set, translate all pages
- `--split-short-lines`: Force split short lines into different paragraphs (may cause poor typesetting & bugs)
- `--short-line-split-factor`: Split threshold factor (default: 0.8). The actual threshold is the median length of all lines on the current page \* this factor
### Translation Service Options
- `--qps`: QPS (Queries Per Second) limit for translation service (default: 4)
- `--ignore-cache`: Ignore translation cache and force retranslation
- `--no-dual`: Do not output bilingual PDF files
- `--no-mono`: Do not output monolingual PDF files
- `--openai`: Use OpenAI for translation (default: False)
- `--bing`: Use Bing for translation (default: False)
- `--google`: Use Google Translate for translation (default: False)
> [!TIP]
>
> 1. You must specify one translation service among `--openai`, `--bing`, `--google`.
> 2. It is recommended to use models with strong compatibility with OpenAI, such as: `glm-4-flash`, `deepseek-chat`, etc.
> 3. Currently, it has not been optimized for traditional translation engines like Bing/Google, it is recommended to use LLMs.
### OpenAI Specific Options
- `--openai-model`: OpenAI model to use (default: gpt-4o-mini)
- `--openai-base-url`: Base URL for OpenAI API
- `--openai-api-key`: API key for OpenAI service
### Output Control
- `--output`, `-o`: Output directory for translated files. If not set, use same directory as input
- `--debug`, `-d`: Enable debug logging level and export detailed intermediate results in `~/.cache/yadt/working`.
### Configuration File
- `--config`, `-c`: Configuration file path. Use the TOML format.
Example Configuration:
```toml
[yadt]
debug = true
lang-in = "en-US"
lang-out = "zh-CN"
qps = 20
# this is a comment
# pages = 4
openai = true
openai-model = "SOME_ALSOME_MODEL"
openai-base-url = "https://example.example/v1"
openai-api-key = "[KEY]"
# All other options can also be set in the configuration file.
```
## Python API
You can refer to the example in [main.py](https://github.com/funstory-ai/yadt/blob/main/yadt/main.py) to use YADT's Python API.
Please note:
1. Make sure all font files described in `main.download_font_assets` exist
2. The current `TranslationConfig` does not fully validate input parameters, so you need to ensure the validity of input parameters
## Background
There are a lot projects and teams working on to make document editing and translating easier like:
- [mathpix](https://mathpix.com/)
- [Doc2X](https://doc2x.noedgeai.com/)
- [minerU](https://github.com/opendatalab/MinerU)
- [PDFMathTranslate](https://github.com/funstory-ai/yadt)
There are also some solutions to solve specific parts of the problem like:
- [layoutreader](https://github.com/microsoft/unilm/tree/master/layoutreader): the read order of the text block in a pdf
- [Surya](https://github.com/surya-is/surya): the structure of the pdf
This project hopes to promote a standard pipeline and interface to solve the problem.
In fact, there are two main stages of a PDF parser or translator:
- **Parsing**: A stage of parsing means to get the structure of the pdf such as text blocks, images, tables, etc.
- **Rendering**: A stage of rendering means to render the structure into a new pdf or other format.
For a service like mathpix, it will parse the pdf into a structure may be in a XML format, and then render them using a single column reader order as [layoutreader](https://github.com/microsoft/unilm/tree/master/layoutreader) does. The bad news is that the original structure lost.
Some people will use Adobe PDF Parser because it will generate a Word document and it keeps the original structure. But it is somewhat expensive.
And you know, a pdf or word document is not a good format for reading in mobile devices.
We offer an intermediate representation of the results from parser and can be rendered into a new pdf or other format. The pipeline is also a plugin-based system which everybody can add their new model, ocr, renderer, etc.
## Roadmap
- [ ] Add line support
- [ ] Add table support
- [ ] Add cross-page/cross-column paragraph support
- [ ] More advanced typesetting features
- [ ] Outline support
- [ ] ...
Our first 1.0 version goal is to finish a translation from [PDF Reference, Version 1.7](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf) to the following language version:
- Simplified Chinese
- Traditional Chinese
- Japanese
- Spanish
And meet the following requirements:
- layout error less than 1%
- content loss less than 1%
## Known Issues
1. Parsing errors in the author and reference sections; they get merged into one paragraph after translation.
2. Lines are not supported.
3. Does not support drop caps.
## How to Contribute
We encourage you to contribute to YADT! Please check out the [CONTRIBUTING](https://github.com/funstory-ai/yadt/blob/main/docs/CONTRIBUTING.md) guide.
Everyone interacting in YADT and its sub-projects' codebases, issue trackers, chat rooms, and mailing lists is expected to follow the YADT [Code of Conduct](https://github.com/funstory-ai/yadt/blob/main/docs/CODE_OF_CONDUCT.md).
## Acknowledgements
- [PDFMathTranslate](https://github.com/Byaidu/PDFMathTranslate)
- [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
- [pdfminer](https://github.com/pdfminer/pdfminer.six)
- [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
<h2 id="star_hist">Star History</h2>
<a href="https://star-history.com/#funstory-ai/yadt&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date"/>
</picture>
</a>
Raw data
{
"_id": null,
"home_page": null,
"name": "yadt",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "PDF",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/d7/e1/8b98c1146388ee2b06e5b02acdb77a2b7535b42e9bbd5052a336c9fc7df6/yadt-0.1.2.tar.gz",
"platform": null,
"description": "<!-- # Yet Another Document Translator -->\n\n<div align=\"center\">\n<img src=\"./docs/images/banner.png\" width=\"320px\" alt=\"YADT\"/>\n\n\n<h2 id=\"title\">Yet Another Document Translator</h2>\n\n<p>\n <!-- PyPI -->\n <a href=\"https://pypi.org/project/yadt/\">\n <img src=\"https://img.shields.io/pypi/v/yadt\"></a>\n <a href=\"https://pepy.tech/projects/yadt\">\n <img src=\"https://static.pepy.tech/badge/yadt\"></a>\n <!-- <a href=\"https://github.com/funstory-ai/yadt/pulls\">\n <img src=\"https://img.shields.io/badge/contributions-welcome-green\"></a> -->\n <!-- License -->\n <a href=\"./LICENSE\">\n <img src=\"https://img.shields.io/github/license/funstory-ai/yadt\"></a>\n <a href=\"https://t.me/+Z9_SgnxmsmA5NzBl\">\n <img src=\"https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white\"></a>\n</p>\n</div>\n\nPDF scientific paper translation and bilingual comparison library.\n\n- Provides a simple [command line interface](#getting-started).\n- Provides a [Python API](#python-api).\n- Mainly designed to be embedded into other programs, but can also be used directly for simple translation tasks.\n\n## Preview\n\n<div align=\"center\">\n<img src=\"./docs/images/preview.gif\" width=\"80%\"/>\n</div>\n\n## Getting Started\n\n### Install from PyPI\n\nWe recommend using the Tool feature of [uv](https://github.com/astral-sh/uv) to install yadt.\n\n1. First, you need to refer to [uv installation](https://github.com/astral-sh/uv#installation) to install uv and set up the `PATH` environment variable as prompted.\n\n2. Use the following command to install yadt:\n\n```bash\nuv tool install --python 3.12 yadt\n\nyadt --help\n```\n\n3. Use the `yadt` command. For example:\n\n```bash\nyadt --bing --files example.pdf\n\n# multiple files\nyadt --bing --files example1.pdf --files example2.pdf\n```\n\n### Install from Source\n\nWe still recommend using [uv](https://github.com/astral-sh/uv) to manage virtual environments.\n\n1. First, you need to refer to [uv installation](https://github.com/astral-sh/uv#installation) to install uv and set up the `PATH` environment variable as prompted.\n\n2. Use the following command to install yadt:\n\n```bash\n# clone the project\ngit clone https://github.com/funstory-ai/yadt\n\n# enter the project directory\ncd yadt\n\n# install dependencies and run yadt\nuv run yadt --help\n```\n\n3. Use the `uv run yadt` command. For example:\n\n```bash\nuv run yadt --bing --files example.pdf\n\n# multiple files\nuv run yadt --bing --files example.pdf --files example2.pdf\n```\n\n> [!TIP]\n> The absolute path is recommended.\n\n## Advanced Options\n\n### Language Options\n\n- `--lang-in`, `-li`: Source language code (default: en)\n- `--lang-out`, `-lo`: Target language code (default: zh)\n\n> [!TIP]\n> Currently, this project mainly focuses on English-to-Chinese translation, and other scenarios have not been tested yet.\n\n### PDF Processing Options\n\n- `--files`: One or more file paths to input PDF documents.\n- `--pages`, `-p`: Specify pages to translate (e.g., \"1,2,1-,-3,3-5\"). If not set, translate all pages\n- `--split-short-lines`: Force split short lines into different paragraphs (may cause poor typesetting & bugs)\n- `--short-line-split-factor`: Split threshold factor (default: 0.8). The actual threshold is the median length of all lines on the current page \\* this factor\n\n### Translation Service Options\n\n- `--qps`: QPS (Queries Per Second) limit for translation service (default: 4)\n- `--ignore-cache`: Ignore translation cache and force retranslation\n- `--no-dual`: Do not output bilingual PDF files\n- `--no-mono`: Do not output monolingual PDF files\n- `--openai`: Use OpenAI for translation (default: False)\n- `--bing`: Use Bing for translation (default: False)\n- `--google`: Use Google Translate for translation (default: False)\n\n> [!TIP]\n>\n> 1. You must specify one translation service among `--openai`, `--bing`, `--google`.\n> 2. It is recommended to use models with strong compatibility with OpenAI, such as: `glm-4-flash`, `deepseek-chat`, etc.\n> 3. Currently, it has not been optimized for traditional translation engines like Bing/Google, it is recommended to use LLMs.\n\n### OpenAI Specific Options\n\n- `--openai-model`: OpenAI model to use (default: gpt-4o-mini)\n- `--openai-base-url`: Base URL for OpenAI API\n- `--openai-api-key`: API key for OpenAI service\n\n### Output Control\n\n- `--output`, `-o`: Output directory for translated files. If not set, use same directory as input\n- `--debug`, `-d`: Enable debug logging level and export detailed intermediate results in `~/.cache/yadt/working`.\n\n### Configuration File\n\n- `--config`, `-c`: Configuration file path. Use the TOML format.\n\nExample Configuration:\n\n```toml\n[yadt]\ndebug = true\nlang-in = \"en-US\"\nlang-out = \"zh-CN\"\nqps = 20\n# this is a comment\n# pages = 4\nopenai = true\nopenai-model = \"SOME_ALSOME_MODEL\"\nopenai-base-url = \"https://example.example/v1\"\nopenai-api-key = \"[KEY]\"\n# All other options can also be set in the configuration file.\n```\n\n## Python API\n\nYou can refer to the example in [main.py](https://github.com/funstory-ai/yadt/blob/main/yadt/main.py) to use YADT's Python API.\n\nPlease note:\n\n1. Make sure all font files described in `main.download_font_assets` exist\n\n2. The current `TranslationConfig` does not fully validate input parameters, so you need to ensure the validity of input parameters\n\n## Background\n\nThere are a lot projects and teams working on to make document editing and translating easier like:\n\n- [mathpix](https://mathpix.com/)\n- [Doc2X](https://doc2x.noedgeai.com/)\n- [minerU](https://github.com/opendatalab/MinerU)\n- [PDFMathTranslate](https://github.com/funstory-ai/yadt)\n\nThere are also some solutions to solve specific parts of the problem like:\n\n- [layoutreader](https://github.com/microsoft/unilm/tree/master/layoutreader): the read order of the text block in a pdf\n- [Surya](https://github.com/surya-is/surya): the structure of the pdf\n\nThis project hopes to promote a standard pipeline and interface to solve the problem.\n\nIn fact, there are two main stages of a PDF parser or translator:\n\n- **Parsing**: A stage of parsing means to get the structure of the pdf such as text blocks, images, tables, etc.\n- **Rendering**: A stage of rendering means to render the structure into a new pdf or other format.\n\nFor a service like mathpix, it will parse the pdf into a structure may be in a XML format, and then render them using a single column reader order as [layoutreader](https://github.com/microsoft/unilm/tree/master/layoutreader) does. The bad news is that the original structure lost.\n\nSome people will use Adobe PDF Parser because it will generate a Word document and it keeps the original structure. But it is somewhat expensive.\nAnd you know, a pdf or word document is not a good format for reading in mobile devices.\n\nWe offer an intermediate representation of the results from parser and can be rendered into a new pdf or other format. The pipeline is also a plugin-based system which everybody can add their new model, ocr, renderer, etc.\n\n## Roadmap\n\n- [ ] Add line support\n- [ ] Add table support\n- [ ] Add cross-page/cross-column paragraph support\n- [ ] More advanced typesetting features\n- [ ] Outline support\n- [ ] ...\n\nOur first 1.0 version goal is to finish a translation from [PDF Reference, Version 1.7](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf) to the following language version:\n\n- Simplified Chinese\n- Traditional Chinese\n- Japanese\n- Spanish\n\nAnd meet the following requirements:\n\n- layout error less than 1%\n- content loss less than 1%\n\n## Known Issues\n\n1. Parsing errors in the author and reference sections; they get merged into one paragraph after translation.\n2. Lines are not supported.\n3. Does not support drop caps.\n\n## How to Contribute\n\nWe encourage you to contribute to YADT! Please check out the [CONTRIBUTING](https://github.com/funstory-ai/yadt/blob/main/docs/CONTRIBUTING.md) guide.\n\nEveryone interacting in YADT and its sub-projects' codebases, issue trackers, chat rooms, and mailing lists is expected to follow the YADT [Code of Conduct](https://github.com/funstory-ai/yadt/blob/main/docs/CODE_OF_CONDUCT.md).\n\n## Acknowledgements\n\n- [PDFMathTranslate](https://github.com/Byaidu/PDFMathTranslate)\n- [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)\n- [pdfminer](https://github.com/pdfminer/pdfminer.six)\n- [PyMuPDF](https://github.com/pymupdf/PyMuPDF)\n\n<h2 id=\"star_hist\">Star History</h2>\n\n<a href=\"https://star-history.com/#funstory-ai/yadt&Date\">\n <picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date&theme=dark\" />\n <source media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date\" />\n <img alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=funstory-ai/yadt&type=Date\"/>\n </picture>\n</a>",
"bugtrack_url": null,
"license": null,
"summary": "Yet Another Document Translator",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/funstory-ai/yadt",
"Issues": "https://github.com/funstory-ai/yadt/issues"
},
"split_keywords": [
"pdf"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "331bc67be468e0980ce1b7c8f30e5cc3ca6f2b027851ec5b722f6421d46b7a5a",
"md5": "4f58bdb14d4cacd3f2f87a325f4eec88",
"sha256": "d35acd26cf61b4829782462477f1d0682c31320b236a099fbe5dca35629c056a"
},
"downloads": -1,
"filename": "yadt-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4f58bdb14d4cacd3f2f87a325f4eec88",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 87504,
"upload_time": "2025-02-03T16:19:57",
"upload_time_iso_8601": "2025-02-03T16:19:57.451202Z",
"url": "https://files.pythonhosted.org/packages/33/1b/c67be468e0980ce1b7c8f30e5cc3ca6f2b027851ec5b722f6421d46b7a5a/yadt-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d7e18b98c1146388ee2b06e5b02acdb77a2b7535b42e9bbd5052a336c9fc7df6",
"md5": "5fb88f687bac1d1973d4ad7c6a56fd5b",
"sha256": "bab0d12e0c1b05bd3ef059d4c1fa86fd7fc2762d5d9133a0657bc668d0f7fdc8"
},
"downloads": -1,
"filename": "yadt-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "5fb88f687bac1d1973d4ad7c6a56fd5b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 4306616,
"upload_time": "2025-02-03T16:20:00",
"upload_time_iso_8601": "2025-02-03T16:20:00.601097Z",
"url": "https://files.pythonhosted.org/packages/d7/e1/8b98c1146388ee2b06e5b02acdb77a2b7535b42e9bbd5052a336c9fc7df6/yadt-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-03 16:20:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "funstory-ai",
"github_project": "yadt",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "yadt"
}