# Python GPT-4 PO File Translator
This Python script provides a robust and flexible tool for translating `.po` files using OpenAI's GPT-4 model. It accommodates various translation modes, handles fuzzy entries, and integrates batch processing for larger projects, making it suitable for diverse `.po` file structures and sizes.
## Features
- **Bulk and Individual Translation Modes**: Allows efficient bulk translation or precise, entry-by-entry translations for nuanced content.
- **Detailed Language Option (`--detail-lang`)**: Supports using full language names (e.g., "Netherlands, German") alongside shortcodes (e.g., `nl, de`), ensuring clarity in translation prompts.
- **Configurable Batch Size**: Set the number of entries to translate per batch during bulk translation, optimizing API usage.
- **Fuzzy Entry Management**: Automatically removes fuzzy flags and entries, ensuring only valid translations are processed.
- **Language Inference from Folder Structure**: Infers the target language from the folder structure, reducing the need for explicit language specifications.
- **Translation Validation and Retry Logic**: Built-in mechanisms validate translations and automatically retry to avoid incorrect or verbose translations.
- **Logging for Transparency**: Detailed logging for monitoring, debugging, and ensuring progress throughout the translation process.
- **OpenAI API Key Management**: Supports environment variables or command-line arguments for securely providing OpenAI API credentials.
- **Retry Mechanism for Failed Translations**: Retries failed translations up to three times, reducing incomplete or incorrect outputs.
- **Post-Processing for Concise Translations**: Automatically reviews translations to ensure they are concise and free of unnecessary explanations or repetitions.
## Requirements
- Python 3.x
- `polib` library (for `.po` file handling)
- `openai` Python package (for integration with OpenAI GPT models)
- `tenacity` library (for retry mechanisms)
- `python-dotenv` (for managing environment variables)
## Installation
### Via PyPI
Install the `gpt-po-translator` package directly from PyPI:
```bash
pip install gpt-po-translator
```
### Manual Installation
For manual installation or working with the latest code from the repository:
1. Clone the repository:
```bash
git clone [repository URL]
```
2. Navigate to the cloned directory and install the package:
```bash
pip install .
```
## API Key Configuration
The `gpt-po-translator` supports two methods for providing OpenAI API credentials:
1. **Environment Variable**: Set your OpenAI API key as an environment variable named `OPENAI_API_KEY`. This method is recommended for security and ease of API key management.
```bash
export OPENAI_API_KEY='your_api_key_here'
```
2. **Command-Line Argument**: Pass the API key as a command-line argument using the `--api_key` option.
```bash
gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 100 --folder-language
```
Make sure your API key is securely stored and not exposed in public spaces or repositories.
## Usage
Use `gpt-po-translator` as a command-line tool for translating `.po` files:
```bash
gpt-po-translator --folder [path_to_po_files] --lang [language_codes] [--api_key [your_openai_api_key]] [--fuzzy] [--bulk] [--bulksize [batch_size]] [--folder-language] [--detail-lang [full_language_names]]
```
### Example
```bash
gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 40 --folder-language --detail-lang "German,French"
```
This command translates `.po` files in the `./locales` folder to German and French, using the provided OpenAI API key and processing 40 translations per batch in bulk mode. It also infers the language from the folder structure.
### Command-Line Options
- `--folder`: Specifies the input folder containing `.po` files.
- `--lang`: Comma-separated language codes to filter `.po` files (e.g., `de,fr`).
- `--detail-lang`: Optional argument for full language names, matching the order of `--lang` (e.g., "German,French").
- `--fuzzy`: Removes fuzzy entries before processing.
- `--bulk`: Enables bulk translation mode for faster processing.
- `--bulksize`: Sets the batch size for bulk translation (default is 50).
- `--model`: Specifies the OpenAI model to use for translations (default is `gpt-3.5-turbo-0125`).
- `--api_key`: OpenAI API key. Can be provided through the command line or as an environment variable.
- `--folder-language`: Infers the target language from the folder structure.
## Detailed Language Names and Shortcodes
The `--detail-lang` option complements `--lang` by allowing you to specify full language names (e.g., `Netherlands,German`) instead of language shortcodes. The full names are then used in the context of OpenAI prompts, improving clarity for the GPT model.
Example usage:
```bash
gpt-po-translator --folder ./locales --lang nl,de --detail-lang "Netherlands,German"
```
## Logging
The script logs detailed information about the files being processed, the number of translations, and batch details in bulk mode. Logs are essential for monitoring progress, debugging issues, and ensuring transparency throughout the translation process.
## Error Handling and Retries
The script includes robust error handling and retries to ensure reliable translation:
- **Failed Translations**: Automatically retries failed translations up to three times.
- **Empty Translations**: If an empty translation is returned, the script will attempt to translate the text again using an alternative approach.
- **Lengthy or Incorrect Translations**: Translations that are too long or contain explanations instead of direct translations are flagged and retried.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/pescheckit/python-gpt-po",
"name": "gpt-po-translator",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Bram Mittendorff",
"author_email": "bram@pescheck.io",
"download_url": "https://files.pythonhosted.org/packages/da/d9/4b5a5f89d5cd9d9cdc1b8e86c2783d7f48f5d3cb6b9e077073bb74a089cf/gpt_po_translator-0.2.13.tar.gz",
"platform": null,
"description": "# Python GPT-4 PO File Translator\n\nThis Python script provides a robust and flexible tool for translating `.po` files using OpenAI's GPT-4 model. It accommodates various translation modes, handles fuzzy entries, and integrates batch processing for larger projects, making it suitable for diverse `.po` file structures and sizes.\n\n## Features\n\n- **Bulk and Individual Translation Modes**: Allows efficient bulk translation or precise, entry-by-entry translations for nuanced content.\n- **Detailed Language Option (`--detail-lang`)**: Supports using full language names (e.g., \"Netherlands, German\") alongside shortcodes (e.g., `nl, de`), ensuring clarity in translation prompts.\n- **Configurable Batch Size**: Set the number of entries to translate per batch during bulk translation, optimizing API usage.\n- **Fuzzy Entry Management**: Automatically removes fuzzy flags and entries, ensuring only valid translations are processed.\n- **Language Inference from Folder Structure**: Infers the target language from the folder structure, reducing the need for explicit language specifications.\n- **Translation Validation and Retry Logic**: Built-in mechanisms validate translations and automatically retry to avoid incorrect or verbose translations.\n- **Logging for Transparency**: Detailed logging for monitoring, debugging, and ensuring progress throughout the translation process.\n- **OpenAI API Key Management**: Supports environment variables or command-line arguments for securely providing OpenAI API credentials.\n- **Retry Mechanism for Failed Translations**: Retries failed translations up to three times, reducing incomplete or incorrect outputs.\n- **Post-Processing for Concise Translations**: Automatically reviews translations to ensure they are concise and free of unnecessary explanations or repetitions.\n\n## Requirements\n\n- Python 3.x\n- `polib` library (for `.po` file handling)\n- `openai` Python package (for integration with OpenAI GPT models)\n- `tenacity` library (for retry mechanisms)\n- `python-dotenv` (for managing environment variables)\n\n## Installation\n\n### Via PyPI\n\nInstall the `gpt-po-translator` package directly from PyPI:\n\n```bash\npip install gpt-po-translator\n```\n\n### Manual Installation\n\nFor manual installation or working with the latest code from the repository:\n\n1. Clone the repository:\n ```bash\n git clone [repository URL]\n ```\n2. Navigate to the cloned directory and install the package:\n ```bash\n pip install .\n ```\n\n## API Key Configuration\n\nThe `gpt-po-translator` supports two methods for providing OpenAI API credentials:\n\n1. **Environment Variable**: Set your OpenAI API key as an environment variable named `OPENAI_API_KEY`. This method is recommended for security and ease of API key management.\n\n ```bash\n export OPENAI_API_KEY='your_api_key_here'\n ```\n\n2. **Command-Line Argument**: Pass the API key as a command-line argument using the `--api_key` option.\n\n ```bash\n gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 100 --folder-language\n ```\n\nMake sure your API key is securely stored and not exposed in public spaces or repositories.\n\n## Usage\n\nUse `gpt-po-translator` as a command-line tool for translating `.po` files:\n\n```bash\ngpt-po-translator --folder [path_to_po_files] --lang [language_codes] [--api_key [your_openai_api_key]] [--fuzzy] [--bulk] [--bulksize [batch_size]] [--folder-language] [--detail-lang [full_language_names]]\n```\n\n### Example\n\n```bash\ngpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 40 --folder-language --detail-lang \"German,French\"\n```\n\nThis command translates `.po` files in the `./locales` folder to German and French, using the provided OpenAI API key and processing 40 translations per batch in bulk mode. It also infers the language from the folder structure.\n\n### Command-Line Options\n\n- `--folder`: Specifies the input folder containing `.po` files.\n- `--lang`: Comma-separated language codes to filter `.po` files (e.g., `de,fr`).\n- `--detail-lang`: Optional argument for full language names, matching the order of `--lang` (e.g., \"German,French\").\n- `--fuzzy`: Removes fuzzy entries before processing.\n- `--bulk`: Enables bulk translation mode for faster processing.\n- `--bulksize`: Sets the batch size for bulk translation (default is 50).\n- `--model`: Specifies the OpenAI model to use for translations (default is `gpt-3.5-turbo-0125`).\n- `--api_key`: OpenAI API key. Can be provided through the command line or as an environment variable.\n- `--folder-language`: Infers the target language from the folder structure.\n\n## Detailed Language Names and Shortcodes\n\nThe `--detail-lang` option complements `--lang` by allowing you to specify full language names (e.g., `Netherlands,German`) instead of language shortcodes. The full names are then used in the context of OpenAI prompts, improving clarity for the GPT model.\n\nExample usage:\n\n```bash\ngpt-po-translator --folder ./locales --lang nl,de --detail-lang \"Netherlands,German\"\n```\n\n## Logging\n\nThe script logs detailed information about the files being processed, the number of translations, and batch details in bulk mode. Logs are essential for monitoring progress, debugging issues, and ensuring transparency throughout the translation process.\n\n## Error Handling and Retries\n\nThe script includes robust error handling and retries to ensure reliable translation:\n\n- **Failed Translations**: Automatically retries failed translations up to three times.\n- **Empty Translations**: If an empty translation is returned, the script will attempt to translate the text again using an alternative approach.\n- **Lengthy or Incorrect Translations**: Translations that are too long or contain explanations instead of direct translations are flagged and retried.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "LICENSE",
"summary": "A CLI tool for translating .po files using GPT models.",
"version": "0.2.13",
"project_urls": {
"Homepage": "https://github.com/pescheckit/python-gpt-po"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "db66948962b44d88d619c36695d1ffe336182ff86ef671684f3439200b7c51a1",
"md5": "14953aa95ee92d5846191dbae71ca057",
"sha256": "463415c04339930d353e3dc61571617519f4670206cf285ccbefb4d655d0959e"
},
"downloads": -1,
"filename": "gpt_po_translator-0.2.13-py3-none-any.whl",
"has_sig": false,
"md5_digest": "14953aa95ee92d5846191dbae71ca057",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 12413,
"upload_time": "2024-09-09T09:35:33",
"upload_time_iso_8601": "2024-09-09T09:35:33.863929Z",
"url": "https://files.pythonhosted.org/packages/db/66/948962b44d88d619c36695d1ffe336182ff86ef671684f3439200b7c51a1/gpt_po_translator-0.2.13-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dad94b5a5f89d5cd9d9cdc1b8e86c2783d7f48f5d3cb6b9e077073bb74a089cf",
"md5": "4a9f151b2fbe0a6fc0668d21b4a1c3e1",
"sha256": "a6cb7bfa2030eaad9fe97bebea8a9febe56985d566e75fbb3fafeaff247241b0"
},
"downloads": -1,
"filename": "gpt_po_translator-0.2.13.tar.gz",
"has_sig": false,
"md5_digest": "4a9f151b2fbe0a6fc0668d21b4a1c3e1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 12888,
"upload_time": "2024-09-09T09:35:34",
"upload_time_iso_8601": "2024-09-09T09:35:34.898042Z",
"url": "https://files.pythonhosted.org/packages/da/d9/4b5a5f89d5cd9d9cdc1b8e86c2783d7f48f5d3cb6b9e077073bb74a089cf/gpt_po_translator-0.2.13.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-09 09:35:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pescheckit",
"github_project": "python-gpt-po",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "polib",
"specs": [
[
"==",
"1.2.0"
]
]
},
{
"name": "openai",
"specs": [
[
"==",
"v1.42.0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"8.2.2"
]
]
},
{
"name": "tenacity",
"specs": [
[
"==",
"9.0.0"
]
]
},
{
"name": "setuptools-scm",
"specs": [
[
"==",
"8.1.0"
]
]
}
],
"lcname": "gpt-po-translator"
}