DarijaTranslatorAssistant


NameDarijaTranslatorAssistant JSON
Version 1.0.1 PyPI version JSON
download
home_pagehttps://github.com/aissam-out/DarijaTranslatorAssistant
SummaryA library for assisting in translating Darija to English. It provides a list of potential translations for a given darija word. It also supports translation of full sentences using LLMs (e.g., OpenAI).
upload_time2024-09-06 10:01:18
maintainerNone
docs_urlNone
authorAissam Outchakoucht
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DarijaAssistant Library

![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)

**DarijaAssistant** is a Python library designed to assist in translating Moroccan Darija (a dialect of Arabic) into English. It integrates two main functionalities:

1. **Assisted Translation**: The `DarijaAssistant` class provides additional support for translating words and sentences using a custom [word-distance algorithm](https://pypi.org/project/DarijaDistance/), offering assistance to improve translation accuracy, especially for difficult or ambiguous phrases.

2. **LLM Client**: A client that allows interaction with any language model (LLM) hosted at any URL. For enhanced usability, the library also provides built-in support for OpenAI’s GPT models, allowing users to easily integrate them by simply providing the OpenAI API key and the model name, making it work out of the box.

This library allows users to perform both raw and assisted translations, improving the contextual understanding of Moroccan Darija sentences through caching, normalization, and additional linguistic analysis.


## Installation

To install the library, run:

```bash
pip install DarijaTranslatorAssistant
```

## Usage

### 1. Initializing the Translation model

You can choose between a model hosted at any URL or OpenAI. Here's how to initialize the client:

```python
from DarijaTranslatorAssistant.llm_client import LLMClient

# Example using OpenAI GPT model
llm_client = LLMClient(use_openai=True, openai_api_key="your_openai_api_key", openai_model="gpt-4o")

# Example using an LLM hosted at a specific URL
llm_client = LLMClient(llm_url="http://your-llm-url.com", use_openai=False)
```

### 2. Simple Translation

You can perform a direct translation using the LLM client.

```python
sentence = "law3lm asahbi"
# only uses OpenAI's gpt-4o
translation_without_assistance = llm_client.translate(sentence)
print(translation_without_assistance)

# [output]: The world, my friend.
```

### 3. Assisted Translation

For more context-aware translation, use the *DarijaAssistant* class. This will assist the translation process by leveraging a word-distance algorithm.

```python
from DarijaTranslatorAssistant.darija_assistant import DarijaAssistant

# Initialize DarijaAssistant with the LLM client
assistant = DarijaAssistant(llm_client=llm_client)

# Use assisted translation: OpenAI's gpt-4o + DarijaAssistant
sentence = "law3lm asahbi"
result = assistant.assist_and_translate(sentence)
print(result)

# [output]: I do not know my friend.
```

### 4. Example Translations

Here's the difference between GPT-4 translations and our approach, showing how each handles Darija sentences with and without specialized assistance.

| Darija Sentence    | GPT4o Translation Without Assistance | Assisted Translation     |
|--------------------|--------------------------------------|--------------------------|
| law3lm asahbi      | The world, my friend.                | I do not know my friend. |
| kbchlaba9ich       | I feel thirsty.                      | Fill my cup.             |
| 3rram dyal lbrahch | Brahch's pen.                        | Plenty of kids.          |
| chof 3la tfrnisa   | Check the outlet.                    | Look at the smile.       |

### 5. Expanding the Dictionary

You can add new words and translations using the DarijaDataManager from the DarijaDistance package, which the DarijaAssistant library relies on.

```python
from DarijaDistance.preprocess import DarijaDataManager

data_manager = DarijaDataManager()
data_manager.add_translations([('khona', 'brother')])
```

Now, the word "khona" will be recognized and translated as "brother" in future translations. This addition is persistent, meaning it will be saved to the library's data, not just the current session. As a result, future instances of DarijaAssistant will automatically recognize and apply this translation, without needing to re-add it.

### 6. Access to Word-Distance Methods

As a user of the DarijaAssistant library, you have access to all the methods from the [word-distance algorithm](https://pypi.org/project/DarijaDistance/), such as checking translation confidence, retrieving exact matches, and more.

## Contributing

Contributions are welcome! If you have any ideas, suggestions, or find a bug, please open an issue or submit a pull request to the Github repo.


## License

This project is licensed under the MIT License. See the [LICENSE](https://github.com/aissam-out/DarijaTranslatorAssistant/blob/main/License) file for more details.

## Contact

If you have any questions or feedback, you can find me on LinkedIn: [Aissam Outchakoucht](https://www.linkedin.com/in/aissam-outchakoucht/) or on X: [@aissam_out](https://x.com/aissam_out).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aissam-out/DarijaTranslatorAssistant",
    "name": "DarijaTranslatorAssistant",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Aissam Outchakoucht",
    "author_email": "aissam.outchakoucht@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/fa/a5/893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb/darijatranslatorassistant-1.0.1.tar.gz",
    "platform": null,
    "description": "# DarijaAssistant Library\n\n![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)\n\n**DarijaAssistant** is a Python library designed to assist in translating Moroccan Darija (a dialect of Arabic) into English. It integrates two main functionalities:\n\n1. **Assisted Translation**: The `DarijaAssistant` class provides additional support for translating words and sentences using a custom [word-distance algorithm](https://pypi.org/project/DarijaDistance/), offering assistance to improve translation accuracy, especially for difficult or ambiguous phrases.\n\n2. **LLM Client**: A client that allows interaction with any language model (LLM) hosted at any URL. For enhanced usability, the library also provides built-in support for OpenAI\u2019s GPT models, allowing users to easily integrate them by simply providing the OpenAI API key and the model name, making it work out of the box.\n\nThis library allows users to perform both raw and assisted translations, improving the contextual understanding of Moroccan Darija sentences through caching, normalization, and additional linguistic analysis.\n\n\n## Installation\n\nTo install the library, run:\n\n```bash\npip install DarijaTranslatorAssistant\n```\n\n## Usage\n\n### 1. Initializing the Translation model\n\nYou can choose between a model hosted at any URL or OpenAI. Here's how to initialize the client:\n\n```python\nfrom DarijaTranslatorAssistant.llm_client import LLMClient\n\n# Example using OpenAI GPT model\nllm_client = LLMClient(use_openai=True, openai_api_key=\"your_openai_api_key\", openai_model=\"gpt-4o\")\n\n# Example using an LLM hosted at a specific URL\nllm_client = LLMClient(llm_url=\"http://your-llm-url.com\", use_openai=False)\n```\n\n### 2. Simple Translation\n\nYou can perform a direct translation using the LLM client.\n\n```python\nsentence = \"law3lm asahbi\"\n# only uses OpenAI's gpt-4o\ntranslation_without_assistance = llm_client.translate(sentence)\nprint(translation_without_assistance)\n\n# [output]: The world, my friend.\n```\n\n### 3. Assisted Translation\n\nFor more context-aware translation, use the *DarijaAssistant* class. This will assist the translation process by leveraging a word-distance algorithm.\n\n```python\nfrom DarijaTranslatorAssistant.darija_assistant import DarijaAssistant\n\n# Initialize DarijaAssistant with the LLM client\nassistant = DarijaAssistant(llm_client=llm_client)\n\n# Use assisted translation: OpenAI's gpt-4o + DarijaAssistant\nsentence = \"law3lm asahbi\"\nresult = assistant.assist_and_translate(sentence)\nprint(result)\n\n# [output]: I do not know my friend.\n```\n\n### 4. Example Translations\n\nHere's the difference between GPT-4 translations and our approach, showing how each handles Darija sentences with and without specialized assistance.\n\n| Darija Sentence    | GPT4o Translation Without Assistance | Assisted Translation     |\n|--------------------|--------------------------------------|--------------------------|\n| law3lm asahbi      | The world, my friend.                | I do not know my friend. |\n| kbchlaba9ich       | I feel thirsty.                      | Fill my cup.             |\n| 3rram dyal lbrahch | Brahch's pen.                        | Plenty of kids.          |\n| chof 3la tfrnisa   | Check the outlet.                    | Look at the smile.       |\n\n### 5. Expanding the Dictionary\n\nYou can add new words and translations using the DarijaDataManager from the DarijaDistance package, which the DarijaAssistant library relies on.\n\n```python\nfrom DarijaDistance.preprocess import DarijaDataManager\n\ndata_manager = DarijaDataManager()\ndata_manager.add_translations([('khona', 'brother')])\n```\n\nNow, the word \"khona\" will be recognized and translated as \"brother\" in future translations. This addition is persistent, meaning it will be saved to the library's data, not just the current session. As a result, future instances of DarijaAssistant will automatically recognize and apply this translation, without needing to re-add it.\n\n### 6. Access to Word-Distance Methods\n\nAs a user of the DarijaAssistant library, you have access to all the methods from the [word-distance algorithm](https://pypi.org/project/DarijaDistance/), such as checking translation confidence, retrieving exact matches, and more.\n\n## Contributing\n\nContributions are welcome! If you have any ideas, suggestions, or find a bug, please open an issue or submit a pull request to the Github repo.\n\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/aissam-out/DarijaTranslatorAssistant/blob/main/License) file for more details.\n\n## Contact\n\nIf you have any questions or feedback, you can find me on LinkedIn: [Aissam Outchakoucht](https://www.linkedin.com/in/aissam-outchakoucht/) or on X: [@aissam_out](https://x.com/aissam_out).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A library for assisting in translating Darija to English. It provides a list of potential translations for a given darija word. It also supports translation of full sentences using LLMs (e.g., OpenAI).",
    "version": "1.0.1",
    "project_urls": {
        "Homepage": "https://github.com/aissam-out/DarijaTranslatorAssistant"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8765f8a6e53bf3b72e27a815aacd4171ba7802eade4a766279a0cd7dd8cc52f2",
                "md5": "47459e4bd115c54004d025beea5eaa52",
                "sha256": "fd8043797ce86613a7b244165d91da2fe2577ddf3e5742c106a56bcb53b683e4"
            },
            "downloads": -1,
            "filename": "DarijaTranslatorAssistant-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "47459e4bd115c54004d025beea5eaa52",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 8873,
            "upload_time": "2024-09-06T10:01:17",
            "upload_time_iso_8601": "2024-09-06T10:01:17.646143Z",
            "url": "https://files.pythonhosted.org/packages/87/65/f8a6e53bf3b72e27a815aacd4171ba7802eade4a766279a0cd7dd8cc52f2/DarijaTranslatorAssistant-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "faa5893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb",
                "md5": "dc0cd0c705efc354f9d7d9bee740cc09",
                "sha256": "2db495bb074306733c471f8f7750b50302ab0180ad36bc6805ea1eae9f7ffe46"
            },
            "downloads": -1,
            "filename": "darijatranslatorassistant-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "dc0cd0c705efc354f9d7d9bee740cc09",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 7720,
            "upload_time": "2024-09-06T10:01:18",
            "upload_time_iso_8601": "2024-09-06T10:01:18.803483Z",
            "url": "https://files.pythonhosted.org/packages/fa/a5/893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb/darijatranslatorassistant-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-06 10:01:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aissam-out",
    "github_project": "DarijaTranslatorAssistant",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "darijatranslatorassistant"
}
        
Elapsed time: 0.35004s