# DarijaAssistant Library

**DarijaAssistant** is a Python library designed to assist in translating Moroccan Darija (a dialect of Arabic) into English. It integrates two main functionalities:
1. **Assisted Translation**: The `DarijaAssistant` class provides additional support for translating words and sentences using a custom [word-distance algorithm](https://pypi.org/project/DarijaDistance/), offering assistance to improve translation accuracy, especially for difficult or ambiguous phrases.
2. **LLM Client**: A client that allows interaction with any language model (LLM) hosted at any URL. For enhanced usability, the library also provides built-in support for OpenAI’s GPT models, allowing users to easily integrate them by simply providing the OpenAI API key and the model name, making it work out of the box.
This library allows users to perform both raw and assisted translations, improving the contextual understanding of Moroccan Darija sentences through caching, normalization, and additional linguistic analysis.
## Installation
To install the library, run:
```bash
pip install DarijaTranslatorAssistant
```
## Usage
### 1. Initializing the Translation model
You can choose between a model hosted at any URL or OpenAI. Here's how to initialize the client:
```python
from DarijaTranslatorAssistant.llm_client import LLMClient
# Example using OpenAI GPT model
llm_client = LLMClient(use_openai=True, openai_api_key="your_openai_api_key", openai_model="gpt-4o")
# Example using an LLM hosted at a specific URL
llm_client = LLMClient(llm_url="http://your-llm-url.com", use_openai=False)
```
### 2. Simple Translation
You can perform a direct translation using the LLM client.
```python
sentence = "law3lm asahbi"
# only uses OpenAI's gpt-4o
translation_without_assistance = llm_client.translate(sentence)
print(translation_without_assistance)
# [output]: The world, my friend.
```
### 3. Assisted Translation
For more context-aware translation, use the *DarijaAssistant* class. This will assist the translation process by leveraging a word-distance algorithm.
```python
from DarijaTranslatorAssistant.darija_assistant import DarijaAssistant
# Initialize DarijaAssistant with the LLM client
assistant = DarijaAssistant(llm_client=llm_client)
# Use assisted translation: OpenAI's gpt-4o + DarijaAssistant
sentence = "law3lm asahbi"
result = assistant.assist_and_translate(sentence)
print(result)
# [output]: I do not know my friend.
```
### 4. Example Translations
Here's the difference between GPT-4 translations and our approach, showing how each handles Darija sentences with and without specialized assistance.
| Darija Sentence | GPT4o Translation Without Assistance | Assisted Translation |
|--------------------|--------------------------------------|--------------------------|
| law3lm asahbi | The world, my friend. | I do not know my friend. |
| kbchlaba9ich | I feel thirsty. | Fill my cup. |
| 3rram dyal lbrahch | Brahch's pen. | Plenty of kids. |
| chof 3la tfrnisa | Check the outlet. | Look at the smile. |
### 5. Expanding the Dictionary
You can add new words and translations using the DarijaDataManager from the DarijaDistance package, which the DarijaAssistant library relies on.
```python
from DarijaDistance.preprocess import DarijaDataManager
data_manager = DarijaDataManager()
data_manager.add_translations([('khona', 'brother')])
```
Now, the word "khona" will be recognized and translated as "brother" in future translations. This addition is persistent, meaning it will be saved to the library's data, not just the current session. As a result, future instances of DarijaAssistant will automatically recognize and apply this translation, without needing to re-add it.
### 6. Access to Word-Distance Methods
As a user of the DarijaAssistant library, you have access to all the methods from the [word-distance algorithm](https://pypi.org/project/DarijaDistance/), such as checking translation confidence, retrieving exact matches, and more.
## Contributing
Contributions are welcome! If you have any ideas, suggestions, or find a bug, please open an issue or submit a pull request to the Github repo.
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/aissam-out/DarijaTranslatorAssistant/blob/main/License) file for more details.
## Contact
If you have any questions or feedback, you can find me on LinkedIn: [Aissam Outchakoucht](https://www.linkedin.com/in/aissam-outchakoucht/) or on X: [@aissam_out](https://x.com/aissam_out).
Raw data
{
"_id": null,
"home_page": "https://github.com/aissam-out/DarijaTranslatorAssistant",
"name": "DarijaTranslatorAssistant",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Aissam Outchakoucht",
"author_email": "aissam.outchakoucht@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/fa/a5/893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb/darijatranslatorassistant-1.0.1.tar.gz",
"platform": null,
"description": "# DarijaAssistant Library\n\n\n\n**DarijaAssistant** is a Python library designed to assist in translating Moroccan Darija (a dialect of Arabic) into English. It integrates two main functionalities:\n\n1. **Assisted Translation**: The `DarijaAssistant` class provides additional support for translating words and sentences using a custom [word-distance algorithm](https://pypi.org/project/DarijaDistance/), offering assistance to improve translation accuracy, especially for difficult or ambiguous phrases.\n\n2. **LLM Client**: A client that allows interaction with any language model (LLM) hosted at any URL. For enhanced usability, the library also provides built-in support for OpenAI\u2019s GPT models, allowing users to easily integrate them by simply providing the OpenAI API key and the model name, making it work out of the box.\n\nThis library allows users to perform both raw and assisted translations, improving the contextual understanding of Moroccan Darija sentences through caching, normalization, and additional linguistic analysis.\n\n\n## Installation\n\nTo install the library, run:\n\n```bash\npip install DarijaTranslatorAssistant\n```\n\n## Usage\n\n### 1. Initializing the Translation model\n\nYou can choose between a model hosted at any URL or OpenAI. Here's how to initialize the client:\n\n```python\nfrom DarijaTranslatorAssistant.llm_client import LLMClient\n\n# Example using OpenAI GPT model\nllm_client = LLMClient(use_openai=True, openai_api_key=\"your_openai_api_key\", openai_model=\"gpt-4o\")\n\n# Example using an LLM hosted at a specific URL\nllm_client = LLMClient(llm_url=\"http://your-llm-url.com\", use_openai=False)\n```\n\n### 2. Simple Translation\n\nYou can perform a direct translation using the LLM client.\n\n```python\nsentence = \"law3lm asahbi\"\n# only uses OpenAI's gpt-4o\ntranslation_without_assistance = llm_client.translate(sentence)\nprint(translation_without_assistance)\n\n# [output]: The world, my friend.\n```\n\n### 3. Assisted Translation\n\nFor more context-aware translation, use the *DarijaAssistant* class. This will assist the translation process by leveraging a word-distance algorithm.\n\n```python\nfrom DarijaTranslatorAssistant.darija_assistant import DarijaAssistant\n\n# Initialize DarijaAssistant with the LLM client\nassistant = DarijaAssistant(llm_client=llm_client)\n\n# Use assisted translation: OpenAI's gpt-4o + DarijaAssistant\nsentence = \"law3lm asahbi\"\nresult = assistant.assist_and_translate(sentence)\nprint(result)\n\n# [output]: I do not know my friend.\n```\n\n### 4. Example Translations\n\nHere's the difference between GPT-4 translations and our approach, showing how each handles Darija sentences with and without specialized assistance.\n\n| Darija Sentence | GPT4o Translation Without Assistance | Assisted Translation |\n|--------------------|--------------------------------------|--------------------------|\n| law3lm asahbi | The world, my friend. | I do not know my friend. |\n| kbchlaba9ich | I feel thirsty. | Fill my cup. |\n| 3rram dyal lbrahch | Brahch's pen. | Plenty of kids. |\n| chof 3la tfrnisa | Check the outlet. | Look at the smile. |\n\n### 5. Expanding the Dictionary\n\nYou can add new words and translations using the DarijaDataManager from the DarijaDistance package, which the DarijaAssistant library relies on.\n\n```python\nfrom DarijaDistance.preprocess import DarijaDataManager\n\ndata_manager = DarijaDataManager()\ndata_manager.add_translations([('khona', 'brother')])\n```\n\nNow, the word \"khona\" will be recognized and translated as \"brother\" in future translations. This addition is persistent, meaning it will be saved to the library's data, not just the current session. As a result, future instances of DarijaAssistant will automatically recognize and apply this translation, without needing to re-add it.\n\n### 6. Access to Word-Distance Methods\n\nAs a user of the DarijaAssistant library, you have access to all the methods from the [word-distance algorithm](https://pypi.org/project/DarijaDistance/), such as checking translation confidence, retrieving exact matches, and more.\n\n## Contributing\n\nContributions are welcome! If you have any ideas, suggestions, or find a bug, please open an issue or submit a pull request to the Github repo.\n\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/aissam-out/DarijaTranslatorAssistant/blob/main/License) file for more details.\n\n## Contact\n\nIf you have any questions or feedback, you can find me on LinkedIn: [Aissam Outchakoucht](https://www.linkedin.com/in/aissam-outchakoucht/) or on X: [@aissam_out](https://x.com/aissam_out).\n",
"bugtrack_url": null,
"license": null,
"summary": "A library for assisting in translating Darija to English. It provides a list of potential translations for a given darija word. It also supports translation of full sentences using LLMs (e.g., OpenAI).",
"version": "1.0.1",
"project_urls": {
"Homepage": "https://github.com/aissam-out/DarijaTranslatorAssistant"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8765f8a6e53bf3b72e27a815aacd4171ba7802eade4a766279a0cd7dd8cc52f2",
"md5": "47459e4bd115c54004d025beea5eaa52",
"sha256": "fd8043797ce86613a7b244165d91da2fe2577ddf3e5742c106a56bcb53b683e4"
},
"downloads": -1,
"filename": "DarijaTranslatorAssistant-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "47459e4bd115c54004d025beea5eaa52",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 8873,
"upload_time": "2024-09-06T10:01:17",
"upload_time_iso_8601": "2024-09-06T10:01:17.646143Z",
"url": "https://files.pythonhosted.org/packages/87/65/f8a6e53bf3b72e27a815aacd4171ba7802eade4a766279a0cd7dd8cc52f2/DarijaTranslatorAssistant-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "faa5893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb",
"md5": "dc0cd0c705efc354f9d7d9bee740cc09",
"sha256": "2db495bb074306733c471f8f7750b50302ab0180ad36bc6805ea1eae9f7ffe46"
},
"downloads": -1,
"filename": "darijatranslatorassistant-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "dc0cd0c705efc354f9d7d9bee740cc09",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 7720,
"upload_time": "2024-09-06T10:01:18",
"upload_time_iso_8601": "2024-09-06T10:01:18.803483Z",
"url": "https://files.pythonhosted.org/packages/fa/a5/893dfa25716818e768d3a459664533865032cafecc709b0f459d27a72adb/darijatranslatorassistant-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-06 10:01:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aissam-out",
"github_project": "DarijaTranslatorAssistant",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "darijatranslatorassistant"
}