Wolaita-POST


NameWolaita-POST JSON
Version 1.1.3 PyPI version JSON
download
home_pagehttps://github.com/Sisagegn/Wolaita_POST
SummaryA POS tagger for the Wolaita language using deep learning
upload_time2024-11-24 19:51:03
maintainerNone
docs_urlNone
authorSisagegn Samuel
requires_python>=3.6
licenseNone
keywords wolaita pos tagging nlp deep learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Wolaita_POST

## Overview

Wolaita_POST is a Python framework tailored for accurate Part-of-Speech (POS) tagging of the Wolaita language. Leveraging advanced deep learning models, including Bi-GRU and others, it integrates FastText embeddings to enhance tagging performance. The framework uses pretrained models, streamlining deployment and boosting accuracy. Designed for researchers and developers working with Natural Language Processing (NLP) in lesser-resourced languages, Wolaita_POST provides a robust solution for Wolaita language text analysis, making it a valuable tool in the NLP field.

## Features
- Accurate POS Tagging: Utilizes deep learning models (Bi-GRU, Bi-LSTM, etc.) to achieve precise Part-of-Speech tagging for Wolaita language text.
- Pretrained Models: Ready-to-use pretrained models for quick deployment and high accuracy.
- FastText Embeddings: Incorporates FastText word embeddings to capture subword information and improve performance on low-resource languages.
- Easy Integration: Simple API that allows researchers and developers to integrate POS tagging into their NLP pipelines.
- Supports Wolaita Language: Specifically designed for the Wolaita language, addressing the challenges of processing lesser-resourced languages.
- Customizable: Flexible configuration to accommodate different models, tokenizers, and word vectors based on project requirements.
- Efficient Deployment: Enables easy deployment for various NLP applications, such as machine translation and named entity recognition (NER).

## Installation
To install Wolaita_POST, you can use pip:
- !pip install Wolaita_POST

##Usage

After installation, you can use Wolaita_POST as follows:
1. Import the package:

from Wolaita_POST import WolaitaPOSTagger

2. Set file paths for your pretrained model, word vectors, and tokenizers:

import os

base_dir = "/content/drive/MyDrive"  # Replace with the actual path

# Set the relative paths

model_path = os.path.join(base_dir, "last_model/last_model/Bi_GRU_model.keras")

fasttext_model_path = os.path.join(base_dir, "FastText_and_embedding_matrix/fasttext_model.bin")

word_tokenizer_path = os.path.join(base_dir, "POS/word_tokenizer.pkl")

tag_tokenizer_path = os.path.join(base_dir, "POS/tag_tokenizer.pkl")


3. Initialize the POS tagger:

pos_tagger = WolaitaPOSTagger(

    model_path=model_path,
    
    word_vector_path=fasttext_model_path,
    
    word_tokenizer_path=word_tokenizer_path,
    
    tag_tokenizer_path=tag_tokenizer_path
    
)

4. Use the POS tagger to tag Wolaita text:

text = ['Insert your sample text here']

tagged_text = pos_tagger.tag(text)

print(tagged_text)

The tagged_text will contain the part-of-speech tags for the given Wolaita text.

##Running Tests

If you want to verify functionality, you can use pytest. Run this command in your project directory:

- !pytest /content/drive/MyDrive/Wolaita_POST/tests > test_report.txt

##License

This project is licensed under the MIT License. See the LICENSE file for more details.

##Contributing

Contributions are welcome! If you have suggestions for improving the package or find any issues, feel free to open a pull request or submit an issue on GitHub.

##Acknowledgements

Special thanks to the developers and researchers who contributed to this project, making it possible to expand NLP resources for the Wolaita language.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Sisagegn/Wolaita_POST",
    "name": "Wolaita-POST",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "Wolaita POS tagging NLP deep learning",
    "author": "Sisagegn Samuel",
    "author_email": "samuelsisagegn@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/98/e2/b53e2652e26e37d3c0f4e115e3bc083dfbd905e875e49a2593889c44aa96/wolaita_post-1.1.3.tar.gz",
    "platform": null,
    "description": "# Wolaita_POST\n\n## Overview\n\nWolaita_POST is a Python framework tailored for accurate Part-of-Speech (POS) tagging of the Wolaita language. Leveraging advanced deep learning models, including Bi-GRU and others, it integrates FastText embeddings to enhance tagging performance. The framework uses pretrained models, streamlining deployment and boosting accuracy. Designed for researchers and developers working with Natural Language Processing (NLP) in lesser-resourced languages, Wolaita_POST provides a robust solution for Wolaita language text analysis, making it a valuable tool in the NLP field.\n\n## Features\n- Accurate POS Tagging: Utilizes deep learning models (Bi-GRU, Bi-LSTM, etc.) to achieve precise Part-of-Speech tagging for Wolaita language text.\n- Pretrained Models: Ready-to-use pretrained models for quick deployment and high accuracy.\n- FastText Embeddings: Incorporates FastText word embeddings to capture subword information and improve performance on low-resource languages.\n- Easy Integration: Simple API that allows researchers and developers to integrate POS tagging into their NLP pipelines.\n- Supports Wolaita Language: Specifically designed for the Wolaita language, addressing the challenges of processing lesser-resourced languages.\n- Customizable: Flexible configuration to accommodate different models, tokenizers, and word vectors based on project requirements.\n- Efficient Deployment: Enables easy deployment for various NLP applications, such as machine translation and named entity recognition (NER).\n\n## Installation\nTo install Wolaita_POST, you can use pip:\n- !pip install Wolaita_POST\n\n##Usage\n\nAfter installation, you can use Wolaita_POST as follows:\n1. Import the package:\n\nfrom Wolaita_POST import WolaitaPOSTagger\n\n2. Set file paths for your pretrained model, word vectors, and tokenizers:\n\nimport os\n\nbase_dir = \"/content/drive/MyDrive\"  # Replace with the actual path\n\n# Set the relative paths\n\nmodel_path = os.path.join(base_dir, \"last_model/last_model/Bi_GRU_model.keras\")\n\nfasttext_model_path = os.path.join(base_dir, \"FastText_and_embedding_matrix/fasttext_model.bin\")\n\nword_tokenizer_path = os.path.join(base_dir, \"POS/word_tokenizer.pkl\")\n\ntag_tokenizer_path = os.path.join(base_dir, \"POS/tag_tokenizer.pkl\")\n\n\n3. Initialize the POS tagger:\n\npos_tagger = WolaitaPOSTagger(\n\n    model_path=model_path,\n    \n    word_vector_path=fasttext_model_path,\n    \n    word_tokenizer_path=word_tokenizer_path,\n    \n    tag_tokenizer_path=tag_tokenizer_path\n    \n)\n\n4. Use the POS tagger to tag Wolaita text:\n\ntext = ['Insert your sample text here']\n\ntagged_text = pos_tagger.tag(text)\n\nprint(tagged_text)\n\nThe tagged_text will contain the part-of-speech tags for the given Wolaita text.\n\n##Running Tests\n\nIf you want to verify functionality, you can use pytest. Run this command in your project directory:\n\n- !pytest /content/drive/MyDrive/Wolaita_POST/tests > test_report.txt\n\n##License\n\nThis project is licensed under the MIT License. See the LICENSE file for more details.\n\n##Contributing\n\nContributions are welcome! If you have suggestions for improving the package or find any issues, feel free to open a pull request or submit an issue on GitHub.\n\n##Acknowledgements\n\nSpecial thanks to the developers and researchers who contributed to this project, making it possible to expand NLP resources for the Wolaita language.\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A POS tagger for the Wolaita language using deep learning",
    "version": "1.1.3",
    "project_urls": {
        "Documentation": "https://github.com/Sisagegn/Wolaita_POST/wiki",
        "Homepage": "https://github.com/Sisagegn/Wolaita_POST",
        "Source": "https://github.com/Sisagegn/Wolaita_POST",
        "Tracker": "https://github.com/Sisagegn/Wolaita_POST/issues"
    },
    "split_keywords": [
        "wolaita",
        "pos",
        "tagging",
        "nlp",
        "deep",
        "learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "989ed9227867172b2f7a86f4306c99885c2b414c922ec2a2688fbcf505be7658",
                "md5": "d279ea81e8c5675180375f8620fcd50e",
                "sha256": "8eccfd87f0db6a91902b8d70ee3b3edc9fdc77622277f2ee2494ea0de65902d6"
            },
            "downloads": -1,
            "filename": "Wolaita_POST-1.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d279ea81e8c5675180375f8620fcd50e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 3173,
            "upload_time": "2024-11-24T19:51:02",
            "upload_time_iso_8601": "2024-11-24T19:51:02.172708Z",
            "url": "https://files.pythonhosted.org/packages/98/9e/d9227867172b2f7a86f4306c99885c2b414c922ec2a2688fbcf505be7658/Wolaita_POST-1.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "98e2b53e2652e26e37d3c0f4e115e3bc083dfbd905e875e49a2593889c44aa96",
                "md5": "205b1c5d9854bce98b4c59f3a442d9ef",
                "sha256": "2b4ac185052c9f273eacbeee2853594c3c9176dad44f739b818ebfbc2d4f3b03"
            },
            "downloads": -1,
            "filename": "wolaita_post-1.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "205b1c5d9854bce98b4c59f3a442d9ef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6114,
            "upload_time": "2024-11-24T19:51:03",
            "upload_time_iso_8601": "2024-11-24T19:51:03.168325Z",
            "url": "https://files.pythonhosted.org/packages/98/e2/b53e2652e26e37d3c0f4e115e3bc083dfbd905e875e49a2593889c44aa96/wolaita_post-1.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-24 19:51:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Sisagegn",
    "github_project": "Wolaita_POST",
    "github_not_found": true,
    "lcname": "wolaita-post"
}
        
Elapsed time: 0.42871s