ArrowTextClassifier


NameArrowTextClassifier JSON
Version 1.0.3 PyPI version JSON
download
home_pagehttps://github.com/Bhargav230m/ArrowTextClassifier.git
SummaryArrowTextClassifier is a simple text classification tool written in pytorch that allows you to train, summarize, and use text classification models for various tasks.
upload_time2024-04-20 14:25:39
maintainerNone
docs_urlNone
authortechpowerb
requires_python>=3.6
licenseNone
keywords text classification natural language processing nlp pytorch machine learning deep learning text summarization preprocessing data science artificial intelligence dataset discord
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ArrowTextClassifier

ArrowTextClassifier is a Python package for text classification tasks, offering functionalities to train, summarize, and classify text using convolutional neural network (CNN) architecture.

## Installation

You can install ArrowTextClassifier via pip:

```bash
pip install ArrowTextClassifier
```

## How it Works

ArrowTextClassifier implements a convolutional neural network (CNN) architecture for text classification. It tokenizes input text, embeds the tokens, applies convolutional filters over the embedded tokens to extract features, and then classifies the text into predefined categories.

## Usage

### Training

To train a text classification model, you can utilize the `train_model` method provided by the `Model` class:

```python
from ArrowTextClassifier import Model

model = Model(name="your_model_name")
model.train_model(dataset)
```

#### How to make a dataset

To make your own custom dataset for training you need to create a parquet file with the following format:

*Example Parquet File*

```json
{"label":"normal","example":"Hey there!"}
{"label":"normal","example":"Hi!"}
{"label":"toxic","example":"You suck!"}
```

After you have created the parquet file with the data in the format above, you can provide to the dataset to start training the model.

### Summarization

To summarize a trained model, you can use the `summarize` method:

```python
model.summarize(
    model_path="path_to_your_model",
    hyperparams_path="path_to_hyperparameters_file",
    vocabulary_path="path_to_vocabulary_file",
    modelSummary_write_path="path_to_write_model_summary"
)
```

### Classification

For classifying text using the trained model:

```python
result = model.classify(
    model_path="path_to_your_model",
    hyperparams_path="path_to_hyperparameters_file",
    text="your_input_text",
    vocabulary_path="path_to_vocabulary_file"
)
print(result)
```

## Getting Started

This package provides tools for text classification tasks. You can explore and customize it according to your requirements. Refer to the documentation for detailed usage instructions. We have also made our own colab [notebook](https://colab.research.google.com/drive/1fGDLICkctfdpTgLoh_Bouv-NY-q-kdlQ?usp=sharing) to help you train a custom offensive language classifier using this.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

---

## Contact

For any questions or feedback, please contact technologypower24@gmail.com or you can contact me at discord - techpowerb.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Bhargav230m/ArrowTextClassifier.git",
    "name": "ArrowTextClassifier",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "text classification, natural language processing, NLP, PyTorch, machine learning, deep learning, text summarization, preprocessing, data science, artificial intelligence, dataset, discord",
    "author": "techpowerb",
    "author_email": "technologypower24@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6c/1a/0010a3aef31d2ce95efdf9d42bc66475060b9c9c5d57887a4446b3b79846/ArrowTextClassifier-1.0.3.tar.gz",
    "platform": null,
    "description": "# ArrowTextClassifier\r\n\r\nArrowTextClassifier is a Python package for text classification tasks, offering functionalities to train, summarize, and classify text using convolutional neural network (CNN) architecture.\r\n\r\n## Installation\r\n\r\nYou can install ArrowTextClassifier via pip:\r\n\r\n```bash\r\npip install ArrowTextClassifier\r\n```\r\n\r\n## How it Works\r\n\r\nArrowTextClassifier implements a convolutional neural network (CNN) architecture for text classification. It tokenizes input text, embeds the tokens, applies convolutional filters over the embedded tokens to extract features, and then classifies the text into predefined categories.\r\n\r\n## Usage\r\n\r\n### Training\r\n\r\nTo train a text classification model, you can utilize the `train_model` method provided by the `Model` class:\r\n\r\n```python\r\nfrom ArrowTextClassifier import Model\r\n\r\nmodel = Model(name=\"your_model_name\")\r\nmodel.train_model(dataset)\r\n```\r\n\r\n#### How to make a dataset\r\n\r\nTo make your own custom dataset for training you need to create a parquet file with the following format:\r\n\r\n*Example Parquet File*\r\n\r\n```json\r\n{\"label\":\"normal\",\"example\":\"Hey there!\"}\r\n{\"label\":\"normal\",\"example\":\"Hi!\"}\r\n{\"label\":\"toxic\",\"example\":\"You suck!\"}\r\n```\r\n\r\nAfter you have created the parquet file with the data in the format above, you can provide to the dataset to start training the model.\r\n\r\n### Summarization\r\n\r\nTo summarize a trained model, you can use the `summarize` method:\r\n\r\n```python\r\nmodel.summarize(\r\n    model_path=\"path_to_your_model\",\r\n    hyperparams_path=\"path_to_hyperparameters_file\",\r\n    vocabulary_path=\"path_to_vocabulary_file\",\r\n    modelSummary_write_path=\"path_to_write_model_summary\"\r\n)\r\n```\r\n\r\n### Classification\r\n\r\nFor classifying text using the trained model:\r\n\r\n```python\r\nresult = model.classify(\r\n    model_path=\"path_to_your_model\",\r\n    hyperparams_path=\"path_to_hyperparameters_file\",\r\n    text=\"your_input_text\",\r\n    vocabulary_path=\"path_to_vocabulary_file\"\r\n)\r\nprint(result)\r\n```\r\n\r\n## Getting Started\r\n\r\nThis package provides tools for text classification tasks. You can explore and customize it according to your requirements. Refer to the documentation for detailed usage instructions. We have also made our own colab [notebook](https://colab.research.google.com/drive/1fGDLICkctfdpTgLoh_Bouv-NY-q-kdlQ?usp=sharing) to help you train a custom offensive language classifier using this.\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n\r\n---\r\n\r\n## Contact\r\n\r\nFor any questions or feedback, please contact technologypower24@gmail.com or you can contact me at discord - techpowerb.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "ArrowTextClassifier is a simple text classification tool written in pytorch that allows you to train, summarize, and use text classification models for various tasks.",
    "version": "1.0.3",
    "project_urls": {
        "Homepage": "https://github.com/Bhargav230m/ArrowTextClassifier.git"
    },
    "split_keywords": [
        "text classification",
        " natural language processing",
        " nlp",
        " pytorch",
        " machine learning",
        " deep learning",
        " text summarization",
        " preprocessing",
        " data science",
        " artificial intelligence",
        " dataset",
        " discord"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2bd2e6a1111141a1abed2d53209fb18315c298b7754d049173bf12e850f64644",
                "md5": "cb8b0d04dff09bf09616d16a6e4a0c5b",
                "sha256": "3433b196ff044e80e4c5fc016e9726ae01a133dc9d8fc3b4deecbb083b1f22af"
            },
            "downloads": -1,
            "filename": "ArrowTextClassifier-1.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cb8b0d04dff09bf09616d16a6e4a0c5b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 9941,
            "upload_time": "2024-04-20T14:25:37",
            "upload_time_iso_8601": "2024-04-20T14:25:37.514166Z",
            "url": "https://files.pythonhosted.org/packages/2b/d2/e6a1111141a1abed2d53209fb18315c298b7754d049173bf12e850f64644/ArrowTextClassifier-1.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c1a0010a3aef31d2ce95efdf9d42bc66475060b9c9c5d57887a4446b3b79846",
                "md5": "80c29ad861f574fe7e106975be132599",
                "sha256": "d128a1210cc580c66fb0b6e2f98a27b9d117193945d5c6fbc26b53f93d041697"
            },
            "downloads": -1,
            "filename": "ArrowTextClassifier-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "80c29ad861f574fe7e106975be132599",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 7928,
            "upload_time": "2024-04-20T14:25:39",
            "upload_time_iso_8601": "2024-04-20T14:25:39.284521Z",
            "url": "https://files.pythonhosted.org/packages/6c/1a/0010a3aef31d2ce95efdf9d42bc66475060b9c9c5d57887a4446b3b79846/ArrowTextClassifier-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-20 14:25:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Bhargav230m",
    "github_project": "ArrowTextClassifier",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "arrowtextclassifier"
}
        
Elapsed time: 0.29940s