<div align="center">
[![python](https://img.shields.io/badge/Python-3.9|3.10|3.11|3.12|3.13-3776AB.svg?style=flat&logo=python&logoColor=white)](https://www.python.org) ![PyPI - Version](https://img.shields.io/pypi/v/sentimentpredictor) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit) [![Downloads](https://static.pepy.tech/badge/textpredict)](https://pepy.tech/project/textpredict)
![TextPredict Logo](https://raw.githubusercontent.com/ankit-aglawe/textpredict/main/assets/logo.png)
## Advanced Text Classification with Transformer Models
</div>
TextPredict is a powerful Python package designed for various text analysis and prediction tasks using advanced NLP models. It simplifies the process of performing sentiment analysis, emotion detection, zero-shot classification, named entity recognition (NER), and more. Built on top of Hugging Face's Transformers, TextPredict allows seamless integration with pre-trained models or custom models for specific tasks.
## Features
- **Sentiment Analysis**: Determine the sentiment of text (positive, negative, neutral).
- **Emotion Detection**: Identify emotions such as happiness, sadness, anger, etc.
- **Zero-Shot Classification**: Classify text into custom categories without additional training.
- **Named Entity Recognition (NER)**: Extract entities like names, locations, and organizations from text.
- **Sequence Classification**: Fine-tune models for custom classification tasks.
- **Token Classification**: Classify tokens within text for tasks like NER.
- **Sequence-to-Sequence (Seq2Seq)**: Perform tasks like translation and summarization.
- **Model Comparison**: Evaluate and compare multiple models on the same dataset.
- **Explainability**: Understand model predictions through feature importance analysis.
- **Text Cleaning**: Utilize utility functions for preprocessing text data.
## Supported Tasks
- Sentiment Analysis
- Emotion Detection
- Zero-Shot Classification
- Named Entity Recognition (NER)
- Sequence Classification
- Token Classification
- Sequence-to-Sequence (Seq2Seq)
## Installation
You can install the package via pip:
```sh
pip install textpredict
```
## Quick Start
### Initialization and Simple Prediction
Initialize the TextPredict model and perform simple predictions:
```python
import textpredict as tp
# Initialize for sentiment analysis
# task : ["sentiment", "ner", "zeroshot", "emotion", "sequence_classification", "token_classification", "seq2seq" etc]
model = tp.initialize(task="sentiment")
result = model.analyze(text = ["I love this product!", "I hate this product!"], return_probs=False)
print(f"Sentiment Prediction Result: {result}")
```
### Using Pre-trained Models from Hugging Face
Utilize a specific pre-trained model from Hugging Face:
```python
model = tp.initialize(task="emotion", model_name="AnkitAI/reviews-roberta-base-sentiment-analysis", source="huggingface")
result = model.analyze(text = "I love this product!", return_probs=True)
print(f"Sentiment Prediction Result: {result}")
```
### Using Models from Local Directory
Load and use a model from a local directory:
```python
model = tp.initialize(task="ner", model_name="./results", source="local")
result = model.analyze(text="I love this product!", return_probs=True)
print(f"Sentiment Prediction Result: {result}")
```
### Training a Model
Train a model for sequence classification:
```python
import textpredict as tp
from datasets import load_dataset
# Load dataset
train_data = load_dataset("imdb", split="train")
val_data = load_dataset("imdb", split="test")
# Initialize and train the model
trainer = tp.SequenceClassificationTrainer(model_name="bert-base-uncased", output_dir="./results", train_dataset=train_data, val_dataset=val_data)
trainer.train()
# Save and evaluate the trained model
trainer.save()
metrics = trainer.evaluate(test_dataset=val_data)
print(f"Evaluation Metrics: {metrics}")
```
For detailed examples, refer to the `examples` directory.
### Explainability and Feature Importance
Understand model predictions with feature importance:
```python
text = "I love this product!"
explainer = tp.Explainability(model_name="bert-base-uncased", task="sentiment", device="cpu")
importance = explainer.feature_importance(text=text)
print(f"Feature Importance: {importance}")
```
## Documentation
For detailed documentation, please refer to the [TextPredict Documentation](#).
## Contributing
Contributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) before making a pull request.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
### Links
- **GitHub Repository**: [Github](https://github.com/ankit-aglawe/textpredict)
- **PyPI Project**: [PYPI](https://pypi.org/project/textpredict/)
- **Documentation**: [Readthedocs](https://github.com/ankit-aglawe/sentimentpredictor#readme)
- **Source Code**: [Source Code](https://github.com/ankit-aglawe/sentimentpredictor)
- **Issue Tracker**: [Issue Tracker](https://github.com/ankit-aglawe/sentimentpredictor/issues)
Raw data
{
"_id": null,
"home_page": "https://github.com/ankit-aglawe/textpredict",
"name": "textpredict",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "text classification, emotion classification, sentiment analysis, text analysis, machine learning, NLP, transformers, data science, pre-trained models, text mining",
"author": "Ankit Aglawe",
"author_email": "aglawe.ankit@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/55/32/692fa1ef3e22efedd0a0a2c1798f0e9aa237aa4e680c24f09c1d4ecd2630/textpredict-0.1.3.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n\n[![python](https://img.shields.io/badge/Python-3.9|3.10|3.11|3.12|3.13-3776AB.svg?style=flat&logo=python&logoColor=white)](https://www.python.org) ![PyPI - Version](https://img.shields.io/pypi/v/sentimentpredictor) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit) [![Downloads](https://static.pepy.tech/badge/textpredict)](https://pepy.tech/project/textpredict)\n\n\n![TextPredict Logo](https://raw.githubusercontent.com/ankit-aglawe/textpredict/main/assets/logo.png)\n\n## Advanced Text Classification with Transformer Models\n</div>\nTextPredict is a powerful Python package designed for various text analysis and prediction tasks using advanced NLP models. It simplifies the process of performing sentiment analysis, emotion detection, zero-shot classification, named entity recognition (NER), and more. Built on top of Hugging Face's Transformers, TextPredict allows seamless integration with pre-trained models or custom models for specific tasks.\n\n## Features\n\n- **Sentiment Analysis**: Determine the sentiment of text (positive, negative, neutral).\n- **Emotion Detection**: Identify emotions such as happiness, sadness, anger, etc.\n- **Zero-Shot Classification**: Classify text into custom categories without additional training.\n- **Named Entity Recognition (NER)**: Extract entities like names, locations, and organizations from text.\n- **Sequence Classification**: Fine-tune models for custom classification tasks.\n- **Token Classification**: Classify tokens within text for tasks like NER.\n- **Sequence-to-Sequence (Seq2Seq)**: Perform tasks like translation and summarization.\n- **Model Comparison**: Evaluate and compare multiple models on the same dataset.\n- **Explainability**: Understand model predictions through feature importance analysis.\n- **Text Cleaning**: Utilize utility functions for preprocessing text data.\n\n## Supported Tasks\n\n- Sentiment Analysis\n- Emotion Detection\n- Zero-Shot Classification\n- Named Entity Recognition (NER)\n- Sequence Classification\n- Token Classification\n- Sequence-to-Sequence (Seq2Seq)\n\n## Installation\n\nYou can install the package via pip:\n\n```sh\npip install textpredict\n```\n\n## Quick Start\n\n### Initialization and Simple Prediction\n\nInitialize the TextPredict model and perform simple predictions:\n\n```python\nimport textpredict as tp\n\n# Initialize for sentiment analysis\n\n# task : [\"sentiment\", \"ner\", \"zeroshot\", \"emotion\", \"sequence_classification\", \"token_classification\", \"seq2seq\" etc]\n\nmodel = tp.initialize(task=\"sentiment\") \nresult = model.analyze(text = [\"I love this product!\", \"I hate this product!\"], return_probs=False)\nprint(f\"Sentiment Prediction Result: {result}\")\n```\n\n### Using Pre-trained Models from Hugging Face\n\nUtilize a specific pre-trained model from Hugging Face:\n\n```python\nmodel = tp.initialize(task=\"emotion\", model_name=\"AnkitAI/reviews-roberta-base-sentiment-analysis\", source=\"huggingface\")\nresult = model.analyze(text = \"I love this product!\", return_probs=True)\nprint(f\"Sentiment Prediction Result: {result}\")\n```\n\n### Using Models from Local Directory\n\nLoad and use a model from a local directory:\n\n```python\nmodel = tp.initialize(task=\"ner\", model_name=\"./results\", source=\"local\")\nresult = model.analyze(text=\"I love this product!\", return_probs=True)\nprint(f\"Sentiment Prediction Result: {result}\")\n```\n\n### Training a Model\n\nTrain a model for sequence classification:\n\n```python\nimport textpredict as tp\nfrom datasets import load_dataset\n\n# Load dataset\ntrain_data = load_dataset(\"imdb\", split=\"train\")\nval_data = load_dataset(\"imdb\", split=\"test\")\n\n# Initialize and train the model\ntrainer = tp.SequenceClassificationTrainer(model_name=\"bert-base-uncased\", output_dir=\"./results\", train_dataset=train_data, val_dataset=val_data)\ntrainer.train()\n\n# Save and evaluate the trained model\ntrainer.save()\nmetrics = trainer.evaluate(test_dataset=val_data)\nprint(f\"Evaluation Metrics: {metrics}\")\n```\n\nFor detailed examples, refer to the `examples` directory.\n\n### Explainability and Feature Importance\n\nUnderstand model predictions with feature importance:\n\n```python\ntext = \"I love this product!\"\nexplainer = tp.Explainability(model_name=\"bert-base-uncased\", task=\"sentiment\", device=\"cpu\")\nimportance = explainer.feature_importance(text=text)\nprint(f\"Feature Importance: {importance}\")\n```\n\n## Documentation\n\nFor detailed documentation, please refer to the [TextPredict Documentation](#).\n\n## Contributing\n\nContributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) before making a pull request.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n### Links\n\n- **GitHub Repository**: [Github](https://github.com/ankit-aglawe/textpredict)\n- **PyPI Project**: [PYPI](https://pypi.org/project/textpredict/)\n- **Documentation**: [Readthedocs](https://github.com/ankit-aglawe/sentimentpredictor#readme)\n- **Source Code**: [Source Code](https://github.com/ankit-aglawe/sentimentpredictor)\n- **Issue Tracker**: [Issue Tracker](https://github.com/ankit-aglawe/sentimentpredictor/issues)\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "TextPredict is a powerful Python package designed for various text analysis and prediction tasks using advanced NLP models. It simplifies the process of performing sentiment analysis, emotion detection, zero-shot classification, named entity recognition (NER), and more.",
"version": "0.1.3",
"project_urls": {
"Changelog": "https://github.com/ankit-aglawe/textpredict/releases",
"Documentation": "https://github.com/ankit-aglawe/textpredict#readme",
"Homepage": "https://github.com/ankit-aglawe/textpredict",
"Repository": "https://github.com/ankit-aglawe/textpredict",
"Source": "https://github.com/ankit-aglawe/textpredict",
"Tracker": "https://github.com/ankit-aglawe/textpredict/issues"
},
"split_keywords": [
"text classification",
" emotion classification",
" sentiment analysis",
" text analysis",
" machine learning",
" nlp",
" transformers",
" data science",
" pre-trained models",
" text mining"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "19065697b1aee5a0623a68f0cc85fb8fa94f63147f34c958ddc7acdfa8cf1a41",
"md5": "fe7019c1cd284efe46049ca52b87bfcd",
"sha256": "f159d10e1db38ace517a220aef12fb66c5501522f1d57a57b2d997d4510ce789"
},
"downloads": -1,
"filename": "textpredict-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fe7019c1cd284efe46049ca52b87bfcd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 29584,
"upload_time": "2024-07-10T16:32:02",
"upload_time_iso_8601": "2024-07-10T16:32:02.990939Z",
"url": "https://files.pythonhosted.org/packages/19/06/5697b1aee5a0623a68f0cc85fb8fa94f63147f34c958ddc7acdfa8cf1a41/textpredict-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5532692fa1ef3e22efedd0a0a2c1798f0e9aa237aa4e680c24f09c1d4ecd2630",
"md5": "d5889c473a8d06032b2f161dcdf8ec16",
"sha256": "66a31c5be5535d8ca530774dc97b83829daaa85fba0b458746697ec50c04ceae"
},
"downloads": -1,
"filename": "textpredict-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "d5889c473a8d06032b2f161dcdf8ec16",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 18817,
"upload_time": "2024-07-10T16:32:04",
"upload_time_iso_8601": "2024-07-10T16:32:04.606434Z",
"url": "https://files.pythonhosted.org/packages/55/32/692fa1ef3e22efedd0a0a2c1798f0e9aa237aa4e680c24f09c1d4ecd2630/textpredict-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-10 16:32:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ankit-aglawe",
"github_project": "textpredict",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "textpredict"
}