[![pypi version](https://img.shields.io/pypi/v/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)
[![pypi downloads](https://img.shields.io/pypi/dm/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)
# bert-for-sequence-classification
Pipeline for easy fine-tuning of BERT architecture for sequence classification
## Quick Start
### Installation
1. Install the library
```
pip install bert-for-sequence-classification
```
2. If you want to train you model on GPU, please install pytorch version compatible with your device.
To find the version compatible with the cuda installed on your GPU, check
[Pytorch website](https://pytorch.org/get-started/previous-versions/).
You can learn CUDA version installed on your device by typing `nvidia-smi` in console or
`!nvidia-smi` in a notebook cell.
### CLI Use
```
bert-clf-train --path_to_config <path to yaml file>
```
Example config file can be found [here](config.yaml)
### Jupyter notebook
Example notebook can be found [here](example/pipeline_example.ipynb)
### Inference mode
When using your trained model for inference it depends on how you saved your model
if path_to_state_dict in [config](config.yaml) is equal to false,
then if you have the library installed:
```python
import torch
import pandas as pd
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load(
"path_to_saved_model", map_location=device
)
model.eval()
df = pd.read_csv("path_to_some_df")
df["target_column"] = df["text_column"].apply(model.predict)
```
Otherwise:
```python
import torch
import json
import pandas as pd
from bert_clf.src.models.BertCLF import BertCLF
from transformers import AutoModel, AutoTokenizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(
pretrained_model_name_or_path="pretrained_model_name_or_path"
)
model_bert = AutoModel.from_pretrained(
pretrained_model_name_or_path="pretrained_model_name_or_path"
).to(device)
id2label = json.load(open("path/to/saved/mapper")) # mapper is saved with the state dict
model = BertCLF(
pretrained_model=model_bert,
tokenizer=tokenizer,
id2label=id2label,
dropout="some number",
device=device
)
model.load_state_dict(
torch.load(
"path_to_state_dict", map_location=device
),
strict=False
)
model.eval()
df = pd.read_csv("path_to_some_df")
df["target_column"] = df["text_column"].apply(model.predict)
```
Raw data
{
"_id": null,
"home_page": "",
"name": "bert-for-sequence-classification",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,bert,deep learning,nlp",
"author": "Tatiana Iazykova",
"author_email": "tania_yazykova@bk.ru",
"download_url": "https://files.pythonhosted.org/packages/75/e5/3e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582/bert-for-sequence-classification-0.1.1.tar.gz",
"platform": null,
"description": "[![pypi version](https://img.shields.io/pypi/v/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)\n[![pypi downloads](https://img.shields.io/pypi/dm/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)\n\n# bert-for-sequence-classification\nPipeline for easy fine-tuning of BERT architecture for sequence classification\n\n## Quick Start\n\n### Installation\n\n1. Install the library\n```\npip install bert-for-sequence-classification\n```\n \n2. If you want to train you model on GPU, please install pytorch version compatible with your device.\n\nTo find the version compatible with the cuda installed on your GPU, check \n[Pytorch website](https://pytorch.org/get-started/previous-versions/).\nYou can learn CUDA version installed on your device by typing `nvidia-smi` in console or\n`!nvidia-smi` in a notebook cell.\n\n### CLI Use\n\n```\nbert-clf-train --path_to_config <path to yaml file>\n```\n\nExample config file can be found [here](config.yaml)\n\n### Jupyter notebook\n\nExample notebook can be found [here](example/pipeline_example.ipynb)\n\n### Inference mode\n\nWhen using your trained model for inference it depends on how you saved your model\n\nif path_to_state_dict in [config](config.yaml) is equal to false, \nthen if you have the library installed:\n\n```python\n\nimport torch\nimport pandas as pd\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\nmodel = torch.load(\n \"path_to_saved_model\", map_location=device\n)\n \nmodel.eval()\n\ndf = pd.read_csv(\"path_to_some_df\")\n\ndf[\"target_column\"] = df[\"text_column\"].apply(model.predict)\n```\n\nOtherwise:\n\n```python\n\nimport torch\nimport json\nimport pandas as pd\nfrom bert_clf.src.models.BertCLF import BertCLF\nfrom transformers import AutoModel, AutoTokenizer\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\ntokenizer = AutoTokenizer.from_pretrained(\n pretrained_model_name_or_path=\"pretrained_model_name_or_path\"\n)\nmodel_bert = AutoModel.from_pretrained(\n pretrained_model_name_or_path=\"pretrained_model_name_or_path\"\n).to(device)\n\nid2label = json.load(open(\"path/to/saved/mapper\")) # mapper is saved with the state dict\n\nmodel = BertCLF(\n pretrained_model=model_bert,\n tokenizer=tokenizer,\n id2label=id2label,\n dropout=\"some number\",\n device=device\n)\n\nmodel.load_state_dict(\n torch.load(\n \"path_to_state_dict\", map_location=device\n ),\n strict=False\n)\n\nmodel.eval()\n\ndf = pd.read_csv(\"path_to_some_df\")\n\ndf[\"target_column\"] = df[\"text_column\"].apply(model.predict)\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "Easy fine-tuning for BERT models",
"version": "0.1.1",
"project_urls": null,
"split_keywords": [
"python",
"bert",
"deep learning",
"nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dfd6e728eb2b59c8a86a0a143c24660a6c8115aaac7954b032c00fe527a9af41",
"md5": "337229c23ad7ad0dced11aa3fc71d21b",
"sha256": "9931feb38746ac4132ee8ccd17c448f83de42534d9ef575c3c5eb03d43cc9d01"
},
"downloads": -1,
"filename": "bert_for_sequence_classification-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "337229c23ad7ad0dced11aa3fc71d21b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 17361,
"upload_time": "2023-07-21T14:50:08",
"upload_time_iso_8601": "2023-07-21T14:50:08.974299Z",
"url": "https://files.pythonhosted.org/packages/df/d6/e728eb2b59c8a86a0a143c24660a6c8115aaac7954b032c00fe527a9af41/bert_for_sequence_classification-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "75e53e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582",
"md5": "3b27782992f1f5976ccbde5905daf375",
"sha256": "2c39fa648c7d1523e97a63dfeb907d74ba5ed47d38f9880a217eeff232086cb6"
},
"downloads": -1,
"filename": "bert-for-sequence-classification-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "3b27782992f1f5976ccbde5905daf375",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 12817,
"upload_time": "2023-07-21T14:50:10",
"upload_time_iso_8601": "2023-07-21T14:50:10.571571Z",
"url": "https://files.pythonhosted.org/packages/75/e5/3e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582/bert-for-sequence-classification-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-21 14:50:10",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "bert-for-sequence-classification"
}