bert-for-sequence-classification

Name	bert-for-sequence-classification JSON
Version	0.1.1 JSON
	download
home_page
Summary	Easy fine-tuning for BERT models
upload_time	2023-07-21 14:50:10
maintainer
docs_url	None
author	Tatiana Iazykova
requires_python
license
keywords	python bert deep learning nlp
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![pypi version](https://img.shields.io/pypi/v/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)
[![pypi downloads](https://img.shields.io/pypi/dm/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)

# bert-for-sequence-classification
Pipeline for easy fine-tuning of BERT architecture for sequence classification

## Quick Start

### Installation

1. Install the library
```
pip install bert-for-sequence-classification
```
   
2. If you want to train you model on GPU, please install pytorch version compatible with your device.

To find the version compatible with the cuda installed on your GPU, check 
[Pytorch website](https://pytorch.org/get-started/previous-versions/).
You can learn CUDA version installed on your device by typing `nvidia-smi` in console or
`!nvidia-smi` in a notebook cell.

### CLI Use

```
bert-clf-train --path_to_config <path to yaml file>
```

Example config file can be found [here](config.yaml)

### Jupyter notebook

Example notebook can be found [here](example/pipeline_example.ipynb)

### Inference mode

When using your trained model for inference it depends on how you saved your model

if path_to_state_dict in [config](config.yaml) is equal to false, 
then if you have the library installed:

```python

import torch
import pandas as pd

device = torch.device("cuda" if  torch.cuda.is_available() else "cpu")

model = torch.load(
    "path_to_saved_model", map_location=device
)
    
model.eval()

df = pd.read_csv("path_to_some_df")

df["target_column"] = df["text_column"].apply(model.predict)
```

Otherwise:

```python

import torch
import json
import pandas as pd
from bert_clf.src.models.BertCLF import BertCLF
from transformers import AutoModel, AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = AutoTokenizer.from_pretrained(
    pretrained_model_name_or_path="pretrained_model_name_or_path"
)
model_bert = AutoModel.from_pretrained(
    pretrained_model_name_or_path="pretrained_model_name_or_path"
).to(device)

id2label = json.load(open("path/to/saved/mapper"))  # mapper is saved with the state dict

model = BertCLF(
    pretrained_model=model_bert,
    tokenizer=tokenizer,
    id2label=id2label,
    dropout="some number",
    device=device
)

model.load_state_dict(
    torch.load(
        "path_to_state_dict", map_location=device
    ),
    strict=False
)

model.eval()

df = pd.read_csv("path_to_some_df")

df["target_column"] = df["text_column"].apply(model.predict)
```

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "bert-for-sequence-classification",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,bert,deep learning,nlp",
    "author": "Tatiana Iazykova",
    "author_email": "tania_yazykova@bk.ru",
    "download_url": "https://files.pythonhosted.org/packages/75/e5/3e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582/bert-for-sequence-classification-0.1.1.tar.gz",
    "platform": null,
    "description": "[![pypi version](https://img.shields.io/pypi/v/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)\n[![pypi downloads](https://img.shields.io/pypi/dm/bert-for-sequence-classification)](https://pypi.org/project/bert-for-sequence-classification)\n\n# bert-for-sequence-classification\nPipeline for easy fine-tuning of BERT architecture for sequence classification\n\n## Quick Start\n\n### Installation\n\n1. Install the library\n```\npip install bert-for-sequence-classification\n```\n   \n2. If you want to train you model on GPU, please install pytorch version compatible with your device.\n\nTo find the version compatible with the cuda installed on your GPU, check \n[Pytorch website](https://pytorch.org/get-started/previous-versions/).\nYou can learn CUDA version installed on your device by typing `nvidia-smi` in console or\n`!nvidia-smi` in a notebook cell.\n\n### CLI Use\n\n```\nbert-clf-train --path_to_config <path to yaml file>\n```\n\nExample config file can be found [here](config.yaml)\n\n### Jupyter notebook\n\nExample notebook can be found [here](example/pipeline_example.ipynb)\n\n### Inference mode\n\nWhen using your trained model for inference it depends on how you saved your model\n\nif path_to_state_dict in [config](config.yaml) is equal to false, \nthen if you have the library installed:\n\n```python\n\nimport torch\nimport pandas as pd\n\ndevice = torch.device(\"cuda\" if  torch.cuda.is_available() else \"cpu\")\n\nmodel = torch.load(\n    \"path_to_saved_model\", map_location=device\n)\n    \nmodel.eval()\n\ndf = pd.read_csv(\"path_to_some_df\")\n\ndf[\"target_column\"] = df[\"text_column\"].apply(model.predict)\n```\n\nOtherwise:\n\n```python\n\nimport torch\nimport json\nimport pandas as pd\nfrom bert_clf.src.models.BertCLF import BertCLF\nfrom transformers import AutoModel, AutoTokenizer\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\ntokenizer = AutoTokenizer.from_pretrained(\n    pretrained_model_name_or_path=\"pretrained_model_name_or_path\"\n)\nmodel_bert = AutoModel.from_pretrained(\n    pretrained_model_name_or_path=\"pretrained_model_name_or_path\"\n).to(device)\n\nid2label = json.load(open(\"path/to/saved/mapper\"))  # mapper is saved with the state dict\n\nmodel = BertCLF(\n    pretrained_model=model_bert,\n    tokenizer=tokenizer,\n    id2label=id2label,\n    dropout=\"some number\",\n    device=device\n)\n\nmodel.load_state_dict(\n    torch.load(\n        \"path_to_state_dict\", map_location=device\n    ),\n    strict=False\n)\n\nmodel.eval()\n\ndf = pd.read_csv(\"path_to_some_df\")\n\ndf[\"target_column\"] = df[\"text_column\"].apply(model.predict)\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Easy fine-tuning for BERT models",
    "version": "0.1.1",
    "project_urls": null,
    "split_keywords": [
        "python",
        "bert",
        "deep learning",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dfd6e728eb2b59c8a86a0a143c24660a6c8115aaac7954b032c00fe527a9af41",
                "md5": "337229c23ad7ad0dced11aa3fc71d21b",
                "sha256": "9931feb38746ac4132ee8ccd17c448f83de42534d9ef575c3c5eb03d43cc9d01"
            },
            "downloads": -1,
            "filename": "bert_for_sequence_classification-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "337229c23ad7ad0dced11aa3fc71d21b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 17361,
            "upload_time": "2023-07-21T14:50:08",
            "upload_time_iso_8601": "2023-07-21T14:50:08.974299Z",
            "url": "https://files.pythonhosted.org/packages/df/d6/e728eb2b59c8a86a0a143c24660a6c8115aaac7954b032c00fe527a9af41/bert_for_sequence_classification-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "75e53e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582",
                "md5": "3b27782992f1f5976ccbde5905daf375",
                "sha256": "2c39fa648c7d1523e97a63dfeb907d74ba5ed47d38f9880a217eeff232086cb6"
            },
            "downloads": -1,
            "filename": "bert-for-sequence-classification-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3b27782992f1f5976ccbde5905daf375",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 12817,
            "upload_time": "2023-07-21T14:50:10",
            "upload_time_iso_8601": "2023-07-21T14:50:10.571571Z",
            "url": "https://files.pythonhosted.org/packages/75/e5/3e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582/bert-for-sequence-classification-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-21 14:50:10",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "bert-for-sequence-classification"
}

Tatiana Iazykova